WorldWideScience

Sample records for gene expression-based classification

  1. Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification.

    Science.gov (United States)

    Oberthuer, André; Berthold, Frank; Warnat, Patrick; Hero, Barbara; Kahlert, Yvonne; Spitz, Rüdiger; Ernestus, Karen; König, Rainer; Haas, Stefan; Eils, Roland; Schwab, Manfred; Brors, Benedikt; Westermann, Frank; Fischer, Matthias

    2006-11-01

    To develop a gene expression-based classifier for neuroblastoma patients that reliably predicts courses of the disease. Two hundred fifty-one neuroblastoma specimens were analyzed using a customized oligonucleotide microarray comprising 10,163 probes for transcripts with differential expression in clinical subgroups of the disease. Subsequently, the prediction analysis for microarrays (PAM) was applied to a first set of patients with maximally divergent clinical courses (n = 77). The classification accuracy was estimated by a complete 10-times-repeated 10-fold cross validation, and a 144-gene predictor was constructed from this set. This classifier's predictive power was evaluated in an independent second set (n = 174) by comparing results of the gene expression-based classification with those of risk stratification systems of current trials from Germany, Japan, and the United States. The first set of patients was accurately predicted by PAM (cross-validated accuracy, 99%). Within the second set, the PAM classifier significantly separated cohorts with distinct courses (3-year event-free survival [EFS] 0.86 +/- 0.03 [favorable; n = 115] v 0.52 +/- 0.07 [unfavorable; n = 59] and 3-year overall survival 0.99 +/- 0.01 v 0.84 +/- 0.05; both P model, the PAM predictor classified patients of the second set more accurately than risk stratification of current trials from Germany, Japan, and the United States (P < .001; hazard ratio, 4.756 [95% CI, 2.544 to 8.893]). Integration of gene expression-based class prediction of neuroblastoma patients may improve risk estimation of current neuroblastoma trials.

  2. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2005-11-01

    Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and

  3. GOBO: gene expression-based outcome for breast cancer online.

    Directory of Open Access Journals (Sweden)

    Markus Ringnér

    Full Text Available Microarray-based gene expression analysis holds promise of improving prognostication and treatment decisions for breast cancer patients. However, the heterogeneity of breast cancer emphasizes the need for validation of prognostic gene signatures in larger sample sets stratified into relevant subgroups. Here, we describe a multifunctional user-friendly online tool, GOBO (http://co.bmc.lu.se/gobo, allowing a range of different analyses to be performed in an 1881-sample breast tumor data set, and a 51-sample breast cancer cell line set, both generated on Affymetrix U133A microarrays. GOBO supports a wide range of applications including: 1 rapid assessment of gene expression levels in subgroups of breast tumors and cell lines, 2 identification of co-expressed genes for creation of potential metagenes, 3 association with outcome for gene expression levels of single genes, sets of genes, or gene signatures in multiple subgroups of the 1881-sample breast cancer data set. The design and implementation of GOBO facilitate easy incorporation of additional query functions and applications, as well as additional data sets irrespective of tumor type and array platform.

  4. Classification across gene expression microarray studies

    Directory of Open Access Journals (Sweden)

    Kuner Ruprecht

    2009-12-01

    Full Text Available Abstract Background The increasing number of gene expression microarray studies represents an important resource in biomedical research. As a result, gene expression based diagnosis has entered clinical practice for patient stratification in breast cancer. However, the integration and combined analysis of microarray studies remains still a challenge. We assessed the potential benefit of data integration on the classification accuracy and systematically evaluated the generalization performance of selected methods on four breast cancer studies comprising almost 1000 independent samples. To this end, we introduced an evaluation framework which aims to establish good statistical practice and a graphical way to monitor differences. The classification goal was to correctly predict estrogen receptor status (negative/positive and histological grade (low/high of each tumor sample in an independent study which was not used for the training. For the classification we chose support vector machines (SVM, predictive analysis of microarrays (PAM, random forest (RF and k-top scoring pairs (kTSP. Guided by considerations relevant for classification across studies we developed a generalization of kTSP which we evaluated in addition. Our derived version (DV aims to improve the robustness of the intrinsic invariance of kTSP with respect to technologies and preprocessing. Results For each individual study the generalization error was benchmarked via complete cross-validation and was found to be similar for all classification methods. The misclassification rates were substantially higher in classification across studies, when each single study was used as an independent test set while all remaining studies were combined for the training of the classifier. However, with increasing number of independent microarray studies used in the training, the overall classification performance improved. DV performed better than the average and showed slightly less variance. In

  5. A protein and mRNA expression-based classification of gastric cancer.

    Science.gov (United States)

    Setia, Namrata; Agoston, Agoston T; Han, Hye S; Mullen, John T; Duda, Dan G; Clark, Jeffrey W; Deshpande, Vikram; Mino-Kenudson, Mari; Srivastava, Amitabh; Lennerz, Jochen K; Hong, Theodore S; Kwak, Eunice L; Lauwers, Gregory Y

    2016-07-01

    The overall survival of gastric carcinoma patients remains poor despite improved control over known risk factors and surveillance. This highlights the need for new classifications, driven towards identification of potential therapeutic targets. Using sophisticated molecular technologies and analysis, three groups recently provided genetic and epigenetic molecular classifications of gastric cancer (The Cancer Genome Atlas, 'Singapore-Duke' study, and Asian Cancer Research Group). Suggested by these classifications, here, we examined the expression of 14 biomarkers in a cohort of 146 gastric adenocarcinomas and performed unsupervised hierarchical clustering analysis using less expensive and widely available immunohistochemistry and in situ hybridization. Ultimately, we identified five groups of gastric cancers based on Epstein-Barr virus (EBV) positivity, microsatellite instability, aberrant E-cadherin, and p53 expression; the remaining cases constituted a group characterized by normal p53 expression. In addition, the five categories correspond to the reported molecular subgroups by virtue of clinicopathologic features. Furthermore, evaluation between these clusters and survival using the Cox proportional hazards model showed a trend for superior survival in the EBV and microsatellite-instable related adenocarcinomas. In conclusion, we offer as a proposal a simplified algorithm that is able to reproduce the recently proposed molecular subgroups of gastric adenocarcinoma, using immunohistochemical and in situ hybridization techniques.

  6. Genome-Wide Comparative Gene Family Classification

    Science.gov (United States)

    Frech, Christian; Chen, Nansheng

    2010-01-01

    Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species. PMID:20976221

  7. Gene expression-based molecular diagnostic system for malignant gliomas is superior to histological diagnosis.

    Science.gov (United States)

    Shirahata, Mitsuaki; Iwao-Koizumi, Kyoko; Saito, Sakae; Ueno, Noriko; Oda, Masashi; Hashimoto, Nobuo; Takahashi, Jun A; Kato, Kikuya

    2007-12-15

    Current morphology-based glioma classification methods do not adequately reflect the complex biology of gliomas, thus limiting their prognostic ability. In this study, we focused on anaplastic oligodendroglioma and glioblastoma, which typically follow distinct clinical courses. Our goal was to construct a clinically useful molecular diagnostic system based on gene expression profiling. The expression of 3,456 genes in 32 patients, 12 and 20 of whom had prognostically distinct anaplastic oligodendroglioma and glioblastoma, respectively, was measured by PCR array. Next to unsupervised methods, we did supervised analysis using a weighted voting algorithm to construct a diagnostic system discriminating anaplastic oligodendroglioma from glioblastoma. The diagnostic accuracy of this system was evaluated by leave-one-out cross-validation. The clinical utility was tested on a microarray-based data set of 50 malignant gliomas from a previous study. Unsupervised analysis showed divergent global gene expression patterns between the two tumor classes. A supervised binary classification model showed 100% (95% confidence interval, 89.4-100%) diagnostic accuracy by leave-one-out cross-validation using 168 diagnostic genes. Applied to a gene expression data set from a previous study, our model correlated better with outcome than histologic diagnosis, and also displayed 96.6% (28 of 29) consistency with the molecular classification scheme used for these histologically controversial gliomas in the original article. Furthermore, we observed that histologically diagnosed glioblastoma samples that shared anaplastic oligodendroglioma molecular characteristics tended to be associated with longer survival. Our molecular diagnostic system showed reproducible clinical utility and prognostic ability superior to traditional histopathologic diagnosis for malignant glioma.

  8. Multivariate Pattern Classification of Facial Expressions Based on Large-Scale Functional Connectivity.

    Science.gov (United States)

    Liang, Yin; Liu, Baolin; Li, Xianglin; Wang, Peiyuan

    2018-01-01

    It is an important question how human beings achieve efficient recognition of others' facial expressions in cognitive neuroscience, and it has been identified that specific cortical regions show preferential activation to facial expressions in previous studies. However, the potential contributions of the connectivity patterns in the processing of facial expressions remained unclear. The present functional magnetic resonance imaging (fMRI) study explored whether facial expressions could be decoded from the functional connectivity (FC) patterns using multivariate pattern analysis combined with machine learning algorithms (fcMVPA). We employed a block design experiment and collected neural activities while participants viewed facial expressions of six basic emotions (anger, disgust, fear, joy, sadness, and surprise). Both static and dynamic expression stimuli were included in our study. A behavioral experiment after scanning confirmed the validity of the facial stimuli presented during the fMRI experiment with classification accuracies and emotional intensities. We obtained whole-brain FC patterns for each facial expression and found that both static and dynamic facial expressions could be successfully decoded from the FC patterns. Moreover, we identified the expression-discriminative networks for the static and dynamic facial expressions, which span beyond the conventional face-selective areas. Overall, these results reveal that large-scale FC patterns may also contain rich expression information to accurately decode facial expressions, suggesting a novel mechanism, which includes general interactions between distributed brain regions, and that contributes to the human facial expression recognition.

  9. Accurate Gene Expression-Based Biodosimetry Using a Minimal Set of Human Gene Transcripts

    Energy Technology Data Exchange (ETDEWEB)

    Tucker, James D., E-mail: jtucker@biology.biosci.wayne.edu [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Joiner, Michael C. [Department of Radiation Oncology, Wayne State University, Detroit, Michigan (United States); Thomas, Robert A.; Grever, William E.; Bakhmutsky, Marina V. [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Chinkhota, Chantelle N.; Smolinski, Joseph M. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States); Divine, George W. [Department of Public Health Sciences, Henry Ford Hospital, Detroit, Michigan (United States); Auner, Gregory W. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States)

    2014-03-15

    Purpose: Rapid and reliable methods for conducting biological dosimetry are a necessity in the event of a large-scale nuclear event. Conventional biodosimetry methods lack the speed, portability, ease of use, and low cost required for triaging numerous victims. Here we address this need by showing that polymerase chain reaction (PCR) on a small number of gene transcripts can provide accurate and rapid dosimetry. The low cost and relative ease of PCR compared with existing dosimetry methods suggest that this approach may be useful in mass-casualty triage situations. Methods and Materials: Human peripheral blood from 60 adult donors was acutely exposed to cobalt-60 gamma rays at doses of 0 (control) to 10 Gy. mRNA expression levels of 121 selected genes were obtained 0.5, 1, and 2 days after exposure by reverse-transcriptase real-time PCR. Optimal dosimetry at each time point was obtained by stepwise regression of dose received against individual gene transcript expression levels. Results: Only 3 to 4 different gene transcripts, ASTN2, CDKN1A, GDF15, and ATM, are needed to explain ≥0.87 of the variance (R{sup 2}). Receiver-operator characteristics, a measure of sensitivity and specificity, of 0.98 for these statistical models were achieved at each time point. Conclusions: The actual and predicted radiation doses agree very closely up to 6 Gy. Dosimetry at 8 and 10 Gy shows some effect of saturation, thereby slightly diminishing the ability to quantify higher exposures. Analyses of these gene transcripts may be advantageous for use in a field-portable device designed to assess exposures in mass casualty situations or in clinical radiation emergencies.

  10. A model of gene expression based on random dynamical systems reveals modularity properties of gene regulatory networks.

    Science.gov (United States)

    Antoneli, Fernando; Ferreira, Renata C; Briones, Marcelo R S

    2016-06-01

    Here we propose a new approach to modeling gene expression based on the theory of random dynamical systems (RDS) that provides a general coupling prescription between the nodes of any given regulatory network given the dynamics of each node is modeled by a RDS. The main virtues of this approach are the following: (i) it provides a natural way to obtain arbitrarily large networks by coupling together simple basic pieces, thus revealing the modularity of regulatory networks; (ii) the assumptions about the stochastic processes used in the modeling are fairly general, in the sense that the only requirement is stationarity; (iii) there is a well developed mathematical theory, which is a blend of smooth dynamical systems theory, ergodic theory and stochastic analysis that allows one to extract relevant dynamical and statistical information without solving the system; (iv) one may obtain the classical rate equations form the corresponding stochastic version by averaging the dynamic random variables (small noise limit). It is important to emphasize that unlike the deterministic case, where coupling two equations is a trivial matter, coupling two RDS is non-trivial, specially in our case, where the coupling is performed between a state variable of one gene and the switching stochastic process of another gene and, hence, it is not a priori true that the resulting coupled system will satisfy the definition of a random dynamical system. We shall provide the necessary arguments that ensure that our coupling prescription does indeed furnish a coupled regulatory network of random dynamical systems. Finally, the fact that classical rate equations are the small noise limit of our stochastic model ensures that any validation or prediction made on the basis of the classical theory is also a validation or prediction of our model. We illustrate our framework with some simple examples of single-gene system and network motifs. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. Expression-based clustering of CAZyme-encoding genes of Aspergillus niger.

    Science.gov (United States)

    Gruben, Birgit S; Mäkelä, Miia R; Kowalczyk, Joanna E; Zhou, Miaomiao; Benoit-Gelber, Isabelle; De Vries, Ronald P

    2017-11-23

    The Aspergillus niger genome contains a large repertoire of genes encoding carbohydrate active enzymes (CAZymes) that are targeted to plant polysaccharide degradation enabling A. niger to grow on a wide range of plant biomass substrates. Which genes need to be activated in certain environmental conditions depends on the composition of the available substrate. Previous studies have demonstrated the involvement of a number of transcriptional regulators in plant biomass degradation and have identified sets of target genes for each regulator. In this study, a broad transcriptional analysis was performed of the A. niger genes encoding (putative) plant polysaccharide degrading enzymes. Microarray data focusing on the initial response of A. niger to the presence of plant biomass related carbon sources were analyzed of a wild-type strain N402 that was grown on a large range of carbon sources and of the regulatory mutant strains ΔxlnR, ΔaraR, ΔamyR, ΔrhaR and ΔgalX that were grown on their specific inducing compounds. The cluster analysis of the expression data revealed several groups of co-regulated genes, which goes beyond the traditionally described co-regulated gene sets. Additional putative target genes of the selected regulators were identified, based on their expression profile. Notably, in several cases the expression profile puts questions on the function assignment of uncharacterized genes that was based on homology searches, highlighting the need for more extensive biochemical studies into the substrate specificity of enzymes encoded by these non-characterized genes. The data also revealed sets of genes that were upregulated in the regulatory mutants, suggesting interaction between the regulatory systems and a therefore even more complex overall regulatory network than has been reported so far. Expression profiling on a large number of substrates provides better insight in the complex regulatory systems that drive the conversion of plant biomass by fungi. In

  12. Gene expression based evidence of innate immune response activation in the epithelium with oral lichen planus

    Science.gov (United States)

    Adami, Guy R.; Yeung, Alexander C.F.; Stucki, Grant; Kolokythas, Antonia; Sroussi, Herve Y.; Cabay, Robert J.; Kuzin, Igor; Schwartz, Joel L.

    2014-01-01

    Objective Oral lichen planus (OLP) is a disease of the oral mucosa of unknown cause producing lesions with an intense band-like inflammatory infiltrate of T cells to the subepithelium and keratinocyte cell death. We performed gene expression analysis of the oral epithelium of lesions in subjects with OLP and its sister disease, oral lichenoid reaction (OLR), in order to better understand the role of the keratinocytes in these diseases. Design Fourteen patients with OLP or OLR were included in the study, along with a control group of 23 subjects with a variety of oral diseases and a normal group of 17 subjects with no clinically visible mucosal abnormalities. Various proteins have been associated with OLP, based on detection of secreted proteins or changes in RNA levels in tissue samples consisting of epithelium, stroma, and immune cells. The mRNA level of twelve of these genes expressed in the epithelium was tested in the three groups. Results Four genes showed increased expression in the epithelium of OLP patients: CD14, CXCL1, IL8, and TLR1, and at least two of these proteins, TLR1 and CXCL1, were expressed at substantial levels in oral keratinocytes. Conclusions Because of the large accumulation of T cells in lesions of OLP it has long been thought to be an adaptive immunity malfunction. We provide evidence that there is increased expression of innate immune genes in the epithelium with this illness, suggesting a role for this process in the disease and a possible target for treatment. PMID:24581860

  13. Gene expression-based biological test for major depressive disorder: an advanced study

    Directory of Open Access Journals (Sweden)

    Watanabe S

    2017-02-01

    Full Text Available Shin-ya Watanabe,1 Shusuke Numata,1 Jun-ichi Iga,2 Makoto Kinoshita,1 Hidehiro Umehara,1 Kazuo Ishii,3 Tetsuro Ohmori1 1Department of Psychiatry, Institute of Biomedical Sciences, Tokushima University Graduate School, Tokushima, 2Department of Neuropsychiatry, Molecules and Function, Ehime University Graduate School of Medicine, Ehime, 3Department of Applied Biological Science, Faculty of Agriculture, Tokyo University of Agriculture and Technology, Tokyo, Japan Purpose: Recently, we could distinguished patients with major depressive disorder (MDD from nonpsychiatric controls with high accuracy using a panel of five gene expression markers (ARHGAP24, HDAC5, PDGFC, PRNP, and SLC6A4 in leukocyte. In the present study, we examined whether this biological test is able to discriminate patients with MDD from those without MDD, including those with schizophrenia and bipolar disorder.Patients and methods: We measured messenger ribonucleic acid expression levels of the aforementioned five genes in peripheral leukocytes in 17 patients with schizophrenia and 36 patients with bipolar disorder using quantitative real-time polymerase chain reaction (PCR, and we combined these expression data with our previous expression data of 25 patients with MDD and 25 controls. Subsequently, a linear discriminant function was developed for use in discriminating between patients with MDD and without MDD.Results: This expression panel was able to segregate patients with MDD from those without MDD with a sensitivity and specificity of 64% and 67.9%, respectively.Conclusion: Further research to identify MDD-specific markers is needed to improve the performance of this biological test. Keywords: depressive disorder, biomarker, gene expression, schizophrenia, bipolar disorder

  14. Autonomous Bacterial Localization and Gene Expression Based on Nearby Cell Receptor Density

    Science.gov (United States)

    2013-01-22

    signal-peptide (lpp-ompA) sequences from the template vector, pTX101 (provided by Dr George Georgiou, University of Texas, Austin) (Francisco et al...generously providing the PCI-15B cell line, Dr George Georgiou for kindly providing the ompA surface display vector, and Dr Eiry Kobatake for providing...E, Wong WW, Suen JK, Bulter T, Lee SG, Liao JC (2005) A synthetic gene-metabolic oscillator. Nature 435: 118–122 Gardner TS, Cantor CR, Collins JJ

  15. Gene expression-based classifiers identify Staphylococcus aureus infection in mice and humans.

    Directory of Open Access Journals (Sweden)

    Sun Hee Ahn

    Full Text Available Staphylococcus aureus causes a spectrum of human infection. Diagnostic delays and uncertainty lead to treatment delays and inappropriate antibiotic use. A growing literature suggests the host's inflammatory response to the pathogen represents a potential tool to improve upon current diagnostics. The hypothesis of this study is that the host responds differently to S. aureus than to E. coli infection in a quantifiable way, providing a new diagnostic avenue. This study uses Bayesian sparse factor modeling and penalized binary regression to define peripheral blood gene-expression classifiers of murine and human S. aureus infection. The murine-derived classifier distinguished S. aureus infection from healthy controls and Escherichia coli-infected mice across a range of conditions (mouse and bacterial strain, time post infection and was validated in outbred mice (AUC>0.97. A S. aureus classifier derived from a cohort of 94 human subjects distinguished S. aureus blood stream infection (BSI from healthy subjects (AUC 0.99 and E. coli BSI (AUC 0.84. Murine and human responses to S. aureus infection share common biological pathways, allowing the murine model to classify S. aureus BSI in humans (AUC 0.84. Both murine and human S. aureus classifiers were validated in an independent human cohort (AUC 0.95 and 0.92, respectively. The approach described here lends insight into the conserved and disparate pathways utilized by mice and humans in response to these infections. Furthermore, this study advances our understanding of S. aureus infection; the host response to it; and identifies new diagnostic and therapeutic avenues.

  16. Classification and expression analyses of homeobox genes from ...

    Indian Academy of Sciences (India)

    We present here the first genome-wide classification and comparative genomic analysis of the 14 homeobox genes present in D. discoideum. Based on the structural alignment of the homeodomains, they can be broadly divided into TALE and non-TALE classes. When individual homeobox genes were compared with ...

  17. An Efficient Ensemble Learning Method for Gene Microarray Classification

    Directory of Open Access Journals (Sweden)

    Alireza Osareh

    2013-01-01

    Full Text Available The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. However, it has been also revealed that the basic classification techniques have intrinsic drawbacks in achieving accurate gene classification and cancer diagnosis. On the other hand, classifier ensembles have received increasing attention in various applications. Here, we address the gene classification issue using RotBoost ensemble methodology. This method is a combination of Rotation Forest and AdaBoost techniques which in turn preserve both desirable features of an ensemble architecture, that is, accuracy and diversity. To select a concise subset of informative genes, 5 different feature selection algorithms are considered. To assess the efficiency of the RotBoost, other nonensemble/ensemble techniques including Decision Trees, Support Vector Machines, Rotation Forest, AdaBoost, and Bagging are also deployed. Experimental results have revealed that the combination of the fast correlation-based feature selection method with ICA-based RotBoost ensemble is highly effective for gene classification. In fact, the proposed method can create ensemble classifiers which outperform not only the classifiers produced by the conventional machine learning but also the classifiers generated by two widely used conventional ensemble learning methods, that is, Bagging and AdaBoost.

  18. Microsatellite Instability Use in Mismatch Repair Gene Sequence Variant Classification

    Directory of Open Access Journals (Sweden)

    Bryony A. Thompson

    2015-03-01

    Full Text Available Inherited mutations in the DNA mismatch repair genes (MMR can cause MMR deficiency and increased susceptibility to colorectal and endometrial cancer. Microsatellite instability (MSI is the defining molecular signature of MMR deficiency. The clinical classification of identified MMR gene sequence variants has a direct impact on the management of patients and their families. For a significant proportion of cases sequence variants of uncertain clinical significance (also known as unclassified variants are identified, constituting a challenge for genetic counselling and clinical management of families. The effect on protein function of these variants is difficult to interpret. The presence or absence of MSI in tumours can aid in determining the pathogenicity of associated unclassified MMR gene variants. However, there are some considerations that need to be taken into account when using MSI for variant interpretation. The use of MSI and other tumour characteristics in MMR gene sequence variant classification will be explored in this review.

  19. A Classification Framework Applied to Cancer Gene Expression Profiles

    Directory of Open Access Journals (Sweden)

    Hussein Hijazi

    2013-01-01

    Full Text Available Classification of cancer based on gene expression has provided insight into possible treatment strategies. Thus, developing machine learning methods that can successfully distinguish among cancer subtypes or normal versus cancer samples is important. This work discusses supervised learning techniques that have been employed to classify cancers. Furthermore, a two-step feature selection method based on an attribute estimation method (e.g., ReliefF and a genetic algorithm was employed to find a set of genes that can best differentiate between cancer subtypes or normal versus cancer samples. The application of different classification methods (e.g., decision tree, k-nearest neighbor, support vector machine (SVM, bagging, and random forest on 5 cancer datasets shows that no classification method universally outperforms all the others. However, k-nearest neighbor and linear SVM generally improve the classification performance over other classifiers. Finally, incorporating diverse types of genomic data (e.g., protein-protein interaction data and gene expression increase the prediction accuracy as compared to using gene expression alone.

  20. iSyTE 2.0: a database for expression-based gene discovery in the eye

    Science.gov (United States)

    Kakrana, Atul; Yang, Andrian; Anand, Deepti; Djordjevic, Djordje; Ramachandruni, Deepti; Singh, Abhyudai; Huang, Hongzhan

    2018-01-01

    Abstract Although successful in identifying new cataract-linked genes, the previous version of the database iSyTE (integrated Systems Tool for Eye gene discovery) was based on expression information on just three mouse lens stages and was functionally limited to visualization by only UCSC-Genome Browser tracks. To increase its efficacy, here we provide an enhanced iSyTE version 2.0 (URL: http://research.bioinformatics.udel.edu/iSyTE) based on well-curated, comprehensive genome-level lens expression data as a one-stop portal for the effective visualization and analysis of candidate genes in lens development and disease. iSyTE 2.0 includes all publicly available lens Affymetrix and Illumina microarray datasets representing a broad range of embryonic and postnatal stages from wild-type and specific gene-perturbation mouse mutants with eye defects. Further, we developed a new user-friendly web interface for direct access and cogent visualization of the curated expression data, which supports convenient searches and a range of downstream analyses. The utility of these new iSyTE 2.0 features is illustrated through examples of established genes associated with lens development and pathobiology, which serve as tutorials for its application by the end-user. iSyTE 2.0 will facilitate the prioritization of eye development and disease-linked candidate genes in studies involving transcriptomics or next-generation sequencing data, linkage analysis and GWAS approaches. PMID:29036527

  1. Novel gene sets improve set-level classification of prokaryotic gene expression data.

    Science.gov (United States)

    Holec, Matěj; Kuželka, Ondřej; Železný, Filip

    2015-10-28

    Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.

  2. Classifying genes to the correct Gene Ontology Slim term in Saccharomyces cerevisiae using neighbouring genes with classification learning

    Directory of Open Access Journals (Sweden)

    Tsatsoulis Costas

    2010-05-01

    Full Text Available Abstract Background There is increasing evidence that gene location and surrounding genes influence the functionality of genes in the eukaryotic genome. Knowing the Gene Ontology Slim terms associated with a gene gives us insight into a gene's functionality by informing us how its gene product behaves in a cellular context using three different ontologies: molecular function, biological process, and cellular component. In this study, we analyzed if we could classify a gene in Saccharomyces cerevisiae to its correct Gene Ontology Slim term using information about its location in the genome and information from its nearest-neighbouring genes using classification learning. Results We performed experiments to establish that the MultiBoostAB algorithm using the J48 classifier could correctly classify Gene Ontology Slim terms of a gene given information regarding the gene's location and information from its nearest-neighbouring genes for training. Different neighbourhood sizes were examined to determine how many nearest neighbours should be included around each gene to provide better classification rules. Our results show that by just incorporating neighbour information from each gene's two-nearest neighbours, the percentage of correctly classified genes to their correct Gene Ontology Slim term for each ontology reaches over 80% with high accuracy (reflected in F-measures over 0.80 of the classification rules produced. Conclusions We confirmed that in classifying genes to their correct Gene Ontology Slim term, the inclusion of neighbour information from those genes is beneficial. Knowing the location of a gene and the Gene Ontology Slim information from neighbouring genes gives us insight into that gene's functionality. This benefit is seen by just including information from a gene's two-nearest neighbouring genes.

  3. Gene selection for cancer classification with the help of bees.

    Science.gov (United States)

    Moosa, Johra Muhammad; Shakur, Rameen; Kaykobad, Mohammad; Rahman, Mohammad Sohel

    2016-08-10

    Development of biologically relevant models from gene expression data notably, microarray data has become a topic of great interest in the field of bioinformatics and clinical genetics and oncology. Only a small number of gene expression data compared to the total number of genes explored possess a significant correlation with a certain phenotype. Gene selection enables researchers to obtain substantial insight into the genetic nature of the disease and the mechanisms responsible for it. Besides improvement of the performance of cancer classification, it can also cut down the time and cost of medical diagnoses. This study presents a modified Artificial Bee Colony Algorithm (ABC) to select minimum number of genes that are deemed to be significant for cancer along with improvement of predictive accuracy. The search equation of ABC is believed to be good at exploration but poor at exploitation. To overcome this limitation we have modified the ABC algorithm by incorporating the concept of pheromones which is one of the major components of Ant Colony Optimization (ACO) algorithm and a new operation in which successive bees communicate to share their findings. The proposed algorithm is evaluated using a suite of ten publicly available datasets after the parameters are tuned scientifically with one of the datasets. Obtained results are compared to other works that used the same datasets. The performance of the proposed method is proved to be superior. The method presented in this paper can provide subset of genes leading to more accurate classification results while the number of selected genes is smaller. Additionally, the proposed modified Artificial Bee Colony Algorithm could conceivably be applied to problems in other areas as well.

  4. Patterns of Immune Infiltration in Breast Cancer and Their Clinical Implications: A Gene-Expression-Based Retrospective Study

    Science.gov (United States)

    Ali, H. Raza; Chlon, Leon; Pharoah, Paul D. P.; Caldas, Carlos

    2016-01-01

    Background Immune infiltration of breast tumours is associated with clinical outcome. However, past work has not accounted for the diversity of functionally distinct cell types that make up the immune response. The aim of this study was to determine whether differences in the cellular composition of the immune infiltrate in breast tumours influence survival and treatment response, and whether these effects differ by molecular subtype. Methods and Findings We applied an established computational approach (CIBERSORT) to bulk gene expression profiles of almost 11,000 tumours to infer the proportions of 22 subsets of immune cells. We investigated associations between each cell type and survival and response to chemotherapy, modelling cellular proportions as quartiles. We found that tumours with little or no immune infiltration were associated with different survival patterns according to oestrogen receptor (ER) status. In ER-negative disease, tumours lacking immune infiltration were associated with the poorest prognosis, whereas in ER-positive disease, they were associated with intermediate prognosis. Of the cell subsets investigated, T regulatory cells and M0 and M2 macrophages emerged as the most strongly associated with poor outcome, regardless of ER status. Among ER-negative tumours, CD8+ T cells (hazard ratio [HR] = 0.89, 95% CI 0.80–0.98; p = 0.02) and activated memory T cells (HR 0.88, 95% CI 0.80–0.97; p = 0.01) were associated with favourable outcome. T follicular helper cells (odds ratio [OR] = 1.34, 95% CI 1.14–1.57; p < 0.001) and memory B cells (OR = 1.18, 95% CI 1.0–1.39; p = 0.04) were associated with pathological complete response to neoadjuvant chemotherapy in ER-negative disease, suggesting a role for humoral immunity in mediating response to cytotoxic therapy. Unsupervised clustering analysis using immune cell proportions revealed eight subgroups of tumours, largely defined by the balance between M0, M1, and M2 macrophages, with distinct

  5. Supervised classification of combined copy number and gene expression data

    Directory of Open Access Journals (Sweden)

    Riccadonna S.

    2007-12-01

    Full Text Available In this paper we apply a predictive profiling method to genome copy number aberrations (CNA in combination with gene expression and clinical data to identify molecular patterns of cancer pathophysiology. Predictive models and optimal feature lists for the platforms are developed by a complete validation SVM-based machine learning system. Ranked list of genome CNA sites (assessed by comparative genomic hybridization arrays – aCGH and of differentially expressed genes (assessed by microarray profiling with Affy HG-U133A chips are computed and combined on a breast cancer dataset for the discrimination of Luminal/ ER+ (Lum/ER+ and Basal-like/ER- classes. Different encodings are developed and applied to the CNA data, and predictive variable selection is discussed. We analyze the combination of profiling information between the platforms, also considering the pathophysiological data. A specific subset of patients is identified that has a different response to classification by chromosomal gains and losses and by differentially expressed genes, corroborating the idea that genomic CNA can represent an independent source for tumor classification.

  6. Succinate Dehydrogenase Subunit B (SDHB Is Expressed in Neurofibromatosis 1-Associated Gastrointestinal Stromal Tumors (Gists: Implications for the SDHB Expression Based Classification of Gists

    Directory of Open Access Journals (Sweden)

    Jeanny H. Wang, Jerzy Lasota, Markku Miettinen

    2011-01-01

    Full Text Available Gastrointestinal Stromal Tumor (GIST is the most common mesenchymal tumor of the digestive tract. GISTs develop with relatively high incidence in patients with Neurofibromatosis-1 syndrome (NF1. Mutational activation of KIT or PDGFRA is believed to be a driving force in the pathogenesis of familial and sporadic GISTs. Unlike those tumors, NF1-associated GISTs do not have KIT or PGDFRA mutations. Similarly, no mutational activation of KIT or PDGFRA has been identified in pediatric GISTs and in GISTs associated with Carney Triad and Carney-Stratakis Syndrome. KIT and PDGFRA-wild type tumors are expected to have lesser response to imatinib treatment. Recently, Carney Triad and Carney-Stratakis Syndrome -associated GISTs and pediatric GISTs have been shown to have a loss of expression of succinate dehydrogenase subunit B (SDHB, a Krebs cycle/electron transport chain interface protein. It was proposed that GISTs can be divided into SDHB- positive (type 1, and SDHB-negative (type 2 tumors because of similarities in clinical features and response to imatinib treatment. In this study, SDHB expression was examined immunohistochemically in 22 well-characterized NF1-associated GISTs. All analyzed tumors expressed SDHB. Based on SDHB-expression status, NF1-associated GISTs belong to type 1 category; however, similarly to SDHB type 2 tumors, they do not respond well to imatinib treatment. Therefore, a simple categorization of GISTs into SDHB-positive and-negative seems to be incomplete. A classification based on both SDHB expression status and KIT and PDGFRA mutation status characterize GISTs more accurately and allow subdivision of SDHB-positive tumors into different clinico-genetic categories.

  7. Recursive Cluster Elimination (RCE for classification and feature selection from gene expression data

    Directory of Open Access Journals (Sweden)

    Showe Louise C

    2007-05-01

    Full Text Available Abstract Background Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE rather than recursive feature elimination (RFE. We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. Results We have developed a novel method for selecting significant genes in comparative gene expression studies. This method, which we refer to as SVM-RCE, combines K-means, a clustering method, to identify correlated gene clusters, and Support Vector Machines (SVMs, a supervised machine learning classification method, to identify and score (rank those gene clusters for the purpose of classification. K-means is used initially to group genes into clusters. Recursive cluster elimination (RCE is then applied to iteratively remove those clusters of genes that contribute the least to the classification performance. SVM-RCE identifies the clusters of correlated genes that are most significantly differentially expressed between the sample classes. Utilization of gene clusters, rather than individual genes, enhances the supervised classification accuracy of the same data as compared to the accuracy when either SVM or Penalized Discriminant Analysis (PDA with recursive feature elimination (SVM-RFE and PDA-RFE are used to remove genes based on their individual discriminant weights. Conclusion SVM-RCE provides improved classification accuracy with complex microarray data sets when it is compared to the classification accuracy of the same datasets using either SVM-RFE or PDA-RFE. SVM-RCE identifies clusters of correlated genes that when considered together

  8. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  9. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data

    KAUST Repository

    Abusamra, Heba

    2013-01-01

    Different experiments have been applied to compare the performance of the classification methods with and without performing feature selection. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.

  10. Classification of Breast Cancer Subtypes by combining Gene Expression and DNA Methylation Data

    DEFF Research Database (Denmark)

    List, Markus; Hauschild, Anne-Christin; Tan, Qihua

    2014-01-01

    expression data for hundreds of patients, the challenge is to extract a minimal optimal set of genes with good prognostic properties from a large bulk of genes making a moderate contribution to classification. Several studies have successfully applied machine learning algorithms to solve this so-called gene...... on the transcriptomic, but also on an epigenetic level. We compared so-called random forest derived classification models based on gene expression and methylation data alone, to a model based on the combined features and to a model based on the gold standard PAM50. We obtained bootstrap errors of 10...

  11. Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data.

    Science.gov (United States)

    Saini, Harsh; Lal, Sunil Pranit; Naidu, Vimal Vikash; Pickering, Vincel Wince; Singh, Gurmeet; Tsunoda, Tatsuhiko; Sharma, Alok

    2016-12-05

    High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance. This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies. The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers.

  12. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data

    KAUST Repository

    Abusamra, Heba

    2013-05-01

    Microarray technology has enriched the study of gene expression in such a way that scientists are now able to measure the expression levels of thousands of genes in a single experiment. Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification, interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This thesis aims on a comparative study of state-of-the-art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k- nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t- statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used for this study. Different experiments have been applied to compare the performance of the classification methods with and without performing feature selection. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in

  13. Classification between normal and tumor tissues based on the pair-wise gene expression ratio

    International Nuclear Information System (INIS)

    Yap, YeeLeng; Zhang, XueWu; Ling, MT; Wang, XiangHong; Wong, YC; Danchin, Antoine

    2004-01-01

    Precise classification of cancer types is critically important for early cancer diagnosis and treatment. Numerous efforts have been made to use gene expression profiles to improve precision of tumor classification. However, reliable cancer-related signals are generally lacking. Using recent datasets on colon and prostate cancer, a data transformation procedure from single gene expression to pair-wise gene expression ratio is proposed. Making use of the internal consistency of each expression profiling dataset this transformation improves the signal to noise ratio of the dataset and uncovers new relevant cancer-related signals (features). The efficiency in using the transformed dataset to perform normal/tumor classification was investigated using feature partitioning with informative features (gene annotation) as discriminating axes (single gene expression or pair-wise gene expression ratio). Classification results were compared to the original datasets for up to 10-feature model classifiers. 82 and 262 genes that have high correlation to tissue phenotype were selected from the colon and prostate datasets respectively. Remarkably, data transformation of the highly noisy expression data successfully led to lower the coefficient of variation (CV) for the within-class samples as well as improved the correlation with tissue phenotypes. The transformed dataset exhibited lower CV when compared to that of single gene expression. In the colon cancer set, the minimum CV decreased from 45.3% to 16.5%. In prostate cancer, comparable CV was achieved with and without transformation. This improvement in CV, coupled with the improved correlation between the pair-wise gene expression ratio and tissue phenotypes, yielded higher classification efficiency, especially with the colon dataset – from 87.1% to 93.5%. Over 90% of the top ten discriminating axes in both datasets showed significant improvement after data transformation. The high classification efficiency achieved suggested

  14. Gene selection for microarray data classification via subspace learning and manifold regularization.

    Science.gov (United States)

    Tang, Chang; Cao, Lijuan; Zheng, Xiao; Wang, Minhui

    2017-12-19

    With the rapid development of DNA microarray technology, large amount of genomic data has been generated. Classification of these microarray data is a challenge task since gene expression data are often with thousands of genes but a small number of samples. In this paper, an effective gene selection method is proposed to select the best subset of genes for microarray data with the irrelevant and redundant genes removed. Compared with original data, the selected gene subset can benefit the classification task. We formulate the gene selection task as a manifold regularized subspace learning problem. In detail, a projection matrix is used to project the original high dimensional microarray data into a lower dimensional subspace, with the constraint that the original genes can be well represented by the selected genes. Meanwhile, the local manifold structure of original data is preserved by a Laplacian graph regularization term on the low-dimensional data space. The projection matrix can serve as an importance indicator of different genes. An iterative update algorithm is developed for solving the problem. Experimental results on six publicly available microarray datasets and one clinical dataset demonstrate that the proposed method performs better when compared with other state-of-the-art methods in terms of microarray data classification. Graphical Abstract The graphical abstract of this work.

  15. A Comprehensive Classification and Evolutionary Analysis of Plant Homeobox Genes

    OpenAIRE

    Mukherjee, Krishanu; Brocchieri, Luciano; B?rglin, Thomas R.

    2009-01-01

    The full complement of homeobox transcription factor sequences, including genes and pseudogenes, was determined from the analysis of 10 complete genomes from flowering plants, moss, Selaginella, unicellular green algae, and red algae. Our exhaustive genome-wide searches resulted in the discovery in each class of a greater number of homeobox genes than previously reported. All homeobox genes can be unambiguously classified by sequence evolutionary analysis into 14 distinct classes also charact...

  16. Gene selection and classification for cancer microarray data based on machine learning and similarity measures

    Directory of Open Access Journals (Sweden)

    Liu Qingzhong

    2011-12-01

    Full Text Available Abstract Background Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.

  17. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  18. SoFoCles: feature filtering for microarray classification based on gene ontology.

    Science.gov (United States)

    Papachristoudis, Georgios; Diplaris, Sotiris; Mitkas, Pericles A

    2010-02-01

    Marker gene selection has been an important research topic in the classification analysis of gene expression data. Current methods try to reduce the "curse of dimensionality" by using statistical intra-feature set calculations, or classifiers that are based on the given dataset. In this paper, we present SoFoCles, an interactive tool that enables semantic feature filtering in microarray classification problems with the use of external, well-defined knowledge retrieved from the Gene Ontology. The notion of semantic similarity is used to derive genes that are involved in the same biological path during the microarray experiment, by enriching a feature set that has been initially produced with legacy methods. Among its other functionalities, SoFoCles offers a large repository of semantic similarity methods that are used in order to derive feature sets and marker genes. The structure and functionality of the tool are discussed in detail, as well as its ability to improve classification accuracy. Through experimental evaluation, SoFoCles is shown to outperform other classification schemes in terms of classification accuracy in two real datasets using different semantic similarity computation approaches.

  19. Towards precise classification of cancers based on robust gene functional expression profiles

    Directory of Open Access Journals (Sweden)

    Zhu Jing

    2005-03-01

    Full Text Available Abstract Background Development of robust and efficient methods for analyzing and interpreting high dimension gene expression profiles continues to be a focus in computational biology. The accumulated experiment evidence supports the assumption that genes express and perform their functions in modular fashions in cells. Therefore, there is an open space for development of the timely and relevant computational algorithms that use robust functional expression profiles towards precise classification of complex human diseases at the modular level. Results Inspired by the insight that genes act as a module to carry out a highly integrated cellular function, we thus define a low dimension functional expression profile for data reduction. After annotating each individual gene to functional categories defined in a proper gene function classification system such as Gene Ontology applied in this study, we identify those functional categories enriched with differentially expressed genes. For each functional category or functional module, we compute a summary measure (s for the raw expression values of the annotated genes to capture the overall activity level of the module. In this way, we can treat the gene expressions within a functional module as an integrative data point to replace the multiple values of individual genes. We compare the classification performance of decision trees based on functional expression profiles with the conventional gene expression profiles using four publicly available datasets, which indicates that precise classification of tumour types and improved interpretation can be achieved with the reduced functional expression profiles. Conclusion This modular approach is demonstrated to be a powerful alternative approach to analyzing high dimension microarray data and is robust to high measurement noise and intrinsic biological variance inherent in microarray data. Furthermore, efficient integration with current biological knowledge

  20. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  1. Development and validation of a gene expression-based signature to predict distant metastasis in locoregionally advanced nasopharyngeal carcinoma: a retrospective, multicentre, cohort study.

    Science.gov (United States)

    Tang, Xin-Ran; Li, Ying-Qin; Liang, Shao-Bo; Jiang, Wei; Liu, Fang; Ge, Wen-Xiu; Tang, Ling-Long; Mao, Yan-Ping; He, Qing-Mei; Yang, Xiao-Jing; Zhang, Yuan; Wen, Xin; Zhang, Jian; Wang, Ya-Qin; Zhang, Pan-Pan; Sun, Ying; Yun, Jing-Ping; Zeng, Jing; Li, Li; Liu, Li-Zhi; Liu, Na; Ma, Jun

    2018-03-01

    Gene expression patterns can be used as prognostic biomarkers in various types of cancers. We aimed to identify a gene expression pattern for individual distant metastatic risk assessment in patients with locoregionally advanced nasopharyngeal carcinoma. In this multicentre, retrospective, cohort analysis, we included 937 patients with locoregionally advanced nasopharyngeal carcinoma from three Chinese hospitals: the Sun Yat-sen University Cancer Center (Guangzhou, China), the Affiliated Hospital of Guilin Medical University (Guilin, China), and the First People's Hospital of Foshan (Foshan, China). Using microarray analysis, we profiled mRNA gene expression between 24 paired locoregionally advanced nasopharyngeal carcinoma tumours from patients at Sun Yat-sen University Cancer Center with or without distant metastasis after radical treatment. Differentially expressed genes were examined using digital expression profiling in a training cohort (Guangzhou training cohort; n=410) to build a gene classifier using a penalised regression model. We validated the prognostic accuracy of this gene classifier in an internal validation cohort (Guangzhou internal validation cohort, n=204) and two external independent cohorts (Guilin cohort, n=165; Foshan cohort, n=158). The primary endpoint was distant metastasis-free survival. Secondary endpoints were disease-free survival and overall survival. We identified 137 differentially expressed genes between metastatic and non-metastatic locoregionally advanced nasopharyngeal carcinoma tissues. A distant metastasis gene signature for locoregionally advanced nasopharyngeal carcinoma (DMGN) that consisted of 13 genes was generated to classify patients into high-risk and low-risk groups in the training cohort. Patients with high-risk scores in the training cohort had shorter distant metastasis-free survival (hazard ratio [HR] 4·93, 95% CI 2·99-8·16; padvanced nasopharyngeal carcinoma and might be able to predict which patients benefit

  2. Transcription activator-like effector-mediated regulation of gene expression based on the inducible packaging and delivery via designed extracellular vesicles

    International Nuclear Information System (INIS)

    Lainšček, Duško; Lebar, Tina; Jerala, Roman

    2017-01-01

    Transcription activator-like effector (TALE) proteins present a powerful tool for genome editing and engineering, enabling introduction of site-specific mutations, gene knockouts or regulation of the transcription levels of selected genes. TALE nucleases or TALE-based transcription regulators are introduced into mammalian cells mainly via delivery of the coding genes. Here we report an extracellular vesicle-mediated delivery of TALE transcription regulators and their ability to upregulate the reporter gene in target cells. Designed transcriptional activator TALE-VP16 fused to the appropriate dimerization domain was enriched as a cargo protein within extracellular vesicles produced by mammalian HEK293 cells stimulated by Ca-ionophore and using blue light- or rapamycin-inducible dimerization systems. Blue light illumination or rapamycin increased the amount of the TALE-VP16 activator in extracellular vesicles and their addition to the target cells resulted in an increased expression of the reporter gene upon addition of extracellular vesicles to the target cells. This technology therefore represents an efficient delivery for the TALE-based transcriptional regulators. - Highlights: • Inducible dimerization enriched cargo proteins within extracellular vesicles (EV). • Farnesylation surpassed LAMP-1 fusion proteins for the EV packing. • Extracellular vesicles were able to deliver TALE regulators to mammalian cells. • TALE mediated transcriptional activation was achieved by designed EV.

  3. RNAi and Homologous Over-Expression Based Functional Approaches Reveal Triterpenoid Synthase Gene-Cycloartenol Synthase Is Involved in Downstream Withanolide Biosynthesis in Withania somnifera.

    Directory of Open Access Journals (Sweden)

    Smrati Mishra

    Full Text Available Withania somnifera Dunal, is one of the most commonly used medicinal plant in Ayurvedic and indigenous medicine traditionally owing to its therapeutic potential, because of major chemical constituents, withanolides. Withanolide biosynthesis requires the activities of several enzymes in vivo. Cycloartenol synthase (CAS is an important enzyme in the withanolide biosynthetic pathway, catalyzing cyclization of 2, 3 oxidosqualene into cycloartenol. In the present study, we have cloned full-length WsCAS from Withania somnifera by homology-based PCR method. For gene function investigation, we constructed three RNAi gene-silencing constructs in backbone of RNAi vector pGSA and a full-length over-expression construct. These constructs were transformed in Agrobacterium strain GV3101 for plant transformation in W. somnifera. Molecular and metabolite analysis was performed in putative Withania transformants. The PCR and Southern blot results showed the genomic integration of these RNAi and overexpression construct(s in Withania genome. The qRT-PCR analysis showed that the expression of WsCAS gene was considerably downregulated in stable transgenic silenced Withania lines compared with the non-transformed control and HPLC analysis showed that withanolide content was greatly reduced in silenced lines. Transgenic plants over expressing CAS gene displayed enhanced level of CAS transcript and withanolide content compared to non-transformed controls. This work is the first full proof report of functional validation of any metabolic pathway gene in W. somnifera at whole plant level as per our knowledge and it will be further useful to understand the regulatory role of different genes involved in the biosynthesis of withanolides.

  4. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

    KAUST Repository

    Abusamra, Heba

    2013-11-01

    Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification. Interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This paper aims on a comparative study of state-of-the- art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k-nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t-statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used in the experiments. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.

  5. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

    KAUST Repository

    Abusamra, Heba

    2013-01-01

    Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification. Interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This paper aims on a comparative study of state-of-the- art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k-nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t-statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used in the experiments. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.

  6. Classification of Breast Cancer Subtypes by combining Gene Expression and DNA Methylation Data

    Directory of Open Access Journals (Sweden)

    List Markus

    2014-06-01

    Full Text Available Selecting the most promising treatment strategy for breast cancer crucially depends on determining the correct subtype. In recent years, gene expression profiling has been investigated as an alternative to histochemical methods. Since databases like TCGA provide easy and unrestricted access to gene expression data for hundreds of patients, the challenge is to extract a minimal optimal set of genes with good prognostic properties from a large bulk of genes making a moderate contribution to classification. Several studies have successfully applied machine learning algorithms to solve this so-called gene selection problem. However, more diverse data from other OMICS technologies are available, including methylation. We hypothesize that combining methylation and gene expression data could already lead to a largely improved classification model, since the resulting model will reflect differences not only on the transcriptomic, but also on an epigenetic level. We compared so-called random forest derived classification models based on gene expression and methylation data alone, to a model based on the combined features and to a model based on the gold standard PAM50. We obtained bootstrap errors of 10-20% and classification error of 1-50%, depending on breast cancer subtype and model. The gene expression model was clearly superior to the methylation model, which was also reflected in the combined model, which mainly selected features from gene expression data. However, the methylation model was able to identify unique features not considered as relevant by the gene expression model, which might provide deeper insights into breast cancer subtype differentiation on an epigenetic level.

  7. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    Directory of Open Access Journals (Sweden)

    Hala Alshamlan

    2015-01-01

    Full Text Available An artificial bee colony (ABC is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR, and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO. The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  8. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    Science.gov (United States)

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  9. Entropy-based gene ranking without selection bias for the predictive classification of microarray data

    Directory of Open Access Journals (Sweden)

    Serafini Maria

    2003-11-01

    Full Text Available Abstract Background We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process. Results With E-RFE, we speed up the recursive feature elimination (RFE with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Conclusions Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.

  10. Classification and expression analyses of homeobox genes from ...

    Indian Academy of Sciences (India)

    2015-04-27

    Apr 27, 2015 ... Supplementary materials pertaining to this article are available on the Journal of Biosciences Website at .... Bank (PDB) was used as a template and the quality of the model was ... different genes and also to place them in a framework that ..... Kim JS, Seo JH, Yim HS and Kang SO 2011 Homeoprotein Hbx4.

  11. Training ANFIS structure using genetic algorithm for liver cancer classification based on microarray gene expression data

    Directory of Open Access Journals (Sweden)

    Bülent Haznedar

    2017-02-01

    Full Text Available Classification is an important data mining technique, which is used in many fields mostly exemplified as medicine, genetics and biomedical engineering. The number of studies about classification of the datum on DNA microarray gene expression is specifically increased in recent years. However, because of the reasons as the abundance of gene numbers in the datum as microarray gene expressions and the nonlinear relations mostly across those datum, the success of conventional classification algorithms can be limited. Because of these reasons, the interest on classification methods which are based on artificial intelligence to solve the problem on classification has been gradually increased in recent times. In this study, a hybrid approach which is based on Adaptive Neuro-Fuzzy Inference System (ANFIS and Genetic Algorithm (GA are suggested in order to classify liver microarray cancer data set. Simulation results are compared with the results of other methods. According to the results obtained, it is seen that the recommended method is better than the other methods.

  12. Pathogenic classification of LPL gene variants reported to be associated with LPL deficiency

    DEFF Research Database (Denmark)

    Rodrigues, Rute; Artieda, Marta; Tejedor, Diego

    2016-01-01

    into the deleterious effect of the mutations is clinically essential. METHODS: We used gene sequencing followed by in-vivo/in-vitro and in-silico tools for classification. We classified 125 rare LPL mutations in 33 subjects thought to have LPL deficiency and in 314 subjects selected for very SHTG. RESULTS: Of the 33...

  13. A Region-Based GeneSIS Segmentation Algorithm for the Classification of Remotely Sensed Images

    Directory of Open Access Journals (Sweden)

    Stelios K. Mylonas

    2015-03-01

    Full Text Available This paper proposes an object-based segmentation/classification scheme for remotely sensed images, based on a novel variant of the recently proposed Genetic Sequential Image Segmentation (GeneSIS algorithm. GeneSIS segments the image in an iterative manner, whereby at each iteration a single object is extracted via a genetic-based object extraction algorithm. Contrary to the previous pixel-based GeneSIS where the candidate objects to be extracted were evaluated through the fuzzy content of their included pixels, in the newly developed region-based GeneSIS algorithm, a watershed-driven fine segmentation map is initially obtained from the original image, which serves as the basis for the forthcoming GeneSIS segmentation. Furthermore, in order to enhance the spatial search capabilities, we introduce a more descriptive encoding scheme in the object extraction algorithm, where the structural search modules are represented by polygonal shapes. Our objectives in the new framework are posed as follows: enhance the flexibility of the algorithm in extracting more flexible object shapes, assure high level classification accuracies, and reduce the execution time of the segmentation, while at the same time preserving all the inherent attributes of the GeneSIS approach. Finally, exploiting the inherent attribute of GeneSIS to produce multiple segmentations, we also propose two segmentation fusion schemes that operate on the ensemble of segmentations generated by GeneSIS. Our approaches are tested on an urban and two agricultural images. The results show that region-based GeneSIS has considerably lower computational demands compared to the pixel-based one. Furthermore, the suggested methods achieve higher classification accuracies and good segmentation maps compared to a series of existing algorithms.

  14. An enhanced topologically significant directed random walk in cancer classification using gene expression datasets

    Directory of Open Access Journals (Sweden)

    Choon Sen Seah

    2017-12-01

    Full Text Available Microarray technology has become one of the elementary tools for researchers to study the genome of organisms. As the complexity and heterogeneity of cancer is being increasingly appreciated through genomic analysis, cancerous classification is an emerging important trend. Significant directed random walk is proposed as one of the cancerous classification approach which have higher sensitivity of risk gene prediction and higher accuracy of cancer classification. In this paper, the methodology and material used for the experiment are presented. Tuning parameter selection method and weight as parameter are applied in proposed approach. Gene expression dataset is used as the input datasets while pathway dataset is used to build a directed graph, as reference datasets, to complete the bias process in random walk approach. In addition, we demonstrate that our approach can improve sensitive predictions with higher accuracy and biological meaningful classification result. Comparison result takes place between significant directed random walk and directed random walk to show the improvement in term of sensitivity of prediction and accuracy of cancer classification.

  15. Feature Genes Selection Using Supervised Locally Linear Embedding and Correlation Coefficient for Microarray Classification.

    Science.gov (United States)

    Xu, Jiucheng; Mu, Huiyu; Wang, Yun; Huang, Fangzhou

    2018-01-01

    The selection of feature genes with high recognition ability from the gene expression profiles has gained great significance in biology. However, most of the existing methods have a high time complexity and poor classification performance. Motivated by this, an effective feature selection method, called supervised locally linear embedding and Spearman's rank correlation coefficient (SLLE-SC 2 ), is proposed which is based on the concept of locally linear embedding and correlation coefficient algorithms. Supervised locally linear embedding takes into account class label information and improves the classification performance. Furthermore, Spearman's rank correlation coefficient is used to remove the coexpression genes. The experiment results obtained on four public tumor microarray datasets illustrate that our method is valid and feasible.

  16. Sparse representation of multi parametric DCE-MRI features using K-SVD for classifying gene expression based breast cancer recurrence risk

    Science.gov (United States)

    Mahrooghy, Majid; Ashraf, Ahmed B.; Daye, Dania; Mies, Carolyn; Rosen, Mark; Feldman, Michael; Kontos, Despina

    2014-03-01

    We evaluate the prognostic value of sparse representation-based features by applying the K-SVD algorithm on multiparametric kinetic, textural, and morphologic features in breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). K-SVD is an iterative dimensionality reduction method that optimally reduces the initial feature space by updating the dictionary columns jointly with the sparse representation coefficients. Therefore, by using K-SVD, we not only provide sparse representation of the features and condense the information in a few coefficients but also we reduce the dimensionality. The extracted K-SVD features are evaluated by a machine learning algorithm including a logistic regression classifier for the task of classifying high versus low breast cancer recurrence risk as determined by a validated gene expression assay. The features are evaluated using ROC curve analysis and leave one-out cross validation for different sparse representation and dimensionality reduction numbers. Optimal sparse representation is obtained when the number of dictionary elements is 4 (K=4) and maximum non-zero coefficients is 2 (L=2). We compare K-SVD with ANOVA based feature selection for the same prognostic features. The ROC results show that the AUC of the K-SVD based (K=4, L=2), the ANOVA based, and the original features (i.e., no dimensionality reduction) are 0.78, 0.71. and 0.68, respectively. From the results, it can be inferred that by using sparse representation of the originally extracted multi-parametric, high-dimensional data, we can condense the information on a few coefficients with the highest predictive value. In addition, the dimensionality reduction introduced by K-SVD can prevent models from over-fitting.

  17. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    Science.gov (United States)

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.

  18. Parallelizing Gene Expression Programming Algorithm in Enabling Large-Scale Classification

    Directory of Open Access Journals (Sweden)

    Lixiong Xu

    2017-01-01

    Full Text Available As one of the most effective function mining algorithms, Gene Expression Programming (GEP algorithm has been widely used in classification, pattern recognition, prediction, and other research fields. Based on the self-evolution, GEP is able to mine an optimal function for dealing with further complicated tasks. However, in big data researches, GEP encounters low efficiency issue due to its long time mining processes. To improve the efficiency of GEP in big data researches especially for processing large-scale classification tasks, this paper presents a parallelized GEP algorithm using MapReduce computing model. The experimental results show that the presented algorithm is scalable and efficient for processing large-scale classification tasks.

  19. Large-scale gene function analysis with the PANTHER classification system.

    Science.gov (United States)

    Mi, Huaiyu; Muruganujan, Anushya; Casagrande, John T; Thomas, Paul D

    2013-08-01

    The PANTHER (protein annotation through evolutionary relationship) classification system (http://www.pantherdb.org/) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.

  20. An Entropy-based gene selection method for cancer classification using microarray data

    Directory of Open Access Journals (Sweden)

    Krishnan Arun

    2005-03-01

    Full Text Available Abstract Background Accurate diagnosis of cancer subtypes remains a challenging problem. Building classifiers based on gene expression data is a promising approach; yet the selection of non-redundant but relevant genes is difficult. The selected gene set should be small enough to allow diagnosis even in regular clinical laboratories and ideally identify genes involved in cancer-specific regulatory pathways. Here an entropy-based method is proposed that selects genes related to the different cancer classes while at the same time reducing the redundancy among the genes. Results The present study identifies a subset of features by maximizing the relevance and minimizing the redundancy of the selected genes. A merit called normalized mutual information is employed to measure the relevance and the redundancy of the genes. In order to find a more representative subset of features, an iterative procedure is adopted that incorporates an initial clustering followed by data partitioning and the application of the algorithm to each of the partitions. A leave-one-out approach then selects the most commonly selected genes across all the different runs and the gene selection algorithm is applied again to pare down the list of selected genes until a minimal subset is obtained that gives a satisfactory accuracy of classification. The algorithm was applied to three different data sets and the results obtained were compared to work done by others using the same data sets Conclusion This study presents an entropy-based iterative algorithm for selecting genes from microarray data that are able to classify various cancer sub-types with high accuracy. In addition, the feature set obtained is very compact, that is, the redundancy between genes is reduced to a large extent. This implies that classifiers can be built with a smaller subset of genes.

  1. Actionable gene-based classification toward precision medicine in gastric cancer

    Directory of Open Access Journals (Sweden)

    Hiroshi Ichikawa

    2017-10-01

    Full Text Available Abstract Background Intertumoral heterogeneity represents a significant hurdle to identifying optimized targeted therapies in gastric cancer (GC. To realize precision medicine for GC patients, an actionable gene alteration-based molecular classification that directly associates GCs with targeted therapies is needed. Methods A total of 207 Japanese patients with GC were included in this study. Formalin-fixed, paraffin-embedded (FFPE tumor tissues were obtained from surgical or biopsy specimens and were subjected to DNA extraction. We generated comprehensive genomic profiling data using a 435-gene panel including 69 actionable genes paired with US Food and Drug Administration-approved targeted therapies, and the evaluation of Epstein-Barr virus (EBV infection and microsatellite instability (MSI status. Results Comprehensive genomic sequencing detected at least one alteration of 435 cancer-related genes in 194 GCs (93.7% and of 69 actionable genes in 141 GCs (68.1%. We classified the 207 GCs into four The Cancer Genome Atlas (TCGA subtypes using the genomic profiling data; EBV (N = 9, MSI (N = 17, chromosomal instability (N = 119, and genomically stable subtype (N = 62. Actionable gene alterations were not specific and were widely observed throughout all TCGA subtypes. To discover a novel classification which more precisely selects candidates for targeted therapies, 207 GCs were classified using hypermutated phenotype and the mutation profile of 69 actionable genes. We identified a hypermutated group (N = 32, while the others (N = 175 were sub-divided into six clusters including five with actionable gene alterations: ERBB2 (N = 25, CDKN2A, and CDKN2B (N = 10, KRAS (N = 10, BRCA2 (N = 9, and ATM cluster (N = 12. The clinical utility of this classification was demonstrated by a case of unresectable GC with a remarkable response to anti-HER2 therapy in the ERBB2 cluster. Conclusions This actionable gene

  2. Tumor Classification Using High-Order Gene Expression Profiles Based on Multilinear ICA

    Directory of Open Access Journals (Sweden)

    Ming-gang Du

    2009-01-01

    Full Text Available Motivation. Independent Components Analysis (ICA maximizes the statistical independence of the representational components of a training gene expression profiles (GEP ensemble, but it cannot distinguish relations between the different factors, or different modes, and it is not available to high-order GEP Data Mining. In order to generalize ICA, we introduce Multilinear-ICA and apply it to tumor classification using high order GEP. Firstly, we introduce the basis conceptions and operations of tensor and recommend Support Vector Machine (SVM classifier and Multilinear-ICA. Secondly, the higher score genes of original high order GEP are selected by using t-statistics and tabulate tensors. Thirdly, the tensors are performed by Multilinear-ICA. Finally, the SVM is used to classify the tumor subtypes. Results. To show the validity of the proposed method, we apply it to tumor classification using high order GEP. Though we only use three datasets, the experimental results show that the method is effective and feasible. Through this survey, we hope to gain some insight into the problem of high order GEP tumor classification, in aid of further developing more effective tumor classification algorithms.

  3. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Directory of Open Access Journals (Sweden)

    Enrico Glaab

    Full Text Available Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  4. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Science.gov (United States)

    Glaab, Enrico; Bacardit, Jaume; Garibaldi, Jonathan M; Krasnogor, Natalio

    2012-01-01

    Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  5. Angiotensinogen gene polymorphism predicts hypertension, and iridological constitutional classification enhances the risk for hypertension in Koreans.

    Science.gov (United States)

    Cho, Joo-Jang; Hwang, Woo-Jun; Hong, Seung-Heon; Jeong, Hyun-Ja; Lee, Hye-Jung; Kim, Hyung-Min; Um, Jae-Young

    2008-05-01

    This study investigated the relationship between iridological constitution and angiotensinogen (AGN) gene polymorphism in hypertensives. In addition to angiotensin converting enzyme gene, AGN genotype is also one of the most well studied genetic markers of hypertension. Furthermore, iridology, one of complementary and alternative medicine, is the diagnosis of the medical conditions through noting irregularities of the pigmentation in the iris. Iridological constitution has a strong familial aggregation and is implicated in heredity. Therefore, the study classified 87 hypertensive patients with familial history of cerebral infarction and controls (n = 88) according to Iris constitution, and determined AGN genotype. As a result, the AGN/TT genotype was associated with hypertension (chi2 = 13.413, p iridological constitutional classification increased the relative risk for hypertension in the subjects with AGN/T allele. These results suggest that AGN polymorphism predicts hypertension, and iridological constitutional classification enhances the risk for hypertension associated with AGN/T in a Korean population.

  6. Classification of genes and putative biomarker identification using distribution metrics on expression profiles.

    Directory of Open Access Journals (Sweden)

    Hung-Chung Huang

    Full Text Available BACKGROUND: Identification of genes with switch-like properties will facilitate discovery of regulatory mechanisms that underlie these properties, and will provide knowledge for the appropriate application of Boolean networks in gene regulatory models. As switch-like behavior is likely associated with tissue-specific expression, these gene products are expected to be plausible candidates as tissue-specific biomarkers. METHODOLOGY/PRINCIPAL FINDINGS: In a systematic classification of genes and search for biomarkers, gene expression profiles (GEPs of more than 16,000 genes from 2,145 mouse array samples were analyzed. Four distribution metrics (mean, standard deviation, kurtosis and skewness were used to classify GEPs into four categories: predominantly-off, predominantly-on, graded (rheostatic, and switch-like genes. The arrays under study were also grouped and examined by tissue type. For example, arrays were categorized as 'brain group' and 'non-brain group'; the Kolmogorov-Smirnov distance and Pearson correlation coefficient were then used to compare GEPs between brain and non-brain for each gene. We were thus able to identify tissue-specific biomarker candidate genes. CONCLUSIONS/SIGNIFICANCE: The methodology employed here may be used to facilitate disease-specific biomarker discovery.

  7. Classification of gene expression data: A hubness-aware semi-supervised approach.

    Science.gov (United States)

    Buza, Krisztian

    2016-04-01

    Classification of gene expression data is the common denominator of various biomedical recognition tasks. However, obtaining class labels for large training samples may be difficult or even impossible in many cases. Therefore, semi-supervised classification techniques are required as semi-supervised classifiers take advantage of unlabeled data. Gene expression data is high-dimensional which gives rise to the phenomena known under the umbrella of the curse of dimensionality, one of its recently explored aspects being the presence of hubs or hubness for short. Therefore, hubness-aware classifiers have been developed recently, such as Naive Hubness-Bayesian k-Nearest Neighbor (NHBNN). In this paper, we propose a semi-supervised extension of NHBNN which follows the self-training schema. As one of the core components of self-training is the certainty score, we propose a new hubness-aware certainty score. We performed experiments on publicly available gene expression data. These experiments show that the proposed classifier outperforms its competitors. We investigated the impact of each of the components (classification algorithm, semi-supervised technique, hubness-aware certainty score) separately and showed that each of these components are relevant to the performance of the proposed approach. Our results imply that our approach may increase classification accuracy and reduce computational costs (i.e., runtime). Based on the promising results presented in the paper, we envision that hubness-aware techniques will be used in various other biomedical machine learning tasks. In order to accelerate this process, we made an implementation of hubness-aware machine learning techniques publicly available in the PyHubs software package (http://www.biointelligence.hu/pyhubs) implemented in Python, one of the most popular programming languages of data science. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  8. A Pathway Based Classification Method for Analyzing Gene Expression for Alzheimer's Disease Diagnosis.

    Science.gov (United States)

    Voyle, Nicola; Keohane, Aoife; Newhouse, Stephen; Lunnon, Katie; Johnston, Caroline; Soininen, Hilkka; Kloszewska, Iwona; Mecocci, Patrizia; Tsolaki, Magda; Vellas, Bruno; Lovestone, Simon; Hodges, Angela; Kiddle, Steven; Dobson, Richard Jb

    2016-01-01

    Recent studies indicate that gene expression levels in blood may be able to differentiate subjects with Alzheimer's disease (AD) from normal elderly controls and mild cognitively impaired (MCI) subjects. However, there is limited replicability at the single marker level. A pathway-based interpretation of gene expression may prove more robust. This study aimed to investigate whether a case/control classification model built on pathway level data was more robust than a gene level model and may consequently perform better in test data. The study used two batches of gene expression data from the AddNeuroMed (ANM) and Dementia Case Registry (DCR) cohorts. Our study used Illumina Human HT-12 Expression BeadChips to collect gene expression from blood samples. Random forest modeling with recursive feature elimination was used to predict case/control status. Age and APOE ɛ4 status were used as covariates for all analysis. Gene and pathway level models performed similarly to each other and to a model based on demographic information only. Any potential increase in concordance from the novel pathway level approach used here has not lead to a greater predictive ability in these datasets. However, we have only tested one method for creating pathway level scores. Further, we have been able to benchmark pathways against genes in datasets that had been extensively harmonized. Further work should focus on the use of alternative methods for creating pathway level scores, in particular those that incorporate pathway topology, and the use of an endophenotype based approach.

  9. Inflammation, Adenoma and Cancer: Objective Classification of Colon Biopsy Specimens with Gene Expression Signature

    Directory of Open Access Journals (Sweden)

    Orsolya Galamb

    2008-01-01

    Full Text Available Gene expression analysis of colon biopsies using high-density oligonucleotide microarrays can contribute to the understanding of local pathophysiological alterations and to functional classification of adenoma (15 samples, colorectal carcinomas (CRC (15 and inflammatory bowel diseases (IBD (14. Total RNA was extracted, amplified and biotinylated from frozen colonic biopsies. Genome-wide gene expression profile was evaluated by HGU133plus2 microarrays and verified by RT-PCR. We applied two independent methods for data normalization and used PAM for feature selection. Leave one-out stepwise discriminant analysis was performed. Top validated genes included collagenIVα1, lipocalin-2, calumenin, aquaporin-8 genes in CRC; CD44, met proto-oncogene, chemokine ligand-12, ADAM-like decysin-1 and ATP-binding casette-A8 genes in adenoma; and lipocalin-2, ubiquitin D and IFITM2 genes in IBD. Best differentiating markers between Ulcerative colitis and Crohn's disease were cyclin-G2; tripartite motif-containing-31; TNFR shedding aminopeptidase regulator-1 and AMICA. The discriminant analysis was able to classify the samples in overall 96.2% using 7 discriminatory genes (indoleamine-pyrrole-2,3-dioxygenase, ectodermal-neural cortex, TIMP3, fucosyltransferase-8, collectin sub-family member 12, carboxypeptidase D, and transglutaminase-2. Using routine biopsy samples we successfully performed whole genomic microarray analysis to identify discriminative signatures. Our results provide further insight into the pathophysiological background of colonic diseases. The results set up data warehouse which can be mined further.

  10. Finding Combination of Features from Promoter Regions for Ovarian Cancer-related Gene Group Classification

    KAUST Repository

    Olayan, Rawan S.

    2012-12-01

    In classification problems, it is always important to use the suitable combination of features that will be employed by classifiers. Generating the right combination of features usually results in good classifiers. In the situation when the problem is not well understood, data items are usually described by many features in the hope that some of these may be the relevant or most relevant ones. In this study, we focus on one such problem related to genes implicated in ovarian cancer (OC). We try to recognize two important OC-related gene groups: oncogenes, which support the development and progression of OC, and oncosuppressors, which oppose such tendencies. For this, we use the properties of promoters of these genes. We identified potential “regulatory features” that characterize OC-related oncogenes and oncosuppressors promoters. In our study, we used 211 oncogenes and 39 oncosuppressors. For these, we identified 538 characteristic sequence motifs from their promoters. Promoters are annotated by these motifs and derived feature vectors used to develop classification models. We made a comparison of a number of classification models in their ability to distinguish oncogenes from oncosuppressors. Based on 10-fold cross-validation, the resultant model was able to separate the two classes with sensitivity of 96% and specificity of 100% with the complete set of features. Moreover, we developed another recognition model where we attempted to distinguish oncogenes and oncosuppressors as one group from other OC-related genes. That model achieved accuracy of 82%. We believe that the results of this study will help in discovering other OC-related oncogenes and oncosuppressors not identified as yet.

  11. Finding Combination of Features from Promoter Regions for Ovarian Cancer-related Gene Group Classification

    KAUST Repository

    Olayan, Rawan S.

    2012-01-01

    In classification problems, it is always important to use the suitable combination of features that will be employed by classifiers. Generating the right combination of features usually results in good classifiers. In the situation when the problem is not well understood, data items are usually described by many features in the hope that some of these may be the relevant or most relevant ones. In this study, we focus on one such problem related to genes implicated in ovarian cancer (OC). We try to recognize two important OC-related gene groups: oncogenes, which support the development and progression of OC, and oncosuppressors, which oppose such tendencies. For this, we use the properties of promoters of these genes. We identified potential “regulatory features” that characterize OC-related oncogenes and oncosuppressors promoters. In our study, we used 211 oncogenes and 39 oncosuppressors. For these, we identified 538 characteristic sequence motifs from their promoters. Promoters are annotated by these motifs and derived feature vectors used to develop classification models. We made a comparison of a number of classification models in their ability to distinguish oncogenes from oncosuppressors. Based on 10-fold cross-validation, the resultant model was able to separate the two classes with sensitivity of 96% and specificity of 100% with the complete set of features. Moreover, we developed another recognition model where we attempted to distinguish oncogenes and oncosuppressors as one group from other OC-related genes. That model achieved accuracy of 82%. We believe that the results of this study will help in discovering other OC-related oncogenes and oncosuppressors not identified as yet.

  12. Structural organization and classification of cytochrome P450 genes in flax (Linum usitatissimum L.).

    Science.gov (United States)

    Babu, Peram Ravindra; Rao, Khareedu Venkateswara; Reddy, Vudem Dashavantha

    2013-01-15

    Flax CYPome analysis resulted in the identification of 334 putative cytochrome P450 (CYP450) genes in the cultivated flax genome. Classification of flax CYP450 genes based on the sequence similarity with Arabidopsis orthologs and CYP450 nomenclature, revealed 10 clans representing 44 families and 98 subfamilies. CYP80, CYP83, CYP92, CYP702, CYP705, CYP708, CYP728, CYP729, CYP733 and CYP736 families are absent in the flax genome. The subfamily members exhibited conserved sequences, length of exons and phasing of introns. Similarity search of the genomic resources of wild flax species Linum bienne with CYP450 coding sequences of the cultivated flax, revealed the presence of 127 CYP450 gene orthologs, indicating amplification of novel CYP450 genes in the cultivated flax. Seven families CYP73, 74, 75, 76, 77, 84 and 709, coding for enzymes associated with phenylpropanoid/fatty acid metabolism, showed extensive gene amplification in the flax. About 59% of the flax CYP450 genes were present in the EST libraries. Copyright © 2012 Elsevier B.V. All rights reserved.

  13. Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering

    Directory of Open Access Journals (Sweden)

    Li Weizhong

    2008-04-01

    Full Text Available Abstract Background The identification and study of proteins from metagenomic datasets can shed light on the roles and interactions of the source organisms in their communities. However, metagenomic datasets are characterized by the presence of organisms with varying GC composition, codon usage biases etc., and consequently gene identification is challenging. The vast amount of sequence data also requires faster protein family classification tools. Results We present a computational improvement to a sequence clustering approach that we developed previously to identify and classify protein coding genes in large microbial metagenomic datasets. The clustering approach can be used to identify protein coding genes in prokaryotes, viruses, and intron-less eukaryotes. The computational improvement is based on an incremental clustering method that does not require the expensive all-against-all compute that was required by the original approach, while still preserving the remote homology detection capabilities. We present evaluations of the clustering approach in protein-coding gene identification and classification, and also present the results of updating the protein clusters from our previous work with recent genomic and metagenomic sequences. The clustering results are available via CAMERA, (http://camera.calit2.net. Conclusion The clustering paradigm is shown to be a very useful tool in the analysis of microbial metagenomic data. The incremental clustering method is shown to be much faster than the original approach in identifying genes, grouping sequences into existing protein families, and also identifying novel families that have multiple members in a metagenomic dataset. These clusters provide a basis for further studies of protein families.

  14. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value.

    Directory of Open Access Journals (Sweden)

    Laetitia Marisa

    Full Text Available Colon cancer (CC pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses.Fresh-frozen primary tumor samples from a large multicenter cohort of 750 patients with stage I to IV CC who underwent surgery between 1987 and 2007 in seven centers were characterized for common DNA alterations, including BRAF, KRAS, and TP53 mutations, CpG island methylator phenotype, mismatch repair status, and chromosomal instability status, and were screened with whole genome and transcriptome arrays. 566 samples fulfilled RNA quality requirements. Unsupervised consensus hierarchical clustering applied to gene expression data from a discovery subset of 443 CC samples identified six molecular subtypes. These subtypes were associated with distinct clinicopathological characteristics, molecular alterations, specific enrichments of supervised gene expression signatures (stem cell phenotype-like, normal-like, serrated CC phenotype-like, and deregulated signaling pathways. Based on their main biological characteristics, we distinguished a deficient mismatch repair subtype, a KRAS mutant subtype, a cancer stem cell subtype, and three chromosomal instability subtypes, including one associated with down-regulated immune pathways, one with up-regulation of the Wnt pathway, and one displaying a normal-like gene expression profile. The classification was validated in the remaining 123 samples plus an independent set of 1,058 CC samples, including eight public datasets. Furthermore, prognosis was analyzed in the subset of stage II-III CC samples. The subtypes C4 and C6, but not the subtypes C1, C2, C3, and C5, were independently associated with shorter relapse-free survival, even after

  15. Minimal gene selection for classification and diagnosis prediction based on gene expression profile

    Directory of Open Access Journals (Sweden)

    Alireza Mehridehnavi

    2013-01-01

    Conclusion: We have shown that the use of two most significant genes based on their S/N ratios and selection of suitable training samples can lead to classify DLBCL patients with a rather good result. Actually with the aid of mentioned methods we could compensate lack of enough number of patients, improve accuracy of classifying and reduce complication of computations and so running time.

  16. Alternative Polyadenylation Patterns for Novel Gene Discovery and Classification in Cancer

    Directory of Open Access Journals (Sweden)

    Oguzhan Begik

    2017-07-01

    Full Text Available Certain aspects of diagnosis, prognosis, and treatment of cancer patients are still important challenges to be addressed. Therefore, we propose a pipeline to uncover patterns of alternative polyadenylation (APA, a hidden complexity in cancer transcriptomes, to further accelerate efforts to discover novel cancer genes and pathways. Here, we analyzed expression data for 1045 cancer patients and found a significant shift in usage of poly(A signals in common tumor types (breast, colon, lung, prostate, gastric, and ovarian compared to normal tissues. Using machine-learning techniques, we further defined specific subsets of APA events to efficiently classify cancer types. Furthermore, APA patterns were associated with altered protein levels in patients, revealed by antibody-based profiling data, suggesting functional significance. Overall, our study offers a computational approach for use of APA in novel gene discovery and classification in common tumor types, with important implications in basic research, biomarker discovery, and precision medicine approaches.

  17. Incorporating rich background knowledge for gene named entity classification and recognition

    Directory of Open Access Journals (Sweden)

    Yang Zhihao

    2009-07-01

    Full Text Available Abstract Background Gene named entity classification and recognition are crucial preliminary steps of text mining in biomedical literature. Machine learning based methods have been used in this area with great success. In most state-of-the-art systems, elaborately designed lexical features, such as words, n-grams, and morphology patterns, have played a central part. However, this type of feature tends to cause extreme sparseness in feature space. As a result, out-of-vocabulary (OOV terms in the training data are not modeled well due to lack of information. Results We propose a general framework for gene named entity representation, called feature coupling generalization (FCG. The basic idea is to generate higher level features using term frequency and co-occurrence information of highly indicative features in huge amount of unlabeled data. We examine its performance in a named entity classification task, which is designed to remove non-gene entries in a large dictionary derived from online resources. The results show that new features generated by FCG outperform lexical features by 5.97 F-score and 10.85 for OOV terms. Also in this framework each extension yields significant improvements and the sparse lexical features can be transformed into both a lower dimensional and more informative representation. A forward maximum match method based on the refined dictionary produces an F-score of 86.2 on BioCreative 2 GM test set. Then we combined the dictionary with a conditional random field (CRF based gene mention tagger, achieving an F-score of 89.05, which improves the performance of the CRF-based tagger by 4.46 with little impact on the efficiency of the recognition system. A demo of the NER system is available at http://202.118.75.18:8080/bioner.

  18. Molecular sub-classification of renal epithelial tumors using meta-analysis of gene expression microarrays.

    Directory of Open Access Journals (Sweden)

    Thomas Sanford

    Full Text Available To evaluate the accuracy of the sub-classification of renal cortical neoplasms using molecular signatures.A search of publicly available databases was performed to identify microarray datasets with multiple histologic sub-types of renal cortical neoplasms. Meta-analytic techniques were utilized to identify differentially expressed genes for each histologic subtype. The lists of genes obtained from the meta-analysis were used to create predictive signatures through the use of a pair-based method. These signatures were organized into an algorithm to sub-classify renal neoplasms. The use of these signatures according to our algorithm was validated on several independent datasets.We identified three Gene Expression Omnibus datasets that fit our criteria to develop a training set. All of the datasets in our study utilized the Affymetrix platform. The final training dataset included 149 samples represented by the four most common histologic subtypes of renal cortical neoplasms: 69 clear cell, 41 papillary, 16 chromophobe, and 23 oncocytomas. When validation of our signatures was performed on external datasets, we were able to correctly classify 68 of the 72 samples (94%. The correct classification by subtype was 19/20 (95% for clear cell, 14/14 (100% for papillary, 17/19 (89% for chromophobe, 18/19 (95% for oncocytomas.Through the use of meta-analytic techniques, we were able to create an algorithm that sub-classified renal neoplasms on a molecular level with 94% accuracy across multiple independent datasets. This algorithm may aid in selecting molecular therapies and may improve the accuracy of subtyping of renal cortical tumors.

  19. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  20. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  1. Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data

    Directory of Open Access Journals (Sweden)

    Cheung Leo

    2007-02-01

    Full Text Available Abstract Background Designing appropriate machine learning methods for identifying genes that have a significant discriminating power for disease outcomes has become more and more important for our understanding of diseases at genomic level. Although many machine learning methods have been developed and applied to the area of microarray gene expression data analysis, the majority of them are based on linear models, which however are not necessarily appropriate for the underlying connection between the target disease and its associated explanatory genes. Linear model based methods usually also bring in false positive significant features more easily. Furthermore, linear model based algorithms often involve calculating the inverse of a matrix that is possibly singular when the number of potentially important genes is relatively large. This leads to problems of numerical instability. To overcome these limitations, a few non-linear methods have recently been introduced to the area. Many of the existing non-linear methods have a couple of critical problems, the model selection problem and the model parameter tuning problem, that remain unsolved or even untouched. In general, a unified framework that allows model parameters of both linear and non-linear models to be easily tuned is always preferred in real-world applications. Kernel-induced learning methods form a class of approaches that show promising potentials to achieve this goal. Results A hierarchical statistical model named kernel-imbedded Gaussian process (KIGP is developed under a unified Bayesian framework for binary disease classification problems using microarray gene expression data. In particular, based on a probit regression setting, an adaptive algorithm with a cascading structure is designed to find the appropriate kernel, to discover the potentially significant genes, and to make the optimal class prediction accordingly. A Gibbs sampler is built as the core of the algorithm to make

  2. Classification based upon gene expression data: bias and precision of error rates.

    Science.gov (United States)

    Wood, Ian A; Visscher, Peter M; Mengersen, Kerrie L

    2007-06-01

    Gene expression data offer a large number of potentially useful predictors for the classification of tissue samples into classes, such as diseased and non-diseased. The predictive error rate of classifiers can be estimated using methods such as cross-validation. We have investigated issues of interpretation and potential bias in the reporting of error rate estimates. The issues considered here are optimization and selection biases, sampling effects, measures of misclassification rate, baseline error rates, two-level external cross-validation and a novel proposal for detection of bias using the permutation mean. Reporting an optimal estimated error rate incurs an optimization bias. Downward bias of 3-5% was found in an existing study of classification based on gene expression data and may be endemic in similar studies. Using a simulated non-informative dataset and two example datasets from existing studies, we show how bias can be detected through the use of label permutations and avoided using two-level external cross-validation. Some studies avoid optimization bias by using single-level cross-validation and a test set, but error rates can be more accurately estimated via two-level cross-validation. In addition to estimating the simple overall error rate, we recommend reporting class error rates plus where possible the conditional risk incorporating prior class probabilities and a misclassification cost matrix. We also describe baseline error rates derived from three trivial classifiers which ignore the predictors. R code which implements two-level external cross-validation with the PAMR package, experiment code, dataset details and additional figures are freely available for non-commercial use from http://www.maths.qut.edu.au/profiles/wood/permr.jsp

  3. Temporal expression-based analysis of metabolism.

    Directory of Open Access Journals (Sweden)

    Sara B Collins

    Full Text Available Metabolic flux is frequently rerouted through cellular metabolism in response to dynamic changes in the intra- and extra-cellular environment. Capturing the mechanisms underlying these metabolic transitions in quantitative and predictive models is a prominent challenge in systems biology. Progress in this regard has been made by integrating high-throughput gene expression data into genome-scale stoichiometric models of metabolism. Here, we extend previous approaches to perform a Temporal Expression-based Analysis of Metabolism (TEAM. We apply TEAM to understanding the complex metabolic dynamics of the respiratorily versatile bacterium Shewanella oneidensis grown under aerobic, lactate-limited conditions. TEAM predicts temporal metabolic flux distributions using time-series gene expression data. Increased predictive power is achieved by supplementing these data with a large reference compendium of gene expression, which allows us to take into account the unique character of the distribution of expression of each individual gene. We further propose a straightforward method for studying the sensitivity of TEAM to changes in its fundamental free threshold parameter θ, and reveal that discrete zones of distinct metabolic behavior arise as this parameter is changed. By comparing the qualitative characteristics of these zones to additional experimental data, we are able to constrain the range of θ to a small, well-defined interval. In parallel, the sensitivity analysis reveals the inherently difficult nature of dynamic metabolic flux modeling: small errors early in the simulation propagate to relatively large changes later in the simulation. We expect that handling such "history-dependent" sensitivities will be a major challenge in the future development of dynamic metabolic-modeling techniques.

  4. Multiclass classification for skin cancer profiling based on the integration of heterogeneous gene expression series.

    Science.gov (United States)

    Gálvez, Juan Manuel; Castillo, Daniel; Herrera, Luis Javier; San Román, Belén; Valenzuela, Olga; Ortuño, Francisco Manuel; Rojas, Ignacio

    2018-01-01

    Most of the research studies developed applying microarray technology to the characterization of different pathological states of any disease may fail in reaching statistically significant results. This is largely due to the small repertoire of analysed samples, and to the limitation in the number of states or pathologies usually addressed. Moreover, the influence of potential deviations on the gene expression quantification is usually disregarded. In spite of the continuous changes in omic sciences, reflected for instance in the emergence of new Next-Generation Sequencing-related technologies, the existing availability of a vast amount of gene expression microarray datasets should be properly exploited. Therefore, this work proposes a novel methodological approach involving the integration of several heterogeneous skin cancer series, and a later multiclass classifier design. This approach is thus a way to provide the clinicians with an intelligent diagnosis support tool based on the use of a robust set of selected biomarkers, which simultaneously distinguishes among different cancer-related skin states. To achieve this, a multi-platform combination of microarray datasets from Affymetrix and Illumina manufacturers was carried out. This integration is expected to strengthen the statistical robustness of the study as well as the finding of highly-reliable skin cancer biomarkers. Specifically, the designed operation pipeline has allowed the identification of a small subset of 17 differentially expressed genes (DEGs) from which to distinguish among 7 involved skin states. These genes were obtained from the assessment of a number of potential batch effects on the gene expression data. The biological interpretation of these genes was inspected in the specific literature to understand their underlying information in relation to skin cancer. Finally, in order to assess their possible effectiveness in cancer diagnosis, a cross-validation Support Vector Machines (SVM

  5. Graph-based semi-supervised learning with genomic data integration using condition-responsive genes applied to phenotype classification.

    Science.gov (United States)

    Doostparast Torshizi, Abolfazl; Petzold, Linda R

    2018-01-01

    Data integration methods that combine data from different molecular levels such as genome, epigenome, transcriptome, etc., have received a great deal of interest in the past few years. It has been demonstrated that the synergistic effects of different biological data types can boost learning capabilities and lead to a better understanding of the underlying interactions among molecular levels. In this paper we present a graph-based semi-supervised classification algorithm that incorporates latent biological knowledge in the form of biological pathways with gene expression and DNA methylation data. The process of graph construction from biological pathways is based on detecting condition-responsive genes, where 3 sets of genes are finally extracted: all condition responsive genes, high-frequency condition-responsive genes, and P-value-filtered genes. The proposed approach is applied to ovarian cancer data downloaded from the Human Genome Atlas. Extensive numerical experiments demonstrate superior performance of the proposed approach compared to other state-of-the-art algorithms, including the latest graph-based classification techniques. Simulation results demonstrate that integrating various data types enhances classification performance and leads to a better understanding of interrelations between diverse omics data types. The proposed approach outperforms many of the state-of-the-art data integration algorithms. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  6. Comparison of two approaches for the classification of 16S rRNA gene sequences.

    Science.gov (United States)

    Chatellier, Sonia; Mugnier, Nathalie; Allard, Françoise; Bonnaud, Bertrand; Collin, Valérie; van Belkum, Alex; Veyrieras, Jean-Baptiste; Emler, Stefan

    2014-10-01

    The use of 16S rRNA gene sequences for microbial identification in clinical microbiology is accepted widely, and requires databases and algorithms. We compared a new research database containing curated 16S rRNA gene sequences in combination with the lca (lowest common ancestor) algorithm (RDB-LCA) to a commercially available 16S rDNA Centroid approach. We used 1025 bacterial isolates characterized by biochemistry, matrix-assisted laser desorption/ionization time-of-flight MS and 16S rDNA sequencing. Nearly 80 % of isolates were identified unambiguously at the species level by both classification platforms used. The remaining isolates were mostly identified correctly at the genus level due to the limited resolution of 16S rDNA sequencing. Discrepancies between both 16S rDNA platforms were due to differences in database content and the algorithm used, and could amount to up to 10.5 %. Up to 1.4 % of the analyses were found to be inconclusive. It is important to realize that despite the overall good performance of the pipelines for analysis, some inconclusive results remain that require additional in-depth analysis performed using supplementary methods. © 2014 The Authors.

  7. Using Variable Precision Rough Set for Selection and Classification of Biological Knowledge Integrated in DNA Gene Expression

    Directory of Open Access Journals (Sweden)

    Calvo-Dmgz D.

    2012-12-01

    Full Text Available DNA microarrays have contributed to the exponential growth of genomic and experimental data in the last decade. This large amount of gene expression data has been used by researchers seeking diagnosis of diseases like cancer using machine learning methods. In turn, explicit biological knowledge about gene functions has also grown tremendously over the last decade. This work integrates explicit biological knowledge, provided as gene sets, into the classication process by means of Variable Precision Rough Set Theory (VPRS. The proposed model is able to highlight which part of the provided biological knowledge has been important for classification. This paper presents a novel model for microarray data classification which is able to incorporate prior biological knowledge in the form of gene sets. Based on this knowledge, we transform the input microarray data into supergenes, and then we apply rough set theory to select the most promising supergenes and to derive a set of easy interpretable classification rules. The proposed model is evaluated over three breast cancer microarrays datasets obtaining successful results compared to classical classification techniques. The experimental results shows that there are not significat differences between our model and classical techniques but it is able to provide a biological-interpretable explanation of how it classifies new samples.

  8. Association between traditional clinical high-risk features and gene expression profile classification in uveal melanoma.

    Science.gov (United States)

    Nguyen, Brandon T; Kim, Ryan S; Bretana, Maria E; Kegley, Eric; Schefler, Amy C

    2018-02-01

    To evaluate the association between traditional clinical high-risk features of uveal melanoma patients and gene expression profile (GEP). This was a retrospective, single-center, case series of patients with uveal melanoma. Eighty-three patients met inclusion criteria for the study. Patients were examined for the following clinical risk factors: drusen/retinal pigment epithelium (RPE) changes, vascularity on B-scan, internal reflectivity on A-scan, subretinal fluid (SRF), orange pigment, apical tumor height/thickness, and largest basal dimensions (LBD). A novel point system was created to grade the high-risk clinical features of each tumor. Further analyses were performed to assess the degree of association between GEP and each individual risk factor, total clinical risk score, vascularity, internal reflectivity, American Joint Committee on Cancer (AJCC) tumor stage classification, apical tumor height/thickness, and LBD. Of the 83 total patients, 41 were classified as GEP class 1A, 17 as class 1B, and 25 as class 2. The presence of orange pigment, SRF, low internal reflectivity and vascularity on ultrasound, and apical tumor height/thickness ≥ 2 mm were not statistically significantly associated with GEP class. Lack of drusen/RPE changes demonstrated a trend toward statistical association with GEP class 2 compared to class 1A/1B. LBD and advancing AJCC stage was statistically associated with higher GEP class. In this cohort, AJCC stage classification and LBD were the only clinical features statistically associated with GEP class. Clinicians should use caution when inferring the growth potential of melanocytic lesions solely from traditional funduscopic and ultrasonographic risk factors without GEP data.

  9. A re-assessment of gene-tag classification approaches for describing var gene expression patterns during human Plasmodium falciparum malaria parasite infections.

    Science.gov (United States)

    Githinji, George; Bull, Peter C

    2017-01-01

    PfEMP1 are variant parasite antigens that are inserted on the surface of Plasmodium falciparum infected erythrocytes (IE). Through interactions with various host molecules, PfEMP1 mediate IE sequestration in tissues and play a key role in the pathology of severe malaria. PfEMP1 is encoded by a diverse multi-gene family called var . Previous studies have shown that that expression of specific subsets of var genes are associated with low levels of host immunity and severe malaria. However, in most clinical studies to date, full-length var gene sequences were unavailable and various approaches have been used to make comparisons between var gene expression profiles in different parasite isolates using limited information. Several studies have relied on the classification of a 300 - 500 base-pair "DBLα tag" region in the DBLα domain located at the 5' end of most var genes. We assessed the relationship between various DBLα tag classification methods, and sequence features that are only fully assessable through full-length var gene sequences. We compared these different sequence features in full-length var gene from six fully sequenced laboratory isolates. These comparisons show that despite a long history of recombination,   DBLα sequence tag classification can provide functional information on important features of full-length var genes. Notably, a specific subset of DBLα tags previously defined as "group A-like" is associated with CIDRα1 domains proposed to bind to endothelial protein C receptor. This analysis helps to bring together different sources of data that have been used to assess var gene expression in clinical parasite isolates.

  10. Gene Expression Profiles for Predicting Metastasis in Breast Cancer: A Cross-Study Comparison of Classification Methods

    Directory of Open Access Journals (Sweden)

    Mark Burton

    2012-01-01

    Full Text Available Machine learning has increasingly been used with microarray gene expression data and for the development of classifiers using a variety of methods. However, method comparisons in cross-study datasets are very scarce. This study compares the performance of seven classification methods and the effect of voting for predicting metastasis outcome in breast cancer patients, in three situations: within the same dataset or across datasets on similar or dissimilar microarray platforms. Combining classification results from seven classifiers into one voting decision performed significantly better during internal validation as well as external validation in similar microarray platforms than the underlying classification methods. When validating between different microarray platforms, random forest, another voting-based method, proved to be the best performing method. We conclude that voting based classifiers provided an advantage with respect to classifying metastasis outcome in breast cancer patients.

  11. MO-DE-207B-03: Improved Cancer Classification Using Patient-Specific Biological Pathway Information Via Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Young, M; Craft, D [Massachusetts General Hospital and Harvard Medical School, Boston, MA (United States)

    2016-06-15

    Purpose: To develop an efficient, pathway-based classification system using network biology statistics to assist in patient-specific response predictions to radiation and drug therapies across multiple cancer types. Methods: We developed PICS (Pathway Informed Classification System), a novel two-step cancer classification algorithm. In PICS, a matrix m of mRNA expression values for a patient cohort is collapsed into a matrix p of biological pathways. The entries of p, which we term pathway scores, are obtained from either principal component analysis (PCA), normal tissue centroid (NTC), or gene expression deviation (GED). The pathway score matrix is clustered using both k-means and hierarchical clustering, and a clustering is judged by how well it groups patients into distinct survival classes. The most effective pathway scoring/clustering combination, per clustering p-value, thus generates various ‘signatures’ for conventional and functional cancer classification. Results: PICS successfully regularized large dimension gene data, separated normal and cancerous tissues, and clustered a large patient cohort spanning six cancer types. Furthermore, PICS clustered patient cohorts into distinct, statistically-significant survival groups. For a suboptimally-debulked ovarian cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00127) showed significant improvement over that of a prior gene expression-classified study (p = .0179). For a pancreatic cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00141) showed significant improvement over that of a prior gene expression-classified study (p = .04). Pathway-based classification confirmed biomarkers for the pyrimidine, WNT-signaling, glycerophosphoglycerol, beta-alanine, and panthothenic acid pathways for ovarian cancer. Despite its robust nature, PICS requires significantly less run time than current pathway scoring methods. Conclusion: This work validates the PICS method to improve

  12. Gene expression profiling, pathway analysis and subtype classification reveal molecular heterogeneity in hepatocellular carcinoma and suggest subtype specific therapeutic targets.

    Science.gov (United States)

    Agarwal, Rahul; Narayan, Jitendra; Bhattacharyya, Amitava; Saraswat, Mayank; Tomar, Anil Kumar

    2017-10-01

    A very low 5-year survival rate among hepatocellular carcinoma (HCC) patients is mainly due to lack of early stage diagnosis, distant metastasis and high risk of postoperative recurrence. Hence ascertaining novel biomarkers for early diagnosis and patient specific therapeutics is crucial and urgent. Here, we have performed a comprehensive analysis of the expression data of 423 HCC patients (373 tumors and 50 controls) downloaded from The Cancer Genome Atlas (TCGA) followed by pathway enrichment by gene ontology annotations, subtype classification and overall survival analysis. The differential gene expression analysis using non-parametric Wilcoxon test revealed a total of 479 up-regulated and 91 down-regulated genes in HCC compared to controls. The list of top differentially expressed genes mainly consists of tumor/cancer associated genes, such as AFP, THBS4, LCN2, GPC3, NUF2, etc. The genes over-expressed in HCC were mainly associated with cell cycle pathways. In total, 59 kinases associated genes were found over-expressed in HCC, including TTK, MELK, BUB1, NEK2, BUB1B, AURKB, PLK1, CDK1, PKMYT1, PBK, etc. Overall four distinct HCC subtypes were predicted using consensus clustering method. Each subtype was unique in terms of gene expression, pathway enrichment and median survival. Conclusively, this study has exposed a number of interesting genes which can be exploited in future as potential markers of HCC, diagnostic as well as prognostic and subtype classification may guide for improved and specific therapy. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method

    Directory of Open Access Journals (Sweden)

    Huang Desheng

    2009-07-01

    Full Text Available Abstract Background A reliable and precise classification is essential for successful diagnosis and treatment of cancer. Gene expression microarrays have provided the high-throughput platform to discover genomic biomarkers for cancer diagnosis and prognosis. Rational use of the available bioinformation can not only effectively remove or suppress noise in gene chips, but also avoid one-sided results of separate experiment. However, only some studies have been aware of the importance of prior information in cancer classification. Methods Together with the application of support vector machine as the discriminant approach, we proposed one modified method that incorporated prior knowledge into cancer classification based on gene expression data to improve accuracy. A public well-known dataset, Malignant pleural mesothelioma and lung adenocarcinoma gene expression database, was used in this study. Prior knowledge is viewed here as a means of directing the classifier using known lung adenocarcinoma related genes. The procedures were performed by software R 2.80. Results The modified method performed better after incorporating prior knowledge. Accuracy of the modified method improved from 98.86% to 100% in training set and from 98.51% to 99.06% in test set. The standard deviations of the modified method decreased from 0.26% to 0 in training set and from 3.04% to 2.10% in test set. Conclusion The method that incorporates prior knowledge into discriminant analysis could effectively improve the capacity and reduce the impact of noise. This idea may have good future not only in practice but also in methodology.

  14. Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data

    Directory of Open Access Journals (Sweden)

    J. Sunil Rao

    2007-01-01

    Full Text Available In gene selection for cancer classifi cation using microarray data, we define an eigenvalue-ratio statistic to measure a gene’s contribution to the joint discriminability when this gene is included into a set of genes. Based on this eigenvalueratio statistic, we define a novel hypothesis testing for gene statistical redundancy and propose two gene selection methods. Simulation studies illustrate the agreement between statistical redundancy testing and gene selection methods. Real data examples show the proposed gene selection methods can select a compact gene subset which can not only be used to build high quality cancer classifiers but also show biological relevance.

  15. BCDForest: a boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data.

    Science.gov (United States)

    Guo, Yang; Liu, Shuhui; Li, Zhanhuai; Shang, Xuequn

    2018-04-11

    The classification of cancer subtypes is of great importance to cancer disease diagnosis and therapy. Many supervised learning approaches have been applied to cancer subtype classification in the past few years, especially of deep learning based approaches. Recently, the deep forest model has been proposed as an alternative of deep neural networks to learn hyper-representations by using cascade ensemble decision trees. It has been proved that the deep forest model has competitive or even better performance than deep neural networks in some extent. However, the standard deep forest model may face overfitting and ensemble diversity challenges when dealing with small sample size and high-dimensional biology data. In this paper, we propose a deep learning model, so-called BCDForest, to address cancer subtype classification on small-scale biology datasets, which can be viewed as a modification of the standard deep forest model. The BCDForest distinguishes from the standard deep forest model with the following two main contributions: First, a named multi-class-grained scanning method is proposed to train multiple binary classifiers to encourage diversity of ensemble. Meanwhile, the fitting quality of each classifier is considered in representation learning. Second, we propose a boosting strategy to emphasize more important features in cascade forests, thus to propagate the benefits of discriminative features among cascade layers to improve the classification performance. Systematic comparison experiments on both microarray and RNA-Seq gene expression datasets demonstrate that our method consistently outperforms the state-of-the-art methods in application of cancer subtype classification. The multi-class-grained scanning and boosting strategy in our model provide an effective solution to ease the overfitting challenge and improve the robustness of deep forest model working on small-scale data. Our model provides a useful approach to the classification of cancer subtypes

  16. Cell of origin associated classification of B-cell malignancies by gene signatures of the normal B-cell hierarchy.

    Science.gov (United States)

    Johnsen, Hans Erik; Bergkvist, Kim Steve; Schmitz, Alexander; Kjeldsen, Malene Krag; Hansen, Steen Møller; Gaihede, Michael; Nørgaard, Martin Agge; Bæch, John; Grønholdt, Marie-Louise; Jensen, Frank Svendsen; Johansen, Preben; Bødker, Julie Støve; Bøgsted, Martin; Dybkær, Karen

    2014-06-01

    Recent findings have suggested biological classification of B-cell malignancies as exemplified by the "activated B-cell-like" (ABC), the "germinal-center B-cell-like" (GCB) and primary mediastinal B-cell lymphoma (PMBL) subtypes of diffuse large B-cell lymphoma and "recurrent translocation and cyclin D" (TC) classification of multiple myeloma. Biological classification of B-cell derived cancers may be refined by a direct and systematic strategy where identification and characterization of normal B-cell differentiation subsets are used to define the cancer cell of origin phenotype. Here we propose a strategy combining multiparametric flow cytometry, global gene expression profiling and biostatistical modeling to generate B-cell subset specific gene signatures from sorted normal human immature, naive, germinal centrocytes and centroblasts, post-germinal memory B-cells, plasmablasts and plasma cells from available lymphoid tissues including lymph nodes, tonsils, thymus, peripheral blood and bone marrow. This strategy will provide an accurate image of the stage of differentiation, which prospectively can be used to classify any B-cell malignancy and eventually purify tumor cells. This report briefly describes the current models of the normal B-cell subset differentiation in multiple tissues and the pathogenesis of malignancies originating from the normal germinal B-cell hierarchy.

  17. A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization.

    Science.gov (United States)

    Vafaee Sharbaf, Fatemeh; Mosafer, Sara; Moattar, Mohammad Hossein

    2016-06-01

    This paper proposes an approach for gene selection in microarray data. The proposed approach consists of a primary filter approach using Fisher criterion which reduces the initial genes and hence the search space and time complexity. Then, a wrapper approach which is based on cellular learning automata (CLA) optimized with ant colony method (ACO) is used to find the set of features which improve the classification accuracy. CLA is applied due to its capability to learn and model complicated relationships. The selected features from the last phase are evaluated using ROC curve and the most effective while smallest feature subset is determined. The classifiers which are evaluated in the proposed framework are K-nearest neighbor; support vector machine and naïve Bayes. The proposed approach is evaluated on 4 microarray datasets. The evaluations confirm that the proposed approach can find the smallest subset of genes while approaching the maximum accuracy. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Feature selection and classification of MAQC-II breast cancer and multiple myeloma microarray gene expression data.

    Directory of Open Access Journals (Sweden)

    Qingzhong Liu

    Full Text Available Microarray data has a high dimension of variables but available datasets usually have only a small number of samples, thereby making the study of such datasets interesting and challenging. In the task of analyzing microarray data for the purpose of, e.g., predicting gene-disease association, feature selection is very important because it provides a way to handle the high dimensionality by exploiting information redundancy induced by associations among genetic markers. Judicious feature selection in microarray data analysis can result in significant reduction of cost while maintaining or improving the classification or prediction accuracy of learning machines that are employed to sort out the datasets. In this paper, we propose a gene selection method called Recursive Feature Addition (RFA, which combines supervised learning and statistical similarity measures. We compare our method with the following gene selection methods: Support Vector Machine Recursive Feature Elimination (SVMRFE, Leave-One-Out Calculation Sequential Forward Selection (LOOCSFS, Gradient based Leave-one-out Gene Selection (GLGS. To evaluate the performance of these gene selection methods, we employ several popular learning classifiers on the MicroArray Quality Control phase II on predictive modeling (MAQC-II breast cancer dataset and the MAQC-II multiple myeloma dataset. Experimental results show that gene selection is strictly paired with learning classifier. Overall, our approach outperforms other compared methods. The biological functional analysis based on the MAQC-II breast cancer dataset convinced us to apply our method for phenotype prediction. Additionally, learning classifiers also play important roles in the classification of microarray data and our experimental results indicate that the Nearest Mean Scale Classifier (NMSC is a good choice due to its prediction reliability and its stability across the three performance measurements: Testing accuracy, MCC values, and

  19. Medusa structure of the gene regulatory network: dominance of transcription factors in cancer subtype classification.

    Science.gov (United States)

    Guo, Yuchun; Feng, Ying; Trivedi, Niraj S; Huang, Sui

    2011-05-01

    Gene expression profiles consisting of ten thousands of transcripts are used for clustering of tissue, such as tumors, into subtypes, often without considering the underlying reason that the distinct patterns of expression arise because of constraints in the realization of gene expression profiles imposed by the gene regulatory network. The topology of this network has been suggested to consist of a regulatory core of genes represented most prominently by transcription factors (TFs) and microRNAs, that influence the expression of other genes, and of a periphery of 'enslaved' effector genes that are regulated but not regulating. This 'medusa' architecture implies that the core genes are much stronger determinants of the realized gene expression profiles. To test this hypothesis, we examined the clustering of gene expression profiles into known tumor types to quantitatively demonstrate that TFs, and even more pronounced, microRNAs, are much stronger discriminators of tumor type specific gene expression patterns than a same number of randomly selected or metabolic genes. These findings lend support to the hypothesis of a medusa architecture and of the canalizing nature of regulation by microRNAs. They also reveal the degree of freedom for the expression of peripheral genes that are less stringently associated with a tissue type specific global gene expression profile.

  20. Gene Structures, Classification, and Expression Models of the DREB Transcription Factor Subfamily in Populus trichocarpa

    Directory of Open Access Journals (Sweden)

    Yunlin Chen

    2013-01-01

    Full Text Available We identified 75 dehydration-responsive element-binding (DREB protein genes in Populus trichocarpa. We analyzed gene structures, phylogenies, domain duplications, genome localizations, and expression profiles. The phylogenic construction suggests that the PtrDREB gene subfamily can be classified broadly into six subtypes (DREB A-1 to A-6 in Populus. The chromosomal localizations of the PtrDREB genes indicated 18 segmental duplication events involving 36 genes and six redundant PtrDREB genes were involved in tandem duplication events. There were fewer introns in the PtrDREB subfamily. The motif composition of PtrDREB was highly conserved in the same subtype. We investigated expression profiles of this gene subfamily from different tissues and/or developmental stages. Sixteen genes present in the digital expression analysis had high levels of transcript accumulation. The microarray results suggest that 18 genes were upregulated. We further examined the stress responsiveness of 15 genes by qRT-PCR. A digital northern analysis showed that the PtrDREB17, 18, and 32 genes were highly induced in leaves under cold stress, and the same expression trends were shown by qRT-PCR. Taken together, these observations may lay the foundation for future functional analyses to unravel the biological roles of Populus’ DREB genes.

  1. Speculation with spiculation? - Three independent gene fragments and biochemical characters versus morphology in demosponge higher classification

    NARCIS (Netherlands)

    Erpenbeck, D.J.G.; Breeuwer, J.A.J.; Parra-Velandia, F.J.; van Soest, R.W.M.

    2006-01-01

    Demosponge higher-level systematics is currently a subject of major changes due to the simplicity and paucity of complex morphological characters. Still, sponge classification is primarily based on morphological features. The systematics of the demosponge order Agelasida has been exceptionally

  2. Classification and evolutionary analysis of the basic helix-loop-helix gene family in the green anole lizard, Anolis carolinensis.

    Science.gov (United States)

    Liu, Ake; Wang, Yong; Zhang, Debao; Wang, Xuhua; Song, Huifang; Dang, Chunwang; Yao, Qin; Chen, Keping

    2013-08-01

    Helix-loop-helix (bHLH) proteins play essential regulatory roles in a variety of biological processes. These highly conserved proteins form a large transcription factor superfamily, and are commonly identified in large numbers within animal, plant, and fungal genomes. The bHLH domain has been well studied in many animal species, but has not yet been characterized in non-avian reptiles. In this study, we identified 102 putative bHLH genes in the genome of the green anole lizard, Anolis carolinensis. Based on phylogenetic analysis, these genes were classified into 43 families, with 43, 24, 16, 3, 10, and 3 members assigned into groups A, B, C, D, E, and F, respectively, and 3 members categorized as "orphans". Within-group evolutionary relationships inferred from the phylogenetic analysis were consistent with highly conserved patterns observed for introns and additional domains. Results from phylogenetic analysis of the H/E(spl) family suggest that genome and tandem gene duplications have contributed to this family's expansion. Our classification and evolutionary analysis has provided insights into the evolutionary diversification of animal bHLH genes, and should aid future studies on bHLH protein regulation of key growth and developmental processes.

  3. Identification, classification and differential expression of oleosin genes in tung tree (Vernicia fordii).

    Science.gov (United States)

    Cao, Heping; Zhang, Lin; Tan, Xiaofeng; Long, Hongxu; Shockey, Jay M

    2014-01-01

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii), whose seeds are rich in novel TAG with a wide range of industrial applications. The objectives of this study were to identify OLE genes, classify OLE proteins and analyze OLE gene expression in tung trees. We identified five tung tree OLE genes coding for small hydrophobic proteins. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that the five tung OLE genes represented the five OLE subfamilies and all contained the "proline knot" motif (PX5SPX3P) shared among 65 OLE from 19 tree species, including the sequenced genomes of Prunus persica (peach), Populus trichocarpa (poplar), Ricinus communis (castor bean), Theobroma cacao (cacao) and Vitis vinifera (grapevine). Tung OLE1, OLE2 and OLE3 belong to the S type and OLE4 and OLE5 belong to the SM type of Arabidopsis OLE. TaqMan and SYBR Green qPCR methods were used to study the differential expression of OLE genes in tung tree tissues. Expression results demonstrated that 1) All five OLE genes were expressed in developing tung seeds, leaves and flowers; 2) OLE mRNA levels were much higher in seeds than leaves or flowers; 3) OLE1, OLE2 and OLE3 genes were expressed in tung seeds at much higher levels than OLE4 and OLE5 genes; 4) OLE mRNA levels rapidly increased during seed development; and 5) OLE gene expression was well-coordinated with tung oil accumulation in the seeds. These results suggest that tung OLE genes 1-3 probably play major roles in tung oil accumulation and/or oil body development. Therefore, they might be preferred targets for tung oil engineering in transgenic plants.

  4. Identification, classification and differential expression of oleosin genes in tung tree (Vernicia fordii.

    Directory of Open Access Journals (Sweden)

    Heping Cao

    Full Text Available Triacylglycerols (TAG are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii, whose seeds are rich in novel TAG with a wide range of industrial applications. The objectives of this study were to identify OLE genes, classify OLE proteins and analyze OLE gene expression in tung trees. We identified five tung tree OLE genes coding for small hydrophobic proteins. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that the five tung OLE genes represented the five OLE subfamilies and all contained the "proline knot" motif (PX5SPX3P shared among 65 OLE from 19 tree species, including the sequenced genomes of Prunus persica (peach, Populus trichocarpa (poplar, Ricinus communis (castor bean, Theobroma cacao (cacao and Vitis vinifera (grapevine. Tung OLE1, OLE2 and OLE3 belong to the S type and OLE4 and OLE5 belong to the SM type of Arabidopsis OLE. TaqMan and SYBR Green qPCR methods were used to study the differential expression of OLE genes in tung tree tissues. Expression results demonstrated that 1 All five OLE genes were expressed in developing tung seeds, leaves and flowers; 2 OLE mRNA levels were much higher in seeds than leaves or flowers; 3 OLE1, OLE2 and OLE3 genes were expressed in tung seeds at much higher levels than OLE4 and OLE5 genes; 4 OLE mRNA levels rapidly increased during seed development; and 5 OLE gene expression was well-coordinated with tung oil accumulation in the seeds. These results suggest that tung OLE genes 1-3 probably play major roles in tung oil accumulation and/or oil body development. Therefore, they might be preferred targets for tung oil engineering in transgenic plants.

  5. Gene Structures, Evolution, Classification and Expression Profiles of the Aquaporin Gene Family in Castor Bean (Ricinus communis L..

    Directory of Open Access Journals (Sweden)

    Zhi Zou

    Full Text Available Aquaporins (AQPs are a class of integral membrane proteins that facilitate the passive transport of water and other small solutes across biological membranes. Castor bean (Ricinus communis L., Euphobiaceae, an important non-edible oilseed crop, is widely cultivated for industrial, medicinal and cosmetic purposes. Its recently available genome provides an opportunity to analyze specific gene families. In this study, a total of 37 full-length AQP genes were identified from the castor bean genome, which were assigned to five subfamilies, including 10 plasma membrane intrinsic proteins (PIPs, 9 tonoplast intrinsic proteins (TIPs, 8 NOD26-like intrinsic proteins (NIPs, 6 X intrinsic proteins (XIPs and 4 small basic intrinsic proteins (SIPs on the basis of sequence similarities. Functional prediction based on the analysis of the aromatic/arginine (ar/R selectivity filter, Froger's positions and specificity-determining positions (SDPs showed a remarkable difference in substrate specificity among subfamilies. Homology analysis supported the expression of all 37 RcAQP genes in at least one of examined tissues, e.g., root, leaf, flower, seed and endosperm. Furthermore, global expression profiles with deep transcriptome sequencing data revealed diverse expression patterns among various tissues. The current study presents the first genome-wide analysis of the AQP gene family in castor bean. Results obtained from this study provide valuable information for future functional analysis and utilization.

  6. Hierarchical information representation and efficient classification of gene expression microarray data

    OpenAIRE

    Bosio, Mattia

    2014-01-01

    In the field of computational biology, microarryas are used to measure the activity of thousands of genes at once and create a global picture of cellular function. Microarrays allow scientists to analyze expression of many genes in a single experiment quickly and eficiently. Even if microarrays are a consolidated research technology nowadays and the trends in high-throughput data analysis are shifting towards new technologies like Next Generation Sequencing (NGS), an optimum method for sample...

  7. Gene Expression Profiling of Early Stage Non-Small Cell Lung Cancer

    NARCIS (Netherlands)

    J. Hou (Jun)

    2010-01-01

    textabstractNSCLC is a highly heterogeneous malignancy with a poor prognosis. Treatment for NSCLC is currently based on a combination of pathological staging and histological classification. Recently, gene expression-based NSCLC profiling is proven a superior approach to stratify cancer cases with

  8. From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    Directory of Open Access Journals (Sweden)

    Dawyndt Peter

    2010-01-01

    Full Text Available Abstract Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the

  9. From learning taxonomies to phylogenetic learning: integration of 16S rRNA gene data into FAME-based bacterial classification.

    Science.gov (United States)

    Slabbinck, Bram; Waegeman, Willem; Dawyndt, Peter; De Vos, Paul; De Baets, Bernard

    2010-01-30

    Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for the discrimination of bacterial

  10. From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    Science.gov (United States)

    2010-01-01

    Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for

  11. A multi-gene phylogeny of Chlorophyllum (Agaricaceae, Basidiomycota: new species, new combination and infrageneric classification

    Directory of Open Access Journals (Sweden)

    Zai-Wei Ge

    2018-03-01

    Full Text Available Taxonomic and phylogenetic studies of Chlorophyllum were carried out on the basis of morphological differences and molecular phylogenetic analyses. Based on the phylogeny inferred from the internal transcribed spacer (ITS, the partial large subunit nuclear ribosomal DNA (nrLSU, the second largest subunit of RNA polymerase II (rpb2 and translation elongation factor 1-α (tef1 sequences, six well-supported clades and 17 phylogenetic species are recognised. Within this phylogenetic framework and considering the diagnostic morphological characters, two new species, C. africanum and C. palaeotropicum, are described. In addition, a new infrageneric classification of Chlorophyllum is proposed, in which the genus is divided into six sections. One new combination is also made. This study provides a robust basis for a more detailed investigation of diversity and biogeography of Chlorophyllum.

  12. Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms

    DEFF Research Database (Denmark)

    Yoo, C.; Gernaey, Krist

    2008-01-01

    importance in the projection (VIP) information of the DPLS method. The power of the gene selection method and the proposed supervised hierarchical clustering method is illustrated on a three microarray data sets of leukemia, breast, and colon cancer. Supervised machine learning algorithms thus enable...

  13. Gene expression profiling for molecular classification of multiple myeloma in newly diagnosed patients

    NARCIS (Netherlands)

    Broyl, Annemiek; Hose, Dirk; Lokhorst, Henk; de Knegt, Yvonne; Peeters, Justine; Jauch, Anna; Bertsch, Uta; Buijs, Arjan; Stevens-Kroef, Marian; Beverloo, H. Berna; Vellenga, Edo; Zweegman, Sonja; Kersten, Marie-Josée; van der Holt, Bronno; el Jarari, Laila; Mulligan, George; Goldschmidt, Hartmut; van Duin, Mark; Sonneveld, Pieter

    2010-01-01

    To identify molecularly defined subgroups in multiple myeloma, gene expression profiling was performed on purified CD138(+) plasma cells of 320 newly diagnosed myeloma patients included in the Dutch-Belgian/German HOVON-65/GMMG-HD4 trial. Hierarchical clustering identified 10 subgroups; 6

  14. Minimising Immunohistochemical False Negative ER Classification Using a Complementary 23 Gene Expression Signature of ER Status

    DEFF Research Database (Denmark)

    Li, Qiyuan; Eklund, Aron Charles; Birkbak, Nicolai Juul

    2010-01-01

    with clinical outcome. METHODOLOGY/PRINCIPAL FINDINGS: Firstly, ER status was discriminated by fitting the bimodal expression of ESR1 to a mixed Gaussian model. The discriminative power of ESR1 suggested bimodal expression as an efficient way to stratify breast cancer; therefore we identified a set of genes...

  15. Regularization strategies for hyperplane classifiers: application to cancer classification with gene expression data.

    Science.gov (United States)

    Andries, Erik; Hagstrom, Thomas; Atlas, Susan R; Willman, Cheryl

    2007-02-01

    Linear discrimination, from the point of view of numerical linear algebra, can be treated as solving an ill-posed system of linear equations. In order to generate a solution that is robust in the presence of noise, these problems require regularization. Here, we examine the ill-posedness involved in the linear discrimination of cancer gene expression data with respect to outcome and tumor subclasses. We show that a filter factor representation, based upon Singular Value Decomposition, yields insight into the numerical ill-posedness of the hyperplane-based separation when applied to gene expression data. We also show that this representation yields useful diagnostic tools for guiding the selection of classifier parameters, thus leading to improved performance.

  16. Genome-wide identification, classification and expression profiling of nicotianamine synthase (NAS) gene family in maize

    OpenAIRE

    Zhou, Xiaojin; Li, Suzhen; Zhao, Qianqian; Liu, Xiaoqing; Zhang, Shaojun; Sun, Cheng; Fan, Yunliu; Zhang, Chunyi; Chen, Rumei

    2013-01-01

    Background Nicotianamine (NA), a ubiquitous molecule in plants, is an important metal ion chelator and the main precursor for phytosiderophores biosynthesis. Considerable progress has been achieved in cloning and characterizing the functions of nicotianamine synthase (NAS) in plants including barley, Arabidopsis and rice. Maize is not only an important cereal crop, but also a model plant for genetics and evolutionary study. The genome sequencing of maize was completed, and many gene families ...

  17. A chemometric evaluation of the underlying physical and chemical patterns that support near infrared spectroscopy of barley seeds as a tool for explorative classification of endosperm genes and gene combinations

    DEFF Research Database (Denmark)

    Jacobsen, Susanne; Søndergaard, Ib; Møller, Birthe

    2005-01-01

    Analysis (PCA). Riso mutants R-13, R-29 high (I -> 3, 1 -> 4)-beta-glucan, low starch and R-1508 (high lysine, reduced starch), near isogeneic controls and normal lines and recombinants were studied. Based on proteome analysis results, six antimicrobial proteins were followed during endosperm development...... revealing pleiotropic gene effects in expression timing that supporting the gene classification. To verify that NIR spectroscopy data represents a physio-chemical fingerprint of the barley seed, physical and chemical spectral components were partially separated by Multiple Scatter Correction...... and their genetic classification ability verified. Wavelength bands with known water binding and (I -> 3, 1 -> 4)-beta-glucan assignments were successfully predicted by partial least squares regression giving insight into how NIR-data works in classification. Highly reproducible gene-specific, covariate...

  18. Application of a 5-tiered scheme for standardized classification of 2,360 unique mismatch repair gene variants in the InSiGHT locus-specific database

    DEFF Research Database (Denmark)

    Thompson, Bryony A; Spurdle, Amanda B; Plazzer, John-Paul

    2014-01-01

    and apply a standardized classification scheme to constitutional variants in the Lynch syndrome-associated genes MLH1, MSH2, MSH6 and PMS2. Unpublished data submission was encouraged to assist in variant classification and was recognized through microattribution. The scheme was refined by multidisciplinary...... are now possible for 1,370 variants that were not obviously protein truncating from nomenclature. This large-scale endeavor will facilitate the consistent management of families suspected to have Lynch syndrome and demonstrates the value of multidisciplinary collaboration in the curation......The clinical classification of hereditary sequence variants identified in disease-related genes directly affects clinical management of patients and their relatives. The International Society for Gastrointestinal Hereditary Tumours (InSiGHT) undertook a collaborative effort to develop, test...

  19. Application of a 5-tiered scheme for standardized classification of 2,360 unique mismatch repair gene variants in the InSiGHT locus-specific database

    NARCIS (Netherlands)

    Thompson, Bryony A.; Spurdle, Amanda B.; Plazzer, John-Paul; Greenblatt, Marc S.; Akagi, Kiwamu; Al-Mulla, Fahd; Bapat, Bharati; Bernstein, Inge; Capella, Gabriel; den Dunnen, Johan T.; du Sart, Desiree; Fabre, Aurelie; Farrell, Michael P.; Farrington, Susan M.; Frayling, Ian M.; Frebourg, Thierry; Goldgar, David E.; Heinen, Christopher D.; Holinski-Feder, Elke; Kohonen-Corish, Maija; Robinson, Kristina Lagerstedt; Leung, Suet Yi; Martins, Alexandra; Moller, Pal; Morak, Monika; Nystrom, Minna; Peltomaki, Paivi; Pineda, Marta; Qi, Ming; Ramesar, Rajkumar; Rasmussen, Lene Juel; Royer-Pokora, Brigitte; Scott, Rodney J.; Sijmons, Rolf; Tavtigian, Sean V.; Tops, Carli M.; Weber, Thomas; Wijnen, Juul; Woods, Michael O.; Macrae, Finlay; Genuardi, Maurizio

    The clinical classification of hereditary sequence variants identified in disease-related genes directly affects clinical management of patients and their relatives. The International Society for Gastrointestinal Hereditary Tumours (InSiGHT) undertook a collaborative effort to develop, test and

  20. Cancer classification through filtering progressive transductive support vector machine based on gene expression data

    Science.gov (United States)

    Lu, Xinguo; Chen, Dan

    2017-08-01

    Traditional supervised classifiers neglect a large amount of data which not have sufficient follow-up information, only work with labeled data. Consequently, the small sample size limits the advancement of design appropriate classifier. In this paper, a transductive learning method which combined with the filtering strategy in transductive framework and progressive labeling strategy is addressed. The progressive labeling strategy does not need to consider the distribution of labeled samples to evaluate the distribution of unlabeled samples, can effective solve the problem of evaluate the proportion of positive and negative samples in work set. Our experiment result demonstrate that the proposed technique have great potential in cancer prediction based on gene expression.

  1. Closing the loop: from paper to protein annotation using supervised Gene Ontology classification.

    Science.gov (United States)

    Gobeill, Julien; Pasche, Emilie; Vishnyakova, Dina; Ruch, Patrick

    2014-01-01

    Gene function curation of the literature with Gene Ontology (GO) concepts is one particularly time-consuming task in genomics, and the help from bioinformatics is highly requested to keep up with the flow of publications. In 2004, the first BioCreative challenge already designed a task of automatic GO concepts assignment from a full text. At this time, results were judged far from reaching the performances required by real curation workflows. In particular, supervised approaches produced the most disappointing results because of lack of training data. Ten years later, the available curation data have massively grown. In 2013, the BioCreative IV GO task revisited the automatic GO assignment task. For this issue, we investigated the power of our supervised classifier, GOCat. GOCat computes similarities between an input text and already curated instances contained in a knowledge base to infer GO concepts. The subtask A consisted in selecting GO evidence sentences for a relevant gene in a full text. For this, we designed a state-of-the-art supervised statistical approach, using a naïve Bayes classifier and the official training set, and obtained fair results. The subtask B consisted in predicting GO concepts from the previous output. For this, we applied GOCat and reached leading results, up to 65% for hierarchical recall in the top 20 outputted concepts. Contrary to previous competitions, machine learning has this time outperformed standard dictionary-based approaches. Thanks to BioCreative IV, we were able to design a complete workflow for curation: given a gene name and a full text, this system is able to select evidence sentences for curation and to deliver highly relevant GO concepts. Contrary to previous competitions, machine learning this time outperformed dictionary-based systems. Observed performances are sufficient for being used in a real semiautomatic curation workflow. GOCat is available at http://eagl.unige.ch/GOCat/. http://eagl.unige.ch/GOCat4FT/.

  2. Two-gene signature improves the discriminatory power of IASLC/ATS/ERS classification to predict the survival of patients with early-stage lung adenocarcinoma

    Directory of Open Access Journals (Sweden)

    Sun Y

    2016-07-01

    Full Text Available Yifeng Sun,1,* Likun Hou,2,* Yu Yang,1 Huikang Xie,2 Yang Yang,1 Zhigang Li,1 Heng Zhao,1 Wen Gao,3 Bo Su4 1Department of Thoracic Surgery, Shanghai Chest Hospital, Shanghai Jiaotong University, 2Department of Pathology, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, 3Department of Thoracic Surgery, Shanghai Huadong Hospital, Fudan University School of Medicine, Shanghai, 4Central Lab, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, People’s Republic of China *These authors contributed equally to this work Background: In this study, we investigated the contribution of a gene expression–based signature (composed of BAG1, BRCA1, CDC6, CDK2AP1, ERBB3, FUT3, IL11, LCK, RND3, SH3BGR to survival prediction for early-stage lung adenocarcinoma categorized by the new International Association for the Study of Lung Cancer (IASLC/the American Thoracic Society (ATS/the European Respiratory Society (ERS classification. We also aimed to verify whether gene signature improves the risk discrimination of IASLC/ATS/ERS classification in early-stage lung adenocarcinoma. Patients and methods: Total RNA was extracted from 93 patients with pathologically confirmed TNM stage Ia and Ib lung adenocarcinoma. The mRNA expression levels of ten genes in the signature (BAG1, BRCA1, CDC6, CDK2AP1, ERBB3, FUT3, IL11, LCK, RND3, and SH3BGR were detected using real-time polymerase chain reaction. Each patient was categorized according to the new IASLC/ATS/ERS classification by accessing hematoxylin–eosin-stained slides. The corresponding Kaplan–Meier survival analysis by the log-rank statistic, multivariate Cox proportional hazards modeling, and c-index calculation were conducted using the programming language R (Version 2.15.1 with the “risksetROC” package. Results: The multivariate analysis demonstrated that the risk factor of the ten-gene expression signature can significantly improve the discriminatory

  3. Combining multiple hypothesis testing and affinity propagation clustering leads to accurate, robust and sample size independent classification on gene expression data

    Directory of Open Access Journals (Sweden)

    Sakellariou Argiris

    2012-10-01

    Full Text Available Abstract Background A feature selection method in microarray gene expression data should be independent of platform, disease and dataset size. Our hypothesis is that among the statistically significant ranked genes in a gene list, there should be clusters of genes that share similar biological functions related to the investigated disease. Thus, instead of keeping N top ranked genes, it would be more appropriate to define and keep a number of gene cluster exemplars. Results We propose a hybrid FS method (mAP-KL, which combines multiple hypothesis testing and affinity propagation (AP-clustering algorithm along with the Krzanowski & Lai cluster quality index, to select a small yet informative subset of genes. We applied mAP-KL on real microarray data, as well as on simulated data, and compared its performance against 13 other feature selection approaches. Across a variety of diseases and number of samples, mAP-KL presents competitive classification results, particularly in neuromuscular diseases, where its overall AUC score was 0.91. Furthermore, mAP-KL generates concise yet biologically relevant and informative N-gene expression signatures, which can serve as a valuable tool for diagnostic and prognostic purposes, as well as a source of potential disease biomarkers in a broad range of diseases. Conclusions mAP-KL is a data-driven and classifier-independent hybrid feature selection method, which applies to any disease classification problem based on microarray data, regardless of the available samples. Combining multiple hypothesis testing and AP leads to subsets of genes, which classify unknown samples from both, small and large patient cohorts with high accuracy.

  4. Genome-wide identification, characterization and classification of ionotropic glutamate receptor genes (iGluRs) in the malaria vector Anopheles sinensis (Diptera: Culicidae).

    Science.gov (United States)

    Wang, Ting-Ting; Si, Feng-Ling; He, Zheng-Bo; Chen, Bin

    2018-01-15

    Ionotropic glutamate receptors (iGluRs) are conserved ligand-gated ion channel receptors, and ionotropic receptors (IRs) were revealed as a new family of iGluRs. Their subdivision was unsettled, and their characteristics are little known. Anopheles sinensis is a major malaria vector in eastern Asia, and its genome was recently well sequenced and annotated. We identified iGluR genes in the An. sinensis genome, analyzed their characteristics including gene structure, genome distribution, domains and specific sites by bioinformatic methods, and deduced phylogenetic relationships of all iGluRs in An. sinensis, Anopheles gambiae and Drosophila melanogaster. Based on the characteristics and phylogenetics, we generated the classification of iGluRs, and comparatively analyzed the intron number and selective pressure of three iGluRs subdivisions, iGluR group, Antenna IR and Divergent IR subfamily. A total of 56 iGluR genes were identified and named in the whole-genome of An. sinensis. These genes were located on 18 scaffolds, and 31 of them (29 being IRs) are distributed into 10 clusters that are suggested to form mainly from recent gene duplication. These iGluRs can be divided into four groups: NMDA, non-NMDA, Antenna IR and Divergent IR based on feature comparison and phylogenetic analysis. IR8a and IR25a were suggested to be monophyletic, named as Putative in the study, and moved from the Antenna subfamily in the IR family to the non-NMDA group as a sister of traditional non-NMDA. The generated iGluRs of genes (including NMDA and regenerated non-NMDA) are relatively conserved, and have a more complicated gene structure, smaller ω values and some specific functional sites. The iGluR genes in An. sinensis, An. gambiae and D. melanogaster have amino-terminal domain (ATD), ligand binding domain (LBD) and Lig_Chan domains, except for IR8a that only has the LBD and Lig_Chan domains. However, the new concept IR family of genes (including regenerated Antenna IR, and Divergent

  5. Gene features selection for three-class disease classification via multiple orthogonal partial least square discriminant analysis and S-plot using microarray data.

    Science.gov (United States)

    Yang, Mingxing; Li, Xiumin; Li, Zhibin; Ou, Zhimin; Liu, Ming; Liu, Suhuan; Li, Xuejun; Yang, Shuyu

    2013-01-01

    DNA microarray analysis is characterized by obtaining a large number of gene variables from a small number of observations. Cluster analysis is widely used to analyze DNA microarray data to make classification and diagnosis of disease. Because there are so many irrelevant and insignificant genes in a dataset, a feature selection approach must be employed in data analysis. The performance of cluster analysis of this high-throughput data depends on whether the feature selection approach chooses the most relevant genes associated with disease classes. Here we proposed a new method using multiple Orthogonal Partial Least Squares-Discriminant Analysis (mOPLS-DA) models and S-plots to select the most relevant genes to conduct three-class disease classification and prediction. We tested our method using Golub's leukemia microarray data. For three classes with subtypes, we proposed hierarchical orthogonal partial least squares-discriminant analysis (OPLS-DA) models and S-plots to select features for two main classes and their subtypes. For three classes in parallel, we employed three OPLS-DA models and S-plots to choose marker genes for each class. The power of feature selection to classify and predict three-class disease was evaluated using cluster analysis. Further, the general performance of our method was tested using four public datasets and compared with those of four other feature selection methods. The results revealed that our method effectively selected the most relevant features for disease classification and prediction, and its performance was better than that of the other methods.

  6. Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification

    Science.gov (United States)

    2018-01-01

    One of the goals of cancer research is to identify a set of genes that cause or control disease progression. However, although multiple such gene sets were published, these are usually in very poor agreement with each other, and very few of the genes proved to be functional therapeutic targets. Furthermore, recent findings from a breast cancer gene-expression cohort showed that sets of genes selected randomly can be used to predict survival with a much higher probability than expected. These results imply that many of the genes identified in breast cancer gene expression analysis may not be causal of cancer progression, even though they can still be highly predictive of prognosis. We performed a similar analysis on all the cancer types available in the cancer genome atlas (TCGA), namely, estimating the predictive power of random gene sets for survival. Our work shows that most cancer types exhibit the property that random selections of genes are more predictive of survival than expected. In contrast to previous work, this property is not removed by using a proliferation signature, which implies that proliferation may not always be the confounder that drives this property. We suggest one possible solution in the form of data-driven sub-classification to reduce this property significantly. Our results suggest that the predictive power of random gene sets may be used to identify the existence of sub-classes in the data, and thus may allow better understanding of patient stratification. Furthermore, by reducing the observed bias this may allow more direct identification of biologically relevant, and potentially causal, genes. PMID:29470520

  7. Validation of the prognostic gene portfolio, ClinicoMolecular Triad Classification, using an independent prospective breast cancer cohort and external patient populations

    Science.gov (United States)

    2014-01-01

    Introduction Using genome-wide expression profiles of a prospective training cohort of breast cancer patients, ClinicoMolecular Triad Classification (CMTC) was recently developed to classify breast cancers into three clinically relevant groups to aid treatment decisions. CMTC was found to be both prognostic and predictive in a large external breast cancer cohort in that study. This study serves to validate the reproducibility of CMTC and its prognostic value using independent patient cohorts. Methods An independent internal cohort (n = 284) and a new external cohort (n = 2,181) were used to validate the association of CMTC between clinicopathological factors, 12 known gene signatures, two molecular subtype classifiers, and 19 oncogenic signalling pathway activities, and to reproduce the abilities of CMTC to predict clinical outcomes of breast cancer. In addition, we also updated the outcome data of the original training cohort (n = 147). Results The original training cohort reached a statistically significant difference (p risk groups. Conclusions Both prospective internal cohorts and the independent external cohorts reproduced the triad classification of CMTC and its prognostic significance. CMTC is an independent prognostic predictor, and it outperformed 12 other known prognostic gene signatures, molecular subtype classifications, and all other standard prognostic clinicopathological factors. Our results support further development of CMTC portfolio into a guide for personalized breast cancer treatments. PMID:24996446

  8. Validation of the prognostic gene portfolio, ClinicoMolecular Triad Classification, using an independent prospective breast cancer cohort and external patient populations.

    Science.gov (United States)

    Wang, Dong-Yu; Done, Susan J; Mc Cready, David R; Leong, Wey L

    2014-07-04

    Using genome-wide expression profiles of a prospective training cohort of breast cancer patients, ClinicoMolecular Triad Classification (CMTC) was recently developed to classify breast cancers into three clinically relevant groups to aid treatment decisions. CMTC was found to be both prognostic and predictive in a large external breast cancer cohort in that study. This study serves to validate the reproducibility of CMTC and its prognostic value using independent patient cohorts. An independent internal cohort (n = 284) and a new external cohort (n = 2,181) were used to validate the association of CMTC between clinicopathological factors, 12 known gene signatures, two molecular subtype classifiers, and 19 oncogenic signalling pathway activities, and to reproduce the abilities of CMTC to predict clinical outcomes of breast cancer. In addition, we also updated the outcome data of the original training cohort (n = 147). The original training cohort reached a statistically significant difference (p value of the triad classification was reproduced in the second independent internal cohort and the new external validation cohort. CMTC achieved even higher prognostic significance when all available patients were analyzed (n = 4,851). Oncogenic pathways Myc, E2F1, Ras and β-catenin were again implicated in the high-risk groups. Both prospective internal cohorts and the independent external cohorts reproduced the triad classification of CMTC and its prognostic significance. CMTC is an independent prognostic predictor, and it outperformed 12 other known prognostic gene signatures, molecular subtype classifications, and all other standard prognostic clinicopathological factors. Our results support further development of CMTC portfolio into a guide for personalized breast cancer treatments.

  9. Classifying Classifications

    DEFF Research Database (Denmark)

    Debus, Michael S.

    2017-01-01

    This paper critically analyzes seventeen game classifications. The classifications were chosen on the basis of diversity, ranging from pre-digital classification (e.g. Murray 1952), over game studies classifications (e.g. Elverdam & Aarseth 2007) to classifications of drinking games (e.g. LaBrie et...... al. 2013). The analysis aims at three goals: The classifications’ internal consistency, the abstraction of classification criteria and the identification of differences in classification across fields and/or time. Especially the abstraction of classification criteria can be used in future endeavors...... into the topic of game classifications....

  10. Classification of Osteogenesis Imperfecta revisited

    NARCIS (Netherlands)

    van Dijk, F. S.; Pals, G.; van Rijn, R. R.; Nikkels, P. G. J.; Cobben, J. M.

    2010-01-01

    In 1979 Sillence proposed a classification of Osteogenesis Imperfecta (OI) in OI types I, II, III and IV. In 2004 and 2007 this classification was expanded with OI types V-VIII because of distinct clinical features and/or different causative gene mutations. We propose a revised classification of OI

  11. Locus-Specific Databases and Recommendations to Strengthen Their Contribution to the Classification of Variants in Cancer Susceptibility Genes

    NARCIS (Netherlands)

    Greenblatt, Marc S.; Brody, Lawrence C.; Foulkes, William D.; Genuardi, Maurizio; Hofstra, Robert M. W.; Olivier, Magali; Plon, Sharon E.; Sijmons, Rolf H.; Sinilnikova, Olga; Spurdle, Amanda B.

    2008-01-01

    Locus-specific databases (LSDBs) are curated collections of sequence variants in genes associated with disease. LSDBs of cancer-related genes often serve as a critical resource to researchers, diagnostic laboratories, clinicians, and others in the cancer genetics community. LSDBs are poised to play

  12. Classification of Non-Small Cell Lung Cancer Using Significance Analysis of Microarray-Gene Set Reduction Algorithm

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    2016-01-01

    Full Text Available Among non-small cell lung cancer (NSCLC, adenocarcinoma (AC, and squamous cell carcinoma (SCC are two major histology subtypes, accounting for roughly 40% and 30% of all lung cancer cases, respectively. Since AC and SCC differ in their cell of origin, location within the lung, and growth pattern, they are considered as distinct diseases. Gene expression signatures have been demonstrated to be an effective tool for distinguishing AC and SCC. Gene set analysis is regarded as irrelevant to the identification of gene expression signatures. Nevertheless, we found that one specific gene set analysis method, significance analysis of microarray-gene set reduction (SAMGSR, can be adopted directly to select relevant features and to construct gene expression signatures. In this study, we applied SAMGSR to a NSCLC gene expression dataset. When compared with several novel feature selection algorithms, for example, LASSO, SAMGSR has equivalent or better performance in terms of predictive ability and model parsimony. Therefore, SAMGSR is a feature selection algorithm, indeed. Additionally, we applied SAMGSR to AC and SCC subtypes separately to discriminate their respective stages, that is, stage II versus stage I. Few overlaps between these two resulting gene signatures illustrate that AC and SCC are technically distinct diseases. Therefore, stratified analyses on subtypes are recommended when diagnostic or prognostic signatures of these two NSCLC subtypes are constructed.

  13. Classification of early-stage non-small cell lung cancer by weighing gene expression profiles with connectivity information.

    Science.gov (United States)

    Zhang, Ao; Tian, Suyan

    2018-05-01

    Pathway-based feature selection algorithms, which utilize biological information contained in pathways to guide which features/genes should be selected, have evolved quickly and become widespread in the field of bioinformatics. Based on how the pathway information is incorporated, we classify pathway-based feature selection algorithms into three major categories-penalty, stepwise forward, and weighting. Compared to the first two categories, the weighting methods have been underutilized even though they are usually the simplest ones. In this article, we constructed three different genes' connectivity information-based weights for each gene and then conducted feature selection upon the resulting weighted gene expression profiles. Using both simulations and a real-world application, we have demonstrated that when the data-driven connectivity information constructed from the data of specific disease under study is considered, the resulting weighted gene expression profiles slightly outperform the original expression profiles. In summary, a big challenge faced by the weighting method is how to estimate pathway knowledge-based weights more accurately and precisely. Only until the issue is conquered successfully will wide utilization of the weighting methods be impossible. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks | Center for Cancer Research

    Science.gov (United States)

    The purpose of this study was to develop a method of classifying cancers to specific diagnostic categories based on their gene expression signatures using artificial neural networks (ANNs). We trained the ANNs using the small, round blue-cell tumors (SRBCTs) as a model. These cancers belong to four distinct diagnostic categories and often present diagnostic dilemmas in

  15. Variations and classification of toxic epitopes related to celiac disease among α-gliadin genes from four Aegilops genomes.

    Science.gov (United States)

    Li, Jie; Wang, Shunli; Li, Shanshan; Ge, Pei; Li, Xiaohui; Ma, Wujun; Zeller, F J; Hsam, Sai L K; Yan, Yueming

    2012-07-01

    The α-gliadins are associated with human celiac disease. A total of 23 noninterrupted full open reading frame α-gliadin genes and 19 pseudogenes were cloned and sequenced from C, M, N, and U genomes of four diploid Aegilops species. Sequence comparison of α-gliadin genes from Aegilops and Triticum species demonstrated an existence of extensive allelic variations in Gli-2 loci of the four Aegilops genomes. Specific structural features were found including the compositions and variations of two polyglutamine domains (QI and QII) and four T cell stimulatory toxic epitopes. The mean numbers of glutamine residues in the QI domain in C and N genomes and the QII domain in C, N, and U genomes were much higher than those in Triticum genomes, and the QI domain in C and N genomes and the QII domain in C, M, N, and U genomes displayed greater length variations. Interestingly, the types and numbers of four T cell stimulatory toxic epitopes in α-gliadins from the four Aegilops genomes were significantly less than those from Triticum A, B, D, and their progenitor genomes. Relationships between the structural variations of the two polyglutamine domains and the distributions of four T cell stimulatory toxic epitopes were found, resulting in the α-gliadin genes from the Aegilops and Triticum genomes to be classified into three groups.

  16. A multifactorial likelihood model for MMR gene variant classification incorporating probabilities based on sequence bioinformatics and tumor characteristics: a report from the Colon Cancer Family Registry.

    Science.gov (United States)

    Thompson, Bryony A; Goldgar, David E; Paterson, Carol; Clendenning, Mark; Walters, Rhiannon; Arnold, Sven; Parsons, Michael T; Michael D, Walsh; Gallinger, Steven; Haile, Robert W; Hopper, John L; Jenkins, Mark A; Lemarchand, Loic; Lindor, Noralane M; Newcomb, Polly A; Thibodeau, Stephen N; Young, Joanne P; Buchanan, Daniel D; Tavtigian, Sean V; Spurdle, Amanda B

    2013-01-01

    Mismatch repair (MMR) gene sequence variants of uncertain clinical significance are often identified in suspected Lynch syndrome families, and this constitutes a challenge for both researchers and clinicians. Multifactorial likelihood model approaches provide a quantitative measure of MMR variant pathogenicity, but first require input of likelihood ratios (LRs) for different MMR variation-associated characteristics from appropriate, well-characterized reference datasets. Microsatellite instability (MSI) and somatic BRAF tumor data for unselected colorectal cancer probands of known pathogenic variant status were used to derive LRs for tumor characteristics using the Colon Cancer Family Registry (CFR) resource. These tumor LRs were combined with variant segregation within families, and estimates of prior probability of pathogenicity based on sequence conservation and position, to analyze 44 unclassified variants identified initially in Australasian Colon CFR families. In addition, in vitro splicing analyses were conducted on the subset of variants based on bioinformatic splicing predictions. The LR in favor of pathogenicity was estimated to be ~12-fold for a colorectal tumor with a BRAF mutation-negative MSI-H phenotype. For 31 of the 44 variants, the posterior probabilities of pathogenicity were such that altered clinical management would be indicated. Our findings provide a working multifactorial likelihood model for classification that carefully considers mode of ascertainment for gene testing. © 2012 Wiley Periodicals, Inc.

  17. Molecular characterization and classification of Trypanosoma spp. Venezuelan isolates based on microsatellite markers and kinetoplast maxicircle genes.

    Science.gov (United States)

    Sánchez, E; Perrone, T; Recchimuzzi, G; Cardozo, I; Biteau, N; Aso, P M; Mijares, A; Baltz, T; Berthier, D; Balzano-Nogueira, L; Gonzatti, M I

    2015-10-15

    Livestock trypanosomoses, caused by three species of the Trypanozoon subgenus, Trypanosoma brucei brucei, T. evansi and T. equiperdum is widely distributed throughout the world and constitutes an important limitation for the production of animal protein. T. evansi and T. equiperdum are morphologically indistinguishable parasites that evolved from a common ancestor but acquired important biological differences, including host range, mode of transmission, distribution, clinical symptoms and pathogenicity. At a molecular level, T. evansi is characterized by the complete loss of the maxicircles of the kinetoplastic DNA, while T. equiperdum has retained maxicircle fragments similar to those present in T. brucei. T. evansi causes the disease known as Surra, Derrengadera or "mal de cadeiras", while T. equiperdum is the etiological agent of dourine or "mal du coit", characterized by venereal transmission and white patches in the genitalia. Nine Venezuelan Trypanosoma spp. isolates, from horse, donkey or capybara were genotyped and classified using microsatellite analyses and maxicircle genes. The variables from the microsatellite data and the Procyclin PE repeats matrices were combined using the Hill-Smith method and compared to a group of T. evansi, T. equiperdum and T. brucei reference strains from South America, Asia and Africa using Coinertia analysis. Four maxicircle genes (cytb, cox1, a6 and nd8) were amplified by PCRfrom TeAp-N/D1 and TeGu-N/D1, the two Venezuelan isolates that grouped with the T. equiperdum STIB841/OVI strain. These maxicircle sequences were analyzed by nucleotide BLAST and aligned toorthologous genes from the Trypanozoon subgenus by MUSCLE tools. Phylogenetic trees were constructed using Maximum Parsimony (MP) and Maximum Likelihood (ML) with the MEGA5.1® software. We characterized microsatellite markers and Procyclin PE repeats of nine Venezuelan Trypanosoma spp. isolates with various degrees of virulence in a mouse model, and compared them to a

  18. A molecular phylogeny of the Cephinae (Hymenoptera, Cephidae based on mtDNA COI gene: a test of traditional classification

    Directory of Open Access Journals (Sweden)

    Mahir Budak

    2011-09-01

    Full Text Available Cephinae is traditionally divided into three tribes and about 24 genera based on morphology and host utilization. There has been no study testing the monophyly of taxa under a strict phylogenetic criterion. A molecular phylogeny of Cephinae based on a total of 68 sequences of mtDNA COI gene, representing seven genera of Cephinae, is reconstructed to test the traditional limits and relationships of taxa. Monophyly of the traditional tribes is not supported. Monophyly of the genera are largely supported except for Pachycephus. A few host shift events are suggested based on phylogenetic relationships among taxa. These results indicate that a more robust phylogeny is required for a more plausible conclusion. We also report two species of Cephus for the first time from Turkey.

  19. Update on diabetes classification.

    Science.gov (United States)

    Thomas, Celeste C; Philipson, Louis H

    2015-01-01

    This article highlights the difficulties in creating a definitive classification of diabetes mellitus in the absence of a complete understanding of the pathogenesis of the major forms. This brief review shows the evolving nature of the classification of diabetes mellitus. No classification scheme is ideal, and all have some overlap and inconsistencies. The only diabetes in which it is possible to accurately diagnose by DNA sequencing, monogenic diabetes, remains undiagnosed in more than 90% of the individuals who have diabetes caused by one of the known gene mutations. The point of classification, or taxonomy, of disease, should be to give insight into both pathogenesis and treatment. It remains a source of frustration that all schemes of diabetes mellitus continue to fall short of this goal. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. Predicting tissue-specific expressions based on sequence characteristics

    KAUST Repository

    Paik, Hyojung; Ryu, Tae Woo; Heo, Hyoungsam; Seo, Seungwon; Lee, Doheon; Hur, Cheolgoo

    2011-01-01

    In multicellular organisms, including humans, understanding expression specificity at the tissue level is essential for interpreting protein function, such as tissue differentiation. We developed a prediction approach via generated sequence features from overrepresented patterns in housekeeping (HK) and tissue-specific (TS) genes to classify TS expression in humans. Using TS domains and transcriptional factor binding sites (TFBSs), sequence characteristics were used as indices of expressed tissues in a Random Forest algorithm by scoring exclusive patterns considering the biological intuition; TFBSs regulate gene expression, and the domains reflect the functional specificity of a TS gene. Our proposed approach displayed better performance than previous attempts and was validated using computational and experimental methods.

  1. Predicting tissue-specific expressions based on sequence characteristics

    KAUST Repository

    Paik, Hyojung

    2011-04-30

    In multicellular organisms, including humans, understanding expression specificity at the tissue level is essential for interpreting protein function, such as tissue differentiation. We developed a prediction approach via generated sequence features from overrepresented patterns in housekeeping (HK) and tissue-specific (TS) genes to classify TS expression in humans. Using TS domains and transcriptional factor binding sites (TFBSs), sequence characteristics were used as indices of expressed tissues in a Random Forest algorithm by scoring exclusive patterns considering the biological intuition; TFBSs regulate gene expression, and the domains reflect the functional specificity of a TS gene. Our proposed approach displayed better performance than previous attempts and was validated using computational and experimental methods.

  2. Gene

    Data.gov (United States)

    U.S. Department of Health & Human Services — Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes,...

  3. A comprehensive simulation study on classification of RNA-Seq data.

    Directory of Open Access Journals (Sweden)

    Gökmen Zararsız

    Full Text Available RNA sequencing (RNA-Seq is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM, classification and regression trees (CART, and random forests (RF. We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count

  4. Genome-wide identification, phylogenetic classification, and exon-intron structure characterisation of the tubulin and actin genes in flax (Linum usitatissimum).

    Science.gov (United States)

    Pydiura, Nikolay; Pirko, Yaroslav; Galinousky, Dmitry; Postovoitova, Anastasiia; Yemets, Alla; Kilchevsky, Aleksandr; Blume, Yaroslav

    2018-06-08

    Flax (Linum usitatissimum L.) is a valuable food and fiber crop cultivated for its quality fiber and seed oil. α-, β-, γ-tubulins and actins are the main structural proteins of the cytoskeleton. α- and γ-tubulin and actin genes have not been characterized yet in the flax genome. In this study, we have identified 6 α-tubulin genes, 13 β-tubulin genes, 2 γ-tubulin genes, and 15 actin genes in the flax genome and analysed the phylogenetic relationships between flax and A. thaliana tubulin and actin genes. Six α-tubulin genes are represented by 3 paralogous pairs, among 13 β-tubulin genes 7 different isotypes can be distinguished, 6 of which are encoded by two paralogous genes each. γ-tubulin is represented by a paralogous pair of genes one of which may be not functional. Fifteen actin genes represent 7 paralogous pairs - 7 actin isotypes and a sequentially duplicated copy of one of the genes of one of the isotypes. Exon-intron structure analysis has shown intron length polymorphism within the β-tubulin genes and intron number variation among the α-tubulin gene: 3 or 4 introns are found in two or four genes, respectively. Intron positioning occurs at conservative sites, as observed in numerous other plant species. Flax actin genes show both intron length polymorphisms and variation in the number of intron that may be 2 or 3. These data will be useful to support further studies on the specificity, functioning, regulation and evolution of the flax cytoskeleton proteins. This article is protected by copyright. All rights reserved.

  5. Tissue Classification

    DEFF Research Database (Denmark)

    Van Leemput, Koen; Puonti, Oula

    2015-01-01

    Computational methods for automatically segmenting magnetic resonance images of the brain have seen tremendous advances in recent years. So-called tissue classification techniques, aimed at extracting the three main brain tissue classes (white matter, gray matter, and cerebrospinal fluid), are now...... well established. In their simplest form, these methods classify voxels independently based on their intensity alone, although much more sophisticated models are typically used in practice. This article aims to give an overview of often-used computational techniques for brain tissue classification...

  6. Importância da detecção das mutações no gene FLT3 e no gene NPM1 na leucemia mieloide aguda - Classificação da Organização Mundial de Saúde 2008 Importance of detecting FLT3 and NPM1 gene mutations in acute myeloid leukemia -World Health Organization Classification 2008

    Directory of Open Access Journals (Sweden)

    Marley Aparecida Licínio

    2010-01-01

    Full Text Available As leucemias mieloides agudas (LMA constituem um grupo de neoplasias malignas caracterizadas pela proliferação descontrolada de células hematopoéticas, decorrente de mutações que podem ocorrer em diferentes fases da diferenciação de células precursoras mieloides. Em 2008, a Organização Mundial da Saúde (OMS-2008 publicou uma nova classificação para neoplasias do sistema hematopoético e linfoide. De acordo com essa classificação, para um diagnóstico mais preciso e estratificação de prognóstico de pacientes com leucemias mieloides agudas, devem-se pesquisar mutações nos genes FLT3 e NPM1. Sabe-se que a presença de mutações no gene FLT3 é de prognóstico desfavorável e que as mutações no gene NPM1 do tipo A são de prognóstico favorável. Assim, nos países desenvolvidos, a análise das mutações no gene FLT3 e NPM1 tem sido considerada como um fator de prognóstico importante na decisão terapêutica em pacientes com diagnóstico de leucemias mieloides agudas. Considerando essas informações, é de extrema importância a análise das mutações no gene FLT3 (duplicação interna em tandem - DIT - e mutação pontual D835 e no gene NPM1 como marcadores moleculares para o diagnóstico, o prognóstico e a monitoração de doença residual mínima em pacientes com leucemias mieloides agudas.Acute myeloid leukemia (AML is a group of malignancies characterized by uncontrolled proliferation of hematopoietic cells resulting from mutations that occur at different stages in the differentiation of myeloid precursor cells. In 2008, the World Health Organization (WHO-2008 published a new classification for cancers of the hematopoietic and lymphoid system. According to this classification, FLT3 and NPM1 gene mutations should be investigated for a more precise diagnosis and prognostic stratification of AML patients. It is well known that the presence of FLT3 gene mutations is considered an unfavorable prognostic factor and type

  7. Transporter Classification Database (TCDB)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Transporter Classification Database details a comprehensive classification system for membrane transport proteins known as the Transporter Classification (TC)...

  8. Image analysis for gene expression based phenotype characterization in yeast cells

    NARCIS (Netherlands)

    Tleis, M.

    2016-01-01

    Image analysis of objects in the microscope scale requires accuracy so that measurements can be used to differentiate between groups of objects that are being studied. This thesis deals with measurements in yeast biology that are obtained through microscope images. We study the algorithms and

  9. Heterogeneity wavelet kinetics from DCE-MRI for classifying gene expression based breast cancer recurrence risk.

    Science.gov (United States)

    Mahrooghy, Majid; Ashraf, Ahmed B; Daye, Dania; Mies, Carolyn; Feldman, Michael; Rosen, Mark; Kontos, Despina

    2013-01-01

    Breast tumors are heterogeneous lesions. Intra-tumor heterogeneity presents a major challenge for cancer diagnosis and treatment. Few studies have worked on capturing tumor heterogeneity from imaging. Most studies to date consider aggregate measures for tumor characterization. In this work we capture tumor heterogeneity by partitioning tumor pixels into subregions and extracting heterogeneity wavelet kinetic (HetWave) features from breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) to obtain the spatiotemporal patterns of the wavelet coefficients and contrast agent uptake from each partition. Using a genetic algorithm for feature selection, and a logistic regression classifier with leave one-out cross validation, we tested our proposed HetWave features for the task of classifying breast cancer recurrence risk. The classifier based on our features gave an ROC AUC of 0.78, outperforming previously proposed kinetic, texture, and spatial enhancement variance features which give AUCs of 0.69, 0.64, and 0.65, respectively.

  10. Protein expression based multimarker analysis of breast cancer samples

    International Nuclear Information System (INIS)

    Presson, Angela P; Horvath, Steve; Yoon, Nam K; Bagryanova, Lora; Mah, Vei; Alavi, Mohammad; Maresh, Erin L; Rajasekaran, Ayyappan K; Goodglick, Lee; Chia, David

    2011-01-01

    Tissue microarray (TMA) data are commonly used to validate the prognostic accuracy of tumor markers. For example, breast cancer TMA data have led to the identification of several promising prognostic markers of survival time. Several studies have shown that TMA data can also be used to cluster patients into clinically distinct groups. Here we use breast cancer TMA data to cluster patients into distinct prognostic groups. We apply weighted correlation network analysis (WGCNA) to TMA data consisting of 26 putative tumor biomarkers measured on 82 breast cancer patients. Based on this analysis we identify three groups of patients with low (5.4%), moderate (22%) and high (50%) mortality rates, respectively. We then develop a simple threshold rule using a subset of three markers (p53, Na-KATPase-β1, and TGF β receptor II) that can approximately define these mortality groups. We compare the results of this correlation network analysis with results from a standard Cox regression analysis. We find that the rule-based grouping variable (referred to as WGCNA*) is an independent predictor of survival time. While WGCNA* is based on protein measurements (TMA data), it validated in two independent Affymetrix microarray gene expression data (which measure mRNA abundance). We find that the WGCNA patient groups differed by 35% from mortality groups defined by a more conventional stepwise Cox regression analysis approach. We show that correlation network methods, which are primarily used to analyze the relationships between gene products, are also useful for analyzing the relationships between patients and for defining distinct patient groups based on TMA data. We identify a rule based on three tumor markers for predicting breast cancer survival outcomes

  11. Population Level Purifying Selection and Gene Expression Shape Subgenome Evolution in Maize.

    Science.gov (United States)

    Pophaly, Saurabh D; Tellier, Aurélien

    2015-12-01

    The maize ancestor experienced a recent whole-genome duplication (WGD) followed by gene erosion which generated two subgenomes, the dominant subgenome (maize1) experiencing fewer deletions than maize2. We take advantage of available extensive polymorphism and gene expression data in maize to study purifying selection and gene expression divergence between WGD retained paralog pairs. We first report a strong correlation in nucleotide diversity between duplicate pairs, except for upstream regions. We then show that maize1 genes are under stronger purifying selection than maize2. WGD retained genes have higher gene dosage and biased Gene Ontologies consistent with previous studies. The relative gene expression of paralogs across tissues demonstrates that 98% of duplicate pairs have either subfunctionalized in a tissuewise manner or have diverged consistently in their expression thereby preventing functional complementation. Tissuewise subfunctionalization seems to be a hallmark of transcription factors, whereas consistent repression occurs for macromolecular complexes. We show that dominant gene expression is a strong determinant of the strength of purifying selection, explaining the inferred stronger negative selection on maize1 genes. We propose a novel expression-based classification of duplicates which is more robust to explain observed polymorphism patterns than the subgenome location. Finally, upstream regions of repressed genes exhibit an enrichment in transposable elements which indicates a possible mechanism for expression divergence. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. Classification in context

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2004-01-01

    This paper surveys classification research literature, discusses various classification theories, and shows that the focus has traditionally been on establishing a scientific foundation for classification research. This paper argues that a shift has taken place, and suggests that contemporary...... classification research focus on contextual information as the guide for the design and construction of classification schemes....

  13. Classification of the web

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2004-01-01

    This paper discusses the challenges faced by investigations into the classification of the Web and outlines inquiries that are needed to use principles for bibliographic classification to construct classifications of the Web. This paper suggests that the classification of the Web meets challenges...... that call for inquiries into the theoretical foundation of bibliographic classification theory....

  14. Hazard classification methodology

    International Nuclear Information System (INIS)

    Brereton, S.J.

    1996-01-01

    This document outlines the hazard classification methodology used to determine the hazard classification of the NIF LTAB, OAB, and the support facilities on the basis of radionuclides and chemicals. The hazard classification determines the safety analysis requirements for a facility

  15. Prediction of metastasis from low-malignant breast cancer by gene expression profiling

    DEFF Research Database (Denmark)

    Thomassen, Mads; Tan, Qihua; Eiriksdottir, Freyja

    2007-01-01

    examined in these studies is the low-risk patients for whom outcome is very difficult to predict with currently used methods. These patients do not receive adjuvant treatment according to the guidelines of the Danish Breast Cancer Cooperative Group (DBCG). In this study, 26 tumors from low-risk patients...... with different characteristics and risk, expression-based classification specifically developed in low-risk patients have higher predictive power in this group.......Promising results for prediction of outcome in breast cancer have been obtained by genome wide gene expression profiling. Some studies have suggested that an extensive overtreatment of breast cancer patients might be reduced by risk assessment with gene expression profiling. A patient group hardly...

  16. SAW Classification Algorithm for Chinese Text Classification

    OpenAIRE

    Xiaoli Guo; Huiyu Sun; Tiehua Zhou; Ling Wang; Zhaoyang Qu; Jiannan Zang

    2015-01-01

    Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word) algorithm. The algorithm uses the special space effect of Chinese text where words...

  17. Identification of high-risk cutaneous melanoma tumors is improved when combining the online American Joint Committee on Cancer Individualized Melanoma Patient Outcome Prediction Tool with a 31-gene expression profile-based classification.

    Science.gov (United States)

    Ferris, Laura K; Farberg, Aaron S; Middlebrook, Brooke; Johnson, Clare E; Lassen, Natalie; Oelschlager, Kristen M; Maetzold, Derek J; Cook, Robert W; Rigel, Darrell S; Gerami, Pedram

    2017-05-01

    A significant proportion of patients with American Joint Committee on Cancer (AJCC)-defined early-stage cutaneous melanoma have disease recurrence and die. A 31-gene expression profile (GEP) that accurately assesses metastatic risk associated with primary cutaneous melanomas has been described. We sought to compare accuracy of the GEP in combination with risk determined using the web-based AJCC Individualized Melanoma Patient Outcome Prediction Tool. GEP results from 205 stage I/II cutaneous melanomas with sufficient clinical data for prognostication using the AJCC tool were classified as low (class 1) or high (class 2) risk. Two 5-year overall survival cutoffs (AJCC 79% and 68%), reflecting survival for patients with stage IIA or IIB disease, respectively, were assigned for binary AJCC risk. Cox univariate analysis revealed significant risk classification of distant metastasis-free and overall survival (hazard ratio range 3.2-9.4, P risk by GEP but low risk by AJCC. Specimens reflect tertiary care center referrals; more effective therapies have been approved for clinical use after accrual. The GEP provides valuable prognostic information and improves identification of high-risk melanomas when used together with the AJCC online prediction tool. Copyright © 2016 American Academy of Dermatology, Inc. Published by Elsevier Inc. All rights reserved.

  18. Asteroid taxonomic classifications

    International Nuclear Information System (INIS)

    Tholen, D.J.

    1989-01-01

    This paper reports on three taxonomic classification schemes developed and applied to the body of available color and albedo data. Asteroid taxonomic classifications according to two of these schemes are reproduced

  19. Hand eczema classification

    DEFF Research Database (Denmark)

    Diepgen, T L; Andersen, Klaus Ejner; Brandao, F M

    2008-01-01

    of the disease is rarely evidence based, and a classification system for different subdiagnoses of hand eczema is not agreed upon. Randomized controlled trials investigating the treatment of hand eczema are called for. For this, as well as for clinical purposes, a generally accepted classification system...... A classification system for hand eczema is proposed. Conclusions It is suggested that this classification be used in clinical work and in clinical trials....

  20. Classification with support hyperplanes

    NARCIS (Netherlands)

    G.I. Nalbantov (Georgi); J.C. Bioch (Cor); P.J.F. Groenen (Patrick)

    2006-01-01

    textabstractA new classification method is proposed, called Support Hy- perplanes (SHs). To solve the binary classification task, SHs consider the set of all hyperplanes that do not make classification mistakes, referred to as semi-consistent hyperplanes. A test object is classified using

  1. Standard classification: Physics

    International Nuclear Information System (INIS)

    1977-01-01

    This is a draft standard classification of physics. The conception is based on the physics part of the systematic catalogue of the Bayerische Staatsbibliothek and on the classification given in standard textbooks. The ICSU-AB classification now used worldwide by physics information services was not taken into account. (BJ) [de

  2. Classification of refrigerants; Classification des fluides frigorigenes

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2001-07-01

    This document was made from the US standard ANSI/ASHRAE 34 published in 2001 and entitled 'designation and safety classification of refrigerants'. This classification allows to clearly organize in an international way the overall refrigerants used in the world thanks to a codification of the refrigerants in correspondence with their chemical composition. This note explains this codification: prefix, suffixes (hydrocarbons and derived fluids, azeotropic and non-azeotropic mixtures, various organic compounds, non-organic compounds), safety classification (toxicity, flammability, case of mixtures). (J.S.)

  3. Classification, disease, and diagnosis.

    Science.gov (United States)

    Jutel, Annemarie

    2011-01-01

    Classification shapes medicine and guides its practice. Understanding classification must be part of the quest to better understand the social context and implications of diagnosis. Classifications are part of the human work that provides a foundation for the recognition and study of illness: deciding how the vast expanse of nature can be partitioned into meaningful chunks, stabilizing and structuring what is otherwise disordered. This article explores the aims of classification, their embodiment in medical diagnosis, and the historical traditions of medical classification. It provides a brief overview of the aims and principles of classification and their relevance to contemporary medicine. It also demonstrates how classifications operate as social framing devices that enable and disable communication, assert and refute authority, and are important items for sociological study.

  4. Security classification of information

    Energy Technology Data Exchange (ETDEWEB)

    Quist, A.S.

    1993-04-01

    This document is the second of a planned four-volume work that comprehensively discusses the security classification of information. The main focus of Volume 2 is on the principles for classification of information. Included herein are descriptions of the two major types of information that governments classify for national security reasons (subjective and objective information), guidance to use when determining whether information under consideration for classification is controlled by the government (a necessary requirement for classification to be effective), information disclosure risks and benefits (the benefits and costs of classification), standards to use when balancing information disclosure risks and benefits, guidance for assigning classification levels (Top Secret, Secret, or Confidential) to classified information, guidance for determining how long information should be classified (classification duration), classification of associations of information, classification of compilations of information, and principles for declassifying and downgrading information. Rules or principles of certain areas of our legal system (e.g., trade secret law) are sometimes mentioned to .provide added support to some of those classification principles.

  5. Classification of Flotation Frothers

    Directory of Open Access Journals (Sweden)

    Jan Drzymala

    2018-02-01

    Full Text Available In this paper, a scheme of flotation frothers classification is presented. The scheme first indicates the physical system in which a frother is present and four of them i.e., pure state, aqueous solution, aqueous solution/gas system and aqueous solution/gas/solid system are distinguished. As a result, there are numerous classifications of flotation frothers. The classifications can be organized into a scheme described in detail in this paper. The frother can be present in one of four physical systems, that is pure state, aqueous solution, aqueous solution/gas and aqueous solution/gas/solid system. It results from the paper that a meaningful classification of frothers relies on choosing the physical system and next feature, trend, parameter or parameters according to which the classification is performed. The proposed classification can play a useful role in characterizing and evaluation of flotation frothers.

  6. Ontologies vs. Classification Systems

    DEFF Research Database (Denmark)

    Madsen, Bodil Nistrup; Erdman Thomsen, Hanne

    2009-01-01

    What is an ontology compared to a classification system? Is a taxonomy a kind of classification system or a kind of ontology? These are questions that we meet when working with people from industry and public authorities, who need methods and tools for concept clarification, for developing meta...... data sets or for obtaining advanced search facilities. In this paper we will present an attempt at answering these questions. We will give a presentation of various types of ontologies and briefly introduce terminological ontologies. Furthermore we will argue that classification systems, e.g. product...... classification systems and meta data taxonomies, should be based on ontologies....

  7. voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data

    Directory of Open Access Journals (Sweden)

    Gokmen Zararsiz

    2017-10-01

    Full Text Available RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom extensions of the nearest shrunken centroids (NSC and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom’s precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.

  8. voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data.

    Science.gov (United States)

    Zararsiz, Gokmen; Goksuluk, Dincer; Klaus, Bernd; Korkmaz, Selcuk; Eldem, Vahap; Karabulut, Erdem; Ozturk, Ahmet

    2017-01-01

    RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom) extensions of the nearest shrunken centroids (NSC) and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom's precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.

  9. Classification of radiological procedures

    International Nuclear Information System (INIS)

    1989-01-01

    A classification for departments in Danish hospitals which use radiological procedures. The classification codes consist of 4 digits, where the first 2 are the codes for the main groups. The first digit represents the procedure's topographical object and the second the techniques. The last 2 digits describe individual procedures. (CLS)

  10. Colombia: Territorial classification

    International Nuclear Information System (INIS)

    Mendoza Morales, Alberto

    1998-01-01

    The article is about the approaches of territorial classification, thematic axes, handling principles and territorial occupation, politician and administrative units and administration regions among other topics. Understanding as Territorial Classification the space distribution on the territory of the country, of the geographical configurations, the human communities, the political-administrative units and the uses of the soil, urban and rural, existent and proposed

  11. Munitions Classification Library

    Science.gov (United States)

    2016-04-04

    members of the community to make their own additions to any, or all, of the classification libraries . The next phase entailed data collection over less......Include area code) 04/04/2016 Final Report August 2014 - August 2015 MUNITIONS CLASSIFICATION LIBRARY Mr. Craig Murray, Parsons Dr. Thomas H. Bell, Leidos

  12. Recursive automatic classification algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Bauman, E V; Dorofeyuk, A A

    1982-03-01

    A variational statement of the automatic classification problem is given. The dependence of the form of the optimal partition surface on the form of the classification objective functional is investigated. A recursive algorithm is proposed for maximising a functional of reasonably general form. The convergence problem is analysed in connection with the proposed algorithm. 8 references.

  13. Library Classification 2020

    Science.gov (United States)

    Harris, Christopher

    2013-01-01

    In this article the author explores how a new library classification system might be designed using some aspects of the Dewey Decimal Classification (DDC) and ideas from other systems to create something that works for school libraries in the year 2020. By examining what works well with the Dewey Decimal System, what features should be carried…

  14. Spectroscopic classification of transients

    DEFF Research Database (Denmark)

    Stritzinger, M. D.; Fraser, M.; Hummelmose, N. N.

    2017-01-01

    We report the spectroscopic classification of several transients based on observations taken with the Nordic Optical Telescope (NOT) equipped with ALFOSC, over the nights 23-25 August 2017.......We report the spectroscopic classification of several transients based on observations taken with the Nordic Optical Telescope (NOT) equipped with ALFOSC, over the nights 23-25 August 2017....

  15. CLASIFICACIÓN NO SUPERVISADA DE COBERTURAS VEGETALES SOBRE IMÁGENES DIGITALES DE SENSORES REMOTOS: “LANDSAT - ETM+” NONSUPERVISED CLASSIFICATION OF VEGETABLE COVERS ON DIGITAL IMAGES OF REMOTE SENSORS: "LANDSAT - ETM+"

    Directory of Open Access Journals (Sweden)

    Mauricio Arango Gutiérrez

    2005-06-01

    Full Text Available La diversidad de especies vegetales presentes en Colombia y la falta de inventario sobre ellas hace pensar en un proceso que facilite la labor de los investigadores en estas disciplinas. Los sensores remotos satelitales como el LANDSAT ETM+ y las técnicas de inteligencia artificial no supervisadas, como los Self-Organizing Maps - SOM, podrían proveer una alternativa viable para avanzar en la obtención rápida de información que corresponda a zonas con diferentes coberturas vegetales presentes en la geografía nacional. La zona propuesta para el caso en estudio fue clasificada de forma supervisada por el método de máxima similitud en otro trabajo de investigación en ciencias forestales y se discriminaron ocho tipos de coberturas vegetales. Esta información sirvió como patrón de medida para evaluar el desempeño de los clasificadores no supervisados ISODATA y SOM. Sin embargo, la información que proveen las imágenes debió ser depurada previamente de acuerdo a los criterios de uso y calidad de los datos de manera que se utilizara la información adecuada para estos métodos no supervisados. Para esto se recurrió a varios conceptos como las estadísticas de las imágenes, el comportamiento espectral de las comunidades vegetales, las características del sensor y la divergencia promedio que permitieron definir las mejores bandas y sus combinaciones. Sobre éstas se aplicó el concepto de análisis de componentes principales que permitió reducir el número de datos conservando un gran porcentaje de la información. Sobre estos datos depurados se aplicaron las técnicas no supervisadas modificando algunos parámetros que pudieran mostrar una mejor convergencia de los métodos. Los resultados obtenidos se compararon con la clasificación supervisada a través de matrices de confusión y se concluye que no hay una buena convergencia de los métodos de clasificación no supervisada con este proceso para el caso de las coberturas vegetales

  16. DOE LLW classification rationale

    International Nuclear Information System (INIS)

    Flores, A.Y.

    1991-01-01

    This report was about the rationale which the US Department of Energy had with low-level radioactive waste (LLW) classification. It is based on the Nuclear Regulatory Commission's classification system. DOE site operators met to review the qualifications and characteristics of the classification systems. They evaluated performance objectives, developed waste classification tables, and compiled dose limits on the waste. A goal of the LLW classification system was to allow each disposal site the freedom to develop limits to radionuclide inventories and concentrations according to its own site-specific characteristics. This goal was achieved with the adoption of a performance objectives system based on a performance assessment, with site-specific environmental conditions and engineered disposal systems

  17. Constructing criticality by classification

    DEFF Research Database (Denmark)

    Machacek, Erika

    2017-01-01

    " in the bureaucratic practice of classification: Experts construct material criticality in assessments as they allot information on the materials to the parameters of the assessment framework. In so doing, they ascribe a new set of connotations to the materials, namely supply risk, and their importance to clean energy......, legitimizing a criticality discourse.Specifically, the paper introduces a typology delineating the inferences made by the experts from their produced recommendations in the classification of rare earth element criticality. The paper argues that the classification is a specific process of constructing risk....... It proposes that the expert bureaucratic practice of classification legitimizes (i) the valorisation that was made in the drafting of the assessment framework for the classification, and (ii) political operationalization when enacted that might have (non-)distributive implications for the allocation of public...

  18. Difference of protein 53 expression based on radiation therapy response in cervical cancer

    Science.gov (United States)

    Pasaribu, H. P.; Lubis, L. I.; Dina, S.; Simanjuntak, R. Y.; Siregar, H. S.; Rivany, R.

    2018-03-01

    Cervical cancer is one of most common gynecological cancer in women and the leading cause of death in developing countries. An analytic study with the case-control design was conducted to determine the difference of p53 expression based on radiation therapy response in cervical cancer. The study was performed in Obstetric and Gynecology Department and Pathology Department of Adam Malik General Hospital Medan from January to February 2017. 15 paraffin blocks of acervical cancer patient with incomplete response were obtained as study samples, and 15 paraffin blocks of acervical cancer patient with complete response were obtained as control samples, The samples were collected by consecutive sampling, andan immunohistochemical assessment of p53 expression was done to assessapoptosis count and radiation response. Data were analyzed using Kruskal-Wallis with confidence interval 83.5% and pcervical cancer.

  19. Classification of movement disorders.

    Science.gov (United States)

    Fahn, Stanley

    2011-05-01

    The classification of movement disorders has evolved. Even the terminology has shifted, from an anatomical one of extrapyramidal disorders to a phenomenological one of movement disorders. The history of how this shift came about is described. The history of both the definitions and the classifications of the various neurologic conditions is then reviewed. First is a review of movement disorders as a group; then, the evolving classifications for 3 of them--parkinsonism, dystonia, and tremor--are covered in detail. Copyright © 2011 Movement Disorder Society.

  20. Modulation of gene expression made easy

    DEFF Research Database (Denmark)

    Solem, Christian; Jensen, Peter Ruhdal

    2002-01-01

    A new approach for modulating gene expression, based on randomization of promoter (spacer) sequences, was developed. The method was applied to chromosomal genes in Lactococcus lactis and shown to generate libraries of clones with broad ranges of expression levels of target genes. In one example...... that the method can be applied to modulating the expression of native genes on the chromosome. We constructed a series of strains in which the expression of the las operon, containing the genes pfk, pyk, and ldh, was modulated by integrating a truncated copy of the pfk gene. Importantly, the modulation affected...

  1. Learning Apache Mahout classification

    CERN Document Server

    Gupta, Ashish

    2015-01-01

    If you are a data scientist who has some experience with the Hadoop ecosystem and machine learning methods and want to try out classification on large datasets using Mahout, this book is ideal for you. Knowledge of Java is essential.

  2. CLASSIFICATION OF VIRUSES

    Indian Academy of Sciences (India)

    First page Back Continue Last page Overview Graphics. CLASSIFICATION OF VIRUSES. On basis of morphology. On basis of chemical composition. On basis of structure of genome. On basis of mode of replication. Notes:

  3. Pitch Based Sound Classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft......-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic......, and that even though classification gets marginally better, not much is achieved by increasing the window size beyond 1 s....

  4. Towards secondary fingerprint classification

    CSIR Research Space (South Africa)

    Msiza, IS

    2011-07-01

    Full Text Available an accuracy figure of 76.8%. This small difference between the two figures is indicative of the validity of the proposed secondary classification module. Keywords?fingerprint core; fingerprint delta; primary classifi- cation; secondary classification I..., namely, the fingerprint core and the fingerprint delta. Forensically, a fingerprint core is defined as the innermost turning point where the fingerprint ridges form a loop, while the fingerprint delta is defined as the point where these ridges form a...

  5. Expected Classification Accuracy

    Directory of Open Access Journals (Sweden)

    Lawrence M. Rudner

    2005-08-01

    Full Text Available Every time we make a classification based on a test score, we should expect some number..of misclassifications. Some examinees whose true ability is within a score range will have..observed scores outside of that range. A procedure for providing a classification table of..true and expected scores is developed for polytomously scored items under item response..theory and applied to state assessment data. A simplified procedure for estimating the..table entries is also presented.

  6. Latent classification models

    DEFF Research Database (Denmark)

    Langseth, Helge; Nielsen, Thomas Dyhre

    2005-01-01

    parametric family ofdistributions.  In this paper we propose a new set of models forclassification in continuous domains, termed latent classificationmodels. The latent classification model can roughly be seen ascombining the \\NB model with a mixture of factor analyzers,thereby relaxing the assumptions...... classification model, and wedemonstrate empirically that the accuracy of the proposed model issignificantly higher than the accuracy of other probabilisticclassifiers....

  7. 78 FR 68983 - Cotton Futures Classification: Optional Classification Procedure

    Science.gov (United States)

    2013-11-18

    ...-AD33 Cotton Futures Classification: Optional Classification Procedure AGENCY: Agricultural Marketing... regulations to allow for the addition of an optional cotton futures classification procedure--identified and... response to requests from the U.S. cotton industry and ICE, AMS will offer a futures classification option...

  8. Supernova Photometric Lightcurve Classification

    Science.gov (United States)

    Zaidi, Tayeb; Narayan, Gautham

    2016-01-01

    This is a preliminary report on photometric supernova classification. We first explore the properties of supernova light curves, and attempt to restructure the unevenly sampled and sparse data from assorted datasets to allow for processing and classification. The data was primarily drawn from the Dark Energy Survey (DES) simulated data, created for the Supernova Photometric Classification Challenge. This poster shows a method for producing a non-parametric representation of the light curve data, and applying a Random Forest classifier algorithm to distinguish between supernovae types. We examine the impact of Principal Component Analysis to reduce the dimensionality of the dataset, for future classification work. The classification code will be used in a stage of the ANTARES pipeline, created for use on the Large Synoptic Survey Telescope alert data and other wide-field surveys. The final figure-of-merit for the DES data in the r band was 60% for binary classification (Type I vs II).Zaidi was supported by the NOAO/KPNO Research Experiences for Undergraduates (REU) Program which is funded by the National Science Foundation Research Experiences for Undergraduates Program (AST-1262829).

  9. PASTEC: an automatic transposable element classification tool.

    Directory of Open Access Journals (Sweden)

    Claire Hoede

    Full Text Available SUMMARY: The classification of transposable elements (TEs is key step towards deciphering their potential impact on the genome. However, this process is often based on manual sequence inspection by TE experts. With the wealth of genomic sequences now available, this task requires automation, making it accessible to most scientists. We propose a new tool, PASTEC, which classifies TEs by searching for structural features and similarities. This tool outperforms currently available software for TE classification. The main innovation of PASTEC is the search for HMM profiles, which is useful for inferring the classification of unknown TE on the basis of conserved functional domains of the proteins. In addition, PASTEC is the only tool providing an exhaustive spectrum of possible classifications to the order level of the Wicker hierarchical TE classification system. It can also automatically classify other repeated elements, such as SSR (Simple Sequence Repeats, rDNA or potential repeated host genes. Finally, the output of this new tool is designed to facilitate manual curation by providing to biologists with all the evidence accumulated for each TE consensus. AVAILABILITY: PASTEC is available as a REPET module or standalone software (http://urgi.versailles.inra.fr/download/repet/REPET_linux-x64-2.2.tar.gz. It requires a Unix-like system. There are two standalone versions: one of which is parallelized (requiring Sun grid Engine or Torque, and the other of which is not.

  10. A comparative evaluation of sequence classification programs

    Directory of Open Access Journals (Sweden)

    Bazinet Adam L

    2012-05-01

    Full Text Available Abstract Background A fundamental problem in modern genomics is to taxonomically or functionally classify DNA sequence fragments derived from environmental sampling (i.e., metagenomics. Several different methods have been proposed for doing this effectively and efficiently, and many have been implemented in software. In addition to varying their basic algorithmic approach to classification, some methods screen sequence reads for ’barcoding genes’ like 16S rRNA, or various types of protein-coding genes. Due to the sheer number and complexity of methods, it can be difficult for a researcher to choose one that is well-suited for a particular analysis. Results We divided the very large number of programs that have been released in recent years for solving the sequence classification problem into three main categories based on the general algorithm they use to compare a query sequence against a database of sequences. We also evaluated the performance of the leading programs in each category on data sets whose taxonomic and functional composition is known. Conclusions We found significant variability in classification accuracy, precision, and resource consumption of sequence classification programs when used to analyze various metagenomics data sets. However, we observe some general trends and patterns that will be useful to researchers who use sequence classification programs.

  11. A New Classification Approach Based on Multiple Classification Rules

    OpenAIRE

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  12. Phylogenetic classification and the universal tree.

    Science.gov (United States)

    Doolittle, W F

    1999-06-25

    From comparative analyses of the nucleotide sequences of genes encoding ribosomal RNAs and several proteins, molecular phylogeneticists have constructed a "universal tree of life," taking it as the basis for a "natural" hierarchical classification of all living things. Although confidence in some of the tree's early branches has recently been shaken, new approaches could still resolve many methodological uncertainties. More challenging is evidence that most archaeal and bacterial genomes (and the inferred ancestral eukaryotic nuclear genome) contain genes from multiple sources. If "chimerism" or "lateral gene transfer" cannot be dismissed as trivial in extent or limited to special categories of genes, then no hierarchical universal classification can be taken as natural. Molecular phylogeneticists will have failed to find the "true tree," not because their methods are inadequate or because they have chosen the wrong genes, but because the history of life cannot properly be represented as a tree. However, taxonomies based on molecular sequences will remain indispensable, and understanding of the evolutionary process will ultimately be enriched, not impoverished.

  13. Genes and Gene Therapy

    Science.gov (United States)

    ... correctly, a child can have a genetic disorder. Gene therapy is an experimental technique that uses genes to ... or prevent disease. The most common form of gene therapy involves inserting a normal gene to replace an ...

  14. Transporter taxonomy - a comparison of different transport protein classification schemes.

    Science.gov (United States)

    Viereck, Michael; Gaulton, Anna; Digles, Daniela; Ecker, Gerhard F

    2014-06-01

    Currently, there are more than 800 well characterized human membrane transport proteins (including channels and transporters) and there are estimates that about 10% (approx. 2000) of all human genes are related to transport. Membrane transport proteins are of interest as potential drug targets, for drug delivery, and as a cause of side effects and drug–drug interactions. In light of the development of Open PHACTS, which provides an open pharmacological space, we analyzed selected membrane transport protein classification schemes (Transporter Classification Database, ChEMBL, IUPHAR/BPS Guide to Pharmacology, and Gene Ontology) for their ability to serve as a basis for pharmacology driven protein classification. A comparison of these membrane transport protein classification schemes by using a set of clinically relevant transporters as use-case reveals the strengths and weaknesses of the different taxonomy approaches.

  15. Using fuzzy association rule mining in cancer classification

    International Nuclear Information System (INIS)

    Mahmoodian, Hamid; Marhaban, M.H.; Abdulrahim, Raha; Rosli, Rozita; Saripan, Iqbal

    2011-01-01

    Full text: The classification of the cancer tumors based on gene expression profiles has been extensively studied in numbers of studies. A wide variety of cancer datasets have been implemented by the various methods of gene selec tion and classification to identify the behavior of the genes in tumors and find the relationships between them and outcome of diseases. Interpretability of the model, which is developed by fuzzy rules and linguistic variables in this study, has been rarely considered. In addition, creating a fuzzy classifier with high performance in classification that uses a subset of significant genes which have been selected by different types of gene selection methods is another goal of this study. A new algorithm has been developed to identify the fuzzy rules and significant genes based on fuzzy association rule mining. At first, different subset of genes which have been selected by different methods, were used to generate primary fuzzy classifiers separately and then proposed algorithm was implemented to mix the genes which have been associated in the primary classifiers and generate a new classifier. The results show that fuzzy classifier can classify the tumors with high performance while presenting the relationships between the genes by linguistic variables

  16. Sequence Classification: 891809 [

    Lifescience Database Archive (English)

    Full Text Available unknown function, expressed during sporulation; not required for sporulation, but gene exhibits genetic int...eractions with other genes required for sporulation; Spr6p || http://www.ncbi.nlm.nih.gov/protein/6320961 ...

  17. Cellular image classification

    CERN Document Server

    Xu, Xiang; Lin, Feng

    2017-01-01

    This book introduces new techniques for cellular image feature extraction, pattern recognition and classification. The authors use the antinuclear antibodies (ANAs) in patient serum as the subjects and the Indirect Immunofluorescence (IIF) technique as the imaging protocol to illustrate the applications of the described methods. Throughout the book, the authors provide evaluations for the proposed methods on two publicly available human epithelial (HEp-2) cell datasets: ICPR2012 dataset from the ICPR'12 HEp-2 cell classification contest and ICIP2013 training dataset from the ICIP'13 Competition on cells classification by fluorescent image analysis. First, the reading of imaging results is significantly influenced by one’s qualification and reading systems, causing high intra- and inter-laboratory variance. The authors present a low-order LP21 fiber mode for optical single cell manipulation and imaging staining patterns of HEp-2 cells. A focused four-lobed mode distribution is stable and effective in optical...

  18. Bosniak classification system

    DEFF Research Database (Denmark)

    Graumann, Ole; Osther, Susanne Sloth; Karstoft, Jens

    2016-01-01

    BACKGROUND: The Bosniak classification was originally based on computed tomographic (CT) findings. Magnetic resonance (MR) and contrast-enhanced ultrasonography (CEUS) imaging may demonstrate findings that are not depicted at CT, and there may not always be a clear correlation between the findings...... at MR and CEUS imaging and those at CT. PURPOSE: To compare diagnostic accuracy of MR, CEUS, and CT when categorizing complex renal cystic masses according to the Bosniak classification. MATERIAL AND METHODS: From February 2011 to June 2012, 46 complex renal cysts were prospectively evaluated by three...... readers. Each mass was categorized according to the Bosniak classification and CT was chosen as gold standard. Kappa was calculated for diagnostic accuracy and data was compared with pathological results. RESULTS: CT images found 27 BII, six BIIF, seven BIII, and six BIV. Forty-three cysts could...

  19. Bosniak Classification system

    DEFF Research Database (Denmark)

    Graumann, Ole; Osther, Susanne Sloth; Karstoft, Jens

    2014-01-01

    Background: The Bosniak classification is a diagnostic tool for the differentiation of cystic changes in the kidney. The process of categorizing renal cysts may be challenging, involving a series of decisions that may affect the final diagnosis and clinical outcome such as surgical management....... Purpose: To investigate the inter- and intra-observer agreement among experienced uroradiologists when categorizing complex renal cysts according to the Bosniak classification. Material and Methods: The original categories of 100 cystic renal masses were chosen as “Gold Standard” (GS), established...... to the calculated weighted κ all readers performed “very good” for both inter-observer and intra-observer variation. Most variation was seen in cysts catagorized as Bosniak II, IIF, and III. These results show that radiologists who evaluate complex renal cysts routinely may apply the Bosniak classification...

  20. Acoustic classification of dwellings

    DEFF Research Database (Denmark)

    Berardi, Umberto; Rasmussen, Birgit

    2014-01-01

    insulation performance, national schemes for sound classification of dwellings have been developed in several European countries. These schemes define acoustic classes according to different levels of sound insulation. Due to the lack of coordination among countries, a significant diversity in terms...... exchanging experiences about constructions fulfilling different classes, reducing trade barriers, and finally increasing the sound insulation of dwellings.......Schemes for the classification of dwellings according to different building performances have been proposed in the last years worldwide. The general idea behind these schemes relates to the positive impact a higher label, and thus a better performance, should have. In particular, focusing on sound...

  1. Minimum Error Entropy Classification

    CERN Document Server

    Marques de Sá, Joaquim P; Santos, Jorge M F; Alexandre, Luís A

    2013-01-01

    This book explains the minimum error entropy (MEE) concept applied to data classification machines. Theoretical results on the inner workings of the MEE concept, in its application to solving a variety of classification problems, are presented in the wider realm of risk functionals. Researchers and practitioners also find in the book a detailed presentation of practical data classifiers using MEE. These include multi‐layer perceptrons, recurrent neural networks, complexvalued neural networks, modular neural networks, and decision trees. A clustering algorithm using a MEE‐like concept is also presented. Examples, tests, evaluation experiments and comparison with similar machines using classic approaches, complement the descriptions.

  2. Classification of iconic images

    OpenAIRE

    Zrianina, Mariia; Kopf, Stephan

    2016-01-01

    Iconic images represent an abstract topic and use a presentation that is intuitively understood within a certain cultural context. For example, the abstract topic “global warming” may be represented by a polar bear standing alone on an ice floe. Such images are widely used in media and their automatic classification can help to identify high-level semantic concepts. This paper presents a system for the classification of iconic images. It uses a variation of the Bag of Visual Words approach wi...

  3. Casemix classification systems.

    Science.gov (United States)

    Fetter, R B

    1999-01-01

    The idea of using casemix classification to manage hospital services is not new, but has been limited by available technology. It was not until after the introduction of Medicare in the United States in 1965 that serious attempts were made to measure hospital production in order to contain spiralling costs. This resulted in a system of casemix classification known as diagnosis related groups (DRGs). This paper traces the development of DRGs and their evolution from the initial version to the All Patient Refined DRGs developed in 1991.

  4. Information gathering for CLP classification

    Directory of Open Access Journals (Sweden)

    Ida Marcello

    2011-01-01

    Full Text Available Regulation 1272/2008 includes provisions for two types of classification: harmonised classification and self-classification. The harmonised classification of substances is decided at Community level and a list of harmonised classifications is included in the Annex VI of the classification, labelling and packaging Regulation (CLP. If a chemical substance is not included in the harmonised classification list it must be self-classified, based on available information, according to the requirements of Annex I of the CLP Regulation. CLP appoints that the harmonised classification will be performed for carcinogenic, mutagenic or toxic to reproduction substances (CMR substances and for respiratory sensitisers category 1 and for other hazard classes on a case-by-case basis. The first step of classification is the gathering of available and relevant information. This paper presents the procedure for gathering information and to obtain data. The data quality is also discussed.

  5. The paradox of atheoretical classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2016-01-01

    A distinction can be made between “artificial classifications” and “natural classifications,” where artificial classifications may adequately serve some limited purposes, but natural classifications are overall most fruitful by allowing inference and thus many different purposes. There is strong...... support for the view that a natural classification should be based on a theory (and, of course, that the most fruitful theory provides the most fruitful classification). Nevertheless, atheoretical (or “descriptive”) classifications are often produced. Paradoxically, atheoretical classifications may...... be very successful. The best example of a successful “atheoretical” classification is probably the prestigious Diagnostic and Statistical Manual of Mental Disorders (DSM) since its third edition from 1980. Based on such successes one may ask: Should the claim that classifications ideally are natural...

  6. Whole Blood mRNA Expression-Based Prognosis of Metastatic Renal Cell Carcinoma.

    Science.gov (United States)

    Giridhar, Karthik V; Sosa, Carlos P; Hillman, David W; Sanhueza, Cristobal; Dalpiaz, Candace L; Costello, Brian A; Quevedo, Fernando J; Pitot, Henry C; Dronca, Roxana S; Ertz, Donna; Cheville, John C; Donkena, Krishna Vanaja; Kohli, Manish

    2017-11-03

    The Memorial Sloan Kettering Cancer Center (MSKCC) prognostic score is based on clinical parameters. We analyzed whole blood mRNA expression in metastatic clear cell renal cell carcinoma (mCCRCC) patients and compared it to the MSKCC score for predicting overall survival. In a discovery set of 19 patients with mRCC, we performed whole transcriptome RNA sequencing and selected eighteen candidate genes for further evaluation based on associations with overall survival and statistical significance. In an independent validation of set of 47 patients with mCCRCC, transcript expression of the 18 candidate genes were quantified using a customized NanoString probeset. Cox regression multivariate analysis confirmed that two of the candidate genes were significantly associated with overall survival. Higher expression of BAG1 [hazard ratio (HR) of 0.14, p < 0.0001, 95% confidence interval (CI) 0.04-0.36] and NOP56 (HR 0.13, p < 0.0001, 95% CI 0.05-0.34) were associated with better prognosis. A prognostic model incorporating expression of BAG1 and NOP56 into the MSKCC score improved prognostication significantly over a model using the MSKCC prognostic score only ( p < 0.0001). Prognostic value of using whole blood mRNA gene profiling in mCCRCC is feasible and should be prospectively confirmed in larger studies.

  7. Whole Blood mRNA Expression-Based Prognosis of Metastatic Renal Cell Carcinoma

    Directory of Open Access Journals (Sweden)

    Karthik V. Giridhar

    2017-11-01

    Full Text Available The Memorial Sloan Kettering Cancer Center (MSKCC prognostic score is based on clinical parameters. We analyzed whole blood mRNA expression in metastatic clear cell renal cell carcinoma (mCCRCC patients and compared it to the MSKCC score for predicting overall survival. In a discovery set of 19 patients with mRCC, we performed whole transcriptome RNA sequencing and selected eighteen candidate genes for further evaluation based on associations with overall survival and statistical significance. In an independent validation of set of 47 patients with mCCRCC, transcript expression of the 18 candidate genes were quantified using a customized NanoString probeset. Cox regression multivariate analysis confirmed that two of the candidate genes were significantly associated with overall survival. Higher expression of BAG1 [hazard ratio (HR of 0.14, p < 0.0001, 95% confidence interval (CI 0.04–0.36] and NOP56 (HR 0.13, p < 0.0001, 95% CI 0.05–0.34 were associated with better prognosis. A prognostic model incorporating expression of BAG1 and NOP56 into the MSKCC score improved prognostication significantly over a model using the MSKCC prognostic score only (p < 0.0001. Prognostic value of using whole blood mRNA gene profiling in mCCRCC is feasible and should be prospectively confirmed in larger studies.

  8. MERRF Classification: Implications for Diagnosis and Clinical Trials.

    Science.gov (United States)

    Finsterer, Josef; Zarrouk-Mahjoub, Sinda; Shoffner, John M

    2018-03-01

    Given the etiologic heterogeneity of disease classification using clinical phenomenology, we employed contemporary criteria to classify variants associated with myoclonic epilepsy with ragged-red fibers (MERRF) syndrome and to assess the strength of evidence of gene-disease associations. Standardized approaches are used to clarify the definition of MERRF, which is essential for patient diagnosis, patient classification, and clinical trial design. Systematic literature and database search with application of standardized assessment of gene-disease relationships using modified Smith criteria and of variants reported to be associated with MERRF using modified Yarham criteria. Review of available evidence supports a gene-disease association for two MT-tRNAs and for POLG. Using modified Smith criteria, definitive evidence of a MERRF gene-disease association is identified for MT-TK. Strong gene-disease evidence is present for MT-TL1 and POLG. Functional assays that directly associate variants with oxidative phosphorylation impairment were critical to mtDNA variant classification. In silico analysis was of limited utility to the assessment of individual MT-tRNA variants. With the use of contemporary classification criteria, several mtDNA variants previously reported as pathogenic or possibly pathogenic are reclassified as neutral variants. MERRF is primarily an MT-TK disease, with pathogenic variants in this gene accounting for ~90% of MERRF patients. Although MERRF is phenotypically and genotypically heterogeneous, myoclonic epilepsy is the clinical feature that distinguishes MERRF from other categories of mitochondrial disorders. Given its low frequency in mitochondrial disorders, myoclonic epilepsy is not explained simply by an impairment of cellular energetics. Although MERRF phenocopies can occur in other genes, additional data are needed to establish a MERRF disease-gene association. This approach to MERRF emphasizes standardized classification rather than clinical

  9. Combined genetic and splicing analysis of BRCA1 c.[594-2A>C; 641A>G] highlights the relevance of naturally occurring in-frame transcripts for developing disease gene variant classification algorithms

    OpenAIRE

    de la Hoya, Miguel; Soukarieh, Omar; L��pez-Perolio, Irene; Vega, Ana; Walker, Logan C.; van Ierland, Yvette; Baralle, Diana; Santamari��a, Marta; Lattimore, Vanessa; Wijnen, Juul; Whiley, Philip; Blanco, Ana; Raponi, Michela; Hauke, Jan; Wappenschmidt, Barbara

    2016-01-01

    A recent analysis using family history weighting and co-observation classification modeling indicated that BRCA1 c.594-2A > C (IVS9-2A > C), previously described to cause exon 10 skipping (a truncating alteration), displays characteristics inconsistent with those of a high risk pathogenic BRCA1 variant. We used large-scale genetic and clinical resources from the ENIGMA, CIMBA and BCAC consortia to assess pathogenicity of c.594-2A > C. The combined odds for causality considering case-control, ...

  10. Ecosystem classification, Chapter 2

    Science.gov (United States)

    M.J. Robin-Abbott; L.H. Pardo

    2011-01-01

    The ecosystem classification in this report is based on the ecoregions developed through the Commission for Environmental Cooperation (CEC) for North America (CEC 1997). Only ecosystems that occur in the United States are included. CEC ecoregions are described, with slight modifications, below (CEC 1997) and shown in Figures 2.1 and 2.2. We chose this ecosystem...

  11. The classification of phocomelia.

    Science.gov (United States)

    Tytherleigh-Strong, G; Hooper, G

    2003-06-01

    We studied 24 patients with 44 phocomelic upper limbs. Only 11 limbs could be grouped in the classification system of Frantz and O' Rahilly. The non-classifiable limbs were further studied and their characteristics identified. It is confirmed that phocomelia is not an intercalary defect.

  12. Principles for ecological classification

    Science.gov (United States)

    Dennis H. Grossman; Patrick Bourgeron; Wolf-Dieter N. Busch; David T. Cleland; William Platts; G. Ray; C. Robins; Gary Roloff

    1999-01-01

    The principal purpose of any classification is to relate common properties among different entities to facilitate understanding of evolutionary and adaptive processes. In the context of this volume, it is to facilitate ecosystem stewardship, i.e., to help support ecosystem conservation and management objectives.

  13. Mimicking human texture classification

    NARCIS (Netherlands)

    Rogowitz, B.E.; van Rikxoort, Eva M.; van den Broek, Egon; Pappas, T.N.; Schouten, Theo E.; Daly, S.J.

    2005-01-01

    In an attempt to mimic human (colorful) texture classification by a clustering algorithm three lines of research have been encountered, in which as test set 180 texture images (both their color and gray-scale equivalent) were drawn from the OuTex and VisTex databases. First, a k-means algorithm was

  14. Classification, confusion and misclassification

    African Journals Online (AJOL)

    The classification of objects and phenomena in science and nature has fascinated academics since Carl Linnaeus, the Swedish botanist and zoologist, created his binomial description of living things in the 1700s and probably long before in accounts of others in textbooks long since gone. It must have concerned human ...

  15. Classifications in popular music

    NARCIS (Netherlands)

    van Venrooij, A.; Schmutz, V.; Wright, J.D.

    2015-01-01

    The categorical system of popular music, such as genre categories, is a highly differentiated and dynamic classification system. In this article we present work that studies different aspects of these categorical systems in popular music. Following the work of Paul DiMaggio, we focus on four

  16. Shark Teeth Classification

    Science.gov (United States)

    Brown, Tom; Creel, Sally; Lee, Velda

    2009-01-01

    On a recent autumn afternoon at Harmony Leland Elementary in Mableton, Georgia, students in a fifth-grade science class investigated the essential process of classification--the act of putting things into groups according to some common characteristics or attributes. While they may have honed these skills earlier in the week by grouping their own…

  17. Text document classification

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana

    č. 62 (2005), s. 53-54 ISSN 0926-4981 R&D Projects: GA AV ČR IAA2075302; GA AV ČR KSK1019101; GA MŠk 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : document representation * categorization * classification Subject RIV: BD - Theory of Information

  18. Classification in Medical Imaging

    DEFF Research Database (Denmark)

    Chen, Chen

    Classification is extensively used in the context of medical image analysis for the purpose of diagnosis or prognosis. In order to classify image content correctly, one needs to extract efficient features with discriminative properties and build classifiers based on these features. In addition...... on characterizing human faces and emphysema disease in lung CT images....

  19. Improving Student Question Classification

    Science.gov (United States)

    Heiner, Cecily; Zachary, Joseph L.

    2009-01-01

    Students in introductory programming classes often articulate their questions and information needs incompletely. Consequently, the automatic classification of student questions to provide automated tutorial responses is a challenging problem. This paper analyzes 411 questions from an introductory Java programming course by reducing the natural…

  20. NOUN CLASSIFICATION IN ESAHIE

    African Journals Online (AJOL)

    The present work deals with noun classification in Esahie (Kwa, Niger ... phonological information influences the noun (form) class system of Esahie. ... between noun classes and (grammatical) Gender is interrogated (in the light of ..... the (A) argument6 precedes the verb and the (P) argument7 follows the verb in a simple.

  1. Dynamic Latent Classification Model

    DEFF Research Database (Denmark)

    Zhong, Shengtong; Martínez, Ana M.; Nielsen, Thomas Dyhre

    as possible. Motivated by this problem setting, we propose a generative model for dynamic classification in continuous domains. At each time point the model can be seen as combining a naive Bayes model with a mixture of factor analyzers (FA). The latent variables of the FA are used to capture the dynamics...

  2. Classification of myocardial infarction

    DEFF Research Database (Denmark)

    Saaby, Lotte; Poulsen, Tina Svenstrup; Hosbond, Susanne Elisabeth

    2013-01-01

    The classification of myocardial infarction into 5 types was introduced in 2007 as an important component of the universal definition. In contrast to the plaque rupture-related type 1 myocardial infarction, type 2 myocardial infarction is considered to be caused by an imbalance between demand...

  3. Event Classification using Concepts

    NARCIS (Netherlands)

    Boer, M.H.T. de; Schutte, K.; Kraaij, W.

    2013-01-01

    The semantic gap is one of the challenges in the GOOSE project. In this paper a Semantic Event Classification (SEC) system is proposed as an initial step in tackling the semantic gap challenge in the GOOSE project. This system uses semantic text analysis, multiple feature detectors using the BoW

  4. NEW CLASSIFICATION OF ECOPOLICES

    Directory of Open Access Journals (Sweden)

    VOROBYOV V. V.

    2016-09-01

    Full Text Available Problem statement. Ecopolices are the newest stage of the urban planning. They have to be consideredsuchas material and energy informational structures, included to the dynamic-evolutionary matrix netsofex change processes in the ecosystems. However, there are not made the ecopolice classifications, developing on suchapproaches basis. And this determined the topicality of the article. Analysis of publications on theoretical and applied aspects of the ecopolices formation showed, that the work on them is managed mainly in the context of the latest scientific and technological achievements in the various knowledge fields. These settlements are technocratic. They are connected with the morphology of space, network structures of regional and local natural ecosystems, without independent stability, can not exist without continuous man support. Another words, they do not work in with an ecopolices idea. It is come to a head for objective, symbiotic searching of ecopolices concept with the development of their classifications. Purpose statement is to develop the objective evidence for ecopolices and to propose their new classification. Conclusion. On the base of the ecopolices classification have to lie an elements correlation idea of their general plans and men activity type according with natural mechanism of accepting, reworking and transmission of material, energy and information between geo-ecosystems, planet, man, ecopolices material part and Cosmos. New ecopolices classification should be based on the principles of multi-dimensional, time-spaced symbiotic clarity with exchange ecosystem networks. The ecopolice function with this approach comes not from the subjective anthropocentric economy but from the holistic objective of Genesis paradigm. Or, otherwise - not from the Consequence, but from the Cause.

  5. Efficient Fingercode Classification

    Science.gov (United States)

    Sun, Hong-Wei; Law, Kwok-Yan; Gollmann, Dieter; Chung, Siu-Leung; Li, Jian-Bin; Sun, Jia-Guang

    In this paper, we present an efficient fingerprint classification algorithm which is an essential component in many critical security application systems e. g. systems in the e-government and e-finance domains. Fingerprint identification is one of the most important security requirements in homeland security systems such as personnel screening and anti-money laundering. The problem of fingerprint identification involves searching (matching) the fingerprint of a person against each of the fingerprints of all registered persons. To enhance performance and reliability, a common approach is to reduce the search space by firstly classifying the fingerprints and then performing the search in the respective class. Jain et al. proposed a fingerprint classification algorithm based on a two-stage classifier, which uses a K-nearest neighbor classifier in its first stage. The fingerprint classification algorithm is based on the fingercode representation which is an encoding of fingerprints that has been demonstrated to be an effective fingerprint biometric scheme because of its ability to capture both local and global details in a fingerprint image. We enhance this approach by improving the efficiency of the K-nearest neighbor classifier for fingercode-based fingerprint classification. Our research firstly investigates the various fast search algorithms in vector quantization (VQ) and the potential application in fingerprint classification, and then proposes two efficient algorithms based on the pyramid-based search algorithms in VQ. Experimental results on DB1 of FVC 2004 demonstrate that our algorithms can outperform the full search algorithm and the original pyramid-based search algorithms in terms of computational efficiency without sacrificing accuracy.

  6. Differential Classification of Dementia

    Directory of Open Access Journals (Sweden)

    E. Mohr

    1995-01-01

    Full Text Available In the absence of biological markers, dementia classification remains complex both in terms of characterization as well as early detection of the presence or absence of dementing symptoms, particularly in diseases with possible secondary dementia. An empirical, statistical approach using neuropsychological measures was therefore developed to distinguish demented from non-demented patients and to identify differential patterns of cognitive dysfunction in neurodegenerative disease. Age-scaled neurobehavioral test results (Wechsler Adult Intelligence Scale—Revised and Wechsler Memory Scale from Alzheimer's (AD and Huntington's (HD patients, matched for intellectual disability, as well as normal controls were used to derive a classification formula. Stepwise discriminant analysis accurately (99% correct distinguished controls from demented patients, and separated the two patient groups (79% correct. Variables discriminating between HD and AD patient groups consisted of complex psychomotor tasks, visuospatial function, attention and memory. The reliability of the classification formula was demonstrated with a new, independent sample of AD and HD patients which yielded virtually identical results (classification accuracy for dementia: 96%; AD versus HD: 78%. To validate the formula, the discriminant function was applied to Parkinson's (PD patients, 38% of whom were classified as demented. The validity of the classification was demonstrated by significant PD subgroup differences on measures of dementia not included in the discriminant function. Moreover, a majority of demented PD patients (65% were classified as having an HD-like pattern of cognitive deficits, in line with previous reports of the subcortical nature of PD dementia. This approach may thus be useful in classifying presence or absence of dementia and in discriminating between dementia subtypes in cases of secondary or coincidental dementia.

  7. 78 FR 54970 - Cotton Futures Classification: Optional Classification Procedure

    Science.gov (United States)

    2013-09-09

    ... Service 7 CFR Part 27 [AMS-CN-13-0043] RIN 0581-AD33 Cotton Futures Classification: Optional Classification Procedure AGENCY: Agricultural Marketing Service, USDA. ACTION: Proposed rule. SUMMARY: The... optional cotton futures classification procedure--identified and known as ``registration'' by the U.S...

  8. 32 CFR 2700.22 - Classification guides.

    Science.gov (United States)

    2010-07-01

    ... SECURITY INFORMATION REGULATIONS Derivative Classification § 2700.22 Classification guides. OMSN shall... direct derivative classification, shall identify the information to be protected in specific and uniform...

  9. IAEA Classification of Uranium Deposits

    International Nuclear Information System (INIS)

    Bruneton, Patrice

    2014-01-01

    Classifications of uranium deposits follow two general approaches, focusing on: • descriptive features such as the geotectonic position, the host rock type, the orebody morphology, …… : « geologic classification »; • or on genetic aspects: « genetic classification »

  10. The future of general classification

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2013-01-01

    Discusses problems related to accessing multiple collections using a single retrieval language. Surveys the concepts of interoperability and switching language. Finds that mapping between more indexing languages always will be an approximation. Surveys the issues related to general classification...... and contrasts that to special classifications. Argues for the use of general classifications to provide access to collections nationally and internationally....

  11. [Headache: classification and diagnosis].

    Science.gov (United States)

    Carbaat, P A T; Couturier, E G M

    2016-11-01

    There are many types of headache and, moreover, many people have different types of headache at the same time. Adequate treatment is possible only on the basis of the correct diagnosis. Technically and in terms of content the current diagnostics process for headache is based on the 'International Classification of Headache Disorders' (ICHD-3-beta) that was produced under the auspices of the International Headache Society. This classification is based on a distinction between primary and secondary headaches. The most common primary headache types are the tension type headache, migraine and the cluster headache. Application of uniform diagnostic concepts is essential to come to the most appropriate treatment of the various types of headache.

  12. Classification of hand eczema

    DEFF Research Database (Denmark)

    Agner, T; Aalto-Korte, K; Andersen, K E

    2015-01-01

    BACKGROUND: Classification of hand eczema (HE) is mandatory in epidemiological and clinical studies, and also important in clinical work. OBJECTIVES: The aim was to test a recently proposed classification system of HE in clinical practice in a prospective multicentre study. METHODS: Patients were...... recruited from nine different tertiary referral centres. All patients underwent examination by specialists in dermatology and were checked using relevant allergy testing. Patients were classified into one of the six diagnostic subgroups of HE: allergic contact dermatitis, irritant contact dermatitis, atopic...... system investigated in the present study was useful, being able to give an appropriate main diagnosis for 89% of HE patients, and for another 7% when using two main diagnoses. The fact that more than half of the patients had one or more additional diagnoses illustrates that HE is a multifactorial disease....

  13. Sound classification of dwellings

    DEFF Research Database (Denmark)

    Rasmussen, Birgit

    2012-01-01

    National schemes for sound classification of dwellings exist in more than ten countries in Europe, typically published as national standards. The schemes define quality classes reflecting different levels of acoustical comfort. Main criteria concern airborne and impact sound insulation between...... dwellings, facade sound insulation and installation noise. The schemes have been developed, implemented and revised gradually since the early 1990s. However, due to lack of coordination between countries, there are significant discrepancies, and new standards and revisions continue to increase the diversity...... is needed, and a European COST Action TU0901 "Integrating and Harmonizing Sound Insulation Aspects in Sustainable Urban Housing Constructions", has been established and runs 2009-2013, one of the main objectives being to prepare a proposal for a European sound classification scheme with a number of quality...

  14. Sequence Classification: 890824 [

    Lifescience Database Archive (English)

    Full Text Available hyphal/invasive growth pathways; cooperates with Tec1p transcription factor to regulate genes specific for invasive growth; Ste12p || http://www.ncbi.nlm.nih.gov/protein/6321876 ... ...ion factor that is activated by a MAP kinase signaling cascade, activates genes involved in mating or pseudo

  15. Identification and classification of genes regulated by phosphatidylinositol 3-kinase- and TRKB-mediated signalling pathways during neuronal differentiation in two subtypes of the human neuroblastoma cell line SH-SY5Y

    Directory of Open Access Journals (Sweden)

    Sakaki Yoshiyuki

    2008-10-01

    Full Text Available Abstract Background SH-SY5Y cells exhibit a neuronal phenotype when treated with all-trans retinoic acid (RA, but the molecular mechanism of activation in the signalling pathway mediated by phosphatidylinositol 3-kinase (PI3K is unclear. To investigate this mechanism, we compared the gene expression profiles in SK-N-SH cells and two subtypes of SH-SY5Y cells (SH-SY5Y-A and SH-SY5Y-E, each of which show a different phenotype during RA-mediated differentiation. Findings SH-SY5Y-A cells differentiated in the presence of RA, whereas RA-treated SH-SY5Y-E cells required additional treatment with brain-derived neurotrophic factor (BDNF for full differentiation. After exposing cells to a PI3K inhibitor, LY294002, we identified 386 genes and categorised these genes into two clusters dependent on the PI3K signalling pathway during RA-mediated differentiation in SH-SY5Y-A cells. Transcriptional regulation of the gene cluster, including 158 neural genes, was greatly reduced in SK-N-SH cells and partially impaired in SH-SY5Y-E cells, which is consistent with a defect in the neuronal phenotype of these cells. Additional stimulation with BDNF induced a set of neural genes that were down-regulated in RA-treated SH-SY5Y-E cells but were abundant in differentiated SH-SY5Y-A cells. Conclusion We identified gene clusters controlled by PI3K- and TRKB-mediated signalling pathways during the differentiation of two subtypes of SH-SY5Y cells. The TRKB-mediated bypass pathway compensates for impaired neural function generated by defects in several signalling pathways, including PI3K in SH-SY5Y-E cells. Our expression profiling data will be useful for further elucidation of the signal transduction-transcriptional network involving PI3K or TRKB.

  16. Identification and classification of genes regulated by phosphatidylinositol 3-kinase- and TRKB-mediated signalling pathways during neuronal differentiation in two subtypes of the human neuroblastoma cell line SH-SY5Y.

    Science.gov (United States)

    Nishida, Yuichiro; Adati, Naoki; Ozawa, Ritsuko; Maeda, Aasami; Sakaki, Yoshiyuki; Takeda, Tadayuki

    2008-10-28

    SH-SY5Y cells exhibit a neuronal phenotype when treated with all-trans retinoic acid (RA), but the molecular mechanism of activation in the signalling pathway mediated by phosphatidylinositol 3-kinase (PI3K) is unclear. To investigate this mechanism, we compared the gene expression profiles in SK-N-SH cells and two subtypes of SH-SY5Y cells (SH-SY5Y-A and SH-SY5Y-E), each of which show a different phenotype during RA-mediated differentiation. SH-SY5Y-A cells differentiated in the presence of RA, whereas RA-treated SH-SY5Y-E cells required additional treatment with brain-derived neurotrophic factor (BDNF) for full differentiation. After exposing cells to a PI3K inhibitor, LY294002, we identified 386 genes and categorised these genes into two clusters dependent on the PI3K signalling pathway during RA-mediated differentiation in SH-SY5Y-A cells. Transcriptional regulation of the gene cluster, including 158 neural genes, was greatly reduced in SK-N-SH cells and partially impaired in SH-SY5Y-E cells, which is consistent with a defect in the neuronal phenotype of these cells. Additional stimulation with BDNF induced a set of neural genes that were down-regulated in RA-treated SH-SY5Y-E cells but were abundant in differentiated SH-SY5Y-A cells. We identified gene clusters controlled by PI3K- and TRKB-mediated signalling pathways during the differentiation of two subtypes of SH-SY5Y cells. The TRKB-mediated bypass pathway compensates for impaired neural function generated by defects in several signalling pathways, including PI3K in SH-SY5Y-E cells. Our expression profiling data will be useful for further elucidation of the signal transduction-transcriptional network involving PI3K or TRKB.

  17. Granular loess classification based

    International Nuclear Information System (INIS)

    Browzin, B.S.

    1985-01-01

    This paper discusses how loess might be identified by two index properties: the granulometric composition and the dry unit weight. These two indices are necessary but not always sufficient for identification of loess. On the basis of analyses of samples from three continents, it was concluded that the 0.01-0.5-mm fraction deserves the name loessial fraction. Based on the loessial fraction concept, a granulometric classification of loess is proposed. A triangular chart is used to classify loess

  18. Classification and regression trees

    CERN Document Server

    Breiman, Leo; Olshen, Richard A; Stone, Charles J

    1984-01-01

    The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

  19. CLASSIFICATION OF CRIMINAL GROUPS

    OpenAIRE

    Natalia Romanova

    2013-01-01

    New types of criminal groups are emerging in modern society.  These types have their special criminal subculture. The research objective is to develop new parameters of classification of modern criminal groups, create a new typology of criminal groups and identify some features of their subculture. Research methodology is based on the system approach that includes using the method of analysis of documentary sources (materials of a criminal case), method of conversations with themembers of the...

  20. Decimal Classification Editions

    Directory of Open Access Journals (Sweden)

    Zenovia Niculescu

    2009-01-01

    Full Text Available The study approaches the evolution of Dewey Decimal Classification editions from the perspective of updating the terminology, reallocating and expanding the main and auxilary structure of Dewey indexing language. The comparative analysis of DDC editions emphasizes the efficiency of Dewey scheme from the point of view of improving the informational offer, through basic index terms, revised and developed, as well as valuing the auxilary notations.

  1. Decimal Classification Editions

    OpenAIRE

    Zenovia Niculescu

    2009-01-01

    The study approaches the evolution of Dewey Decimal Classification editions from the perspective of updating the terminology, reallocating and expanding the main and auxilary structure of Dewey indexing language. The comparative analysis of DDC editions emphasizes the efficiency of Dewey scheme from the point of view of improving the informational offer, through basic index terms, revised and developed, as well as valuing the auxilary notations.

  2. Genetic classification and distinguishing of Staphylococcus species based on different partial gap, 16S rRNA, hsp60, rpoB, sodA, and tuf gene sequences.

    Science.gov (United States)

    Ghebremedhin, B; Layer, F; König, W; König, B

    2008-03-01

    The analysis of 16S rRNA gene sequences has been the technique generally used to study the evolution and taxonomy of staphylococci. However, the results of this method do not correspond to the results of polyphasic taxonomy, and the related species cannot always be distinguished from each other. Thus, new phylogenetic markers for Staphylococcus spp. are needed. We partially sequenced the gap gene (approximately 931 bp), which encodes the glyceraldehyde-3-phosphate dehydrogenase, for 27 Staphylococcus species. The partial sequences had 24.3 to 96% interspecies homology and were useful in the identification of staphylococcal species (F. Layer, B. Ghebremedhin, W. König, and B. König, J. Microbiol. Methods 70:542-549, 2007). The DNA sequence similarities of the partial staphylococcal gap sequences were found to be lower than those of 16S rRNA (approximately 97%), rpoB (approximately 86%), hsp60 (approximately 82%), and sodA (approximately 78%). Phylogenetically derived trees revealed four statistically supported groups: S. hyicus/S. intermedius, S. sciuri, S. haemolyticus/S. simulans, and S. aureus/epidermidis. The branching of S. auricularis, S. cohnii subsp. cohnii, and the heterogeneous S. saprophyticus group, comprising S. saprophyticus subsp. saprophyticus and S. equorum subsp. equorum, was not reliable. Thus, the phylogenetic analysis based on the gap gene sequences revealed similarities between the dendrograms based on other gene sequences (e.g., the S. hyicus/S. intermedius and S. sciuri groups) as well as differences, e.g., the grouping of S. arlettae and S. kloosii in the gap-based tree. From our results, we propose the partial sequencing of the gap gene as an alternative molecular tool for the taxonomical analysis of Staphylococcus species and for decreasing the possibility of misidentification.

  3. Genetic Classification and Distinguishing of Staphylococcus Species Based on Different Partial gap, 16S rRNA, hsp60, rpoB, sodA, and tuf Gene Sequences▿

    Science.gov (United States)

    Ghebremedhin, B.; Layer, F.; König, W.; König, B.

    2008-01-01

    The analysis of 16S rRNA gene sequences has been the technique generally used to study the evolution and taxonomy of staphylococci. However, the results of this method do not correspond to the results of polyphasic taxonomy, and the related species cannot always be distinguished from each other. Thus, new phylogenetic markers for Staphylococcus spp. are needed. We partially sequenced the gap gene (∼931 bp), which encodes the glyceraldehyde-3-phosphate dehydrogenase, for 27 Staphylococcus species. The partial sequences had 24.3 to 96% interspecies homology and were useful in the identification of staphylococcal species (F. Layer, B. Ghebremedhin, W. König, and B. König, J. Microbiol. Methods 70:542-549, 2007). The DNA sequence similarities of the partial staphylococcal gap sequences were found to be lower than those of 16S rRNA (∼97%), rpoB (∼86%), hsp60 (∼82%), and sodA (∼78%). Phylogenetically derived trees revealed four statistically supported groups: S. hyicus/S. intermedius, S. sciuri, S. haemolyticus/S. simulans, and S. aureus/epidermidis. The branching of S. auricularis, S. cohnii subsp. cohnii, and the heterogeneous S. saprophyticus group, comprising S. saprophyticus subsp. saprophyticus and S. equorum subsp. equorum, was not reliable. Thus, the phylogenetic analysis based on the gap gene sequences revealed similarities between the dendrograms based on other gene sequences (e.g., the S. hyicus/S. intermedius and S. sciuri groups) as well as differences, e.g., the grouping of S. arlettae and S. kloosii in the gap-based tree. From our results, we propose the partial sequencing of the gap gene as an alternative molecular tool for the taxonomical analysis of Staphylococcus species and for decreasing the possibility of misidentification. PMID:18174295

  4. Classifications of track structures

    International Nuclear Information System (INIS)

    Paretzke, H.G.

    1984-01-01

    When ionizing particles interact with matter they produce random topological structures of primary activations which represent the initial boundary conditions for all subsequent physical, chemical and/or biological reactions. There are two important aspects of research on such track structures, namely their experimental or theoretical determination on one hand and the quantitative classification of these complex structures which is a basic pre-requisite for the understanding of mechanisms of radiation actions. This paper deals only with the latter topic, i.e. the problems encountered in and possible approaches to quantitative ordering and grouping of these multidimensional objects by their degrees of similarity with respect to their efficiency in producing certain final radiation effects, i.e. to their ''radiation quality.'' Various attempts of taxonometric classification with respect to radiation efficiency have been made in basic and applied radiation research including macro- and microdosimetric concepts as well as track entities and stopping power based theories. In this paper no review of those well-known approaches is given but rather an outline and discussion of alternative methods new to this field of radiation research which have some very promising features and which could possibly solve at least some major classification problems

  5. Neuromuscular disease classification system

    Science.gov (United States)

    Sáez, Aurora; Acha, Begoña; Montero-Sánchez, Adoración; Rivas, Eloy; Escudero, Luis M.; Serrano, Carmen

    2013-06-01

    Diagnosis of neuromuscular diseases is based on subjective visual assessment of biopsies from patients by the pathologist specialist. A system for objective analysis and classification of muscular dystrophies and neurogenic atrophies through muscle biopsy images of fluorescence microscopy is presented. The procedure starts with an accurate segmentation of the muscle fibers using mathematical morphology and a watershed transform. A feature extraction step is carried out in two parts: 24 features that pathologists take into account to diagnose the diseases and 58 structural features that the human eye cannot see, based on the assumption that the biopsy is considered as a graph, where the nodes are represented by each fiber, and two nodes are connected if two fibers are adjacent. A feature selection using sequential forward selection and sequential backward selection methods, a classification using a Fuzzy ARTMAP neural network, and a study of grading the severity are performed on these two sets of features. A database consisting of 91 images was used: 71 images for the training step and 20 as the test. A classification error of 0% was obtained. It is concluded that the addition of features undetectable by the human visual inspection improves the categorization of atrophic patterns.

  6. An automated cirrus classification

    Science.gov (United States)

    Gryspeerdt, Edward; Quaas, Johannes; Goren, Tom; Klocke, Daniel; Brueck, Matthias

    2018-05-01

    Cirrus clouds play an important role in determining the radiation budget of the earth, but many of their properties remain uncertain, particularly their response to aerosol variations and to warming. Part of the reason for this uncertainty is the dependence of cirrus cloud properties on the cloud formation mechanism, which itself is strongly dependent on the local meteorological conditions. In this work, a classification system (Identification and Classification of Cirrus or IC-CIR) is introduced to identify cirrus clouds by the cloud formation mechanism. Using reanalysis and satellite data, cirrus clouds are separated into four main types: orographic, frontal, convective and synoptic. Through a comparison to convection-permitting model simulations and back-trajectory-based analysis, it is shown that these observation-based regimes can provide extra information on the cloud-scale updraughts and the frequency of occurrence of liquid-origin ice, with the convective regime having higher updraughts and a greater occurrence of liquid-origin ice compared to the synoptic regimes. Despite having different cloud formation mechanisms, the radiative properties of the regimes are not distinct, indicating that retrieved cloud properties alone are insufficient to completely describe them. This classification is designed to be easily implemented in GCMs, helping improve future model-observation comparisons and leading to improved parametrisations of cirrus cloud processes.

  7. Maximum mutual information regularized classification

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-09-07

    In this paper, a novel pattern classification approach is proposed by regularizing the classifier learning to maximize mutual information between the classification response and the true class label. We argue that, with the learned classifier, the uncertainty of the true class label of a data sample should be reduced by knowing its classification response as much as possible. The reduced uncertainty is measured by the mutual information between the classification response and the true class label. To this end, when learning a linear classifier, we propose to maximize the mutual information between classification responses and true class labels of training samples, besides minimizing the classification error and reducing the classifier complexity. An objective function is constructed by modeling mutual information with entropy estimation, and it is optimized by a gradient descend method in an iterative algorithm. Experiments on two real world pattern classification problems show the significant improvements achieved by maximum mutual information regularization.

  8. Maximum mutual information regularized classification

    KAUST Repository

    Wang, Jim Jing-Yan; Wang, Yi; Zhao, Shiguang; Gao, Xin

    2014-01-01

    In this paper, a novel pattern classification approach is proposed by regularizing the classifier learning to maximize mutual information between the classification response and the true class label. We argue that, with the learned classifier, the uncertainty of the true class label of a data sample should be reduced by knowing its classification response as much as possible. The reduced uncertainty is measured by the mutual information between the classification response and the true class label. To this end, when learning a linear classifier, we propose to maximize the mutual information between classification responses and true class labels of training samples, besides minimizing the classification error and reducing the classifier complexity. An objective function is constructed by modeling mutual information with entropy estimation, and it is optimized by a gradient descend method in an iterative algorithm. Experiments on two real world pattern classification problems show the significant improvements achieved by maximum mutual information regularization.

  9. Nonlinear programming for classification problems in machine learning

    Science.gov (United States)

    Astorino, Annabella; Fuduli, Antonio; Gaudioso, Manlio

    2016-10-01

    We survey some nonlinear models for classification problems arising in machine learning. In the last years this field has become more and more relevant due to a lot of practical applications, such as text and web classification, object recognition in machine vision, gene expression profile analysis, DNA and protein analysis, medical diagnosis, customer profiling etc. Classification deals with separation of sets by means of appropriate separation surfaces, which is generally obtained by solving a numerical optimization model. While linear separability is the basis of the most popular approach to classification, the Support Vector Machine (SVM), in the recent years using nonlinear separating surfaces has received some attention. The objective of this work is to recall some of such proposals, mainly in terms of the numerical optimization models. In particular we tackle the polyhedral, ellipsoidal, spherical and conical separation approaches and, for some of them, we also consider the semisupervised versions.

  10. Sequence Classification: 890604 [

    Lifescience Database Archive (English)

    Full Text Available scriptional coactivator SKIP, can activate transcription of a reporter gene; interacts with splicing factors Prp22p and Prp46p; Prp45p || http://www.ncbi.nlm.nih.gov/protein/6319287 ...

  11. Sequence Classification: 889958 [

    Lifescience Database Archive (English)

    Full Text Available e-responsive elements (ICREs), required for derepression of phospholipid biosynthetic genes in response to inositol depletion; Ino2p || http://www.ncbi.nlm.nih.gov/protein/6320328 ...

  12. Sequence Classification: 889932 [

    Lifescience Database Archive (English)

    Full Text Available ctor, involved in the expression of genes during nutrient limitation; also involved in the negative regulation of DPP1 and PHR1; Gis1p || http://www.ncbi.nlm.nih.gov/protein/6320301 ...

  13. Genomewide identification, classification and analysis of NAC type ...

    Indian Academy of Sciences (India)

    Supplementary data: Genomewide identification, classification and analysis of NAC type gene family in maize. Xiaojian Peng, Yang Zhao, Xiaoming Li, Min Wu, Wenbo Chai, Lei Sheng, Yu Wang, Qing Dong,. Haiyang Jiang and Beijiu Cheng. J. Genet. 94, 377–390. Table 1. Detailed information of NAC proteins in maize.

  14. Classification of mitocans, anti-cancer drugs acting on mitochondria

    Czech Academy of Sciences Publication Activity Database

    Neužil, Jiří; Dong, L. F.; Rohlena, Jakub; Truksa, Jaroslav; Ralph, S. J.

    2013-01-01

    Roč. 13, č. 3 (2013), s. 199-208 ISSN 1567-7249 Institutional research plan: CEZ:AV0Z50520701 Keywords : Mitocans * Anti-cancer therapeutics * Classification Subject RIV: EB - Gene tics ; Molecular Biology Impact factor: 3.524, year: 2013

  15. Inherited epidermolysis bullosa : Updated recommendations on diagnosis and classification

    NARCIS (Netherlands)

    Fine, Jo-David; Bruckner-Tuderman, Leena; Eady, Robin A. J.; Bauer, Eugene A.; Bauer, Johann W.; Has, Cristina; Heagerty, Adrian; Hintner, Helmut; Hovnanian, Alain; Jonkman, Marcel F.; Leigh, Irene; Marinkovich, M. Peter; Martinez, Anna E.; McGrath, John A.; Mellerio, Jemima E.; Moss, Celia; Murrell, Dedee F.; Shimizu, Hiroshi; Uitto, Jouni; Woodley, David; Zambruno, Giovanna

    Background: Several newtargeted genes and clinical subtypes have been identified since publication in 2008 of the report of the last international consensus meeting on diagnosis and classification of epidermolysis bullosa (EB). As a correlate, new clinical manifestations have been seen in several

  16. Multiclass gene selection using Pareto-fronts.

    Science.gov (United States)

    Rajapakse, Jagath C; Mundra, Piyushkumar A

    2013-01-01

    Filter methods are often used for selection of genes in multiclass sample classification by using microarray data. Such techniques usually tend to bias toward a few classes that are easily distinguishable from other classes due to imbalances of strong features and sample sizes of different classes. It could therefore lead to selection of redundant genes while missing the relevant genes, leading to poor classification of tissue samples. In this manuscript, we propose to decompose multiclass ranking statistics into class-specific statistics and then use Pareto-front analysis for selection of genes. This alleviates the bias induced by class intrinsic characteristics of dominating classes. The use of Pareto-front analysis is demonstrated on two filter criteria commonly used for gene selection: F-score and KW-score. A significant improvement in classification performance and reduction in redundancy among top-ranked genes were achieved in experiments with both synthetic and real-benchmark data sets.

  17. Classification of IRAS asteroids

    International Nuclear Information System (INIS)

    Tedesco, E.F.; Matson, D.L.; Veeder, G.J.

    1989-01-01

    Albedos and spectral reflectances are essential for classifying asteroids. For example, classes E, M and P are indistinguishable without albedo data. Colorometric data are available for about 1000 asteroids but, prior to IRAS, albedo data was available for only about 200. IRAS broke this bottleneck by providing albedo data on nearly 2000 asteroids. Hence, excepting absolute magnitudes, the albedo and size are now the most common asteroid physical parameters known. In this chapter the authors present the results of analyses of IRAS-derived asteroid albedos, discuss their application to asteroid classification, and mention several studies which might be done to exploit further this data set

  18. SPORT FOOD ADDITIVE CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    I. P. Prokopenko

    2015-01-01

    Full Text Available Correctly organized nutritive and pharmacological support is an important component of an athlete's preparation for competitions, an optimal shape maintenance, fast recovery and rehabilitation after traumas and defatigation. Special products of enhanced biological value (BAS for athletes nutrition are used with this purpose. Easy-to-use energy sources are administered into athlete's organism, yielded materials and biologically active substances which regulate and activate exchange reactions which proceed with difficulties during certain physical trainings. The article presents sport supplements classification which can be used before warm-up and trainings, after trainings and in competitions breaks.

  19. Radioactive facilities classification criteria

    International Nuclear Information System (INIS)

    Briso C, H.A.; Riesle W, J.

    1992-01-01

    Appropriate classification of radioactive facilities into groups of comparable risk constitutes one of the problems faced by most Regulatory Bodies. Regarding the radiological risk, the main facts to be considered are the radioactive inventory and the processes to which these radionuclides are subjected. Normally, operations are ruled by strict safety procedures. Thus, the total activity of the radionuclides existing in a given facility is the varying feature that defines its risk. In order to rely on a quantitative criterion and, considering that the Annual Limits of Intake are widely accepted references, an index based on these limits, to support decisions related to radioactive facilities, is proposed. (author)

  20. Transcriptome classification reveals molecular subtypes in psoriasis

    Directory of Open Access Journals (Sweden)

    Ainali Chrysanthi

    2012-09-01

    Full Text Available Abstract Background Psoriasis is an immune-mediated disease characterised by chronically elevated pro-inflammatory cytokine levels, leading to aberrant keratinocyte proliferation and differentiation. Although certain clinical phenotypes, such as plaque psoriasis, are well defined, it is currently unclear whether there are molecular subtypes that might impact on prognosis or treatment outcomes. Results We present a pipeline for patient stratification through a comprehensive analysis of gene expression in paired lesional and non-lesional psoriatic tissue samples, compared with controls, to establish differences in RNA expression patterns across all tissue types. Ensembles of decision tree predictors were employed to cluster psoriatic samples on the basis of gene expression patterns and reveal gene expression signatures that best discriminate molecular disease subtypes. This multi-stage procedure was applied to several published psoriasis studies and a comparison of gene expression patterns across datasets was performed. Conclusion Overall, classification of psoriasis gene expression patterns revealed distinct molecular sub-groups within the clinical phenotype of plaque psoriasis. Enrichment for TGFb and ErbB signaling pathways, noted in one of the two psoriasis subgroups, suggested that this group may be more amenable to therapies targeting these pathways. Our study highlights the potential biological relevance of using ensemble decision tree predictors to determine molecular disease subtypes, in what may initially appear to be a homogenous clinical group. The R code used in this paper is available upon request.

  1. Identification and classification of genes regulated by phosphatidylinositol 3-kinase- and TRKB-mediated signalling pathways during neuronal differentiation in two subtypes of the human neuroblastoma cell line SH-SY5Y

    OpenAIRE

    Nishida, Yuichiro; Adati, Naoki; Ozawa, Ritsuko; Maeda, Aasami; Sakaki, Yoshiyuki; Takeda, Tadayuki

    2008-01-01

    Abstract Background SH-SY5Y cells exhibit a neuronal phenotype when treated with all-trans retinoic acid (RA), but the molecular mechanism of activation in the signalling pathway mediated by phosphatidylinositol 3-kinase (PI3K) is unclear. To investigate this mechanism, we compared the gene expression profiles in SK-N-SH cells and two subtypes of SH-SY5Y cells (SH-SY5Y-A and SH-SY5Y-E), each of which show a different phenotype during RA-mediated differentiation. Findings SH-SY5Y-A cells diffe...

  2. Significant Deregulated Pathways in Diabetes Type II Complications Identified through Expression Based Network Biology

    Science.gov (United States)

    Ukil, Sanchaita; Sinha, Meenakshee; Varshney, Lavneesh; Agrawal, Shipra

    Type 2 Diabetes is a complex multifactorial disease, which alters several signaling cascades giving rise to serious complications. It is one of the major risk factors for cardiovascular diseases. The present research work describes an integrated functional network biology approach to identify pathways that get transcriptionally altered and lead to complex complications thereby amplifying the phenotypic effect of the impaired disease state. We have identified two sub-network modules, which could be activated under abnormal circumstances in diabetes. Present work describes key proteins such as P85A and SRC serving as important nodes to mediate alternate signaling routes during diseased condition. P85A has been shown to be an important link between stress responsive MAPK and CVD markers involved in fibrosis. MAPK8 has been shown to interact with P85A and further activate CTGF through VEGF signaling. We have traced a novel and unique route correlating inflammation and fibrosis by considering P85A as a key mediator of signals. The next sub-network module shows SRC as a junction for various signaling processes, which results in interaction between NF-kB and beta catenin to cause cell death. The powerful interaction between these important genes in response to transcriptionally altered lipid metabolism and impaired inflammatory response via SRC causes apoptosis of cells. The crosstalk between inflammation, lipid homeostasis and stress, and their serious effects downstream have been explained in the present analyses.

  3. Supply chain planning classification

    Science.gov (United States)

    Hvolby, Hans-Henrik; Trienekens, Jacques; Bonde, Hans

    2001-10-01

    Industry experience a need to shift in focus from internal production planning towards planning in the supply network. In this respect customer oriented thinking becomes almost a common good amongst companies in the supply network. An increase in the use of information technology is needed to enable companies to better tune their production planning with customers and suppliers. Information technology opportunities and supply chain planning systems facilitate companies to monitor and control their supplier network. In spite if these developments, most links in today's supply chains make individual plans, because the real demand information is not available throughout the chain. The current systems and processes of the supply chains are not designed to meet the requirements now placed upon them. For long term relationships with suppliers and customers, an integrated decision-making process is needed in order to obtain a satisfactory result for all parties. Especially when customized production and short lead-time is in focus. An effective value chain makes inventory available and visible among the value chain members, minimizes response time and optimizes total inventory value held throughout the chain. In this paper a supply chain planning classification grid is presented based current manufacturing classifications and supply chain planning initiatives.

  4. Waste classification sampling plan

    International Nuclear Information System (INIS)

    Landsman, S.D.

    1998-01-01

    The purpose of this sampling is to explain the method used to collect and analyze data necessary to verify and/or determine the radionuclide content of the B-Cell decontamination and decommissioning waste stream so that the correct waste classification for the waste stream can be made, and to collect samples for studies of decontamination methods that could be used to remove fixed contamination present on the waste. The scope of this plan is to establish the technical basis for collecting samples and compiling quantitative data on the radioactive constituents present in waste generated during deactivation activities in B-Cell. Sampling and radioisotopic analysis will be performed on the fixed layers of contamination present on structural material and internal surfaces of process piping and tanks. In addition, dose rate measurements on existing waste material will be performed to determine the fraction of dose rate attributable to both removable and fixed contamination. Samples will also be collected to support studies of decontamination methods that are effective in removing the fixed contamination present on the waste. Sampling performed under this plan will meet criteria established in BNF-2596, Data Quality Objectives for the B-Cell Waste Stream Classification Sampling, J. M. Barnett, May 1998

  5. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...

  6. Classification of smooth Fano polytopes

    DEFF Research Database (Denmark)

    Øbro, Mikkel

    A simplicial lattice polytope containing the origin in the interior is called a smooth Fano polytope, if the vertices of every facet is a basis of the lattice. The study of smooth Fano polytopes is motivated by their connection to toric varieties. The thesis concerns the classification of smooth...... Fano polytopes up to isomorphism. A smooth Fano -polytope can have at most vertices. In case of vertices an explicit classification is known. The thesis contains the classification in case of vertices. Classifications of smooth Fano -polytopes for fixed exist only for . In the thesis an algorithm...... for the classification of smooth Fano -polytopes for any given is presented. The algorithm has been implemented and used to obtain the complete classification for ....

  7. Small-scale classification schemes

    DEFF Research Database (Denmark)

    Hertzum, Morten

    2004-01-01

    Small-scale classification schemes are used extensively in the coordination of cooperative work. This study investigates the creation and use of a classification scheme for handling the system requirements during the redevelopment of a nation-wide information system. This requirements...... classification inherited a lot of its structure from the existing system and rendered requirements that transcended the framework laid out by the existing system almost invisible. As a result, the requirements classification became a defining element of the requirements-engineering process, though its main...... effects remained largely implicit. The requirements classification contributed to constraining the requirements-engineering process by supporting the software engineers in maintaining some level of control over the process. This way, the requirements classification provided the software engineers...

  8. Albinism: classification, clinical characteristics, and recent findings.

    Science.gov (United States)

    Summers, C Gail

    2009-06-01

    To describe the clinical characteristics and recent findings in the heterogeneous group of inherited disorders of melanin biosynthesis grouped as "albinism." The current classification of albinism, and the cutaneous, ocular, and central nervous system characteristics are presented. Recent clinical findings are summarized. Albinism is now classified based on genes known to be responsible for albinism. Foveal hypoplasia is invariably present and individuals with albinism often have delayed visual development, reduced vision, nystagmus, a positive angle kappa, strabismus, iris transillumination, and absent or reduced melanin pigment in the fundi. A visual-evoked potential can document the excessive retinostriate decussation seen in albinism. Grating acuity can be used to document delayed visual development in preverbal children. Glasses are often needed to improve visual acuity and binocular alignment. Albinism is caused by several different genes. Heterogeneity in clinical phenotype indicates that expressivity is variable.

  9. The molecular classification of hereditary endocrine diseases.

    Science.gov (United States)

    Ye, Lei; Ning, Guang

    2015-12-01

    Hereditary endocrine diseases are an important group of diseases with great heterogeneity. The current classification for hereditary endocrine disease is mostly based upon anatomy, which is helpful for pathophysiological interpretation, but does not address the pathogenic variability associated with different underlying genetic causes. Identification of an endocrinopathy-associated genetic alteration provides evidence for differential diagnosis, discovery of non-classical disease, and the potential for earlier diagnosis and targeted therapy. Molecular diagnosis should be routinely applied when managing patients with suspicion of hereditary disease. To enhance the accurate diagnosis and treatment of patients with hereditary endocrine diseases, we propose categorization of endocrine diseases into three groups based upon the function of the mutant gene: cell differentiation, hormone synthesis and action, and tumorigenesis. Each category was further grouped according to the specific gene function. We believe that this format would facilitate practice of precision medicine in the field of hereditary endocrine diseases.

  10. Active Learning for Text Classification

    OpenAIRE

    Hu, Rong

    2011-01-01

    Text classification approaches are used extensively to solve real-world challenges. The success or failure of text classification systems hangs on the datasets used to train them, without a good dataset it is impossible to build a quality system. This thesis examines the applicability of active learning in text classification for the rapid and economical creation of labelled training data. Four main contributions are made in this thesis. First, we present two novel selection strategies to cho...

  11. Unsupervised Classification Using Immune Algorithm

    OpenAIRE

    Al-Muallim, M. T.; El-Kouatly, R.

    2012-01-01

    Unsupervised classification algorithm based on clonal selection principle named Unsupervised Clonal Selection Classification (UCSC) is proposed in this paper. The new proposed algorithm is data driven and self-adaptive, it adjusts its parameters to the data to make the classification operation as fast as possible. The performance of UCSC is evaluated by comparing it with the well known K-means algorithm using several artificial and real-life data sets. The experiments show that the proposed U...

  12. On the classification techniques in data mining for microarray data classification

    Science.gov (United States)

    Aydadenta, Husna; Adiwijaya

    2018-03-01

    Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.

  13. Reliability of Oronasal Fistula Classification.

    Science.gov (United States)

    Sitzman, Thomas J; Allori, Alexander C; Matic, Damir B; Beals, Stephen P; Fisher, David M; Samson, Thomas D; Marcus, Jeffrey R; Tse, Raymond W

    2018-01-01

    Objective Oronasal fistula is an important complication of cleft palate repair that is frequently used to evaluate surgical quality, yet reliability of fistula classification has never been examined. The objective of this study was to determine the reliability of oronasal fistula classification both within individual surgeons and between multiple surgeons. Design Using intraoral photographs of children with repaired cleft palate, surgeons rated the location of palatal fistulae using the Pittsburgh Fistula Classification System. Intrarater and interrater reliability scores were calculated for each region of the palate. Participants Eight cleft surgeons rated photographs obtained from 29 children. Results Within individual surgeons reliability for each region of the Pittsburgh classification ranged from moderate to almost perfect (κ = .60-.96). By contrast, reliability between surgeons was lower, ranging from fair to substantial (κ = .23-.70). Between-surgeon reliability was lowest for the junction of the soft and hard palates (κ = .23). Within-surgeon and between-surgeon reliability were almost perfect for the more general classification of fistula in the secondary palate (κ = .95 and κ = .83, respectively). Conclusions This is the first reliability study of fistula classification. We show that the Pittsburgh Fistula Classification System is reliable when used by an individual surgeon, but less reliable when used among multiple surgeons. Comparisons of fistula occurrence among surgeons may be subject to less bias if they use the more general classification of "presence or absence of fistula of the secondary palate" rather than the Pittsburgh Fistula Classification System.

  14. Classification of radioactive waste

    International Nuclear Information System (INIS)

    1994-01-01

    Radioactive wastes are generated in a number of different kinds of facilities and arise in a wide range of concentrations of radioactive materials and in a variety of physical and chemical forms. To simplify their management, a number of schemes have evolved for classifying radioactive waste according to the physical, chemical and radiological properties of significance to those facilities managing this waste. These schemes have led to a variety of terminologies, differing from country to country and even between facilities in the same country. This situation makes it difficult for those concerned to communicate with one another regarding waste management practices. This document revises and updates earlier IAEA references on radioactive waste classification systems given in IAEA Technical Reports Series and Safety Series. Guidance regarding exemption of materials from regulatory control is consistent with IAEA Safety Series and the RADWASS documents published under IAEA Safety Series. 11 refs, 2 figs, 2 tab

  15. Nonlinear estimation and classification

    CERN Document Server

    Hansen, Mark; Holmes, Christopher; Mallick, Bani; Yu, Bin

    2003-01-01

    Researchers in many disciplines face the formidable task of analyzing massive amounts of high-dimensional and highly-structured data This is due in part to recent advances in data collection and computing technologies As a result, fundamental statistical research is being undertaken in a variety of different fields Driven by the complexity of these new problems, and fueled by the explosion of available computer power, highly adaptive, non-linear procedures are now essential components of modern "data analysis," a term that we liberally interpret to include speech and pattern recognition, classification, data compression and signal processing The development of new, flexible methods combines advances from many sources, including approximation theory, numerical analysis, machine learning, signal processing and statistics The proposed workshop intends to bring together eminent experts from these fields in order to exchange ideas and forge directions for the future

  16. Automatic diabetic retinopathy classification

    Science.gov (United States)

    Bravo, María. A.; Arbeláez, Pablo A.

    2017-11-01

    Diabetic retinopathy (DR) is a disease in which the retina is damaged due to augmentation in the blood pressure of small vessels. DR is the major cause of blindness for diabetics. It has been shown that early diagnosis can play a major role in prevention of visual loss and blindness. This work proposes a computer based approach for the detection of DR in back-of-the-eye images based on the use of convolutional neural networks (CNNs). Our CNN uses deep architectures to classify Back-of-the-eye Retinal Photographs (BRP) in 5 stages of DR. Our method combines several preprocessing images of BRP to obtain an ACA score of 50.5%. Furthermore, we explore subproblems by training a larger CNN of our main classification task.

  17. Sequence Classification: 860875 [

    Lifescience Database Archive (English)

    Full Text Available me:NCBI35:17:71781727:71798928:-1 gene:ENSG00000129646 transcript:ENST00000301613 ||ref|||| http://www.ensembl.org/Homo_sapiens/protview?peptide=ENSP00000301613 ... ...Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >ENSP00000301613 pep:known chromoso

  18. Sequence Classification: 524859 [

    Lifescience Database Archive (English)

    Full Text Available Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >gi|62181272|ref|YP_217689.1| H inversion...: regulation of flagellar gene expression by site-specific inversion of DNA || http://www.ncbi.nlm.nih.gov/protein/62181272 ...

  19. Sequence Classification: 891024 [

    Lifescience Database Archive (English)

    Full Text Available inal kelch-like domain, putative negative regulator of early meiotic gene expression; required, with...Non-TMB Non-TMH Non-TMB Non-TMB TMB TMB >gi|6321241|ref|NP_011318.1| Protein with an N-term

  20. Sequence Classification: 891825 [

    Lifescience Database Archive (English)

    Full Text Available inal kelch-like domain, putative negative regulator of early meiotic gene expression; required, with...Non-TMB Non-TMH Non-TMB Non-TMB TMB TMB >gi|6320979|ref|NP_011058.1| Protein with an N-term

  1. Sequence Classification: 894156 [

    Lifescience Database Archive (English)

    Full Text Available ination of longevity; LAG2 gene is preferentially expressed in young cells; overexpressi...Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >gi|6324548|ref|NP_014617.1| Protein involved in determ

  2. Sequence Classification: 778660 [

    Lifescience Database Archive (English)

    Full Text Available Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >gi|17507815|ref|NP_491621.1| the gene is express...ed protein retains C2H2 zinc-finger at its N-terminal region. like (1G100) || http://www.ncbi.nlm.nih.gov/protein/17507815 ...

  3. Sequence Classification: 894820 [

    Lifescience Database Archive (English)

    Full Text Available ruption does not increase the rate of spontaneous mutagenesis; Ham1p || http://www.ncbi.nlm.nih.gov/protein/6322529 ... ...n of unknown function that is involved in DNA repair; mutant is sensitive to the base analog, 6-N-hydroxylaminopurine, while gene dis

  4. Sequence Classification: 889823 [

    Lifescience Database Archive (English)

    Full Text Available dback control mechanism; RPN4 is transcriptionally regulated by various stress responses; Rpn4p || http://www.ncbi.nlm.nih.gov/protein/6320184 ... ...ion factor that stimulates expression of proteasome genes; Rpn4p levels are in turn regulated by the 26S proteasome in a negative fee

  5. Intrinsic subtypes from PAM50 gene expression assay in a population-based breast cancer cohort: differences by age, race, and tumor characteristics.

    Science.gov (United States)

    Sweeney, Carol; Bernard, Philip S; Factor, Rachel E; Kwan, Marilyn L; Habel, Laurel A; Quesenberry, Charles P; Shakespear, Kaylynn; Weltzien, Erin K; Stijleman, Inge J; Davis, Carole A; Ebbert, Mark T W; Castillo, Adrienne; Kushi, Lawrence H; Caan, Bette J

    2014-05-01

    Data are lacking to describe gene expression-based breast cancer intrinsic subtype patterns for population-based patient groups. We studied a diverse cohort of women with breast cancer from the Life After Cancer Epidemiology and Pathways studies. RNA was extracted from 1 mm punches from fixed tumor tissue. Quantitative reverse-transcriptase PCR was conducted for the 50 genes that comprise the PAM50 intrinsic subtype classifier. In a subcohort of 1,319 women, the overall subtype distribution based on PAM50 was 53.1% luminal A, 20.5% luminal B, 13.0% HER2-enriched, 9.8% basal-like, and 3.6% normal-like. Among low-risk endocrine-positive tumors (i.e., estrogen and progesterone receptor positive by immunohistochemistry, HER2 negative, and low histologic grade), only 76.5% were categorized as luminal A by PAM50. Continuous-scale luminal A, luminal B, HER2-enriched, and normal-like scores from PAM50 were mutually positively correlated. Basal-like score was inversely correlated with other subtypes. The proportion with non-luminal A subtype decreased with older age at diagnosis, P Trend < 0.0001. Compared with non-Hispanic Whites, African American women were more likely to have basal-like tumors, age-adjusted OR = 4.4 [95% confidence intervals (CI), 2.3-8.4], whereas Asian and Pacific Islander women had reduced odds of basal-like subtype, OR = 0.5 (95% CI, 0.3-0.9). Our data indicate that over 50% of breast cancers treated in the community have luminal A subtype. Gene expression-based classification shifted some tumors categorized as low risk by surrogate clinicopathologic criteria to higher-risk subtypes. Subtyping in a population-based cohort revealed distinct profiles by age and race. ©2014 AACR.

  6. Pathological Bases for a Robust Application of Cancer Molecular Classification

    Directory of Open Access Journals (Sweden)

    Salvador J. Diaz-Cano

    2015-04-01

    Full Text Available Any robust classification system depends on its purpose and must refer to accepted standards, its strength relying on predictive values and a careful consideration of known factors that can affect its reliability. In this context, a molecular classification of human cancer must refer to the current gold standard (histological classification and try to improve it with key prognosticators for metastatic potential, staging and grading. Although organ-specific examples have been published based on proteomics, transcriptomics and genomics evaluations, the most popular approach uses gene expression analysis as a direct correlate of cellular differentiation, which represents the key feature of the histological classification. RNA is a labile molecule that varies significantly according with the preservation protocol, its transcription reflect the adaptation of the tumor cells to the microenvironment, it can be passed through mechanisms of intercellular transference of genetic information (exosomes, and it is exposed to epigenetic modifications. More robust classifications should be based on stable molecules, at the genetic level represented by DNA to improve reliability, and its analysis must deal with the concept of intratumoral heterogeneity, which is at the origin of tumor progression and is the byproduct of the selection process during the clonal expansion and progression of neoplasms. The simultaneous analysis of multiple DNA targets and next generation sequencing offer the best practical approach for an analytical genomic classification of tumors.

  7. Hazard classification or risk assessment

    DEFF Research Database (Denmark)

    Hass, Ulla

    2013-01-01

    The EU classification of substances for e.g. reproductive toxicants is hazard based and does not to address the risk suchsubstances may pose through normal, or extreme, use. Such hazard classification complies with the consumer's right to know. It is also an incentive to careful use and storage...

  8. Seismic texture classification. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Vinther, R.

    1997-12-31

    The seismic texture classification method, is a seismic attribute that can both recognize the general reflectivity styles and locate variations from these. The seismic texture classification performs a statistic analysis for the seismic section (or volume) aiming at describing the reflectivity. Based on a set of reference reflectivities the seismic textures are classified. The result of the seismic texture classification is a display of seismic texture categories showing both the styles of reflectivity from the reference set and interpolations and extrapolations from these. The display is interpreted as statistical variations in the seismic data. The seismic texture classification is applied to seismic sections and volumes from the Danish North Sea representing both horizontal stratifications and salt diapers. The attribute succeeded in recognizing both general structure of successions and variations from these. Also, the seismic texture classification is not only able to display variations in prospective areas (1-7 sec. TWT) but can also be applied to deep seismic sections. The seismic texture classification is tested on a deep reflection seismic section (13-18 sec. TWT) from the Baltic Sea. Applied to this section the seismic texture classification succeeded in locating the Moho, which could not be located using conventional interpretation tools. The seismic texture classification is a seismic attribute which can display general reflectivity styles and deviations from these and enhance variations not found by conventional interpretation tools. (LN)

  9. Efficient AUC optimization for classification

    NARCIS (Netherlands)

    Calders, T.; Jaroszewicz, S.; Kok, J.N.; Koronacki, J.; Lopez de Mantaras, R.; Matwin, S.; Mladenic, D.; Skowron, A.

    2007-01-01

    In this paper we show an efficient method for inducing classifiers that directly optimize the area under the ROC curve. Recently, AUC gained importance in the classification community as a mean to compare the performance of classifiers. Because most classification methods do not optimize this

  10. Dewey Decimal Classification: A Quagmire.

    Science.gov (United States)

    Gamaluddin, Ahmad Fouad

    1980-01-01

    A survey of 660 Pennsylvania school librarians indicates that, though there is limited professional interest in the Library of Congress Classification system, Dewey Decimal Classification (DDC) appears to be firmly entrenched. This article also discusses the relative merits of DDC, the need for a uniform system, librarianship preparation, and…

  11. Latent class models for classification

    NARCIS (Netherlands)

    Vermunt, J.K.; Magidson, J.

    2003-01-01

    An overview is provided of recent developments in the use of latent class (LC) and other types of finite mixture models for classification purposes. Several extensions of existing models are presented. Two basic types of LC models for classification are defined: supervised and unsupervised

  12. 45 CFR 601.5 - Derivative classification.

    Science.gov (United States)

    2010-10-01

    ... CLASSIFICATION AND DECLASSIFICATION OF NATIONAL SECURITY INFORMATION § 601.5 Derivative classification. Distinct... 45 Public Welfare 3 2010-10-01 2010-10-01 false Derivative classification. 601.5 Section 601.5... classification guide, need not possess original classification authority. (a) If a person who applies derivative...

  13. 12 CFR 403.4 - Derivative classification.

    Science.gov (United States)

    2010-01-01

    ... SAFEGUARDING OF NATIONAL SECURITY INFORMATION § 403.4 Derivative classification. (a) Use of derivative classification. (1) Unlike original classification which is an initial determination, derivative classification... 12 Banks and Banking 4 2010-01-01 2010-01-01 false Derivative classification. 403.4 Section 403.4...

  14. 32 CFR 2001.15 - Classification guides.

    Science.gov (United States)

    2010-07-01

    ... 32 National Defense 6 2010-07-01 2010-07-01 false Classification guides. 2001.15 Section 2001.15..., NATIONAL ARCHIVES AND RECORDS ADMINISTRATION CLASSIFIED NATIONAL SECURITY INFORMATION Classification § 2001.15 Classification guides. (a) Preparation of classification guides. Originators of classification...

  15. Vietnamese Document Representation and Classification

    Science.gov (United States)

    Nguyen, Giang-Son; Gao, Xiaoying; Andreae, Peter

    Vietnamese is very different from English and little research has been done on Vietnamese document classification, or indeed, on any kind of Vietnamese language processing, and only a few small corpora are available for research. We created a large Vietnamese text corpus with about 18000 documents, and manually classified them based on different criteria such as topics and styles, giving several classification tasks of different difficulty levels. This paper introduces a new syllable-based document representation at the morphological level of the language for efficient classification. We tested the representation on our corpus with different classification tasks using six classification algorithms and two feature selection techniques. Our experiments show that the new representation is effective for Vietnamese categorization, and suggest that best performance can be achieved using syllable-pair document representation, an SVM with a polynomial kernel as the learning algorithm, and using Information gain and an external dictionary for feature selection.

  16. Sequence Classification: 768697 [

    Lifescience Database Archive (English)

    Full Text Available Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >gi|17554126|ref|NP_498100.1| the gene is express...ed protein retains C2H2 zinc-finger at its N-terminal region. family member (57.9 kD) (3G277) || http://www.ncbi.nlm.nih.gov/protein/17554126 ...

  17. Sequence Classification: 768696 [

    Lifescience Database Archive (English)

    Full Text Available Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >gi|17554128|ref|NP_498099.1| the gene is express...ed protein retains C2H2 zinc-finger at its N-terminal region (162.6 kD) (3G269) || http://www.ncbi.nlm.nih.gov/protein/17554128 ...

  18. Classification of titanium dioxide

    International Nuclear Information System (INIS)

    Macias B, L.R.; Garcia C, R.M.; Maya M, M.E.; Ita T, A. De; Palacios G, J.

    2002-01-01

    In this work the X-ray diffraction (XRD), Scanning Electron Microscopy (Sem) and the X-ray Dispersive Energy Spectroscopy techniques are used with the purpose to achieve a complete identification of phases and mixture of phases of a crystalline material as titanium dioxide. The problem for solving consists of being able to distinguish a sample of titanium dioxide being different than a titanium dioxide pigment. A standard sample of titanium dioxide with NIST certificate is used, which indicates a purity of 99.74% for the TiO 2 . The following way is recommended to proceed: a)To make an analysis by means of X-ray diffraction technique to the sample of titanium dioxide pigment and on the standard of titanium dioxide waiting not find differences. b) To make a chemical analysis by the X-ray Dispersive Energy Spectroscopy via in a microscope, taking advantage of the high vacuum since it is oxygen which is analysed and if it is concluded that the aluminium oxide appears in a greater proportion to 1% it is established that is a titanium dioxide pigment, but if it is lesser then it will be only titanium dioxide. This type of analysis is an application of the nuclear techniques useful for the tariff classification of merchandise which is considered as of difficult recognition. (Author)

  19. Classification of new particles

    International Nuclear Information System (INIS)

    Karl, G.

    1976-01-01

    A classification of the new particles is proposed. Hadrons are constructed from quarks corresponding to several different representations of an SU 3 color group, with confined color. The new family of resonances, related to psi/J, is assigned to color-antisextet quarks Q. These new quarks Q do not form mixed mesons q-barQ with old antiquarks but can form mixed baryons Qqq. We speculate on the relation between color and mass. High-mass recurrences of the psi/J family are expected to have associated large changes in the cross section for electron-positron annihilation (ΔR > 4). A conjectured mass formula, which relates the masses of psi/J and ω, predicts the masses of possible recurrences of the psi/J particle. Other experimental implications at presently available energies are discussed, especially the necessity for an isovector partner for psi/J, and for pseudoscalar mesons at 1.8--2.2 GeV, some of which can decay into two photons

  20. Application of machine learning on brain cancer multiclass classification

    Science.gov (United States)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  1. GeneBins: a database for classifying gene expression data, with application to plant genome arrays

    Directory of Open Access Journals (Sweden)

    Weiller Georg

    2007-03-01

    Full Text Available Abstract Background To interpret microarray experiments, several ontological analysis tools have been developed. However, current tools are limited to specific organisms. Results We developed a bioinformatics system to assign the probe set sequences of any organism to a hierarchical functional classification modelled on KEGG ontology. The GeneBins database currently supports the functional classification of expression data from four Affymetrix arrays; Arabidopsis thaliana, Oryza sativa, Glycine max and Medicago truncatula. An online analysis tool to identify relevant functions is also provided. Conclusion GeneBins provides resources to interpret gene expression results from microarray experiments. It is available at http://bioinfoserver.rsbs.anu.edu.au/utils/GeneBins/

  2. Music classification with MPEG-7

    Science.gov (United States)

    Crysandt, Holger; Wellhausen, Jens

    2003-01-01

    Driven by increasing amount of music available electronically the need and possibility of automatic classification systems for music becomes more and more important. Currently most search engines for music are based on textual descriptions like artist or/and title. This paper presents a system for automatic music description, classification and visualization for a set of songs. The system is designed to extract significant features of a piece of music in order to find songs of similar genre or a similar sound characteristics. The description is done with the help of MPEG-7 only. The classification and visualization is done with the self organizing map algorithm.

  3. Systema Naturae. Classification of living things.

    OpenAIRE

    Alexey Shipunov

    2007-01-01

    Original classification of living organisms containing four kingdoms (Monera, Protista, Vegetabilia and Animalia), 60 phyla and 254 classes, is presented. The classification is based on latest available information.

  4. Progression in nuclear classification

    International Nuclear Information System (INIS)

    Wang Yuying

    1999-01-01

    In this book, summarize the author's achievements of nuclear classification by new method in latest 30 years, new foundational law of nuclear layer in matter world is found. It is explained with a hypothesis of a nucleus which it is made up of two nucleon's clusters with deuteron and triton. Its concrete content is: to advance a new method which analyze data of nuclei with natural abundance using relationship between the numbers of proton and neutron. The relationship of each nucleus increases to 4 sets: S+H=Z H+Z=N Z+N=A and S-H=K. To expand the similarity between proton and neutron to the similarity among p,n, deuteron, triton, and He-5 clusters. According to the distribution law of same kind of nuclei, it obtains that the upper limits of stable region both should be '44s'. New foundational law of nuclear system is 1,2,4,8,16,8,4,2,1. In order to explain new law, a hypothesis which nucleus is made up of deuteron and triton is developing and nuclear field of whole number is built up. And it relates that unity of matter motion, which is the most foundational form atomic nuclear systematic is similar to the most first-class form chromosome numbers of mankind. These achievements will shake the foundations of traditional nuclear science. These achievements will supply new tasks in developing nuclear theory. And shake the ground of which magic number is the basic of nuclear science. It opens up a new field on foundational research. The book will supply new knowledge for researcher, teachers and students in universities and polytechnic schools. Scientific workers read in works of research and technical exploit. It can be stored up for library and laboratory of society and universities. In nowadays of prosperity our nation by science and education, the book is readable for workers of scientific technology and amateurs of natural science

  5. Classification and clinical assessment

    Directory of Open Access Journals (Sweden)

    F. Cantini

    2012-06-01

    Full Text Available There are at least nine classification criteria for psoriatic arthritis (PsA that have been proposed and used in clinical studies. With the exception of the ESSG and Bennett rules, all of the other criteria sets have a good performance in identifying PsA patients. As the CASPAR criteria are based on a robust study methodology, they are considered the current reference standard. However, if there seems to be no doubt that they are very good to classify PsA patients (very high specificity, they might be not sensitive enough to diagnose patients with unknown early PsA. The vast clinical heterogeneity of PsA makes its assessment very challenging. Peripheral joint involvement is measured by 78/76 joint counts, spine involvement by the instruments used for ankylosing spondylitis (AS, dactylitis by involved digit count or by the Leeds dactylitis index, enthesitis by the number of affected entheses (several indices available and psoriasis by the Psoriasis Area and Severity Index (PASI. Peripheral joint damage can be assessed by a modified van der Heijde-Sharp scoring system and axial damage by the methods used for AS or by the Psoriatic Arthritis Spondylitis Radiology Index (PASRI. As in other arthritides, global evaluation of disease activity and severity by patient and physician and assessment of disability and quality of life are widely used. Finally, composite indices that capture several clinical manifestations of PsA have been proposed and a new instrument, the Psoriatic ARthritis Disease Activity Score (PASDAS, is currently being developed.

  6. The classification of easement

    Directory of Open Access Journals (Sweden)

    Popov Danica D.

    2015-01-01

    Full Text Available Easement means, a right enjoyed by the owner of land over the lands of another: such as rights of way, right of light, rights of support, rights to a flow of air or water etc. The dominant tenement is the land owned by the possessor of the easement, and the servient tenement is the land over which the right is enjoyed. An easement must exist for the accommodation and better enjoyment to which it is annexed, otherwise it may amount to mere licence. An easement benefits and binds the land itself and therefore countinious despite any change of ownership of either dominant or servient tenement, although it will be extinguished if the two tenemants come into common ownership. An easement can only be enjoyed in respect of land. This means two parcels of land. First there must be a 'dominant tenement' and a 'servient tenement'. Dominant tenement to which the benefit of the easement attaches, and another (servient tenement which bears the burden of the easement. A positive easement consist of a right to do something on the land of another; a negative easement restrict the use of owner of the serviant tenement may make of his land. An easement may be on land or on the house made on land. The next classification is on easement on the ground, and the other one under the ground. An easement shall be done in accordance with the principle of restrictions. This means that the less burden the servient tenement. When there is doubt about the extent of the actual easement shall take what easier the servient tenement. The new needs of the dominant estate does not result in the expansion of servitude. In the article is made comparison between The Draft Code of property and other real estate, and The Draft of Civil Code of Serbia.

  7. Critical Evaluation of Headache Classifications.

    Science.gov (United States)

    Özge, Aynur

    2013-08-01

    Transforming a subjective sense like headache into an objective state and establishing a common language for this complaint which can be both a symptom and a disease all by itself have kept the investigators busy for years. Each recommendation proposed has brought along a set of patients who do not meet the criteria. While almost the most ideal and most comprehensive classification studies continued at this point, this time criticisims about withdrawing from daily practice came to the fore. In this article, the classification adventure of scientists who work in the area of headache will be summarized. More specifically, 2 classifications made by the International Headache Society (IHS) and the point reached in relation with the 3rd classification which is still being worked on will be discussed together with headache subtypes. It has been presented with the wish and belief that it will contribute to the readers and young investigators who are interested in this subject.

  8. The last classification of vasculitis

    NARCIS (Netherlands)

    Kallenberg, Cees G. M.

    2008-01-01

    Systemic vasculitides are a group of diverse conditions characterized by inflammation of the blood vessels. To obtain homogeneity in clinical characteristics, prognosis, and response to treatment, patients with vasculitis should be classified into defined disease categories. Many classification

  9. Radon classification of building ground

    International Nuclear Information System (INIS)

    Slunga, E.

    1988-01-01

    The Laboratories of Building Technology and Soil Mechanics and Foundation Engineering at the Helsinki University of Technology in cooperation with The Ministry of the Environment have proposed a radon classification for building ground. The proposed classification is based on the radon concentration in soil pores and on the permeability of the foundation soil. The classification includes four radon classes: negligible, normal, high and very high. Depending on the radon class the radon-technical solution for structures is chosen. It is proposed that the classification be done in general terms in connection with the site investigations for the planning of land use and in more detail in connection with the site investigations for an individual house. (author)

  10. Deep Learning for ECG Classification

    Science.gov (United States)

    Pyakillya, B.; Kazachenko, N.; Mikhailovsky, N.

    2017-10-01

    The importance of ECG classification is very high now due to many current medical applications where this problem can be stated. Currently, there are many machine learning (ML) solutions which can be used for analyzing and classifying ECG data. However, the main disadvantages of these ML results is use of heuristic hand-crafted or engineered features with shallow feature learning architectures. The problem relies in the possibility not to find most appropriate features which will give high classification accuracy in this ECG problem. One of the proposing solution is to use deep learning architectures where first layers of convolutional neurons behave as feature extractors and in the end some fully-connected (FCN) layers are used for making final decision about ECG classes. In this work the deep learning architecture with 1D convolutional layers and FCN layers for ECG classification is presented and some classification results are showed.

  11. Vehicle classification using mobile sensors.

    Science.gov (United States)

    2013-04-01

    In this research, the feasibility of using mobile traffic sensors for binary vehicle classification on arterial roads is investigated. Features (e.g. : speed related, acceleration/deceleration related, etc.) are extracted from vehicle traces (passeng...

  12. Classification of remotely sensed images

    CSIR Research Space (South Africa)

    Dudeni, N

    2008-10-01

    Full Text Available For this research, the researchers examine various existing image classification algorithms with the aim of demonstrating how these algorithms can be applied to remote sensing images. These algorithms are broadly divided into supervised...

  13. Classification of Building Object Types

    DEFF Research Database (Denmark)

    Jørgensen, Kaj Asbjørn

    2011-01-01

    made. This is certainly the case in the Danish development. Based on the theories about these abstraction mechanisms, the basic principles for classification systems are presented and the observed misconceptions are analyses and explained. Furthermore, it is argued that the purpose of classification...... systems has changed and that new opportunities should be explored. Some proposals for new applications are presented and carefully aligned with IT opportunities. Especially, the use of building modelling will give new benefits and many of the traditional uses of classification systems will instead...... be managed by software applications and on the basis of building models. Classification systems with taxonomies of building object types have many application opportunities but can still be beneficial in data exchange between building construction partners. However, this will be performed by new methods...

  14. VT Biodiversity Project - Bedrock Classification

    Data.gov (United States)

    Vermont Center for Geographic Information — (Link to Metadata) This dataset is a five category, nine sub-category classification of the bedrock units appearing on the Centennial Geologic Map of Vermont. The...

  15. Classification of Cortical Brain Malformations

    Directory of Open Access Journals (Sweden)

    J Gordon Millichap

    2008-03-01

    Full Text Available Clinical, radiological, and genetic classifications of 113 cases of malformations of cortical development (MCD were evaluated at the Erasmus Medical Center-Sophia Children's Hospital, Rotterdam, the Netherlands.

  16. Identifying colon cancer risk modules with better classification performance based on human signaling network.

    Science.gov (United States)

    Qu, Xiaoli; Xie, Ruiqiang; Chen, Lina; Feng, Chenchen; Zhou, Yanyan; Li, Wan; Huang, Hao; Jia, Xu; Lv, Junjie; He, Yuehan; Du, Youwen; Li, Weiguo; Shi, Yuchen; He, Weiming

    2014-10-01

    Identifying differences between normal and tumor samples from a modular perspective may help to improve our understanding of the mechanisms responsible for colon cancer. Many cancer studies have shown that signaling transduction and biological pathways are disturbed in disease states, and expression profiles can distinguish variations in diseases. In this study, we integrated a weighted human signaling network and gene expression profiles to select risk modules associated with tumor conditions. Risk modules as classification features by our method had a better classification performance than other methods, and one risk module for colon cancer had a good classification performance for distinguishing between normal/tumor samples and between tumor stages. All genes in the module were annotated to the biological process of positive regulation of cell proliferation, and were highly associated with colon cancer. These results suggested that these genes might be the potential risk genes for colon cancer. Copyright © 2013. Published by Elsevier Inc.

  17. Phylogenetic classification of bony fishes.

    Science.gov (United States)

    Betancur-R, Ricardo; Wiley, Edward O; Arratia, Gloria; Acero, Arturo; Bailly, Nicolas; Miya, Masaki; Lecointre, Guillaume; Ortí, Guillermo

    2017-07-06

    Fish classifications, as those of most other taxonomic groups, are being transformed drastically as new molecular phylogenies provide support for natural groups that were unanticipated by previous studies. A brief review of the main criteria used by ichthyologists to define their classifications during the last 50 years, however, reveals slow progress towards using an explicit phylogenetic framework. Instead, the trend has been to rely, in varying degrees, on deep-rooted anatomical concepts and authority, often mixing taxa with explicit phylogenetic support with arbitrary groupings. Two leading sources in ichthyology frequently used for fish classifications (JS Nelson's volumes of Fishes of the World and W. Eschmeyer's Catalog of Fishes) fail to adopt a global phylogenetic framework despite much recent progress made towards the resolution of the fish Tree of Life. The first explicit phylogenetic classification of bony fishes was published in 2013, based on a comprehensive molecular phylogeny ( www.deepfin.org ). We here update the first version of that classification by incorporating the most recent phylogenetic results. The updated classification presented here is based on phylogenies inferred using molecular and genomic data for nearly 2000 fishes. A total of 72 orders (and 79 suborders) are recognized in this version, compared with 66 orders in version 1. The phylogeny resolves placement of 410 families, or ~80% of the total of 514 families of bony fishes currently recognized. The ordinal status of 30 percomorph families included in this study, however, remains uncertain (incertae sedis in the series Carangaria, Ovalentaria, or Eupercaria). Comments to support taxonomic decisions and comparisons with conflicting taxonomic groups proposed by others are presented. We also highlight cases were morphological support exist for the groups being classified. This version of the phylogenetic classification of bony fishes is substantially improved, providing resolution

  18. A classification of chinese culture

    OpenAIRE

    Fan, Y

    2000-01-01

    This paper presents a classification of Chinese Cultural Values (CCVs). Although there exist great differences between the Mainland China, Hong Kong and Taiwan, it is still possible to identify certain core cultural values that are shared by the Chinese people no matter where they live. Based on the original list by the Chinese Cultural Connection (1987), the paper creates a new list that contains 71 core values against 40 in the old. The implications and limitations of the classification are...

  19. Classification of pyodestructive pulmonary diseases

    International Nuclear Information System (INIS)

    Muromskij, Yu.A.; Semivolkov, V.I.; Shlenova, L.A.

    1993-01-01

    Classification of pyodestructive lungs diseases, thier complications and outcomes is proposed which makes it possible for physioians engaged in studying respiratory organs pathology to orient themselves in problems of diagnosis and treatment tactics. The above classification is developed on the basis of studying the disease anamnesis and its clinical process, as well as on the basis of roentgenological and morphological study results by more than 10000 patients

  20. Quantum computing for pattern classification

    OpenAIRE

    Schuld, Maria; Sinayskiy, Ilya; Petruccione, Francesco

    2014-01-01

    It is well known that for certain tasks, quantum computing outperforms classical computing. A growing number of contributions try to use this advantage in order to improve or extend classical machine learning algorithms by methods of quantum information theory. This paper gives a brief introduction into quantum machine learning using the example of pattern classification. We introduce a quantum pattern classification algorithm that draws on Trugenberger's proposal for measuring the Hamming di...

  1. NIM: A Node Influence Based Method for Cancer Classification

    Directory of Open Access Journals (Sweden)

    Yiwen Wang

    2014-01-01

    Full Text Available The classification of different cancer types owns great significance in the medical field. However, the great majority of existing cancer classification methods are clinical-based and have relatively weak diagnostic ability. With the rapid development of gene expression technology, it is able to classify different kinds of cancers using DNA microarray. Our main idea is to confront the problem of cancer classification using gene expression data from a graph-based view. Based on a new node influence model we proposed, this paper presents a novel high accuracy method for cancer classification, which is composed of four parts: the first is to calculate the similarity matrix of all samples, the second is to compute the node influence of training samples, the third is to obtain the similarity between every test sample and each class using weighted sum of node influence and similarity matrix, and the last is to classify each test sample based on its similarity between every class. The data sets used in our experiments are breast cancer, central nervous system, colon tumor, prostate cancer, acute lymphoblastic leukemia, and lung cancer. experimental results showed that our node influence based method (NIM is more efficient and robust than the support vector machine, K-nearest neighbor, C4.5, naive Bayes, and CART.

  2. Current Trends in the Molecular Classification of Renal Neoplasms

    Directory of Open Access Journals (Sweden)

    Andrew N. Young

    2006-01-01

    Full Text Available Renal cell carcinoma (RCC is the most common form of kidney cancer in adults. RCC is a significant challenge for pathologic diagnosis and clinical management. The primary approach to diagnosis is by light microscopy, using the World Health Organization (WHO classification system, which defines histopathologic tumor subtypes with distinct clinical behavior and underlying genetic mutations. However, light microscopic diagnosis of RCC subtypes is often difficult due to variable histology. In addition, the clinical behavior of RCC is highly variable and therapeutic response rates are poor. Few clinical assays are available to predict outcome in RCC or correlate behavior with histology. Therefore, novel RCC classification systems based on gene expression should be useful for diagnosis, prognosis, and treatment. Recent microarray studies have shown that renal tumors are characterized by distinct gene expression profiles, which can be used to discover novel diagnostic and prognostic biomarkers. Here, we review clinical features of kidney cancer, the WHO classification system, and the growing role of molecular classification for diagnosis, prognosis, and therapy of this disease.

  3. Information Classification on University Websites

    DEFF Research Database (Denmark)

    Nawaz, Ather; Clemmensen, Torkil; Hertzum, Morten

    2011-01-01

    Websites are increasingly used as a medium for providing information to university students. The quality of a university website depends on how well the students’ information classification fits with the structure of the information on the website. This paper investigates the information classifi......Websites are increasingly used as a medium for providing information to university students. The quality of a university website depends on how well the students’ information classification fits with the structure of the information on the website. This paper investigates the information...... classification of 14 Danish and 14 Pakistani students and compares it with the information classification of their university website. Brainstorming, card sorting, and task exploration activities were used to discover similarities and differences in the participating students’ classification of website...... information and their ability to navigate the websites. The results of the study indicate group differences in user classification and related taskperformance differences. The main implications of the study are that (a) the edit distance appears a useful measure in cross-country HCI research and practice...

  4. Ototoxicity (cochleotoxicity) classifications: A review.

    Science.gov (United States)

    Crundwell, Gemma; Gomersall, Phil; Baguley, David M

    2016-01-01

    Drug-mediated ototoxicity, specifically cochleotoxicity, is a concern for patients receiving medications for the treatment of serious illness. A number of classification schemes exist, most of which are based on pure-tone audiometry, in order to assist non-audiological/non-otological specialists in the identification and monitoring of iatrogenic hearing loss. This review identifies the primary classification systems used in cochleototoxicity monitoring. By bringing together classifications published in discipline-specific literature, the paper aims to increase awareness of their relative strengths and limitations in the assessment and monitoring of ototoxic hearing loss and to indicate how future classification systems may improve upon the status-quo. Literature review. PubMed identified 4878 articles containing the search term ototox*. A systematic search identified 13 key classification systems. Cochleotoxicity classification systems can be divided into those which focus on hearing change from a baseline audiogram and those that focus on the functional impact of the hearing loss. Common weaknesses of these grading scales included a lack of sensitivity to small adverse changes in hearing thresholds, a lack of high-frequency audiometry (>8 kHz), and lack of indication of which changes are likely to be clinically significant for communication and quality of life.

  5. Information Classification on University Websites

    DEFF Research Database (Denmark)

    Nawaz, Ather; Clemmensen, Torkil; Hertzum, Morten

    2011-01-01

    Websites are increasingly used as a medium for providing information to university students. The quality of a university website depends on how well the students’ information classification fits with the structure of the information on the website. This paper investigates the information classifi......Websites are increasingly used as a medium for providing information to university students. The quality of a university website depends on how well the students’ information classification fits with the structure of the information on the website. This paper investigates the information...... classification of 14 Danish and 14 Pakistani students and compares it with the information classification of their university website. Brainstorming, card sorting, and task exploration activities were used to discover similarities and differences in the participating students’ classification of website...... information and their ability to navigate the websites. The results of the study indicate group differences in user classification and related task-performance differences. The main implications of the study are that (a) the edit distance appears a useful measure in cross-country HCI research and practice...

  6. Evolution and classification of the CRISPR-Cas systems

    Science.gov (United States)

    S. Makarova, Kira; H. Haft, Daniel; Barrangou, Rodolphe; J. J. Brouns, Stan; Charpentier, Emmanuelle; Horvath, Philippe; Moineau, Sylvain; J. M. Mojica, Francisco; I. Wolf, Yuri; Yakunin, Alexander F.; van der Oost, John; V. Koonin, Eugene

    2012-01-01

    The CRISPR–Cas (clustered regularly interspaced short palindromic repeats–CRISPR-associated proteins) modules are adaptive immunity systems that are present in many archaea and bacteria. These defence systems are encoded by operons that have an extraordinarily diverse architecture and a high rate of evolution for both the cas genes and the unique spacer content. Here, we provide an updated analysis of the evolutionary relationships between CRISPR–Cas systems and Cas proteins. Three major types of CRISPR–Cas system are delineated, with a further division into several subtypes and a few chimeric variants. Given the complexity of the genomic architectures and the extremely dynamic evolution of the CRISPR–Cas systems, a unified classification of these systems should be based on multiple criteria. Accordingly, we propose a `polythetic' classification that integrates the phylogenies of the most common cas genes, the sequence and organization of the CRISPR repeats and the architecture of the CRISPR–cas loci. PMID:21552286

  7. Gene Therapy

    Science.gov (United States)

    Gene therapy Overview Gene therapy involves altering the genes inside your body's cells in an effort to treat or stop disease. Genes contain your ... that don't work properly can cause disease. Gene therapy replaces a faulty gene or adds a new ...

  8. Ebolavirus Classification Based on Natural Vectors

    Science.gov (United States)

    Zheng, Hui; Yin, Changchuan; Hoang, Tung; He, Rong Lucy; Yang, Jie

    2015-01-01

    According to the WHO, ebolaviruses have resulted in 8818 human deaths in West Africa as of January 2015. To better understand the evolutionary relationship of the ebolaviruses and infer virulence from the relationship, we applied the alignment-free natural vector method to classify the newest ebolaviruses. The dataset includes three new Guinea viruses as well as 99 viruses from Sierra Leone. For the viruses of the family of Filoviridae, both genus label classification and species label classification achieve an accuracy rate of 100%. We represented the relationships among Filoviridae viruses by Unweighted Pair Group Method with Arithmetic Mean (UPGMA) phylogenetic trees and found that the filoviruses can be separated well by three genera. We performed the phylogenetic analysis on the relationship among different species of Ebolavirus by their coding-complete genomes and seven viral protein genes (glycoprotein [GP], nucleoprotein [NP], VP24, VP30, VP35, VP40, and RNA polymerase [L]). The topology of the phylogenetic tree by the viral protein VP24 shows consistency with the variations of virulence of ebolaviruses. The result suggests that VP24 be a pharmaceutical target for treating or preventing ebolaviruses. PMID:25803489

  9. Gene coexpression network analysis as a source of functional annotation for rice genes.

    Directory of Open Access Journals (Sweden)

    Kevin L Childs

    Full Text Available With the existence of large publicly available plant gene expression data sets, many groups have undertaken data analyses to construct gene coexpression networks and functionally annotate genes. Often, a large compendium of unrelated or condition-independent expression data is used to construct gene networks. Condition-dependent expression experiments consisting of well-defined conditions/treatments have also been used to create coexpression networks to help examine particular biological processes. Gene networks derived from either condition-dependent or condition-independent data can be difficult to interpret if a large number of genes and connections are present. However, algorithms exist to identify modules of highly connected and biologically relevant genes within coexpression networks. In this study, we have used publicly available rice (Oryza sativa gene expression data to create gene coexpression networks using both condition-dependent and condition-independent data and have identified gene modules within these networks using the Weighted Gene Coexpression Network Analysis method. We compared the number of genes assigned to modules and the biological interpretability of gene coexpression modules to assess the utility of condition-dependent and condition-independent gene coexpression networks. For the purpose of providing functional annotation to rice genes, we found that gene modules identified by coexpression analysis of condition-dependent gene expression experiments to be more useful than gene modules identified by analysis of a condition-independent data set. We have incorporated our results into the MSU Rice Genome Annotation Project database as additional expression-based annotation for 13,537 genes, 2,980 of which lack a functional annotation description. These results provide two new types of functional annotation for our database. Genes in modules are now associated with groups of genes that constitute a collective functional

  10. Dystonia: an update on phenomenology, classification, pathogenesis and treatment.

    Science.gov (United States)

    Balint, Bettina; Bhatia, Kailash P

    2014-08-01

    This article will highlight recent advances in dystonia with focus on clinical aspects such as the new classification, syndromic approach, new gene discoveries and genotype-phenotype correlations. Broadening of phenotype of some of the previously described hereditary dystonias and environmental risk factors and trends in treatment will be covered. Based on phenomenology, a new consensus update on the definition, phenomenology and classification of dystonia and a syndromic approach to guide diagnosis have been proposed. Terminology has changed and 'isolated dystonia' is used wherein dystonia is the only motor feature apart from tremor, and the previously called heredodegenerative dystonias and dystonia plus syndromes are now subsumed under 'combined dystonia'. The recently discovered genes ANO3, GNAL and CIZ1 appear not to be a common cause of adult-onset cervical dystonia. Clinical and genetic heterogeneity underlie myoclonus-dystonia, dopa-responsive dystonia and deafness-dystonia syndrome. ALS2 gene mutations are a newly recognized cause for combined dystonia. The phenotypic and genotypic spectra of ATP1A3 mutations have considerably broadened. Two new genome-wide association studies identified new candidate genes. A retrospective analysis suggested complicated vaginal delivery as a modifying risk factor in DYT1. Recent studies confirm lasting therapeutic effects of deep brain stimulation in isolated dystonia, good treatment response in myoclonus-dystonia, and suggest that early treatment correlates with a better outcome. Phenotypic classification continues to be important to recognize particular forms of dystonia and this includes syndromic associations. There are a number of genes underlying isolated or combined dystonia and there will be further new discoveries with the advances in genetic technologies such as exome and whole-genome sequencing. The identification of new genes will facilitate better elucidation of pathogenetic mechanisms and possible corrective

  11. Annotation and Classification of CRISPR-Cas Systems.

    Science.gov (United States)

    Makarova, Kira S; Koonin, Eugene V

    2015-01-01

    The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas (CRISPR-associated proteins) is a prokaryotic adaptive immune system that is represented in most archaea and many bacteria. Among the currently known prokaryotic defense systems, the CRISPR-Cas genomic loci show unprecedented complexity and diversity. Classification of CRISPR-Cas variants that would capture their evolutionary relationships to the maximum possible extent is essential for comparative genomic and functional characterization of this theoretically and practically important system of adaptive immunity. To this end, a multipronged approach has been developed that combines phylogenetic analysis of the conserved Cas proteins with comparison of gene repertoires and arrangements in CRISPR-Cas loci. This approach led to the current classification of CRISPR-Cas systems into three distinct types and ten subtypes for each of which signature genes have been identified. Comparative genomic analysis of the CRISPR-Cas systems in new archaeal and bacterial genomes performed over the 3 years elapsed since the development of this classification makes it clear that new types and subtypes of CRISPR-Cas need to be introduced. Moreover, this classification system captures only part of the complexity of CRISPR-Cas organization and evolution, due to the intrinsic modularity and evolutionary mobility of these immunity systems, resulting in numerous recombinant variants. Moreover, most of the cas genes evolve rapidly, complicating the family assignment for many Cas proteins and the use of family profiles for the recognition of CRISPR-Cas subtype signatures. Further progress in the comparative analysis of CRISPR-Cas systems requires integration of the most sensitive sequence comparison tools, protein structure comparison, and refined approaches for comparison of gene neighborhoods.

  12. What is new in genetics and osteogenesis imperfecta classification?

    Directory of Open Access Journals (Sweden)

    Eugênia R. Valadares

    2014-11-01

    Conclusions: Considering the discovery of new genes and limited genotype‐phenotype correlation, the use of next‐generation sequencing tools has become useful in molecular studies of OI cases. The recommendation of the Nosology Group of the International Society of Skeletal Dysplasias is to maintain the classification of Sillence as the prototypical form, universally accepted to classify the degree of severity in OI, while maintaining it free from direct molecular reference.

  13. Annotation and Classification of CRISPR-Cas Systems

    Science.gov (United States)

    Makarova, Kira S.; Koonin, Eugene V.

    2018-01-01

    The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas (CRISPR-associated proteins) is a prokaryotic adaptive immune system that is represented in most archaea and many bacteria. Among the currently known prokaryotic defense systems, the CRISPR-Cas genomic loci show unprecedented complexity and diversity. Classification of CRISPR-Cas variants that would capture their evolutionary relationships to the maximum possible extent is essential for comparative genomic and functional characterization of this theoretically and practically important system of adaptive immunity. To this end, a multipronged approach has been developed that combines phylogenetic analysis of the conserved Cas proteins with comparison of gene repertoires and arrangements in CRISPR-Cas loci. This approach led to the current classification of CRISPR-Cas systems into three distinct types and ten subtypes for each of which signature genes have been identified. Comparative genomic analysis of the CRISPR-Cas systems in new archaeal and bacterial genomes performed over the 3 years elapsed since the development of this classification makes it clear that new types and subtypes of CRISPR-Cas need to be introduced. Moreover, this classification system captures only part of the complexity of CRISPR-Cas organization and evolution, due to the intrinsic modularity and evolutionary mobility of these immunity systems, resulting in numerous recombinant variants. Moreover, most of the cas genes evolve rapidly, complicating the family assignment for many Cas proteins and the use of family profiles for the recognition of CRISPR-Cas subtype signatures. Further progress in the comparative analysis of CRISPR-Cas systems requires integration of the most sensitive sequence comparison tools, protein structure comparison, and refined approaches for comparison of gene neighborhoods. PMID:25981466

  14. HIV classification using coalescent theory

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Ming [Los Alamos National Laboratory; Letiner, Thomas K [Los Alamos National Laboratory; Korber, Bette T [Los Alamos National Laboratory

    2008-01-01

    Algorithms for subtype classification and breakpoint detection of HIV-I sequences are based on a classification system of HIV-l. Hence, their quality highly depend on this system. Due to the history of creation of the current HIV-I nomenclature, the current one contains inconsistencies like: The phylogenetic distance between the subtype B and D is remarkably small compared with other pairs of subtypes. In fact, it is more like the distance of a pair of subsubtypes Robertson et al. (2000); Subtypes E and I do not exist any more since they were discovered to be composed of recombinants Robertson et al. (2000); It is currently discussed whether -- instead of CRF02 being a recombinant of subtype A and G -- subtype G should be designated as a circulating recombination form (CRF) nd CRF02 as a subtype Abecasis et al. (2007); There are 8 complete and over 400 partial HIV genomes in the LANL-database which belong neither to a subtype nor to a CRF (denoted by U). Moreover, the current classification system is somehow arbitrary like all complex classification systems that were created manually. To this end, it is desirable to deduce the classification system of HIV systematically by an algorithm. Of course, this problem is not restricted to HIV, but applies to all fast mutating and recombining viruses. Our work addresses the simpler subproblem to score classifications of given input sequences of some virus species (classification denotes a partition of the input sequences in several subtypes and CRFs). To this end, we reconstruct ancestral recombination graphs (ARG) of the input sequences under restrictions determined by the given classification. These restritions are imposed in order to ensure that the reconstructed ARGs do not contradict the classification under consideration. Then, we find the ARG with maximal probability by means of Markov Chain Monte Carlo methods. The probability of the most probable ARG is interpreted as a score for the classification. To our

  15. 5 CFR 1312.7 - Derivative classification.

    Science.gov (United States)

    2010-01-01

    ..., DOWNGRADING, DECLASSIFICATION AND SAFEGUARDING OF NATIONAL SECURITY INFORMATION Classification and Declassification of National Security Information § 1312.7 Derivative classification. A derivative classification... 5 Administrative Personnel 3 2010-01-01 2010-01-01 false Derivative classification. 1312.7 Section...

  16. 32 CFR 2400.15 - Classification guides.

    Science.gov (United States)

    2010-07-01

    ... REGULATIONS TO IMPLEMENT E.O. 12356; OFFICE OF SCIENCE AND TECHNOLOGY POLICY INFORMATION SECURITY PROGRAM Derivative Classification § 2400.15 Classification guides. (a) OSTP shall issue and maintain classification guides to facilitate the proper and uniform derivative classification of information. These guides shall...

  17. 7 CFR 51.1860 - Color classification.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Color classification. 51.1860 Section 51.1860... STANDARDS) United States Standards for Fresh Tomatoes 1 Color Classification § 51.1860 Color classification... illustrating the color classification requirements, as set forth in this section. This visual aid may be...

  18. 22 CFR 42.11 - Classification symbols.

    Science.gov (United States)

    2010-04-01

    ... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Classification symbols. 42.11 Section 42.11... NATIONALITY ACT, AS AMENDED Classification and Foreign State Chargeability § 42.11 Classification symbols. A... visa symbol to show the classification of the alien. Immigrants Symbol Class Section of law Immediate...

  19. 28 CFR 345.20 - Position classification.

    Science.gov (United States)

    2010-07-01

    ... 28 Judicial Administration 2 2010-07-01 2010-07-01 false Position classification. 345.20 Section... INDUSTRIES (FPI) INMATE WORK PROGRAMS Position Classification § 345.20 Position classification. (a) Inmate... the objectives and principles of pay classification as a part of the routine orientation of new FPI...

  20. 7 CFR 51.2284 - Size classification.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Size classification. 51.2284 Section 51.2284... Size classification. The following classifications are provided to describe the size of any lot... shall conform to the requirements of the specified classification as defined below: (a) Halves. Lot...

  1. 22 CFR 9.8 - Classification challenges.

    Science.gov (United States)

    2010-04-01

    ... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Classification challenges. 9.8 Section 9.8 Foreign Relations DEPARTMENT OF STATE GENERAL SECURITY INFORMATION REGULATIONS § 9.8 Classification... classification status is improper are expected and encouraged to challenge the classification status of the...

  2. 7 CFR 28.911 - Review classification.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Review classification. 28.911 Section 28.911... REGULATIONS COTTON CLASSING, TESTING, AND STANDARDS Cotton Classification and Market News Service for Producers Classification § 28.911 Review classification. (a) A producer may request one review...

  3. 46 CFR 503.54 - Original classification.

    Science.gov (United States)

    2010-10-01

    ... 46 Shipping 9 2010-10-01 2010-10-01 false Original classification. 503.54 Section 503.54 Shipping... Program § 503.54 Original classification. (a) No Commission Member or employee has the authority to... classification, it shall be sent to the appropriate agency with original classification authority over the...

  4. 32 CFR 2001.21 - Original classification.

    Science.gov (United States)

    2010-07-01

    ... 32 National Defense 6 2010-07-01 2010-07-01 false Original classification. 2001.21 Section 2001.21... Markings § 2001.21 Original classification. (a) Primary markings. At the time of original classification... authority. The name and position, or personal identifier, of the original classification authority shall...

  5. Aberrant Gene Expression in Acute Myeloid Leukaemia

    DEFF Research Database (Denmark)

    Bagger, Frederik Otzen

    model to investigate the role of telomerase in AML, we were able to translate the observed effect into human AML patients and identify specific genes involved, which also predict survival patterns in AML patients. During these studies we have applied methods for investigating differentially expressed......-based gene-lookup webservices, called HemaExplorer and BloodSpot. These web-services support the aim of making data and analysis of haematopoietic cells from mouse and human accessible for researchers without bioinformatics expertise. Finally, in order to aid the analysis of the very limited number...

  6. Featureless classification of light curves

    Science.gov (United States)

    Kügler, S. D.; Gianniotis, N.; Polsterer, K. L.

    2015-08-01

    In the era of rapidly increasing amounts of time series data, classification of variable objects has become the main objective of time-domain astronomy. Classification of irregularly sampled time series is particularly difficult because the data cannot be represented naturally as a vector which can be directly fed into a classifier. In the literature, various statistical features serve as vector representations. In this work, we represent time series by a density model. The density model captures all the information available, including measurement errors. Hence, we view this model as a generalization to the static features which directly can be derived, e.g. as moments from the density. Similarity between each pair of time series is quantified by the distance between their respective models. Classification is performed on the obtained distance matrix. In the numerical experiments, we use data from the OGLE (Optical Gravitational Lensing Experiment) and ASAS (All Sky Automated Survey) surveys and demonstrate that the proposed representation performs up to par with the best currently used feature-based approaches. The density representation preserves all static information present in the observational data, in contrast to a less-complete description by features. The density representation is an upper boundary in terms of information made available to the classifier. Consequently, the predictive power of the proposed classification depends on the choice of similarity measure and classifier, only. Due to its principled nature, we advocate that this new approach of representing time series has potential in tasks beyond classification, e.g. unsupervised learning.

  7. A Semisupervised Cascade Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Stamatis Karlos

    2016-01-01

    Full Text Available Classification is one of the most important tasks of data mining techniques, which have been adopted by several modern applications. The shortage of enough labeled data in the majority of these applications has shifted the interest towards using semisupervised methods. Under such schemes, the use of collected unlabeled data combined with a clearly smaller set of labeled examples leads to similar or even better classification accuracy against supervised algorithms, which use labeled examples exclusively during the training phase. A novel approach for increasing semisupervised classification using Cascade Classifier technique is presented in this paper. The main characteristic of Cascade Classifier strategy is the use of a base classifier for increasing the feature space by adding either the predicted class or the probability class distribution of the initial data. The classifier of the second level is supplied with the new dataset and extracts the decision for each instance. In this work, a self-trained NB∇C4.5 classifier algorithm is presented, which combines the characteristics of Naive Bayes as a base classifier and the speed of C4.5 for final classification. We performed an in-depth comparison with other well-known semisupervised classification methods on standard benchmark datasets and we finally reached to the point that the presented technique has better accuracy in most cases.

  8. What is new in genetics and osteogenesis imperfecta classification?

    Directory of Open Access Journals (Sweden)

    Eugênia R. Valadares

    2014-12-01

    Full Text Available OBJECTIVE: Literature review of new genes related to osteogenesis imperfecta (OI and update of its classification. SOURCES: Literature review in the PubMed and OMIM databases, followed by selection of relevant references. SUMMARY OF THE FINDINGS: In 1979, Sillence et al. developed a classification of OI subtypes based on clinical features and disease severity: OI type I, mild, common, with blue sclera; OI type II, perinatal lethal form; OI type III, severe and progressively deforming, with normal sclera; and OI type IV, moderate severity with normal sclera. Approximately 90% of individuals with OI are heterozygous for mutations in the COL1A1 and COL1A2 genes, with dominant pattern of inheritance or sporadic mutations. After 2006, mutations were identified in the CRTAP, FKBP10, LEPRE1, PLOD2, PPIB, SERPINF1, SERPINH1, SP7, WNT1, BMP1, and TMEM38B genes, associated with recessive OI and mutation in the IFITM5 gene associated with dominant OI. Mutations in PLS3 were recently identified in families with osteoporosis and fractures, with X-linked inheritance pattern. In addition to the genetic complexity of the molecular basis of OI, extensive phenotypic variability resulting from individual loci has also been documented. CONCLUSIONS: Considering the discovery of new genes and limited genotype-phenotype correlation, the use of next-generation sequencing tools has become useful in molecular studies of OI cases. The recommendation of the Nosology Group of the International Society of Skeletal Dysplasias is to maintain the classification of Sillence as the prototypical form, universally accepted to classify the degree of severity in OI, while maintaining it free from direct molecular reference.

  9. Rock suitability classification RSC 2012

    Energy Technology Data Exchange (ETDEWEB)

    McEwen, T. (ed.) [McEwen Consulting, Leicester (United Kingdom); Kapyaho, A. [Geological Survey of Finland, Espoo (Finland); Hella, P. [Saanio and Riekkola, Helsinki (Finland); Aro, S.; Kosunen, P.; Mattila, J.; Pere, T.

    2012-12-15

    This report presents Posiva's Rock Suitability Classification (RSC) system, developed for locating suitable rock volumes for repository design and construction. The RSC system comprises both the revised rock suitability criteria and the procedure for the suitability classification during the construction of the repository. The aim of the classification is to avoid such features of the host rock that may be detrimental to the favourable conditions within the repository, either initially or in the long term. This report also discusses the implications of applying the RSC system for the fulfilment of the regulatory requirements concerning the host rock as a natural barrier and the site's overall suitability for hosting a final repository of spent nuclear fuel.

  10. Rock suitability classification RSC 2012

    International Nuclear Information System (INIS)

    McEwen, T.; Kapyaho, A.; Hella, P.; Aro, S.; Kosunen, P.; Mattila, J.; Pere, T.

    2012-12-01

    This report presents Posiva's Rock Suitability Classification (RSC) system, developed for locating suitable rock volumes for repository design and construction. The RSC system comprises both the revised rock suitability criteria and the procedure for the suitability classification during the construction of the repository. The aim of the classification is to avoid such features of the host rock that may be detrimental to the favourable conditions within the repository, either initially or in the long term. This report also discusses the implications of applying the RSC system for the fulfilment of the regulatory requirements concerning the host rock as a natural barrier and the site's overall suitability for hosting a final repository of spent nuclear fuel

  11. Automatic Hierarchical Color Image Classification

    Directory of Open Access Journals (Sweden)

    Jing Huang

    2003-02-01

    Full Text Available Organizing images into semantic categories can be extremely useful for content-based image retrieval and image annotation. Grouping images into semantic classes is a difficult problem, however. Image classification attempts to solve this hard problem by using low-level image features. In this paper, we propose a method for hierarchical classification of images via supervised learning. This scheme relies on using a good low-level feature and subsequently performing feature-space reconfiguration using singular value decomposition to reduce noise and dimensionality. We use the training data to obtain a hierarchical classification tree that can be used to categorize new images. Our experimental results suggest that this scheme not only performs better than standard nearest-neighbor techniques, but also has both storage and computational advantages.

  12. Oral epithelial dysplasia classification systems

    DEFF Research Database (Denmark)

    Warnakulasuriya, S; Reibel, J; Bouquot, J

    2008-01-01

    At a workshop coordinated by the WHO Collaborating Centre for Oral Cancer and Precancer in the United Kingdom issues related to potentially malignant disorders of the oral cavity were discussed by an expert group. The consensus views of the Working Group are presented in a series of papers....... In this report, we review the oral epithelial dysplasia classification systems. The three classification schemes [oral epithelial dysplasia scoring system, squamous intraepithelial neoplasia and Ljubljana classification] were presented and the Working Group recommended epithelial dysplasia grading for routine...... use. Although most oral pathologists possibly recognize and accept the criteria for grading epithelial dysplasia, firstly based on architectural features and then of cytology, there is great variability in their interpretation of the presence, degree and significance of the individual criteria...

  13. Effective Feature Selection for Classification of Promoter Sequences.

    Directory of Open Access Journals (Sweden)

    Kouser K

    Full Text Available Exploring novel computational methods in making sense of biological data has not only been a necessity, but also productive. A part of this trend is the search for more efficient in silico methods/tools for analysis of promoters, which are parts of DNA sequences that are involved in regulation of expression of genes into other functional molecules. Promoter regions vary greatly in their function based on the sequence of nucleotides and the arrangement of protein-binding short-regions called motifs. In fact, the regulatory nature of the promoters seems to be largely driven by the selective presence and/or the arrangement of these motifs. Here, we explore computational classification of promoter sequences based on the pattern of motif distributions, as such classification can pave a new way of functional analysis of promoters and to discover the functionally crucial motifs. We make use of Position Specific Motif Matrix (PSMM features for exploring the possibility of accurately classifying promoter sequences using some of the popular classification techniques. The classification results on the complete feature set are low, perhaps due to the huge number of features. We propose two ways of reducing features. Our test results show improvement in the classification output after the reduction of features. The results also show that decision trees outperform SVM (Support Vector Machine, KNN (K Nearest Neighbor and ensemble classifier LibD3C, particularly with reduced features. The proposed feature selection methods outperform some of the popular feature transformation methods such as PCA and SVD. Also, the methods proposed are as accurate as MRMR (feature selection method but much faster than MRMR. Such methods could be useful to categorize new promoters and explore regulatory mechanisms of gene expressions in complex eukaryotic species.

  14. SHIP CLASSIFICATION FROM MULTISPECTRAL VIDEOS

    Directory of Open Access Journals (Sweden)

    Frederique Robert-Inacio

    2012-05-01

    Full Text Available Surveillance of a seaport can be achieved by different means: radar, sonar, cameras, radio communications and so on. Such a surveillance aims, on the one hand, to manage cargo and tanker traffic, and, on the other hand, to prevent terrorist attacks in sensitive areas. In this paper an application to video-surveillance of a seaport entrance is presented, and more particularly, the different steps enabling to classify mobile shapes. This classification is based on a parameter measuring the similarity degree between the shape under study and a set of reference shapes. The classification result describes the considered mobile in terms of shape and speed.

  15. Proteomic classification of breast cancer.

    LENUS (Irish Health Repository)

    Kamel, Dalia

    2012-11-01

    Being a significant health problem that affects patients in various age groups, breast cancer has been extensively studied to date. Recently, molecular breast cancer classification has advanced significantly with the availability of genomic profiling technologies. Proteomic technologies have also advanced from traditional protein assays including enzyme-linked immunosorbent assay, immunoblotting and immunohistochemistry to more comprehensive approaches including mass spectrometry and reverse phase protein lysate arrays (RPPA). The purpose of this manuscript is to review the current protein markers that influence breast cancer prediction and prognosis and to focus on novel advances in proteomic classification of breast cancer.

  16. Deep learning for image classification

    Science.gov (United States)

    McCoppin, Ryan; Rizki, Mateen

    2014-06-01

    This paper provides an overview of deep learning and introduces the several subfields of deep learning including a specific tutorial of convolutional neural networks. Traditional methods for learning image features are compared to deep learning techniques. In addition, we present our preliminary classification results, our basic implementation of a convolutional restricted Boltzmann machine on the Mixed National Institute of Standards and Technology database (MNIST), and we explain how to use deep learning networks to assist in our development of a robust gender classification system.

  17. Facial aging: A clinical classification

    Directory of Open Access Journals (Sweden)

    Shiffman Melvin

    2007-01-01

    Full Text Available The purpose of this classification of facial aging is to have a simple clinical method to determine the severity of the aging process in the face. This allows a quick estimate as to the types of procedures that the patient would need to have the best results. Procedures that are presently used for facial rejuvenation include laser, chemical peels, suture lifts, fillers, modified facelift and full facelift. The physician is already using his best judgment to determine which procedure would be best for any particular patient. This classification may help to refine these decisions.

  18. Project implementation : classification of organic soils and classification of marls - training of INDOT personnel.

    Science.gov (United States)

    2012-09-01

    This is an implementation project for the research completed as part of the following projects: SPR3005 Classification of Organic Soils : and SPR3227 Classification of Marl Soils. The methods developed for the classification of both soi...

  19. FACET CLASSIFICATIONS OF E-LEARNING TOOLS

    Directory of Open Access Journals (Sweden)

    Olena Yu. Balalaieva

    2013-12-01

    Full Text Available The article deals with the classification of e-learning tools based on the facet method, which suggests the separation of the parallel set of objects into independent classification groups; at the same time it is not assumed rigid classification structure and pre-built finite groups classification groups are formed by a combination of values taken from the relevant facets. An attempt to systematize the existing classification of e-learning tools from the standpoint of classification theory is made for the first time. Modern Ukrainian and foreign facet classifications of e-learning tools are described; their positive and negative features compared to classifications based on a hierarchical method are analyzed. The original author's facet classification of e-learning tools is proposed.

  20. CREST--classification resources for environmental sequence tags.

    Directory of Open Access Journals (Sweden)

    Anders Lanzén

    Full Text Available Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags, a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3 from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.

  1. Definition and classification of epilepsy. Classification of epileptic seizures 2016

    Directory of Open Access Journals (Sweden)

    K. Yu. Mukhin

    2017-01-01

    Full Text Available Epilepsy is one of the most common neurological diseases, especially in childhood and adolescence. The incidence varies from 15 to 113 cases per 100 000 population with the maximum among children under 1 year old. The prevalence of epilepsy is high, ranging from 5 to 8 cases (in some regions – 10 cases per 1000 children under 15 years old. Classification of the disease has great importance for diagnosis, treatment and prognosis. The article presents a novel strategy for classification of epileptic seizures, developed in 2016. It contains a number of brand new concepts, including a very important one, saying that some seizures, previously considered as generalized or focal only, can be, in fact, both focal and generalized. They include tonic, atonic, myoclonic seizures and epileptic spasms. The term “secondarily generalized seizure” is replace by the term “bilateral tonic-clonic seizure” (as soon as it is not a separate type of epileptic seizures, and the term reflects the spread of discharge from any area of cerebral cortex and evolution of any types of focal seizures. International League Against Epilepsy recommends to abandon the term “pseudo-epileptic seizures” and replace it by the term “psychogenic non-epileptic seizures”. If a doctor is not sure that seizures have epileptic nature, the term “paroxysmal event” should be used without specifying the disease. The conception of childhood epileptic encephalopathies, developed within this novel classification project, is one of the most significant achievements, since in this case not only the seizures, but even epileptiform activity can induce severe disorders of higher mental functions. In addition to detailed description of the new strategy for classification of epileptic seizures, the article contains a comprehensive review of the existing principles of epilepsy and epileptic seizures classification.

  2. carboxylate synthase gene family in Arabidopsis, rice, grapevine

    African Journals Online (AJOL)

    Yomi

    2012-01-16

    Jan 16, 2012 ... evolutionary relationships of ACS genes in the four plant species. Chromosomal .... classification was consistent with the report from. Jakubowicz et al. ..... Analysis of the genome sequence of the flowering plant Arabidopsis ...

  3. LEXICAL UNITS STARTING WITH THE LETTERS «Б» AND «B» MATCHING IN THE PLANE OF EXPRESSION (BASED ON THE RUSSIAN AND ENGLISH LANGUAGES

    Directory of Open Access Journals (Sweden)

    Ms. Maria A. Ankudinova

    2016-09-01

    Full Text Available This article is devoted to lexical units starting with the letters «б» and «b» matching in the plane of expression, based on the Russian and English languages. Lexical units were chosen by the random sampling method, were classified and analyzed, based on their spelling and pronunciation.

  4. Agriculture classification using POLSAR data

    DEFF Research Database (Denmark)

    Skriver, Henning; Dall, Jørgen; Ferro-Famil, Laurent

    2005-01-01

    of their components) show strongly preferred orientations, such as the stalks or ears of cereals. The importance of SAR polarimetry in crop classification arises principally because polarisation is sen-sitive to orientation. Hence it provides a means to distinguish crops with different canopy archi-tectures. Detailed...

  5. Urogenital tuberculosis: definition and classification.

    Science.gov (United States)

    Kulchavenya, Ekaterina

    2014-10-01

    To improve the approach to the diagnosis and management of urogenital tuberculosis (UGTB), we need clear and unique classification. UGTB remains an important problem, especially in developing countries, but it is often an overlooked disease. As with any other infection, UGTB should be cured by antibacterial therapy, but because of late diagnosis it may often require surgery. Scientific literature dedicated to this problem was critically analyzed and juxtaposed with the author's own more than 30 years' experience in tuberculosis urology. The conception, terms and definition were consolidated into one system; classification stage by stage as well as complications are presented. Classification of any disease includes dispersion on forms and stages and exact definitions for each stage. Clinical features and symptoms significantly vary between different forms and stages of UGTB. A simple diagnostic algorithm was constructed. UGTB is multivariant disease and a standard unified approach to it is impossible. Clear definition as well as unique classification are necessary for real estimation of epidemiology and the optimization of therapy. The term 'UGTB' has insufficient information in order to estimate therapy, surgery and prognosis, or to evaluate the epidemiology.

  6. Real time automatic scene classification

    NARCIS (Netherlands)

    Verbrugge, R.; Israël, Menno; Taatgen, N.; van den Broek, Egon; van der Putten, Peter; Schomaker, L.; den Uyl, Marten J.

    2004-01-01

    This work has been done as part of the EU VICAR (IST) project and the EU SCOFI project (IAP). The aim of the first project was to develop a real time video indexing classification annotation and retrieval system. For our systems, we have adapted the approach of Picard and Minka [3], who categorized

  7. Unsupervised classification of variable stars

    Science.gov (United States)

    Valenzuela, Lucas; Pichara, Karim

    2018-03-01

    During the past 10 years, a considerable amount of effort has been made to develop algorithms for automatic classification of variable stars. That has been primarily achieved by applying machine learning methods to photometric data sets where objects are represented as light curves. Classifiers require training sets to learn the underlying patterns that allow the separation among classes. Unfortunately, building training sets is an expensive process that demands a lot of human efforts. Every time data come from new surveys; the only available training instances are the ones that have a cross-match with previously labelled objects, consequently generating insufficient training sets compared with the large amounts of unlabelled sources. In this work, we present an algorithm that performs unsupervised classification of variable stars, relying only on the similarity among light curves. We tackle the unsupervised classification problem by proposing an untraditional approach. Instead of trying to match classes of stars with clusters found by a clustering algorithm, we propose a query-based method where astronomers can find groups of variable stars ranked by similarity. We also develop a fast similarity function specific for light curves, based on a novel data structure that allows scaling the search over the entire data set of unlabelled objects. Experiments show that our unsupervised model achieves high accuracy in the classification of different types of variable stars and that the proposed algorithm scales up to massive amounts of light curves.

  8. Automatic indexing, compiling and classification

    International Nuclear Information System (INIS)

    Andreewsky, Alexandre; Fluhr, Christian.

    1975-06-01

    A review of the principles of automatic indexing, is followed by a comparison and summing-up of work by the authors and by a Soviet staff from the Moscou INFORM-ELECTRO Institute. The mathematical and linguistic problems of the automatic building of thesaurus and automatic classification are examined [fr

  9. Aphasia Classification Using Neural Networks

    DEFF Research Database (Denmark)

    Axer, H.; Jantzen, Jan; Berks, G.

    2000-01-01

    A web-based software model (http://fuzzy.iau.dtu.dk/aphasia.nsf) was developed as an example for classification of aphasia using neural networks. Two multilayer perceptrons were used to classify the type of aphasia (Broca, Wernicke, anomic, global) according to the results in some subtests...

  10. Classification Accuracy Is Not Enough

    DEFF Research Database (Denmark)

    Sturm, Bob L.

    2013-01-01

    A recent review of the research literature evaluating music genre recognition (MGR) systems over the past two decades shows that most works (81\\%) measure the capacity of a system to recognize genre by its classification accuracy. We show here, by implementing and testing three categorically...

  11. Functions in Biological Kind Classification

    Science.gov (United States)

    Lombrozo, Tania; Rehder, Bob

    2012-01-01

    Biological traits that serve functions, such as a zebra's coloration (for camouflage) or a kangaroo's tail (for balance), seem to have a special role in conceptual representations for biological kinds. In five experiments, we investigate whether and why functional features are privileged in biological kind classification. Experiment 1…

  12. Is classification necessary after Google?

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2012-01-01

    believe that the activity of “classification” is not worth the effort, as search engines can be improved without the heavy cost of providing metadata. Design/methodology/approach – The basic issue in classification is seen as providing criteria for deciding whether A should be classified as X...

  13. Data Augmentation for Plant Classification

    NARCIS (Netherlands)

    Pawara, Pornntiwa; Okafor, Emmanuel; Schomaker, Lambertus; Wiering, Marco

    2017-01-01

    Data augmentation plays a crucial role in increasing the number of training images, which often aids to improve classification performances of deep learning techniques for computer vision problems. In this paper, we employ the deep learning framework and determine the effects of several

  14. Climatic classification of the Karst

    International Nuclear Information System (INIS)

    Eslava Ramirez Jesus Antonio; Bahamon Ayala, Sandra Marcela; Lopez Romero Maria Ines

    2000-01-01

    Climate is one the main factors in forming or modifying Karsts, or its resulting forms. The determining climatic elements of Karst characteristics are humidity, air circulation and temperature. Many Karstic processes show characteristics corresponding to a given climate sequence. In the present article we discuss the relation between climate and Karst as well as a climate classification based on the structure of the Karsts

  15. CLASSIFICATION OF LEARNING MANAGEMENT SYSTEMS

    Directory of Open Access Journals (Sweden)

    Yu. B. Popova

    2016-01-01

    Full Text Available Using of information technologies and, in particular, learning management systems, increases opportunities of teachers and students in reaching their goals in education. Such systems provide learning content, help organize and monitor training, collect progress statistics and take into account the individual characteristics of each user. Currently, there is a huge inventory of both paid and free systems are physically located both on college servers and in the cloud, offering different features sets of different licensing scheme and the cost. This creates the problem of choosing the best system. This problem is partly due to the lack of comprehensive classification of such systems. Analysis of more than 30 of the most common now automated learning management systems has shown that a classification of such systems should be carried out according to certain criteria, under which the same type of system can be considered. As classification features offered by the author are: cost, functionality, modularity, keeping the customer’s requirements, the integration of content, the physical location of a system, adaptability training. Considering the learning management system within these classifications and taking into account the current trends of their development, it is possible to identify the main requirements to them: functionality, reliability, ease of use, low cost, support for SCORM standard or Tin Can API, modularity and adaptability. According to the requirements at the Software Department of FITR BNTU under the guidance of the author since 2009 take place the development, the use and continuous improvement of their own learning management system.

  16. Crop Classification by Polarimetric SAR

    DEFF Research Database (Denmark)

    Skriver, Henning; Svendsen, Morten Thougaard; Nielsen, Flemming

    1999-01-01

    Polarimetric SAR-data of agricultural fields have been acquired by the Danish polarimetric L- and C-band SAR (EMISAR) during a number of missions at the Danish agricultural test site Foulum during 1995. The data are used to study the classification potential of polarimetric SAR data using...

  17. Emotions Classification for Arabic Tweets

    African Journals Online (AJOL)

    pc

    2018-03-05

    Mar 5, 2018 ... learning methods for referring to all areas of detecting, analyzing, and classifying ... In this paper, an adaptive model is proposed for emotions classification of ... WEKA data mining tool is used to implement this model and evaluate the ... defined using vector representation, storing a numerical. "importance" ...

  18. Correlation Dimension Estimation for Classification

    Czech Academy of Sciences Publication Activity Database

    Jiřina, Marcel; Jiřina jr., M.

    2006-01-01

    Roč. 1, č. 3 (2006), s. 547-557 ISSN 1895-8648 R&D Projects: GA MŠk(CZ) 1M0567 Institutional research plan: CEZ:AV0Z10300504 Keywords : correlation dimension * probability density estimation * classification * UCI MLR Subject RIV: BA - General Mathematics

  19. The 2015 World Health Organization Classification of Lung Tumors: Impact of Genetic, Clinical and Radiologic Advances Since the 2004 Classification.

    Science.gov (United States)

    Travis, William D; Brambilla, Elisabeth; Nicholson, Andrew G; Yatabe, Yasushi; Austin, John H M; Beasley, Mary Beth; Chirieac, Lucian R; Dacic, Sanja; Duhig, Edwina; Flieder, Douglas B; Geisinger, Kim; Hirsch, Fred R; Ishikawa, Yuichi; Kerr, Keith M; Noguchi, Masayuki; Pelosi, Giuseppe; Powell, Charles A; Tsao, Ming Sound; Wistuba, Ignacio

    2015-09-01

    The 2015 World Health Organization (WHO) Classification of Tumors of the Lung, Pleura, Thymus and Heart has just been published with numerous important changes from the 2004 WHO classification. The most significant changes in this edition involve (1) use of immunohistochemistry throughout the classification, (2) a new emphasis on genetic studies, in particular, integration of molecular testing to help personalize treatment strategies for advanced lung cancer patients, (3) a new classification for small biopsies and cytology similar to that proposed in the 2011 Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society classification, (4) a completely different approach to lung adenocarcinoma as proposed by the 2011 Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society classification, (5) restricting the diagnosis of large cell carcinoma only to resected tumors that lack any clear morphologic or immunohistochemical differentiation with reclassification of the remaining former large cell carcinoma subtypes into different categories, (6) reclassifying squamous cell carcinomas into keratinizing, nonkeratinizing, and basaloid subtypes with the nonkeratinizing tumors requiring immunohistochemistry proof of squamous differentiation, (7) grouping of neuroendocrine tumors together in one category, (8) adding NUT carcinoma, (9) changing the term sclerosing hemangioma to sclerosing pneumocytoma, (10) changing the name hamartoma to "pulmonary hamartoma," (11) creating a group of PEComatous tumors that include (a) lymphangioleiomyomatosis, (b) PEComa, benign (with clear cell tumor as a variant) and (c) PEComa, malignant, (12) introducing the entity pulmonary myxoid sarcoma with an EWSR1-CREB1 translocation, (13) adding the entities myoepithelioma and myoepithelial carcinomas, which can show EWSR1 gene rearrangements, (14) recognition of usefulness of WWTR1-CAMTA1 fusions in diagnosis of epithelioid

  20. Mining gene expression data of multiple sclerosis.

    Directory of Open Access Journals (Sweden)

    Pi Guo

    Full Text Available Microarray produces a large amount of gene expression data, containing various biological implications. The challenge is to detect a panel of discriminative genes associated with disease. This study proposed a robust classification model for gene selection using gene expression data, and performed an analysis to identify disease-related genes using multiple sclerosis as an example.Gene expression profiles based on the transcriptome of peripheral blood mononuclear cells from a total of 44 samples from 26 multiple sclerosis patients and 18 individuals with other neurological diseases (control were analyzed. Feature selection algorithms including Support Vector Machine based on Recursive Feature Elimination, Receiver Operating Characteristic Curve, and Boruta algorithms were jointly performed to select candidate genes associating with multiple sclerosis. Multiple classification models categorized samples into two different groups based on the identified genes. Models' performance was evaluated using cross-validation methods, and an optimal classifier for gene selection was determined.An overlapping feature set was identified consisting of 8 genes that were differentially expressed between the two phenotype groups. The genes were significantly associated with the pathways of apoptosis and cytokine-cytokine receptor interaction. TNFSF10 was significantly associated with multiple sclerosis. A Support Vector Machine model was established based on the featured genes and gave a practical accuracy of ∼86%. This binary classification model also outperformed the other models in terms of Sensitivity, Specificity and F1 score.The combined analytical framework integrating feature ranking algorithms and Support Vector Machine model could be used for selecting genes for other diseases.

  1. Whewell on classification and consilience.

    Science.gov (United States)

    Quinn, Aleta

    2017-08-01

    In this paper I sketch William Whewell's attempts to impose order on classificatory mineralogy, which was in Whewell's day (1794-1866) a confused science of uncertain prospects. Whewell argued that progress was impeded by the crude reductionist assumption that all macroproperties of crystals could be straightforwardly explained by reference to the crystals' chemical constituents. By comparison with biological classification, Whewell proposed methodological reforms that he claimed would lead to a natural classification of minerals, which in turn would support advances in causal understanding of the properties of minerals. Whewell's comparison to successful biological classification is particularly striking given that classificatory biologists did not share an understanding of the causal structure underlying the natural classification of life (the common descent with modification of all organisms). Whewell's key proposed methodological reform is consideration of multiple, distinct principles of classification. The most powerful evidence in support of a natural classificatory claim is the consilience of claims arrived at through distinct lines of reasoning, rooted in distinct conceptual approaches to the target objects. Mineralogists must consider not only elemental composition and chemical affinities, but also symmetry and polarity. Geometrical properties are central to what makes an individual mineral the type of mineral that it is. In Whewell's view, function and organization jointly define life, and so are the keys to understanding what makes an organism the type of organism that it is. I explain the relationship between Whewell's teleological account of life and his natural theology. I conclude with brief comments about the importance of Whewell's classificatory theory for the further development of his philosophy of science and in particular his account of consilience. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Functional Basis of Microorganism Classification.

    Science.gov (United States)

    Zhu, Chengsheng; Delmont, Tom O; Vogel, Timothy M; Bromberg, Yana

    2015-08-01

    Correctly identifying nearest "neighbors" of a given microorganism is important in industrial and clinical applications where close relationships imply similar treatment. Microbial classification based on similarity of physiological and genetic organism traits (polyphasic similarity) is experimentally difficult and, arguably, subjective. Evolutionary relatedness, inferred from phylogenetic markers, facilitates classification but does not guarantee functional identity between members of the same taxon or lack of similarity between different taxa. Using over thirteen hundred sequenced bacterial genomes, we built a novel function-based microorganism classification scheme, functional-repertoire similarity-based organism network (FuSiON; flattened to fusion). Our scheme is phenetic, based on a network of quantitatively defined organism relationships across the known prokaryotic space. It correlates significantly with the current taxonomy, but the observed discrepancies reveal both (1) the inconsistency of functional diversity levels among different taxa and (2) an (unsurprising) bias towards prioritizing, for classification purposes, relatively minor traits of particular interest to humans. Our dynamic network-based organism classification is independent of the arbitrary pairwise organism similarity cut-offs traditionally applied to establish taxonomic identity. Instead, it reveals natural, functionally defined organism groupings and is thus robust in handling organism diversity. Additionally, fusion can use organism meta-data to highlight the specific environmental factors that drive microbial diversification. Our approach provides a complementary view to cladistic assignments and holds important clues for further exploration of microbial lifestyles. Fusion is a more practical fit for biomedical, industrial, and ecological applications, as many of these rely on understanding the functional capabilities of the microbes in their environment and are less concerned with

  3. Functional Basis of Microorganism Classification

    Science.gov (United States)

    Zhu, Chengsheng; Delmont, Tom O.; Vogel, Timothy M.; Bromberg, Yana

    2015-01-01

    Correctly identifying nearest “neighbors” of a given microorganism is important in industrial and clinical applications where close relationships imply similar treatment. Microbial classification based on similarity of physiological and genetic organism traits (polyphasic similarity) is experimentally difficult and, arguably, subjective. Evolutionary relatedness, inferred from phylogenetic markers, facilitates classification but does not guarantee functional identity between members of the same taxon or lack of similarity between different taxa. Using over thirteen hundred sequenced bacterial genomes, we built a novel function-based microorganism classification scheme, functional-repertoire similarity-based organism network (FuSiON; flattened to fusion). Our scheme is phenetic, based on a network of quantitatively defined organism relationships across the known prokaryotic space. It correlates significantly with the current taxonomy, but the observed discrepancies reveal both (1) the inconsistency of functional diversity levels among different taxa and (2) an (unsurprising) bias towards prioritizing, for classification purposes, relatively minor traits of particular interest to humans. Our dynamic network-based organism classification is independent of the arbitrary pairwise organism similarity cut-offs traditionally applied to establish taxonomic identity. Instead, it reveals natural, functionally defined organism groupings and is thus robust in handling organism diversity. Additionally, fusion can use organism meta-data to highlight the specific environmental factors that drive microbial diversification. Our approach provides a complementary view to cladistic assignments and holds important clues for further exploration of microbial lifestyles. Fusion is a more practical fit for biomedical, industrial, and ecological applications, as many of these rely on understanding the functional capabilities of the microbes in their environment and are less concerned

  4. 2008 International Conference on Ectodermal Dysplasias Classification Conference Report

    Science.gov (United States)

    Salinas, Carlos F.; Jorgenson, Ronald J.; Wright, J. Timothy; DiGiovanna, John J.; Fete, Mary D.

    2009-01-01

    There are many ways to classify ectodermal dysplasia syndromes. Clinicians in practice use a list of syndromes from which to choose a potential diagnosis, paging through a volume, such as Freire-Maia and Pinheiro's corpus, matching their patient's findings to listed syndromes. Medical researchers may want a list of syndromes that share one (monothetic system) or several (polythetic system) traits in order to focus research on a narrowly defined group. Special interest groups may want a list from which they can choose constituencies, and insurance companies and government agencies may want a list to determine for whom to provide (or deny) health care coverage. Furthermore, various molecular biologists are now promoting classification systems based on gene mutation (e.g. TP63 associated syndromes) or common molecular pathways. The challenge will be to balance comprehensiveness within the classification with usability and accessibility so that the benefits truly serve the needs of researchers, health care providers and ultimately the individuals and families directly affected by ectodermal dysplasias. It is also recognized that a new classification approach is an ongoing process and will require periodical reviews or updates. Whatever scheme is developed, however, will have far-reaching application for other groups of disorders for which classification is complicated by the number of interested parties and advances in diagnostic acumen. Consensus among interested parties is necessary for optimizing communication among the diverse groups whether it be for equitable distribution of funds, correctness of diagnosis and treatment, or focusing research efforts. PMID:19681152

  5. Towards an integrated phylogenetic classification of the Tremellomycetes.

    Science.gov (United States)

    Liu, X-Z; Wang, Q-M; Göker, M; Groenewald, M; Kachalkin, A V; Lumbsch, H T; Millanes, A M; Wedin, M; Yurkov, A M; Boekhout, T; Bai, F-Y

    2015-06-01

    Families and genera assigned to Tremellomycetes have been mainly circumscribed by morphology and for the yeasts also by biochemical and physiological characteristics. This phenotype-based classification is largely in conflict with molecular phylogenetic analyses. Here a phylogenetic classification framework for the Tremellomycetes is proposed based on the results of phylogenetic analyses from a seven-genes dataset covering the majority of tremellomycetous yeasts and closely related filamentous taxa. Circumscriptions of the taxonomic units at the order, family and genus levels recognised were quantitatively assessed using the phylogenetic rank boundary optimisation (PRBO) and modified general mixed Yule coalescent (GMYC) tests. In addition, a comprehensive phylogenetic analysis on an expanded LSU rRNA (D1/D2 domains) gene sequence dataset covering as many as available teleomorphic and filamentous taxa within Tremellomycetes was performed to investigate the relationships between yeasts and filamentous taxa and to examine the stability of undersampled clades. Based on the results inferred from molecular data and morphological and physiochemical features, we propose an updated classification for the Tremellomycetes. We accept five orders, 17 families and 54 genera, including seven new families and 18 new genera. In addition, seven families and 17 genera are emended and one new species name and 185 new combinations are proposed. We propose to use the term pro tempore or pro tem. in abbreviation to indicate the species names that are temporarily maintained.

  6. Screening and classification of ceramic powders

    Science.gov (United States)

    Miwa, S.

    1983-01-01

    A summary is given of the classification technology of ceramic powders. Advantages and disadvantages of the wet and dry screening and classification methods are discussed. Improvements of wind force screening devices are described.

  7. 5 CFR 1312.3 - Classification requirements.

    Science.gov (United States)

    2010-01-01

    ..., DOWNGRADING, DECLASSIFICATION AND SAFEGUARDING OF NATIONAL SECURITY INFORMATION Classification and Declassification of National Security Information § 1312.3 Classification requirements. United States citizens must...; (5) Scientific, technological, or economic matters relating to the national security; (6) United...

  8. 14 CFR 1203.412 - Classification guides.

    Science.gov (United States)

    2010-01-01

    ... of the classification designations (i.e., Top Secret, Secret or Confidential) apply to the identified... writing by an official with original Top Secret classification authority; the identity of the official...

  9. Classification guide: Paralympic Games London 2012

    OpenAIRE

    2013-01-01

    The London 2012 Paralympic Games Classification Guide is designed to provide National Paralympic Committees (NPCs) and International Paralympic Sport Federations (IPSFs) with information about the classification policies and procedures that will apply to the London 2012 Paralympic Games.

  10. Classification guide: Sochi 2014 Paralympic Winter Games

    OpenAIRE

    2014-01-01

    The Sochi 2014 Paralympic Winter Games classification guide is designed to provide National Paralympic Committees (NPCs) and International Federations (IFs) with information about the classification policies and procedures that will apply to the Sochi 2014 Paralympic Winter Games.

  11. Improvement of Classification of Enterprise Circulating Funds

    OpenAIRE

    Rohanova Hanna O.

    2014-01-01

    The goal of the article lies in revelation of possibilities of increase of efficiency of managing enterprise circulating funds by means of improvement of their classification features. Having analysed approaches of many economists to classification of enterprise circulating funds, systemised and supplementing them, the article offers grouping classification features of enterprise circulating funds. In the result of the study the article offers an expanded classification of circulating funds, ...

  12. 10 CFR 61.55 - Waste classification.

    Science.gov (United States)

    2010-01-01

    ... REGULATORY COMMISSION (CONTINUED) LICENSING REQUIREMENTS FOR LAND DISPOSAL OF RADIOACTIVE WASTE Technical Requirements for Land Disposal Facilities § 61.55 Waste classification. (a) Classification of waste for near surface disposal—(1) Considerations. Determination of the classification of radioactive waste involves two...

  13. 6 CFR 7.26 - Derivative classification.

    Science.gov (United States)

    2010-01-01

    ... 6 Domestic Security 1 2010-01-01 2010-01-01 false Derivative classification. 7.26 Section 7.26 Domestic Security DEPARTMENT OF HOMELAND SECURITY, OFFICE OF THE SECRETARY CLASSIFIED NATIONAL SECURITY INFORMATION Classified Information § 7.26 Derivative classification. (a) Derivative classification is defined...

  14. 22 CFR 9.6 - Derivative classification.

    Science.gov (United States)

    2010-04-01

    ... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Derivative classification. 9.6 Section 9.6 Foreign Relations DEPARTMENT OF STATE GENERAL SECURITY INFORMATION REGULATIONS § 9.6 Derivative classification. (a) Definition. Derivative classification is the incorporating, paraphrasing, restating or...

  15. 46 CFR 76.50-5 - Classification.

    Science.gov (United States)

    2010-10-01

    ... 46 Shipping 3 2010-10-01 2010-10-01 false Classification. 76.50-5 Section 76.50-5 Shipping COAST... Classification. (a) Hand portable fire extinguishers and semiportable fire extinguishing systems shall be... extinguishing systems are set forth in table 76.50-5(c). Table 76.50-5(c) Classification Type Size Soda acid and...

  16. 12 CFR 560.160 - Asset classification.

    Science.gov (United States)

    2010-01-01

    ... 12 Banks and Banking 5 2010-01-01 2010-01-01 false Asset classification. 560.160 Section 560.160... Lending and Investment Provisions Applicable to all Savings Associations § 560.160 Asset classification... consistent with, or reconcilable to, the asset classification system used by OTS in its Thrift Activities...

  17. 14 CFR 298.3 - Classification.

    Science.gov (United States)

    2010-01-01

    ... 14 Aeronautics and Space 4 2010-01-01 2010-01-01 false Classification. 298.3 Section 298.3... REGULATIONS EXEMPTIONS FOR AIR TAXI AND COMMUTER AIR CARRIER OPERATIONS General § 298.3 Classification. (a) There is hereby established a classification of air carriers, designated as “air taxi operators,” which...

  18. 6 CFR 7.30 - Classification challenges.

    Science.gov (United States)

    2010-01-01

    ... 6 Domestic Security 1 2010-01-01 2010-01-01 false Classification challenges. 7.30 Section 7.30... INFORMATION Classified Information § 7.30 Classification challenges. (a) Authorized holders of information... classified are encouraged and expected to challenge the classification status of that information pursuant to...

  19. 14 CFR 1203.701 - Classification.

    Science.gov (United States)

    2010-01-01

    ... 14 Aeronautics and Space 5 2010-01-01 2010-01-01 false Classification. 1203.701 Section 1203.701... Government Information § 1203.701 Classification. (a) Foreign government information that is classified by a foreign entity shall either retain its original classification designation or be marked with a United...

  20. 32 CFR 1602.7 - Classification.

    Science.gov (United States)

    2010-07-01

    ... 32 National Defense 6 2010-07-01 2010-07-01 false Classification. 1602.7 Section 1602.7 National Defense Other Regulations Relating to National Defense SELECTIVE SERVICE SYSTEM DEFINITIONS § 1602.7 Classification. Classification is the exercise of the power to determine claims or questions with respect to...

  1. 32 CFR 644.426 - Classification.

    Science.gov (United States)

    2010-07-01

    ... 32 National Defense 4 2010-07-01 2010-07-01 true Classification. 644.426 Section 644.426 National... HANDBOOK Disposal Disposal of Fee-Owned Real Property and Easement Interests § 644.426 Classification... required by the special acts, classification will be coordinated with the interested Federal agency. The...

  2. 46 CFR Sec. 18 - Group classification.

    Science.gov (United States)

    2010-10-01

    ... 46 Shipping 8 2010-10-01 2010-10-01 false Group classification. Sec. 18 Section 18 Shipping... Sec. 18 Group classification. In the preparation of specifications, Job Orders, Supplemental Job... inserted thereon: Number Classification 41 Maintenance Repairs (deck, engine and stewards department...

  3. 10 CFR 1045.37 - Classification guides.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 4 2010-01-01 2010-01-01 false Classification guides. 1045.37 Section 1045.37 Energy DEPARTMENT OF ENERGY (GENERAL PROVISIONS) NUCLEAR CLASSIFICATION AND DECLASSIFICATION Generation and Review of Documents Containing Restricted Data and Formerly Restricted Data § 1045.37 Classification guides...

  4. 46 CFR 193.50-5 - Classification.

    Science.gov (United States)

    2010-10-01

    ... 46 Shipping 7 2010-10-01 2010-10-01 false Classification. 193.50-5 Section 193.50-5 Shipping COAST... Details § 193.50-5 Classification. (a) Hand portable fire extinguishers and semiportable fire...) Classification Type Size Soda-acid and water, gals. Foam, gals. Carbon dioxide, lbs. Dry chemical, lbs. A II 21/2...

  5. Border Lakes land-cover classification

    Science.gov (United States)

    Marvin Bauer; Brian Loeffelholz; Doug. Shinneman

    2009-01-01

    This document contains metadata and description of land-cover classification of approximately 5.1 million acres of land bordering Minnesota, U.S.A. and Ontario, Canada. The classification focused on the separation and identification of specific forest-cover types. Some separation of the nonforest classes also was performed. The classification was derived from multi-...

  6. 22 CFR 9a.4 - Classification.

    Science.gov (United States)

    2010-04-01

    ... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Classification. 9a.4 Section 9a.4 Foreign... ENERGY PROGRAMS; RELATED MATERIAL § 9a.4 Classification. (a) Section 1 of E.O. 11932, August 4, 1976.... If the officer determines that the information or material warrants classification, he shall assign...

  7. 75 FR 10529 - Mail Classification Change

    Science.gov (United States)

    2010-03-08

    ... POSTAL REGULATORY COMMISSION [Docket Nos. MC2010-19; Order No. 415] Mail Classification Change...-filed Postal Service request to make a minor modification to the Mail Classification Schedule. The.... concerning a change in classification which reflects a change in terminology from Bulk Mailing Center (BMC...

  8. 7 CFR 51.1903 - Size classification.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Size classification. 51.1903 Section 51.1903... STANDARDS) United States Consumer Standards for Fresh Tomatoes Size and Maturity Classification § 51.1903 Size classification. The following terms may be used for describing the size of the tomatoes in any lot...

  9. 33 CFR 154.1216 - Facility classification.

    Science.gov (United States)

    2010-07-01

    ... 33 Navigation and Navigable Waters 2 2010-07-01 2010-07-01 false Facility classification. 154.1216... Vegetable Oils Facilities § 154.1216 Facility classification. (a) The Coast Guard classifies facilities that... classification of a facility that handles, stores, or transports animal fats or vegetable oils. The COTP may...

  10. 7 CFR 1794.31 - Classification.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 12 2010-01-01 2010-01-01 false Classification. 1794.31 Section 1794.31 Agriculture... Classification. (a) Electric and telecommunications programs. RUS will normally determine the proper environmental classification of projects based on its evaluation of the project description set forth in the...

  11. 76 FR 47614 - Mail Classification Change

    Science.gov (United States)

    2011-08-05

    ... POSTAL REGULATORY COMMISSION [Docket No. MC2011-27; Order No. 785] Mail Classification Change...-filed Postal Service request for a change in classification to the ``Reply Rides Free'' program. The... Service filed a notice of classification change pursuant to 39 CFR 3020.90 and 3020.91 concerning the...

  12. 32 CFR 1602.13 - Judgmental Classification.

    Science.gov (United States)

    2010-07-01

    ... 32 National Defense 6 2010-07-01 2010-07-01 false Judgmental Classification. 1602.13 Section 1602.13 National Defense Other Regulations Relating to National Defense SELECTIVE SERVICE SYSTEM DEFINITIONS § 1602.13 Judgmental Classification. A classification action relating to a registrant's claim for...

  13. 7 CFR 51.1904 - Maturity classification.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Maturity classification. 51.1904 Section 51.1904... STANDARDS) United States Consumer Standards for Fresh Tomatoes Size and Maturity Classification § 51.1904 Maturity classification. Tomatoes which are characteristically red when ripe, but are not overripe or soft...

  14. Pattern Classification with Memristive Crossbar Circuits

    Science.gov (United States)

    2016-03-31

    Pattern Classification with Memristive Crossbar Circuits Dmitri B. Strukov Department of Electrical and Computer Engineering Department UC Santa...pattern classification ; deep learning; convolutional neural network networks. Introduction Deep-learning convolutional neural networks (DLCNN), which...the best classification performances on a variety of benchmark tasks [1]. The major challenge in building fast and energy- efficient networks of this

  15. 46 CFR 132.210 - Classification.

    Science.gov (United States)

    2010-10-01

    ... 46 Shipping 4 2010-10-01 2010-10-01 false Classification. 132.210 Section 132.210 Shipping COAST... Portable and Semiportable Fire Extinguishers § 132.210 Classification. (a) Each portable fire extinguisher... Classification Type Size Halon 1211, 1301, and 1211-1301 mixtures kgs. (lbs.) Foam, liters (gallons) Carbon...

  16. 32 CFR 2400.34 - Classification.

    Science.gov (United States)

    2010-07-01

    ... 32 National Defense 6 2010-07-01 2010-07-01 false Classification. 2400.34 Section 2400.34 National... Government Information § 2400.34 Classification. (a) Foreign government information classified by a foreign government or international organization of governments shall retain its original classification designation...

  17. 7 CFR 51.1402 - Size classification.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Size classification. 51.1402 Section 51.1402... STANDARDS) United States Standards for Grades of Pecans in the Shell 1 Size Classification § 51.1402 Size classification. Size of pecans may be specified in connection with the grade in accordance with one of the...

  18. Maxillectomy defects: a suggested classification scheme.

    Science.gov (United States)

    Akinmoladun, V I; Dosumu, O O; Olusanya, A A; Ikusika, O F

    2013-06-01

    The term "maxillectomy" has been used to describe a variety of surgical procedures for a spectrum of diseases involving a diverse anatomical site. Hence, classifications of maxillectomy defects have often made communication difficult. This article highlights this problem, emphasises the need for a uniform system of classification and suggests a classification system which is simple and comprehensive. Articles related to this subject, especially those with specified classifications of maxillary surgical defects were sourced from the internet through Google, Scopus and PubMed using the search terms maxillectomy defects classification. A manual search through available literature was also done. The review of the materials revealed many classifications and modifications of classifications from the descriptive, reconstructive and prosthodontic perspectives. No globally acceptable classification exists among practitioners involved in the management of diseases in the mid-facial region. There were over 14 classifications of maxillary defects found in the English literature. Attempts made to address the inadequacies of previous classifications have tended to result in cumbersome and relatively complex classifications. A single classification that is based on both surgical and prosthetic considerations is most desirable and is hereby proposed.

  19. Angle′s Molar Classification Revisited

    Directory of Open Access Journals (Sweden)

    Devanshi Yadav

    2014-01-01

    Results: Of the 500 pretreatment study casts assessed 52.4% were definitive Class I, 23.6% were Class II, 2.6% were Class III and the ambiguous cases were 21%. These could be easily classified with our method of classification. Conclusion: This improvised classification technique will help orthodontists in making classification of malocclusion accurate and simple.

  20. New guidelines for dam safety classification

    International Nuclear Information System (INIS)

    Dascal, O.

    1999-01-01

    Elements are outlined of recommended new guidelines for safety classification of dams. Arguments are provided for the view that dam classification systems should require more than one system as follows: (a) classification for selection of design criteria, operation procedures and emergency measures plans, based on potential consequences of a dam failure - the hazard classification of water retaining structures; (b) classification for establishment of surveillance activities and for safety evaluation of dams, based on the probability and consequences of failure - the risk classification of water retaining structures; and (c) classification for establishment of water management plans, for safety evaluation of the entire project, for preparation of emergency measures plans, for definition of the frequency and extent of maintenance operations, and for evaluation of changes and modifications required - the hazard classification of the project. The hazard classification of the dam considers, as consequence, mainly the loss of lives or persons in jeopardy and the property damages to third parties. Difficulties in determining the risk classification of the dam lie in the fact that no tool exists to evaluate the probability of the dam's failure. To overcome this, the probability of failure can be substituted for by a set of dam characteristics that express the failure potential of the dam and its foundation. The hazard classification of the entire project is based on the probable consequences of dam failure influencing: loss of life, persons in jeopardy, property and environmental damage. The classification scheme is illustrated for dam threatening events such as earthquakes and floods. 17 refs., 5 tabs

  1. Stellar Spectral Classification with Locality Preserving Projections ...

    Indian Academy of Sciences (India)

    With the help of computer tools and algorithms, automatic stellar spectral classification has become an area of current interest. The process of stellar spectral classification mainly includes two steps: dimension reduction and classification. As a popular dimensionality reduction technique, Principal Component Analysis (PCA) ...

  2. Classification of high resolution satellite images

    OpenAIRE

    Karlsson, Anders

    2003-01-01

    In this thesis the Support Vector Machine (SVM)is applied on classification of high resolution satellite images. Sveral different measures for classification, including texture mesasures, 1st order statistics, and simple contextual information were evaluated. Additionnally, the image was segmented, using an enhanced watershed method, in order to improve the classification accuracy.

  3. The Classification of Romanian High-Schools

    Science.gov (United States)

    Ivan, Ion; Milodin, Daniel; Naie, Lucian

    2006-01-01

    The article tries to tackle the issue of high-schools classification from one city, district or from Romania. The classification criteria are presented. The National Database of Education is also presented and the application of criteria is illustrated. An algorithm for high-school multi-rang classification is proposed in order to build classes of…

  4. Hydropedological insights when considering catchment classification

    NARCIS (Netherlands)

    Bouma, J.; Droogers, P.; Sonneveld, M.P.W.; Ritsema, C.J.; Hunink, J.E.; Immerzeel, W.W.; Kauffman, S.

    2011-01-01

    Soil classification systems are analysed to explore the potential of developing classification systems for catchments. Soil classifications are useful to create systematic order in the overwhelming quantity of different soils in the world and to extrapolate data available for a given soil type to

  5. 28 CFR 524.73 - Classification procedures.

    Science.gov (United States)

    2010-07-01

    ... 28 Judicial Administration 2 2010-07-01 2010-07-01 false Classification procedures. 524.73 Section 524.73 Judicial Administration BUREAU OF PRISONS, DEPARTMENT OF JUSTICE INMATE ADMISSION, CLASSIFICATION, AND TRANSFER CLASSIFICATION OF INMATES Central Inmate Monitoring (CIM) System § 524.73...

  6. 22 CFR 9.4 - Original classification.

    Science.gov (United States)

    2010-04-01

    ... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Original classification. 9.4 Section 9.4 Foreign Relations DEPARTMENT OF STATE GENERAL SECURITY INFORMATION REGULATIONS § 9.4 Original classification. (a) Definition. Original classification is the initial determination that certain information...

  7. 5 CFR 2500.3 - Original classification.

    Science.gov (United States)

    2010-01-01

    ... 5 Administrative Personnel 3 2010-01-01 2010-01-01 false Original classification. 2500.3 Section... SECURITY REGULATION § 2500.3 Original classification. No one in the Office of Administration has been granted authority for original classification of information. ...

  8. Congenital neutropenia in the era of genomics: classification, diagnosis, and natural history.

    Science.gov (United States)

    Donadieu, Jean; Beaupain, Blandine; Fenneteau, Odile; Bellanné-Chantelot, Christine

    2017-11-01

    This review focuses on the classification, diagnosis and natural history of congenital neutropenia (CN). CN encompasses a number of genetic disorders with chronic neutropenia and, for some, affecting other organ systems, such as the pancreas, central nervous system, heart, bone and skin. To date, 24 distinct genes have been associated with CN. The number of genes involved makes gene screening difficult. This can be solved by next-generation sequencing (NGS) of targeted gene panels. One of the major complications of CN is spontaneous leukaemia, which is preceded by clonal somatic evolution, and can be screened by a targeted NGS panel focused on somatic events. © 2017 John Wiley & Sons Ltd.

  9. Rough set soft computing cancer classification and network: one stone, two birds.

    Science.gov (United States)

    Zhang, Yue

    2010-07-15

    Gene expression profiling provides tremendous information to help unravel the complexity of cancer. The selection of the most informative genes from huge noise for cancer classification has taken centre stage, along with predicting the function of such identified genes and the construction of direct gene regulatory networks at different system levels with a tuneable parameter. A new study by Wang and Gotoh described a novel Variable Precision Rough Sets-rooted robust soft computing method to successfully address these problems and has yielded some new insights. The significance of this progress and its perspectives will be discussed in this article.

  10. Classification of symmetric toroidal orbifolds

    Energy Technology Data Exchange (ETDEWEB)

    Fischer, Maximilian; Ratz, Michael; Torrado, Jesus [Technische Univ. Muenchen, Garching (Germany). Physik-Department; Vaudrevange, Patrick K.S. [Deutsches Elektronen-Synchrotron (DESY), Hamburg (Germany)

    2012-09-15

    We provide a complete classification of six-dimensional symmetric toroidal orbifolds which yield N{>=}1 supersymmetry in 4D for the heterotic string. Our strategy is based on a classification of crystallographic space groups in six dimensions. We find in total 520 inequivalent toroidal orbifolds, 162 of them with Abelian point groups such as Z{sub 3}, Z{sub 4}, Z{sub 6}-I etc. and 358 with non-Abelian point groups such as S{sub 3}, D{sub 4}, A{sub 4} etc. We also briefly explore the properties of some orbifolds with Abelian point groups and N=1, i.e. specify the Hodge numbers and comment on the possible mechanisms (local or non-local) of gauge symmetry breaking.

  11. Odor Classification using Agent Technology

    Directory of Open Access Journals (Sweden)

    Sigeru OMATU

    2014-03-01

    Full Text Available In order to measure and classify odors, Quartz Crystal Microbalance (QCM can be used. In the present study, seven QCM sensors and three different odors are used. The system has been developed as a virtual organization of agents using an agent platform called PANGEA (Platform for Automatic coNstruction of orGanizations of intElligent Agents. This is a platform for developing open multi-agent systems, specifically those including organizational aspects. The main reason for the use of agents is the scalability of the platform, i.e. the way in which it models the services. The system models functionalities as services inside the agents, or as Service Oriented Approach (SOA architecture compliant services using Web Services. This way the adaptation of the odor classification systems with new algorithms, tools and classification techniques is allowed.

  12. Classification differences and maternal mortality

    DEFF Research Database (Denmark)

    Salanave, B; Bouvier-Colle, M H; Varnoux, N

    1999-01-01

    OBJECTIVES: To compare the ways maternal deaths are classified in national statistical offices in Europe and to evaluate the ways classification affects published rates. METHODS: Data on pregnancy-associated deaths were collected in 13 European countries. Cases were classified by a European panel....... This change was substantial in three countries (P statistical offices appeared to attribute fewer deaths to obstetric causes. In the other countries, no differences were detected. According to official published data, the aggregated maternal mortality rate for participating countries was 7.7 per...... of experts into obstetric or non-obstetric causes. An ICD-9 code (International Classification of Diseases) was attributed to each case. These were compared to the codes given in each country. Correction indices were calculated, giving new estimates of maternal mortality rates. SUBJECTS: There were...

  13. Critical Evaluation of Headache Classifications

    OpenAIRE

    ?ZGE, Aynur

    2013-01-01

    Transforming a subjective sense like headache into an objective state and establishing a common language for this complaint which can be both a symptom and a disease all by itself have kept the investigators busy for years. Each recommendation proposed has brought along a set of patients who do not meet the criteria. While almost the most ideal and most comprehensive classification studies continued at this point, this time criticisims about withdrawing from daily practice came to the fore. I...

  14. Classification of simple current invariants

    CERN Document Server

    Gato-Rivera, Beatriz

    1992-01-01

    We summarize recent work on the classification of modular invariant partition functions that can be obtained with simple currents in theories with a center (Z_p)^k with p prime. New empirical results for other centers are also presented. Our observation that the total number of invariants is monodromy-independent for (Z_p)^k appears to be true in general as well. (Talk presented in the parallel session on string theory of the Lepton-Photon/EPS Conference, Geneva, 1991.)

  15. Collective Classification in Network Data

    OpenAIRE

    Sen, Prithviraj; Namata, Galileo; Bilgic, Mustafa; Getoor, Lise; University of Maryland; Galligher, Brian; Eliassi-Rad, Tina

    2008-01-01

    Many real-world applications produce networked data such as the world-wide web (hypertext documents connected via hyperlinks), social networks (for example, people connected by friendship links), communication networks (computers connected via communication links) and biological networks (for example, protein interaction networks). A recent focus in machine learning research has been to extend traditional machine learning classification techniques to classify nodes in such networks. In this a...

  16. Texture classification using autoregressive filtering

    Science.gov (United States)

    Lawton, W. M.; Lee, M.

    1984-01-01

    A general theory of image texture models is proposed and its applicability to the problem of scene segmentation using texture classification is discussed. An algorithm, based on half-plane autoregressive filtering, which optimally utilizes second order statistics to discriminate between texture classes represented by arbitrary wide sense stationary random fields is described. Empirical results of applying this algorithm to natural and sysnthesized scenes are presented and future research is outlined.

  17. Classification of posterior vitreous detachment

    OpenAIRE

    Kakehashi, Akihiro; Takezawa, Mikiko; Akiba, Jun

    2013-01-01

    Akihiro Kakehashi,1 Mikiko Takezawa,1 Jun Akiba21Department of Ophthalmology, Jichi Medical University, Saitama Medical Center, Saitama, 2Kanjodori Eye Clinic, Asahikawa, JapanAbstract: Diagnosing a posterior vitreous detachment (PVD) is important for predicting the prognosis and determining the indication for vitreoretinal surgery in many vitreoretinal diseases. This article presents both classifications of a PVD by slit-lamp biomicroscopy and of a shallow PVD by optical coherence tomography...

  18. A Classification Table for Achondrites

    Science.gov (United States)

    Chennaoui-Aoudjehane, H.; Larouci, N.; Jambon, A.; Mittlefehldt, D. W.

    2014-01-01

    Classifying chondrites is relatively easy and the criteria are well documented. It is based on mineral compositions, textural characteristics and more recently, magnetic susceptibility. It can be more difficult to classify achondrites, especially those that are very similar to terrestrial igneous rocks, because mineralogical, textural and compositional properties can be quite variable. Achondrites contain essentially olivine, pyroxenes, plagioclases, oxides, sulphides and accessory minerals. Their origin is attributed to differentiated parents bodies: large asteroids (Vesta); planets (Mars); a satellite (the Moon); and numerous asteroids of unknown size. In most cases, achondrites are not eye witnessed falls and some do not have fusion crust. Because of the mineralogical and magnetic susceptibility similarity with terrestrial igneous rocks for some achondrites, it can be difficult for classifiers to confirm their extra-terrestrial origin. We -as classifiers of meteorites- are confronted with this problem with every suspected achondrite we receive for identification. We are developing a "grid" of classification to provide an easier approach for initial classification. We use simple but reproducible criteria based on mineralogical, petrological and geochemical studies. We presented the classes: acapulcoites, lodranites, winonaites and Martian meteorites (shergottite, chassignites, nakhlites). In this work we are completing the classification table by including the groups: angrites, aubrites, brachinites, ureilites, HED (howardites, eucrites, and diogenites), lunar meteorites, pallasites and mesosiderites. Iron meteorites are not presented in this abstract.

  19. Fuzzy support vector machine for microarray imbalanced data classification

    Science.gov (United States)

    Ladayya, Faroh; Purnami, Santi Wulan; Irhamah

    2017-11-01

    DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a class which have significantly more instances than the other minority classes. Therefore, it is needed a classification method that solve the problem of high dimensional and imbalanced data. Support Vector Machine (SVM) is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, Fuzzy SVM (FSVM) is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different contributions to the classifier. The minority classes have large fuzzy membership so FSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is a high dimensional data with a very large number of features, it is necessary to do feature selection first using Fast Correlation based Filter (FCBF). In this study will be analyzed by SVM, FSVM and both methods by applying FCBF and get the classification performance of them. Based on the overall results, FSVM on selected features has the best classification performance compared to SVM.

  20. [Landscape classification: research progress and development trend].

    Science.gov (United States)

    Liang, Fa-Chao; Liu, Li-Ming

    2011-06-01

    Landscape classification is the basis of the researches on landscape structure, process, and function, and also, the prerequisite for landscape evaluation, planning, protection, and management, directly affecting the precision and practicability of landscape research. This paper reviewed the research progress on the landscape classification system, theory, and methodology, and summarized the key problems and deficiencies of current researches. Some major landscape classification systems, e. g. , LANMAP and MUFIC, were introduced and discussed. It was suggested that a qualitative and quantitative comprehensive classification based on the ideology of functional structure shape and on the integral consideration of landscape classification utility, landscape function, landscape structure, physiogeographical factors, and human disturbance intensity should be the major research directions in the future. The integration of mapping, 3S technology, quantitative mathematics modeling, computer artificial intelligence, and professional knowledge to enhance the precision of landscape classification would be the key issues and the development trend in the researches of landscape classification.

  1. Classifications of Patterned Hair Loss: A Review.

    Science.gov (United States)

    Gupta, Mrinal; Mysore, Venkataram

    2016-01-01

    Patterned hair loss is the most common cause of hair loss seen in both the sexes after puberty. Numerous classification systems have been proposed by various researchers for grading purposes. These systems vary from the simpler systems based on recession of the hairline to the more advanced multifactorial systems based on the morphological and dynamic parameters that affect the scalp and the hair itself. Most of these preexisting systems have certain limitations. Currently, the Hamilton-Norwood classification system for males and the Ludwig system for females are most commonly used to describe patterns of hair loss. In this article, we review the various classification systems for patterned hair loss in both the sexes. Relevant articles were identified through searches of MEDLINE and EMBASE. Search terms included but were not limited to androgenic alopecia classification, patterned hair loss classification, male pattern baldness classification, and female pattern hair loss classification. Further publications were identified from the reference lists of the reviewed articles.

  2. Classification of huminite-ICCP System 1994

    Energy Technology Data Exchange (ETDEWEB)

    Sykorova, I. [Institute of Rock Structure and Mechanics, Academy of Science of the Czech Republic, V Holesovicka 41, 182 09 Prague 8 (Czech Republic); Pickel, W. [Coal and Organic Petrology Services Pty Ltd, 23/80 Box Road, Taren Point, NSW 2229 (Australia); Christanis, K. [Department of Geology, University of Patras, 26500 Rio-Patras (Greece); Wolf, M. [Mergelskull 29, 47802 Krefeld (Germany); Taylor, G.H. [15 Hawkesbury Cres, Farrer Act 2607 (Australia); Flores, D. [Departamento de Geologia, Faculdade de Ciencias do Porto, Praca de Gomes Teixeira, 4099-002 Porto (Portugal)

    2005-04-12

    In the new classification (ICCP System 1994), the maceral group huminite has been revised from the previous classification (ICCP, 1971. Int. Handbook Coal Petr., suppl. to 2nd ed.) to accommodate the nomenclature to changes in the other maceral groups, especially the changes in the vitrinite classification (ICCP, 1998. The new vitrinite classification (ICCP System 1994). Fuel 77, 349-358.). The vitrinite and huminite systems have been correlated so that down to the level of sub-maceral groups, the two systems can be used in parallel. At the level of macerals and for finer classifications, the analyst now has, according to the nature of the coal and the purpose of the analysis, a choice of using either of the two classification systems for huminite and vitrinite. This is in accordance with the new ISO Coal Classification that covers low rank coals as well and allows for the simultaneous use of the huminite and vitrinite nomenclature for low rank coals.

  3. Protist classification and the kingdoms of organisms.

    Science.gov (United States)

    Whittaker, R H; Margulis, L

    1978-04-01

    Traditional classification imposed a division into plant-like and animal-like forms on the unicellular eukaryotes, or protists; in a current view the protists are a diverse assemblage of plant-, animal- and fungus-like groups. Classification of these into phyla is difficult because of their relatively simple structure and limited geological record, but study of ultrastructure and other characteristics is providing new insight on protist classification. Possible classifications are discussed, and a summary classification of the living world into kingdoms (Monera, Protista, Fungi, Animalia, Plantae) and phyla is suggested. This classification also suggests groupings of phyla into superphyla and form-superphyla, and a broadened kingdom Protista (including green algae, oomycotes and slime molds but excluding red and brown algae). The classification thus seeks to offer a compromise between the protist and protoctist kingdoms of Whittaker and Margulis and to combine a full listing of phyla with grouping of these for synoptic treatment.

  4. Homeobox genes and melatonin synthesis

    DEFF Research Database (Denmark)

    Rohde, Kristian; Møller, Morten; Rath, Martin Fredensborg

    2014-01-01

    Nocturnal synthesis of melatonin in the pineal gland is controlled by a circadian rhythm in arylalkylamine N-acetyltransferase (AANAT) enzyme activity. In the rodent, Aanat gene expression displays a marked circadian rhythm; release of norepinephrine in the gland at night causes a cAMP-based indu......Nocturnal synthesis of melatonin in the pineal gland is controlled by a circadian rhythm in arylalkylamine N-acetyltransferase (AANAT) enzyme activity. In the rodent, Aanat gene expression displays a marked circadian rhythm; release of norepinephrine in the gland at night causes a c......AMP-based induction of Aanat transcription. However, additional transcriptional control mechanisms exist. Homeobox genes, which are generally known to encode transcription factors controlling developmental processes, are also expressed in the mature rodent pineal gland. Among these, the cone-rod homeobox (CRX......) transcription factor is believed to control pineal-specific Aanat expression. Based on recent advances in our understanding of Crx in the rodent pineal gland, we here suggest that homeobox genes play a role in adult pineal physiology both by ensuring pineal-specific Aanat expression and by facilitating c...

  5. Congenital muscular dystrophies--problems of classification.

    Science.gov (United States)

    Lenard, H G

    1991-04-01

    The classification of congenital muscular dystrophies (CMD), based on perceived clinical and morphological similarities or differences, is controversial. CMD without cerebral involvement has sometimes been divided into a mild and a severe form. This distinction is, however, arbitrary and not uncontested. Whether Ullrich's disease, formerly called atonic-sclerotic dystrophy, is a disease entity and if so, whether it is a primary muscle disorder, is uncertain. CMD without cerebral involvement is inherited in an autosomal recessive fashion in the great majority of cases. CMDs with cerebral involvement are usually classified into at least three forms: the Fukuyama type of CMD, occurring almost exclusively in Japanese patients; CMD with hypomyelination, sometimes also called the occidental type of cerebromuscular dystrophy; and Walker-Warburg syndrome. Muscle-eye-brain disease, described in a number of Finnish patients, may or may not belong in this last category. In CMD with cerebral involvement inheritance is also autosomal recessive. It is possible that single sporadic cases are phenocopies due to infectious or other exogenous causes. Reports of clinical and morphological findings from an increasing number of patients show a high degree of variability within and, on the other hand, certain similarities between the forms of CMD with cerebral involvement. In addition, neuroradiological changes are also found with increasing frequency in CMD patients without clinical neuropsychological abnormalities. It is not unreasonable to speculate that molecular genetic techniques will reveal in the near future a variable defect in one gene locus or defects in a few gene loci as the cause of the various clinical forms of CMDs.

  6. The reliability and reproducibility of the Hertel classification for comminuted proximal humeral fractures compared with the Neer classification

    NARCIS (Netherlands)

    Iordens, Gijs I. T.; Mahabier, Kiran C.; Buisman, Florian E.; Schep, Niels W. L.; Muradin, Galied S. R.; Beenen, Ludo F. M.; Patka, Peter; van Lieshout, Esther M. M.; den Hartog, Dennis

    2016-01-01

    The Neer classification is the most commonly used fracture classification system for proximal humeral fractures. Inter- and intra-observer agreement is limited, especially for comminuted fractures. A possibly more straightforward and reliable classification system is the Hertel classification. The

  7. A novel classification system for aging theories

    Directory of Open Access Journals (Sweden)

    Lucas Siqueira Trindade

    2013-03-01

    Full Text Available Theories of lifespan evolution are a source of confusion amongst aging researchers. After a century of aging research the dispute over whether the aging process is active or passive persists and a comprehensive and universally accepted theoretical model remains elusive. Evolutionary aging theories primarily dispute whether the aging process is exclusively adapted to favor the kin or exclusively non-adapted to favor the individual. Interestingly, contradictory data and theories supporting both exclusively programmed and exclusively non-programmed theories continue to grow. However, this is a false dichotomy; natural selection favors traits resulting in efficient reproduction whether they benefit the individual or the kin. Thus, to understand the evolution of aging, first we must understand the environment-dependent balance between the advantages and disadvantages of extended lifespan in the process of spreading genes. As described by distinct theories, different niches and environmental conditions confer on extended lifespan a range of fitness values varying from highly beneficial to highly detrimental. Here, we considered the range of fitness values for extended lifespan and develop a fitness-based framework for categorizing existing theories. We show that all theories can be classified into four basic types: secondary (beneficial, maladaptive (neutral, assisted death (detrimental and senemorphic aging (varying between beneficial to detrimental. We anticipate that this classification system will assist with understanding and interpreting aging/death by providing a way of considering theories as members of one of these classes rather than consideration of their individual details.

  8. Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

    KAUST Repository

    Abusamra, Heba

    2016-07-20

    The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.

  9. A Coupled k-Nearest Neighbor Algorithm for Multi-Label Classification

    Science.gov (United States)

    2015-05-22

    classification, an image may contain several concepts simultaneously, such as beach, sunset and kangaroo . Such tasks are usually denoted as multi-label...informatics, a gene can belong to both metabolism and transcription classes; and in music categorization, a song may labeled as Mozart and sad. In the

  10. Is overall similarity classification less effortful than single-dimension classification?

    Science.gov (United States)

    Wills, Andy J; Milton, Fraser; Longmore, Christopher A; Hester, Sarah; Robinson, Jo

    2013-01-01

    It is sometimes argued that the implementation of an overall similarity classification is less effortful than the implementation of a single-dimension classification. In the current article, we argue that the evidence securely in support of this view is limited, and report additional evidence in support of the opposite proposition--overall similarity classification is more effortful than single-dimension classification. Using a match-to-standards procedure, Experiments 1A, 1B and 2 demonstrate that concurrent load reduces the prevalence of overall similarity classification, and that this effect is robust to changes in the concurrent load task employed, the level of time pressure experienced, and the short-term memory requirements of the classification task. Experiment 3 demonstrates that participants who produced overall similarity classifications from the outset have larger working memory capacities than those who produced single-dimension classifications initially, and Experiment 4 demonstrates that instructions to respond meticulously increase the prevalence of overall similarity classification.

  11. Radiological classification of mandibular fractures

    International Nuclear Information System (INIS)

    Mihailova, H.

    2009-01-01

    Mandibular fractures present the biggest part (up to 97%) of the facial bone fractures. Method of choice for diagnosing of mandibular fractures is conventional radiography. The aim of the issue is to present an unified radiological classification of mandibular fractures for the clinical practice. This classification includes only those clinical symptoms of mandibular fracture which could be radiologically objectified: exact anatomical localization (F1-F6), teeth in fracture line (Ta,Tb), grade of dislocation (D I, D II), occlusal disturbances (O(+), O(-)). Radiological symptoms expressed by letter and number symbols are systematized in a formula - FTDO of mandibular fractures similar to TNM formula for tumours. FTDO formula expresses radiological diagnose of each mandibular fracture but it doesn't include neither the site (left or right) of the fracture, nor the kind and number of fractures. In order to express topography and number of fractures the radiological formula is transformed into a decimal fraction. The symbols (FTD) of right mandible fracture are written in the numerator and those of the left site - in the denominator. For double and multiple fractures between the symbols for each fracture we put '+'. Symbols for occlusal disturbances are put down opposite, the fractional line. So topographo-anatomical formula (FTD/FTD)xO is formed. In this way the whole radiological information for unilateral, bilateral, single or multiple fractures of the mandible is expressed. The information in the radiological topography anatomic formula, resp. from the unified topography-anatomic classification ensures a quick and exact X-ray diagnose of mandibular fracture. In this way contributes to get better, make easier and faster X-ray diagnostic process concerning mandibular fractures. And all these is a precondition for prevention of retardation of the diagnosis mandibular fracture. (author)

  12. A Classification of BPEL Extensions

    Directory of Open Access Journals (Sweden)

    Oliver Kopp

    2011-10-01

    Full Text Available The Business Process Execution Language (BPEL has emerged as de-facto standard for business processes implementation. This language is designed to be extensible for including additional valuable features in a standardized manner. There are a number of BPEL extensions available. They are, however, neither classified nor evaluated with respect to their compliance to the BPEL standard. This article fills this gap by providing a framework for classifying BPEL extensions, a classification of existing extensions, and a guideline for designing BPEL extensions.

  13. Classification for Inconsistent Decision Tables

    KAUST Repository

    Azad, Mohammad; Moshkov, Mikhail

    2016-01-01

    Decision trees have been used widely to discover patterns from consistent data set. But if the data set is inconsistent, where there are groups of examples with equal values of conditional attributes but different labels, then to discover the essential patterns or knowledge from the data set is challenging. Three approaches (generalized, most common and many-valued decision) have been considered to handle such inconsistency. The decision tree model has been used to compare the classification results among three approaches. Many-valued decision approach outperforms other approaches, and M_ws_entM greedy algorithm gives faster and better prediction accuracy.

  14. [Classification of local anesthesia methods].

    Science.gov (United States)

    Petricas, A Zh; Medvedev, D V; Olkhovskaya, E B

    The traditional classification methods of dental local anesthesia must be modified. In this paper we proved that the vascular mechanism is leading component of spongy injection. It is necessary to take into account the high effectiveness and relative safety of spongy anesthesia, as well as versatility, ease of implementation and the growing prevalence in the world. The essence of the proposed modification is to distinguish the methods in diffusive (including surface anesthesia, infiltration and conductive anesthesia) and vascular-diffusive (including intraosseous, intraligamentary, intraseptal and intrapulpal anesthesia). For the last four methods the common term «spongy (intraosseous) anesthesia» may be used.

  15. Classification for Inconsistent Decision Tables

    KAUST Repository

    Azad, Mohammad

    2016-09-28

    Decision trees have been used widely to discover patterns from consistent data set. But if the data set is inconsistent, where there are groups of examples with equal values of conditional attributes but different labels, then to discover the essential patterns or knowledge from the data set is challenging. Three approaches (generalized, most common and many-valued decision) have been considered to handle such inconsistency. The decision tree model has been used to compare the classification results among three approaches. Many-valued decision approach outperforms other approaches, and M_ws_entM greedy algorithm gives faster and better prediction accuracy.

  16. Classification of Magnetic Nanoparticle Systems

    DEFF Research Database (Denmark)

    Bogren, Sara; Fornara, Andrea; Ludwig, Frank

    2015-01-01

    and the size parameters are determined from electron microscopy and dynamic light scattering. Using these methods, we also show that the nanocrystal size and particle morphology determines the dynamic magnetic properties for both single- and multi-core particles. The presented results are obtained from...... the four year EU NMP FP7 project, NanoMag, which is focused on standardization of analysis methods for magnetic nanoparticles.......This study presents classification of different magnetic single- and multi-core particle systems using their measured dynamic magnetic properties together with their nanocrystal and particle sizes. The dynamic magnetic properties are measured with AC (dynamical) susceptometry and magnetorelaxometry...

  17. Waste classification: a management approach

    International Nuclear Information System (INIS)

    Wickham, L.E.

    1984-01-01

    A waste classification system designed to quantify the total hazard of a waste has been developed by the Low-Level Waste Management Program. As originally conceived, the system was designed to deal with mixed radioactive waste. The methodology has been developed and successfully applied to radiological and chemical wastes, both individually and mixed together. Management options to help evaluate the financial and safety trade-offs between waste segregation, waste treatment, container types, and site factors are described. Using the system provides a very simple and cost effective way of making quick assessments of a site's capabilities to contain waste materials. 3 references

  18. Gene expression

    International Nuclear Information System (INIS)

    Hildebrand, C.E.; Crawford, B.D.; Walters, R.A.; Enger, M.D.

    1983-01-01

    We prepared probes for isolating functional pieces of the metallothionein locus. The probes enabled a variety of experiments, eventually revealing two mechanisms for metallothionein gene expression, the order of the DNA coding units at the locus, and the location of the gene site in its chromosome. Once the switch regulating metallothionein synthesis was located, it could be joined by recombinant DNA methods to other, unrelated genes, then reintroduced into cells by gene-transfer techniques. The expression of these recombinant genes could then be induced by exposing the cells to Zn 2+ or Cd 2+ . We would thus take advantage of the clearly defined switching properties of the metallothionein gene to manipulate the expression of other, perhaps normally constitutive, genes. Already, despite an incomplete understanding of how the regulatory switch of the metallothionein locus operates, such experiments have been performed successfully

  19. Clinical use of dental classification.

    Science.gov (United States)

    Jones, Gordon

    2008-01-01

    The Dental Classification system used by the uniformed services is supposed to predict the incidence of dental emergencies in the operational setting, at least on the unit level. Since most Sailors and Marines are deployed without close dental support, the sea services have adopted a policy of early treatment of class 3 dental conditions during recruit training. The other services are beginning to do the same. Recently, two factors have emerged that are affecting this early dental class 3 treatment. These factors must be considered when planning to provide early dental treatment. First, changing population and dentist provider demographics in the civilian sector are beginning to affect the class 3 treatment needs of incoming military recruits. Second, attrition from recruit training results in treatment provided to recruits who leave military service before finishing their training. Some view this as a waste of resources, others as a cost of doing business. As operational jointness increases, the three services must develop and use a single dental classification terminology, as well as unified standards and guidelines, both for better research in this area and for the readiness and well-being of our patients.

  20. Classification of Meteorites and Micrometeorites

    Science.gov (United States)

    Maurette, Michel

    Archeologists only started to trace back successfully the advance of the Roman legions, trade patterns and the evolution of manufacturing techniques in Roman time, once they found an efficient scheme of classification for the fragments of amphora used to transport wine for the soldiers. Similarly, the classification of meteorites and micrometeorites is an essential step in the exploitation of these extraterrestrial debris. We recall that one of the main objectives of meteoriticists over the last 30 years was to find the most primitive objects of the solar system, which have been the least reprocessed since the formation of the early solar nebula, with the view to exploit them as reliable archivist of our distant past. This section outlines some of the methods used to classify meteorites and Antarctic micrometeorites. It also summarizes some of the key features of the surprisingly simple relationship between micrometeorites and a relatively rare group of stony meteorites, the hydrous carbonaceous CM-type chondrites, which was only confirmed recently after the study of the Concordia micrometeorites collected in central Antarctica in January 2002. A more technical discussion of this relationship presented in Sect. 25 will allow its extension to the smaller micrometeorites collected by NASA in the stratosphere. The book of Wasson (1985) is still one of the best monographs about meteorites.

  1. Classification of diabetic foot ulcers.

    Science.gov (United States)

    Game, Frances

    2016-01-01

    It is known that the relative importance of factors involved in the development of diabetic foot problems can vary in both their presence and severity between patients and lesions. This may be one of the reasons why outcomes seem to vary centre to centre and why some treatments may seem more effective in some people than others. There is a need therefore to classify and describe lesions of the foot in patients with diabetes in a manner that is agreed across all communities but is simple to use in clinical practice. No single system is currently in widespread use, although a number have been published. Not all are well validated outside the system from which they were derived, and it has not always been made clear the clinical purposes to which such classifications should be put to use, whether that be for research, clinical description in routine clinical care or audit. Here the currently published classification systems, their validation in clinical practice, whether they were designed for research, audit or clinical care, and the strengths and weaknesses of each are explored. Copyright © 2016 John Wiley & Sons, Ltd.

  2. Gender classification under extended operating conditions

    Science.gov (United States)

    Rude, Howard N.; Rizki, Mateen

    2014-06-01

    Gender classification is a critical component of a robust image security system. Many techniques exist to perform gender classification using facial features. In contrast, this paper explores gender classification using body features extracted from clothed subjects. Several of the most effective types of features for gender classification identified in literature were implemented and applied to the newly developed Seasonal Weather And Gender (SWAG) dataset. SWAG contains video clips of approximately 2000 samples of human subjects captured over a period of several months. The subjects are wearing casual business attire and outer garments appropriate for the specific weather conditions observed in the Midwest. The results from a series of experiments are presented that compare the classification accuracy of systems that incorporate various types and combinations of features applied to multiple looks at subjects at different image resolutions to determine a baseline performance for gender classification.

  3. Arabic text classification using Polynomial Networks

    Directory of Open Access Journals (Sweden)

    Mayy M. Al-Tahrawi

    2015-10-01

    Full Text Available In this paper, an Arabic statistical learning-based text classification system has been developed using Polynomial Neural Networks. Polynomial Networks have been recently applied to English text classification, but they were never used for Arabic text classification. In this research, we investigate the performance of Polynomial Networks in classifying Arabic texts. Experiments are conducted on a widely used Arabic dataset in text classification: Al-Jazeera News dataset. We chose this dataset to enable direct comparisons of the performance of Polynomial Networks classifier versus other well-known classifiers on this dataset in the literature of Arabic text classification. Results of experiments show that Polynomial Networks classifier is a competitive algorithm to the state-of-the-art ones in the field of Arabic text classification.

  4. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

    Science.gov (United States)

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-11

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.

  5. A Classification of Feminist Theories

    Directory of Open Access Journals (Sweden)

    Karen Wendling

    2008-09-01

    Full Text Available In this paper I criticize Alison Jaggar’s descriptions of feminist political theories. I propose an alternative classification of feminist theories that I think more accurately reflects the multiplication of feminist theories and philosophies. There are two main categories, “street theory” and academic theories, each with two sub-divisions, political spectrum and “differences” under street theory, and directly and indirectly political analyses under academic theories. My view explains why there are no radical feminists outside of North America and why there are so few socialist feminists inside North America. I argue, controversially, that radical feminism is a radical version of liberalism. I argue that “difference” feminist theories – theory by and about feminists of colour, queer feminists, feminists with disabilities and so on – belong in a separate sub-category of street theory, because they’ve had profound effects on feminist activism not tracked by traditional left-to-right classifications. Finally, I argue that, while academic feminist theories such as feminist existentialism or feminist sociological theory are generally unconnected to movement activism, they provide important feminist insights that may become importantby showing the advantages of my classification over Jaggar’s views. Une analyse critique de la description des théories politiques féministes révèle qu’une classification alternative à celle de Jaggar permettrait de répertorier plus adéquatement les différents courants féministes qui ont évolués au cours des dernières décennies. La nouvelle cartographie que nous proposons comprend deux familles de féminisme : activiste et académique. Cette nouvelle manière de localiser et situer les féminismes aide à comprendre pourquoi il n’y a pas de féminisme radical à l’extérieur de l’Amérique du Nord et aussi pourquoi il y a si peu de féministes socialistes en Amérique du Nord

  6. Blind Signal Classification via Spare Coding

    Science.gov (United States)

    2016-04-10

    Blind Signal Classification via Sparse Coding Youngjune Gwon MIT Lincoln Laboratory gyj@ll.mit.edu Siamak Dastangoo MIT Lincoln Laboratory sia...achieve blind signal classification with no prior knowledge about signals (e.g., MCS, pulse shaping) in an arbitrary RF channel. Since modulated RF...classification method. Our results indicate that we can separate different classes of digitally modulated signals from blind sampling with 70.3% recall and 24.6

  7. 46 CFR 95.50-5 - Classification.

    Science.gov (United States)

    2010-10-01

    ... 46 Shipping 4 2010-10-01 2010-10-01 false Classification. 95.50-5 Section 95.50-5 Shipping COAST... Details § 95.50-5 Classification. (a) Hand portable fire extinguishers and semiportable fire extinguishing... extinguishing systems are set forth in Table 95.50-5(c). Table 95.50-5(c) Classification Type Size Soda-acid and...

  8. A Classification Scheme for Production System Processes

    DEFF Research Database (Denmark)

    Sørensen, Daniel Grud Hellerup; Brunø, Thomas Ditlev; Nielsen, Kjeld

    2018-01-01

    Manufacturing companies often have difficulties developing production platforms, partly due to the complexity of many production systems and difficulty determining which processes constitute a platform. Understanding production processes is an important step to identifying candidate processes...... for a production platform based on existing production systems. Reviewing a number of existing classifications and taxonomies, a consolidated classification scheme for processes in production of discrete products has been outlined. The classification scheme helps ensure consistency during mapping of existing...

  9. Approche historique des classifications en psychiatrie

    OpenAIRE

    Garrabé , J.

    2011-01-01

    Resume Des le milieu du xixe siecle s?est posee la question des criteres de classification des maladies. Pour les maladies mentales, diverses classifications ont alors ete proposees par des auteurs francais (Morel) et allemands (Kahlbaum, Kraepelin). A partir de la fin du xixe siecle, le Bureau International de Statistique (Paris) a publie a une Classification Internationale des Maladies, a revision decennale (J. Bertillon). Cette tache a ete poursuivie dans l?entre-deux-guerres pa...

  10. Density Based Support Vector Machines for Classification

    OpenAIRE

    Zahra Nazari; Dongshik Kang

    2015-01-01

    Support Vector Machines (SVM) is the most successful algorithm for classification problems. SVM learns the decision boundary from two classes (for Binary Classification) of training points. However, sometimes there are some less meaningful samples amongst training points, which are corrupted by noises or misplaced in wrong side, called outliers. These outliers are affecting on margin and classification performance, and machine should better to discard them. SVM as a popular and widely used cl...

  11. Effective Exchange Rate Classifications and Growth

    OpenAIRE

    Justin M. Dubas; Byung-Joo Lee; Nelson C. Mark

    2005-01-01

    We propose an econometric procedure for obtaining de facto exchange rate regime classifications which we apply to study the relationship between exchange rate regimes and economic growth. Our classification method models the de jure regimes as outcomes of a multinomial logit choice problem conditional on the volatility of a country's effective exchange rate, a bilateral exchange rate and international reserves. An `effective' de facto exchange rate regime classification is then obtained by as...

  12. Studies on the Roles of PDGFRA and EGFR in the Classification and Identification of Therapeutic Targets for Human Gliomas

    OpenAIRE

    Chen, Dongfeng

    2013-01-01

    Glioma is the most common type of primary tumor in the adult central nervous system (CNS). However, the current classification of gliomas is highly subjective and even inaccurate in some cases, which leads to clinical confusion and hinders the development of targeted therapies. EGFR and PDGFRA play crucial roles in glia development and glioma pathogenesis. In this thesis we aim to establish a glial genesis-guided molecular classification scheme for gliomas based on the genes co-expressed with...

  13. Comparative genomic and transcriptomic analysis of selected fatty acid biosynthesis genes and CNL disease resistance genes in oil palm

    Science.gov (United States)

    Rosli, Rozana; Amiruddin, Nadzirah; Ab Halim, Mohd Amin; Chan, Pek-Lan; Chan, Kuang-Lim; Azizi, Norazah; Morris, Priscilla E.; Leslie Low, Eng-Ti; Ong-Abdullah, Meilina; Sambanthamurthi, Ravigadevi; Singh, Rajinder

    2018-01-01

    Comparative genomics and transcriptomic analyses were performed on two agronomically important groups of genes from oil palm versus other major crop species and the model organism, Arabidopsis thaliana. The first analysis was of two gene families with key roles in regulation of oil quality and in particular the accumulation of oleic acid, namely stearoyl ACP desaturases (SAD) and acyl-acyl carrier protein (ACP) thioesterases (FAT). In both cases, these were found to be large gene families with complex expression profiles across a wide range of tissue types and developmental stages. The detailed classification of the oil palm SAD and FAT genes has enabled the updating of the latest version of the oil palm gene model. The second analysis focused on disease resistance (R) genes in order to elucidate possible candidates for breeding of pathogen tolerance/resistance. Ortholog analysis showed that 141 out of the 210 putative oil palm R genes had homologs in banana and rice. These genes formed 37 clusters with 634 orthologous genes. Classification of the 141 oil palm R genes showed that the genes belong to the Kinase (7), CNL (95), MLO-like (8), RLK (3) and Others (28) categories. The CNL R genes formed eight clusters. Expression data for selected R genes also identified potential candidates for breeding of disease resistance traits. Furthermore, these findings can provide information about the species evolution as well as the identification of agronomically important genes in oil palm and other major crops. PMID:29672525

  14. Comparative genomic and transcriptomic analysis of selected fatty acid biosynthesis genes and CNL disease resistance genes in oil palm.

    Science.gov (United States)

    Rosli, Rozana; Amiruddin, Nadzirah; Ab Halim, Mohd Amin; Chan, Pek-Lan; Chan, Kuang-Lim; Azizi, Norazah; Morris, Priscilla E; Leslie Low, Eng-Ti; Ong-Abdullah, Meilina; Sambanthamurthi, Ravigadevi; Singh, Rajinder; Murphy, Denis J

    2018-01-01

    Comparative genomics and transcriptomic analyses were performed on two agronomically important groups of genes from oil palm versus other major crop species and the model organism, Arabidopsis thaliana. The first analysis was of two gene families with key roles in regulation of oil quality and in particular the accumulation of oleic acid, namely stearoyl ACP desaturases (SAD) and acyl-acyl carrier protein (ACP) thioesterases (FAT). In both cases, these were found to be large gene families with complex expression profiles across a wide range of tissue types and developmental stages. The detailed classification of the oil palm SAD and FAT genes has enabled the updating of the latest version of the oil palm gene model. The second analysis focused on disease resistance (R) genes in order to elucidate possible candidates for breeding of pathogen tolerance/resistance. Ortholog analysis showed that 141 out of the 210 putative oil palm R genes had homologs in banana and rice. These genes formed 37 clusters with 634 orthologous genes. Classification of the 141 oil palm R genes showed that the genes belong to the Kinase (7), CNL (95), MLO-like (8), RLK (3) and Others (28) categories. The CNL R genes formed eight clusters. Expression data for selected R genes also identified potential candidates for breeding of disease resistance traits. Furthermore, these findings can provide information about the species evolution as well as the identification of agronomically important genes in oil palm and other major crops.

  15. Discriminant forest classification method and system

    Science.gov (United States)

    Chen, Barry Y.; Hanley, William G.; Lemmond, Tracy D.; Hiller, Lawrence J.; Knapp, David A.; Mugge, Marshall J.

    2012-11-06

    A hybrid machine learning methodology and system for classification that combines classical random forest (RF) methodology with discriminant analysis (DA) techniques to provide enhanced classification capability. A DA technique which uses feature measurements of an object to predict its class membership, such as linear discriminant analysis (LDA) or Andersen-Bahadur linear discriminant technique (AB), is used to split the data at each node in each of its classification trees to train and grow the trees and the forest. When training is finished, a set of n DA-based decision trees of a discriminant forest is produced for use in predicting the classification of new samples of unknown class.

  16. Interagency Security Classification Appeals Panel (ISCAP) Decisions

    Data.gov (United States)

    National Archives and Records Administration — This online collection includes documents decided upon by the Interagency Security Classification Appeals Panel (ISCAP) starting in Fiscal Year 2012. The documents...

  17. Recognition Using Classification and Segmentation Scoring

    National Research Council Canada - National Science Library

    Kimball, Owen; Ostendorf, Mari; Rohlicek, Robin

    1992-01-01

    .... We describe an approach to connected word recognition that allows the use of segmental information through an explicit decomposition of the recognition criterion into classification and segmentation scoring...

  18. CCM: A Text Classification Method by Clustering

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    In this paper, a new Cluster based Classification Model (CCM) for suspicious email detection and other text classification tasks, is presented. Comparative experiments of the proposed model against traditional classification models and the boosting algorithm are also discussed. Experimental results...... show that the CCM outperforms traditional classification models as well as the boosting algorithm for the task of suspicious email detection on terrorism domain email dataset and topic categorization on the Reuters-21578 and 20 Newsgroups datasets. The overall finding is that applying a cluster based...

  19. Morphological classification of plant cell deaths

    DEFF Research Database (Denmark)

    van Doorn, W.G.; Beers, E.P.; Dangl, J.L.

    2011-01-01

    , which can express features of both necrosis and vacuolar cell death, PCD in starchy cereal endosperm and during self-incompatibility. The present classification is not static, but will be subject to further revision, especially when specific biochemical pathways are better defined....... the classification of PCD in plants. Here we suggest a classification based on morphological criteria. According to this classification, the use of the term 'apoptosis' is not justified in plants, but at least two classes of PCD can be distinguished: vacuolar cell death and necrosis. During vacuolar cell death...

  20. A Classification Methodology and Retrieval Model to Support Software Reuse

    Science.gov (United States)

    1988-01-01

    Dewey Decimal Classification ( DDC 18), an enumerative scheme, occupies 40 pages [Buchanan 19791. Langridge [19731 states that the facets listed in the...sense of historical importance or wide spread use. The schemes are: Dewey Decimal Classification ( DDC ), Universal Decimal Classification (UDC...Classification Systems ..... ..... 2.3.3 Library Classification__- .52 23.3.1 Dewey Decimal Classification -53 2.33.2 Universal Decimal Classification 55 2333

  1. The impact of classification of interest on predictive toxicogenomics

    Directory of Open Access Journals (Sweden)

    Pierre R. Bushel

    2012-02-01

    Full Text Available The era of toxicogenomics has introduced a new way of monitoring the effect of environmental stressors and toxicants on biological systems via quantification of changes in gene expression. Because the liver is one of the major organs for synthesis and secretion of substances which metabolize endogenous and exogenous materials, there has been a great deal of interest in elucidating predictive and mechanistic genomic markers of hepatotoxicity. This mini-review will bring context to a limited number of toxicogenomics studies which used genomics to evaluate the transcriptional changes in blood and liver in response to acetaminophen (APAP or other liver toxicants, but differed according to the classification of interest (COI, i.e. the partitioning of the samples a priori according to a common toxicological characteristic. The toxicogenomics studies highlighted are characterized by a classification of either no/low vs. high APAP dose exposure, none vs. observed necrosis, and severity of necrosis. The overlap or lack thereof between the gene classifiers and the modulated biological processes that are elucidated will be discussed to enhance the understanding of the effect of the particular COI model and experimental design used for prediction.

  2. A clinical classification acknowledging neuropsychiatric and cognitive impairment in Huntingtons disease

    DEFF Research Database (Denmark)

    Vinther-Jensen, Tua; Larsen, Ida U; Hjermind, Lena E

    2014-01-01

    based on the presence of involuntary movements and a positive genetic test for the HD CAG repeat expansion. After investigating the frequencies of the triad manifestations in a large outpatient clinical cohort of HD gene-expansion carriers, we propose a new clinical classification.MethodsIn this cross...... medication, and cognitive impairment.ResultsAmong the motor manifest HD gene-expansion carriers, 51.8% presented with the full symptom triad, 25.0% were defined as cognitively impaired in addition to motor symptoms, and 14.3% had neuropsychiatric symptoms along with motor symptoms. Only 8.9% had isolated...... terms, suggesting that the current clinical classification is neither necessarily suitable nor helpful for this patient group. Some premanifest gene-expansion carriers may have psychiatric and/or cognitive symptoms caused by reactive stress or other pathology than HD. Acknowledging this fact we, however...

  3. Classification of hydrocephalus: critical analysis of classification categories and advantages of "Multi-categorical Hydrocephalus Classification" (Mc HC).

    Science.gov (United States)

    Oi, Shizuo

    2011-10-01

    Hydrocephalus is a complex pathophysiology with disturbed cerebrospinal fluid (CSF) circulation. There are numerous numbers of classification trials published focusing on various criteria, such as associated anomalies/underlying lesions, CSF circulation/intracranial pressure patterns, clinical features, and other categories. However, no definitive classification exists comprehensively to cover the variety of these aspects. The new classification of hydrocephalus, "Multi-categorical Hydrocephalus Classification" (Mc HC), was invented and developed to cover the entire aspects of hydrocephalus with all considerable classification items and categories. Ten categories include "Mc HC" category I: onset (age, phase), II: cause, III: underlying lesion, IV: symptomatology, V: pathophysiology 1-CSF circulation, VI: pathophysiology 2-ICP dynamics, VII: chronology, VII: post-shunt, VIII: post-endoscopic third ventriculostomy, and X: others. From a 100-year search of publication related to the classification of hydrocephalus, 14 representative publications were reviewed and divided into the 10 categories. The Baumkuchen classification graph made from the round o'clock classification demonstrated the historical tendency of deviation to the categories in pathophysiology, either CSF or ICP dynamics. In the preliminary clinical application, it was concluded that "Mc HC" is extremely effective in expressing the individual state with various categories in the past and present condition or among the compatible cases of hydrocephalus along with the possible chronological change in the future.

  4. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  5. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Science.gov (United States)

    Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun

    2016-01-01

    This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms. PMID:27642363

  6. Gene therapy for hemophilia

    Science.gov (United States)

    Rogers, Geoffrey L.; Herzog, Roland W.

    2015-01-01

    Hemophilia is an X-linked inherited bleeding disorder consisting of two classifications, hemophilia A and hemophilia B, depending on the underlying mutation. Although the disease is currently treatable with intravenous delivery of replacement recombinant clotting factor, this approach represents a significant cost both monetarily and in terms of quality of life. Gene therapy is an attractive alternative approach to the treatment of hemophilia that would ideally provide life-long correction of clotting activity with a single injection. In this review, we will discuss the multitude of approaches that have been explored for the treatment of both hemophilia A and B, including both in vivo and ex vivo approaches with viral and nonviral delivery vectors. PMID:25553466

  7. Transfer Learning beyond Text Classification

    Science.gov (United States)

    Yang, Qiang

    Transfer learning is a new machine learning and data mining framework that allows the training and test data to come from different distributions or feature spaces. We can find many novel applications of machine learning and data mining where transfer learning is necessary. While much has been done in transfer learning in text classification and reinforcement learning, there has been a lack of documented success stories of novel applications of transfer learning in other areas. In this invited article, I will argue that transfer learning is in fact quite ubiquitous in many real world applications. In this article, I will illustrate this point through an overview of a broad spectrum of applications of transfer learning that range from collaborative filtering to sensor based location estimation and logical action model learning for AI planning. I will also discuss some potential future directions of transfer learning.

  8. Classification of positive blood cultures

    DEFF Research Database (Denmark)

    Gradel, Kim Oren; Knudsen, Jenny Dahl; Arpi, Magnus

    2012-01-01

    . For each classification, we tabulated episodes derived by the physicians assessment and the computer algorithm and compared 30-day mortality between concordant and discrepant groups with adjustment for age, gender, and comorbidity. RESULTS: Physicians derived 9,482 reference episodes from 21,705 positive......- vs. hospitalonset, whereas there were no material differences within the other comparison groups. CONCLUSIONS: Using data from health administrative registries, we found high agreement between the computer algorithms and the physicians assessments as regards contamination vs. bloodstream infection......ABSTRACT: BACKGROUND: Information from blood cultures is utilized for infection control, public health surveillance, and clinical outcome research. This information can be enriched by physicians assessments of positive blood cultures, which are, however, often available from selected patient groups...

  9. Textural features for image classification

    Science.gov (United States)

    Haralick, R. M.; Dinstein, I.; Shanmugam, K.

    1973-01-01

    Description of some easily computable textural features based on gray-tone spatial dependances, and illustration of their application in category-identification tasks of three different kinds of image data - namely, photomicrographs of five kinds of sandstones, 1:20,000 panchromatic aerial photographs of eight land-use categories, and ERTS multispectral imagery containing several land-use categories. Two kinds of decision rules are used - one for which the decision regions are convex polyhedra (a piecewise-linear decision rule), and one for which the decision regions are rectangular parallelpipeds (a min-max decision rule). In each experiment the data set was divided into two parts, a training set and a test set. Test set identification accuracy is 89% for the photomicrographs, 82% for the aerial photographic imagery, and 83% for the satellite imagery. These results indicate that the easily computable textural features probably have a general applicability for a wide variety of image-classification applications.

  10. Biological signals classification and analysis

    CERN Document Server

    Kiasaleh, Kamran

    2015-01-01

    This authored monograph presents key aspects of signal processing analysis in the biomedical arena. Unlike wireless communication systems, biological entities produce signals with underlying nonlinear, chaotic nature that elude classification using the standard signal processing techniques, which have been developed over the past several decades for dealing primarily with standard communication systems. This book separates what is random from that which appears to be random, and yet is truly deterministic with random appearance. At its core, this work gives the reader a perspective on biomedical signals and the means to classify and process such signals. In particular, a review of random processes along with means to assess the behavior of random signals is also provided. The book also includes a general discussion of biological signals in order to demonstrate the inefficacy of the well-known techniques to correctly extract meaningful information from such signals. Finally, a thorough discussion of recently ...

  11. Classification With Truncated Distance Kernel.

    Science.gov (United States)

    Huang, Xiaolin; Suykens, Johan A K; Wang, Shuning; Hornegger, Joachim; Maier, Andreas

    2018-05-01

    This brief proposes a truncated distance (TL1) kernel, which results in a classifier that is nonlinear in the global region but is linear in each subregion. With this kernel, the subregion structure can be trained using all the training data and local linear classifiers can be established simultaneously. The TL1 kernel has good adaptiveness to nonlinearity and is suitable for problems which require different nonlinearities in different areas. Though the TL1 kernel is not positive semidefinite, some classical kernel learning methods are still applicable which means that the TL1 kernel can be directly used in standard toolboxes by replacing the kernel evaluation. In numerical experiments, the TL1 kernel with a pregiven parameter achieves similar or better performance than the radial basis function kernel with the parameter tuned by cross validation, implying the TL1 kernel a promising nonlinear kernel for classification tasks.

  12. ASIST SIG/CR Classification Workshop 2000: Classification for User Support and Learning.

    Science.gov (United States)

    Soergel, Dagobert

    2001-01-01

    Reports on papers presented at the 62nd Annual Meeting of ASIST (American Society for Information Science and Technology) for the Special Interest Group in Classification Research (SIG/CR). Topics include types of knowledge; developing user-oriented classifications, including domain analysis; classification in the user interface; and automatic…

  13. Conformal radiotherapy: principles and classification

    International Nuclear Information System (INIS)

    Rosenwald, J.C.; Gaboriaud, G.; Pontvert, D.

    1999-01-01

    'Conformal radiotherapy' is the name fixed by usage and given to a new form of radiotherapy resulting from the technological improvements observed during the last ten years. While this terminology is now widely used, no precise definition can be found in the literature. Conformal radiotherapy refers to an approach in which the dose distribution is more closely 'conformed' or adapted to the actual shape of the target volume. However, the achievement of a consensus on a more specific definition is hampered by various difficulties, namely in characterizing the degree of 'conformality'. We have therefore suggested a classification scheme be established on the basis of the tools and the procedures actually used for all steps of the process, i.e., from prescription to treatment completion. Our classification consists of four levels: schematically, at level 0, there is no conformation (rectangular fields); at level 1, a simple conformation takes place, on the basis of conventional 2D imaging; at level 2, a 3D reconstruction of the structures is used for a more accurate conformation; and level 3 includes research and advanced dynamic techniques. We have used our personal experience, contacts with colleagues and data from the literature to analyze all the steps of the planning process, and to define the tools and procedures relevant to a given level. The corresponding tables have been discussed and approved at the European level within the Dynarad concerted action. It is proposed that the term 'conformal radiotherapy' be restricted to procedures where all steps are at least at level 2. (author)

  14. Disease gene characterization through large-scale co-expression analysis.

    Directory of Open Access Journals (Sweden)

    Allen Day

    2009-12-01

    Full Text Available In the post genome era, a major goal of biology is the identification of specific roles for individual genes. We report a new genomic tool for gene characterization, the UCLA Gene Expression Tool (UGET.Celsius, the largest co-normalized microarray dataset of Affymetrix based gene expression, was used to calculate the correlation between all possible gene pairs on all platforms, and generate stored indexes in a web searchable format. The size of Celsius makes UGET a powerful gene characterization tool. Using a small seed list of known cartilage-selective genes, UGET extended the list of known genes by identifying 32 new highly cartilage-selective genes. Of these, 7 of 10 tested were validated by qPCR including the novel cartilage-specific genes SDK2 and FLJ41170. In addition, we retrospectively tested UGET and other gene expression based prioritization tools to identify disease-causing genes within known linkage intervals. We first demonstrated this utility with UGET using genetically heterogeneous disorders such as Joubert syndrome, microcephaly, neuropsychiatric disorders and type 2 limb girdle muscular dystrophy (LGMD2 and then compared UGET to other gene expression based prioritization programs which use small but discrete and well annotated datasets. Finally, we observed a significantly higher gene correlation shared between genes in disease networks associated with similar complex or Mendelian disorders.UGET is an invaluable resource for a geneticist that permits the rapid inclusion of expression criteria from one to hundreds of genes in genomic intervals linked to disease. By using thousands of arrays UGET annotates and prioritizes genes better than other tools especially with rare tissue disorders or complex multi-tissue biological processes. This information can be critical in prioritization of candidate genes for sequence analysis.

  15. A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Jaison Bennet

    2014-01-01

    Full Text Available Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN, naive Bayes, and support vector machine (SVM. Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT and moving window technique (MWT is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  16. Molecular Classification and Correlates in Colorectal Cancer

    OpenAIRE

    Ogino, Shuji; Goel, Ajay

    2008-01-01

    Molecular classification of colorectal cancer is evolving. As our understanding of colorectal carcinogenesis improves, we are incorporating new knowledge into the classification system. In particular, global genomic status [microsatellite instability (MSI) status and chromosomal instability (CIN) status] and epigenomic status [CpG island methylator phenotype (CIMP) status] play a significant role in determining clinical, pathological and biological characteristics of colorectal cancer. In thi...

  17. 28 CFR 17.26 - Derivative classification.

    Science.gov (United States)

    2010-07-01

    ... 28 Judicial Administration 1 2010-07-01 2010-07-01 false Derivative classification. 17.26 Section 17.26 Judicial Administration DEPARTMENT OF JUSTICE CLASSIFIED NATIONAL SECURITY INFORMATION AND ACCESS TO CLASSIFIED INFORMATION Classified Information § 17.26 Derivative classification. (a) Persons...

  18. 17 CFR 200.506 - Derivative classification.

    Science.gov (United States)

    2010-04-01

    ... 17 Commodity and Securities Exchanges 2 2010-04-01 2010-04-01 false Derivative classification. 200.506 Section 200.506 Commodity and Securities Exchanges SECURITIES AND EXCHANGE COMMISSION ORGANIZATION; CONDUCT AND ETHICS; AND INFORMATION AND REQUESTS Classification and Declassification of National Security...

  19. 5 CFR 2500.5 - Derivative classification.

    Science.gov (United States)

    2010-01-01

    ... 5 Administrative Personnel 3 2010-01-01 2010-01-01 false Derivative classification. 2500.5 Section 2500.5 Administrative Personnel OFFICE OF ADMINISTRATION, EXECUTIVE OFFICE OF THE PRESIDENT INFORMATION SECURITY REGULATION § 2500.5 Derivative classification. The Office of Administration serves only as the...

  20. 46 CFR 503.55 - Derivative classification.

    Science.gov (United States)

    2010-10-01

    ... Security Program § 503.55 Derivative classification. (a) In accordance with Part 2 of Executive Order 12958 and directives of the Information Security Oversight Office, the incorporation, paraphrasing... 46 Shipping 9 2010-10-01 2010-10-01 false Derivative classification. 503.55 Section 503.55...