similarity analysis based: Topics by WorldWideScience.org

Sample records for similarity analysis based

Similarity-based pattern analysis and recognition

CERN Document Server

Pelillo, Marcello

2013-01-01

This accessible text/reference presents a coherent overview of the emerging field of non-Euclidean similarity learning. The book presents a broad range of perspectives on similarity-based pattern analysis and recognition methods, from purely theoretical challenges to practical, real-world applications. The coverage includes both supervised and unsupervised learning paradigms, as well as generative and discriminative models. Topics and features: explores the origination and causes of non-Euclidean (dis)similarity measures, and how they influence the performance of traditional classification alg
Similar words analysis based on POS-CBOW language model

Directory of Open Access Journals (Sweden)

Dongru RUAN

2015-10-01

Full Text Available Similar words analysis is one of the important aspects in the field of natural language processing, and it has important research and application values in text classification, machine translation and information recommendation. Focusing on the features of Sina Weibo's short text, this paper presents a language model named as POS-CBOW, which is a kind of continuous bag-of-words language model with the filtering layer and part-of-speech tagging layer. The proposed approach can adjust the word vectors' similarity according to the cosine similarity and the word vectors' part-of-speech metrics. It can also filter those similar words set on the base of the statistical analysis model. The experimental result shows that the similar words analysis algorithm based on the proposed POS-CBOW language model is better than that based on the traditional CBOW language model.
Phishing Detection: Analysis of Visual Similarity Based Approaches

Directory of Open Access Journals (Sweden)

Ankit Kumar Jain

2017-01-01

Full Text Available Phishing is one of the major problems faced by cyber-world and leads to financial losses for both industries and individuals. Detection of phishing attack with high accuracy has always been a challenging issue. At present, visual similarities based techniques are very useful for detecting phishing websites efficiently. Phishing website looks very similar in appearance to its corresponding legitimate website to deceive users into believing that they are browsing the correct website. Visual similarity based phishing detection techniques utilise the feature set like text content, text format, HTML tags, Cascading Style Sheet (CSS, image, and so forth, to make the decision. These approaches compare the suspicious website with the corresponding legitimate website by using various features and if the similarity is greater than the predefined threshold value then it is declared phishing. This paper presents a comprehensive analysis of phishing attacks, their exploitation, some of the recent visual similarity based approaches for phishing detection, and its comparative study. Our survey provides a better understanding of the problem, current solution space, and scope of future research to deal with phishing attacks efficiently using visual similarity based approaches.
Fast Depiction Invariant Visual Similarity for Content Based Image Retrieval Based on Data-driven Visual Similarity using Linear Discriminant Analysis

Science.gov (United States)

Wihardi, Y.; Setiawan, W.; Nugraha, E.

2018-01-01

On this research we try to build CBIRS based on Learning Distance/Similarity Function using Linear Discriminant Analysis (LDA) and Histogram of Oriented Gradient (HoG) feature. Our method is invariant to depiction of image, such as similarity of image to image, sketch to image, and painting to image. LDA can decrease execution time compared to state of the art method, but it still needs an improvement in term of accuracy. Inaccuracy in our experiment happen because we did not perform sliding windows search and because of low number of negative samples as natural-world images.
Multicriteria Similarity-Based Anomaly Detection Using Pareto Depth Analysis.

Science.gov (United States)

Hsiao, Ko-Jen; Xu, Kevin S; Calder, Jeff; Hero, Alfred O

2016-06-01

We consider the problem of identifying patterns in a data set that exhibits anomalous behavior, often referred to as anomaly detection. Similarity-based anomaly detection algorithms detect abnormally large amounts of similarity or dissimilarity, e.g., as measured by the nearest neighbor Euclidean distances between a test sample and the training samples. In many application domains, there may not exist a single dissimilarity measure that captures all possible anomalous patterns. In such cases, multiple dissimilarity measures can be defined, including nonmetric measures, and one can test for anomalies by scalarizing using a nonnegative linear combination of them. If the relative importance of the different dissimilarity measures are not known in advance, as in many anomaly detection applications, the anomaly detection algorithm may need to be executed multiple times with different choices of weights in the linear combination. In this paper, we propose a method for similarity-based anomaly detection using a novel multicriteria dissimilarity measure, the Pareto depth. The proposed Pareto depth analysis (PDA) anomaly detection algorithm uses the concept of Pareto optimality to detect anomalies under multiple criteria without having to run an algorithm multiple times with different choices of weights. The proposed PDA approach is provably better than using linear combinations of the criteria, and shows superior performance on experiments with synthetic and real data sets.
Composites Similarity Analysis Method Based on Knowledge Set in Composites Quality Control

OpenAIRE

Li Haifeng

2016-01-01

Composites similarity analysis is an important link of composites review, it can not only to declare composites review rechecking, still help composites applicants promptly have the research content relevant progress and avoid duplication. This paper mainly studies the composites similarity model in composites review. With the actual experience of composites management, based on the author’s knowledge set theory, paper analyzes deeply knowledge set representation of composites knowledge, impr...
A new similarity index for nonlinear signal analysis based on local extrema patterns

Science.gov (United States)

Niknazar, Hamid; Motie Nasrabadi, Ali; Shamsollahi, Mohammad Bagher

2018-02-01

Common similarity measures of time domain signals such as cross-correlation and Symbolic Aggregate approximation (SAX) are not appropriate for nonlinear signal analysis. This is because of the high sensitivity of nonlinear systems to initial points. Therefore, a similarity measure for nonlinear signal analysis must be invariant to initial points and quantify the similarity by considering the main dynamics of signals. The statistical behavior of local extrema (SBLE) method was previously proposed to address this problem. The SBLE similarity index uses quantized amplitudes of local extrema to quantify the dynamical similarity of signals by considering patterns of sequential local extrema. By adding time information of local extrema as well as fuzzifying quantized values, this work proposes a new similarity index for nonlinear and long-term signal analysis, which extends the SBLE method. These new features provide more information about signals and reduce noise sensitivity by fuzzifying them. A number of practical tests were performed to demonstrate the ability of the method in nonlinear signal clustering and classification on synthetic data. In addition, epileptic seizure detection based on electroencephalography (EEG) signal processing was done by the proposed similarity to feature the potentials of the method as a real-world application tool.
Generalized sample entropy analysis for traffic signals based on similarity measure

Science.gov (United States)

Shang, Du; Xu, Mengjia; Shang, Pengjian

2017-05-01

Sample entropy is a prevailing method used to quantify the complexity of a time series. In this paper a modified method of generalized sample entropy and surrogate data analysis is proposed as a new measure to assess the complexity of a complex dynamical system such as traffic signals. The method based on similarity distance presents a different way of signals patterns match showing distinct behaviors of complexity. Simulations are conducted over synthetic data and traffic signals for providing the comparative study, which is provided to show the power of the new method. Compared with previous sample entropy and surrogate data analysis, the new method has two main advantages. The first one is that it overcomes the limitation about the relationship between the dimension parameter and the length of series. The second one is that the modified sample entropy functions can be used to quantitatively distinguish time series from different complex systems by the similar measure.
Chromatographic fingerprint similarity analysis for pollutant source identification

International Nuclear Information System (INIS)

Xie, Juan-Ping; Ni, Hong-Gang

2015-01-01

In the present study, a similarity analysis method was proposed to evaluate the source-sink relationships among environmental media for polybrominated diphenyl ethers (PBDEs), which were taken as the representative contaminants. Chromatographic fingerprint analysis has been widely used in the fields of natural products chemistry and forensic chemistry, but its application to environmental science has been limited. We established a library of various sources of media containing contaminants (e.g., plastics), recognizing that the establishment of a more comprehensive library allows for a better understanding of the sources of contamination. We then compared an environmental complex mixture (e.g., sediment, soil) with the profiles in the library. These comparisons could be used as the first step in source tracking. The cosine similarities between plastic and soil or sediment ranged from 0.53 to 0.68, suggesting that plastic in electronic waste is an important source of PBDEs in the environment, but it is not the only source. A similarity analysis between soil and sediment indicated that they have a source-sink relationship. Generally, the similarity analysis method can encompass more relevant information of complex mixtures in the environment than a profile-based approach that only focuses on target pollutants. There is an inherent advantage to creating a data matrix containing all peaks and their relative levels after matching the peaks based on retention times and peak areas. This data matrix can be used for source identification via a similarity analysis without quantitative or qualitative analysis of all chemicals in a sample. - Highlights: • Chromatographic fingerprint analysis can be used as the first step in source tracking. • Similarity analysis method can encompass more relevant information of pollution. • The fingerprints strongly depend on the chromatographic conditions. • A more effective and robust method for identifying similarities is required
BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server

Directory of Open Access Journals (Sweden)

Jiang Hualiang

2010-01-01

Full Text Available Abstract Background Genome sequencing and post-genomics projects such as structural genomics are extending the frontier of the study of sequence-structure-function relationship of genes and their products. Although many sequence/structure-based methods have been devised with the aim of deciphering this delicate relationship, there still remain large gaps in this fundamental problem, which continuously drives researchers to develop novel methods to extract relevant information from sequences and structures and to infer the functions of newly identified genes by genomics technology. Results Here we present an ultrafast method, named BSSF(Binding Site Similarity & Function, which enables researchers to conduct similarity searches in a comprehensive three-dimensional binding site database extracted from PDB structures. This method utilizes a fingerprint representation of the binding site and a validated statistical Z-score function scheme to judge the similarity between the query and database items, even if their similarities are only constrained in a sub-pocket. This fingerprint based similarity measurement was also validated on a known binding site dataset by comparing with geometric hashing, which is a standard 3D similarity method. The comparison clearly demonstrated the utility of this ultrafast method. After conducting the database searching, the hit list is further analyzed to provide basic statistical information about the occurrences of Gene Ontology terms and Enzyme Commission numbers, which may benefit researchers by helping them to design further experiments to study the query proteins. Conclusions This ultrafast web-based system will not only help researchers interested in drug design and structural genomics to identify similar binding sites, but also assist them by providing further analysis of hit list from database searching.
Dimensionality Reduction of Hyperspectral Image with Graph-Based Discriminant Analysis Considering Spectral Similarity

Directory of Open Access Journals (Sweden)

Fubiao Feng

2017-03-01

Full Text Available Recently, graph embedding has drawn great attention for dimensionality reduction in hyperspectral imagery. For example, locality preserving projection (LPP utilizes typical Euclidean distance in a heat kernel to create an affinity matrix and projects the high-dimensional data into a lower-dimensional space. However, the Euclidean distance is not sufficiently correlated with intrinsic spectral variation of a material, which may result in inappropriate graph representation. In this work, a graph-based discriminant analysis with spectral similarity (denoted as GDA-SS measurement is proposed, which fully considers curves changing description among spectral bands. Experimental results based on real hyperspectral images demonstrate that the proposed method is superior to traditional methods, such as supervised LPP, and the state-of-the-art sparse graph-based discriminant analysis (SGDA.
A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.

Science.gov (United States)

Mazandu, Gaston K; Chimusa, Emile R; Mbiyavanga, Mamana; Mulder, Nicola J

2016-02-01

Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is a portable software package integrating all known GO information content-based semantic similarity measures and relevant biological applications associated with these measures. A-DaGO-Fun has the advantage not only of handling datasets from the current high-throughput genome-wide applications, but also allowing users to choose the most relevant semantic similarity approach for their biological applications and to adapt a given module to their needs. A-DaGO-Fun is freely available to the research community at http://web.cbio.uct.ac.za/ITGOM/adagofun. It is implemented in Linux using Python under free software (GNU General Public Licence). gmazandu@cbio.uct.ac.za or Nicola.Mulder@uct.ac.za Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A Similarity-Based Approach for Audiovisual Document Classification Using Temporal Relation Analysis

Directory of Open Access Journals (Sweden)

Ferrane Isabelle

2011-01-01

Full Text Available Abstract We propose a novel approach for video classification that bases on the analysis of the temporal relationships between the basic events in audiovisual documents. Starting from basic segmentation results, we define a new representation method that is called Temporal Relation Matrix (TRM. Each document is then described by a set of TRMs, the analysis of which makes events of a higher level stand out. This representation has been first designed to analyze any audiovisual document in order to find events that may well characterize its content and its structure. The aim of this work is to use this representation to compute a similarity measure between two documents. Approaches for audiovisual documents classification are presented and discussed. Experimentations are done on a set of 242 video documents and the results show the efficiency of our proposals.
Binary similarity measures for fingerprint analysis of qualitative metabolomic profiles.

Science.gov (United States)

Rácz, Anita; Andrić, Filip; Bajusz, Dávid; Héberger, Károly

2018-01-01

Contemporary metabolomic fingerprinting is based on multiple spectrometric and chromatographic signals, used either alone or combined with structural and chemical information of metabolic markers at the qualitative and semiquantitative level. However, signal shifting, convolution, and matrix effects may compromise metabolomic patterns. Recent increase in the use of qualitative metabolomic data, described by the presence (1) or absence (0) of particular metabolites, demonstrates great potential in the field of metabolomic profiling and fingerprint analysis. The aim of this study is a comprehensive evaluation of binary similarity measures for the elucidation of patterns among samples of different botanical origin and various metabolomic profiles. Nine qualitative metabolomic data sets covering a wide range of natural products and metabolomic profiles were applied to assess 44 binary similarity measures for the fingerprinting of plant extracts and natural products. The measures were analyzed by the novel sum of ranking differences method (SRD), searching for the most promising candidates. Baroni-Urbani-Buser (BUB) and Hawkins-Dotson (HD) similarity coefficients were selected as the best measures by SRD and analysis of variance (ANOVA), while Dice (Di1), Yule, Russel-Rao, and Consonni-Todeschini 3 ranked the worst. ANOVA revealed that concordantly and intermediately symmetric similarity coefficients are better candidates for metabolomic fingerprinting than the asymmetric and correlation based ones. The fingerprint analysis based on the BUB and HD coefficients and qualitative metabolomic data performed equally well as the quantitative metabolomic profile analysis. Fingerprint analysis based on the qualitative metabolomic profiles and binary similarity measures proved to be a reliable way in finding the same/similar patterns in metabolomic data as that extracted from quantitative data.
Multicriteria decision-making method based on a cosine similarity ...

African Journals Online (AJOL)

the cosine similarity measure is often used in information retrieval, citation analysis, and automatic classification. However, it scarcely deals with trapezoidal fuzzy information and multicriteria decision-making problems. For this purpose, a cosine similarity measure between trapezoidal fuzzy numbers is proposed based on ...
Machine Fault Detection Based on Filter Bank Similarity Features Using Acoustic and Vibration Analysis

Directory of Open Access Journals (Sweden)

Mauricio Holguín-Londoño

2016-01-01

Full Text Available Vibration and acoustic analysis actively support the nondestructive and noninvasive fault diagnostics of rotating machines at early stages. Nonetheless, the acoustic signal is less used because of its vulnerability to external interferences, hindering an efficient and robust analysis for condition monitoring (CM. This paper presents a novel methodology to characterize different failure signatures from rotating machines using either acoustic or vibration signals. Firstly, the signal is decomposed into several narrow-band spectral components applying different filter bank methods such as empirical mode decomposition, wavelet packet transform, and Fourier-based filtering. Secondly, a feature set is built using a proposed similarity measure termed cumulative spectral density index and used to estimate the mutual statistical dependence between each bandwidth-limited component and the raw signal. Finally, a classification scheme is carried out to distinguish the different types of faults. The methodology is tested in two laboratory experiments, including turbine blade degradation and rolling element bearing faults. The robustness of our approach is validated contaminating the signal with several levels of additive white Gaussian noise, obtaining high-performance outcomes that make the usage of vibration, acoustic, and vibroacoustic measurements in different applications comparable. As a result, the proposed fault detection based on filter bank similarity features is a promising methodology to implement in CM of rotating machinery, even using measurements with low signal-to-noise ratio.
A Profile-Based Framework for Factorial Similarity and the Congruence Coefficient.

Science.gov (United States)

Hartley, Anselma G; Furr, R Michael

2017-01-01

We present a novel profile-based framework for understanding factorial similarity in the context of exploratory factor analysis in general, and for understanding the congruence coefficient (a commonly used index of factor similarity) specifically. First, we introduce the profile-based framework articulating factorial similarity in terms of 3 intuitive components: general saturation similarity, differential saturation similarity, and configural similarity. We then articulate the congruence coefficient in terms of these components, along with 2 additional profile-based components, and we explain how these components resolve ambiguities that can be-and are-found when using the congruence coefficient. Finally, we present secondary analyses revealing that profile-based components of factorial are indeed linked to experts' actual evaluations of factorial similarity. Overall, the profile-based approach we present offers new insights into the ways in which researchers can examine factor similarity and holds the potential to enhance researchers' ability to understand the congruence coefficient.
A study of concept-based similarity approaches for recommending program examples

Science.gov (United States)

Hosseini, Roya; Brusilovsky, Peter

2017-07-01

This paper investigates a range of concept-based example recommendation approaches that we developed to provide example-based problem-solving support in the domain of programming. The goal of these approaches is to offer students a set of most relevant remedial examples when they have trouble solving a code comprehension problem where students examine a program code to determine its output or the final value of a variable. In this paper, we use the ideas of semantic-level similarity-based linking developed in the area of intelligent hypertext to generate examples for the given problem. To determine the best-performing approach, we explored two groups of similarity approaches for selecting examples: non-structural approaches focusing on examples that are similar to the problem in terms of concept coverage and structural approaches focusing on examples that are similar to the problem by the structure of the content. We also explored the value of personalized example recommendation based on student's knowledge levels and learning goal of the exercise. The paper presents concept-based similarity approaches that we developed, explains the data collection studies and reports the result of comparative analysis. The results of our analysis showed better ranking performance of the personalized structural variant of cosine similarity approach.
Musical structure analysis using similarity matrix and dynamic programming

Science.gov (United States)

Shiu, Yu; Jeong, Hong; Kuo, C.-C. Jay

2005-10-01

Automatic music segmentation and structure analysis from audio waveforms based on a three-level hierarchy is examined in this research, where the three-level hierarchy includes notes, measures and parts. The pitch class profile (PCP) feature is first extracted at the note level. Then, a similarity matrix is constructed at the measure level, where a dynamic time warping (DTW) technique is used to enhance the similarity computation by taking the temporal distortion of similar audio segments into account. By processing the similarity matrix, we can obtain a coarse-grain music segmentation result. Finally, dynamic programming is applied to the coarse-grain segments so that a song can be decomposed into several major parts such as intro, verse, chorus, bridge and outro. The performance of the proposed music structure analysis system is demonstrated for pop and rock music.
Protein structure similarity from principle component correlation analysis

Directory of Open Access Journals (Sweden)

Chou James

2006-01-01

Full Text Available Abstract Background Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities. Results We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins. Conclusion The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum

SHOP: scaffold hopping by GRID-based similarity searches

DEFF Research Database (Denmark)

Bergmann, Rikke; Linusson, Anna; Zamora, Ismael

2007-01-01

A new GRID-based method for scaffold hopping (SHOP) is presented. In a fully automatic manner, scaffolds were identified in a database based on three types of 3D-descriptors. SHOP's ability to recover scaffolds was assessed and validated by searching a database spiked with fragments of known...... scaffolds were in the 31 top-ranked scaffolds. SHOP also identified new scaffolds with substantially different chemotypes from the queries. Docking analysis indicated that the new scaffolds would have similar binding modes to those of the respective query scaffolds observed in X-ray structures...
Linear-fitting-based similarity coefficient map for tissue dissimilarity analysis in -w magnetic resonance imaging

International Nuclear Information System (INIS)

Yu Shao-De; Wu Shi-Bin; Xie Yao-Qin; Wang Hao-Yu; Wei Xin-Hua; Chen Xin; Pan Wan-Long; Hu Jiani

2015-01-01

Similarity coefficient mapping (SCM) aims to improve the morphological evaluation of weighted magnetic resonance imaging However, how to interpret the generated SCM map is still pending. Moreover, is it probable to extract tissue dissimilarity messages based on the theory behind SCM? The primary purpose of this paper is to address these two questions. First, the theory of SCM was interpreted from the perspective of linear fitting. Then, a term was embedded for tissue dissimilarity information. Finally, our method was validated with sixteen human brain image series from multi-echo . Generated maps were investigated from signal-to-noise ratio (SNR) and perceived visual quality, and then interpreted from intra- and inter-tissue intensity. Experimental results show that both perceptibility of anatomical structures and tissue contrast are improved. More importantly, tissue similarity or dissimilarity can be quantified and cross-validated from pixel intensity analysis. This method benefits image enhancement, tissue classification, malformation detection and morphological evaluation. (paper)
Concurrence of rule- and similarity-based mechanisms in artificial grammar learning.

Science.gov (United States)

Opitz, Bertram; Hofmann, Juliane

2015-03-01

A current theoretical debate regards whether rule-based or similarity-based learning prevails during artificial grammar learning (AGL). Although the majority of findings are consistent with a similarity-based account of AGL it has been argued that these results were obtained only after limited exposure to study exemplars, and performance on subsequent grammaticality judgment tests has often been barely above chance level. In three experiments the conditions were investigated under which rule- and similarity-based learning could be applied. Participants were exposed to exemplars of an artificial grammar under different (implicit and explicit) learning instructions. The analysis of receiver operating characteristics (ROC) during a final grammaticality judgment test revealed that explicit but not implicit learning led to rule knowledge. It also demonstrated that this knowledge base is built up gradually while similarity knowledge governed the initial state of learning. Together these results indicate that rule- and similarity-based mechanisms concur during AGL. Moreover, it could be speculated that two different rule processes might operate in parallel; bottom-up learning via gradual rule extraction and top-down learning via rule testing. Crucially, the latter is facilitated by performance feedback that encourages explicit hypothesis testing. Copyright © 2015 Elsevier Inc. All rights reserved.
Similarity Analysis of Cable Insulations by Chemical Test

Energy Technology Data Exchange (ETDEWEB)

Kim, Jong Seog [Central Research Institute of Korea Hydro and Nuclear Power Co., Daejeon (Korea, Republic of)

2013-10-15

As result of this experiment, it was found that FT-IR test for material composition, TGA test for aging trend are applicable for similarity analysis of cable materials. OIT is recommended as option if TGA doesn't show good trend. Qualification of new insulation by EQ report of old insulation should be based on higher activation energy of new insulation than that of old one in the consideration of conservatism. In old nuclear power plant, it is easy to find black cable which has no marking of cable information such as manufacturer, material name and voltage. If a type test is required for qualification of these cables, how could I select representative cable? How could I determine the similarity of these cables? If manufacturer has qualified a cable for nuclear power plant more than a decade ago and composition of cable material is changed with similar one, is it acceptable to use the old EQ report for recently manufactured cable? It is well known to use FT-IR method to determine the similarity of cable materials. Infrared ray is easy tool to compare compositions of each material. But, it is not proper to compare aging trend of these materials. Study for similarity analysis of cable insulation by chemical test is described herein. To study a similarity evaluation method for polymer materials, FT-IR, TGA and OIT tests were performed for two cable insulation(old and new) which were supplied from same manufacturer. FT-IR shows good result to compare material compositions while TGA and OIT show good result to compare aging character of materials.
Similarity Analysis of Cable Insulations by Chemical Test

International Nuclear Information System (INIS)

Kim, Jong Seog

2013-01-01

As result of this experiment, it was found that FT-IR test for material composition, TGA test for aging trend are applicable for similarity analysis of cable materials. OIT is recommended as option if TGA doesn't show good trend. Qualification of new insulation by EQ report of old insulation should be based on higher activation energy of new insulation than that of old one in the consideration of conservatism. In old nuclear power plant, it is easy to find black cable which has no marking of cable information such as manufacturer, material name and voltage. If a type test is required for qualification of these cables, how could I select representative cable? How could I determine the similarity of these cables? If manufacturer has qualified a cable for nuclear power plant more than a decade ago and composition of cable material is changed with similar one, is it acceptable to use the old EQ report for recently manufactured cable? It is well known to use FT-IR method to determine the similarity of cable materials. Infrared ray is easy tool to compare compositions of each material. But, it is not proper to compare aging trend of these materials. Study for similarity analysis of cable insulation by chemical test is described herein. To study a similarity evaluation method for polymer materials, FT-IR, TGA and OIT tests were performed for two cable insulation(old and new) which were supplied from same manufacturer. FT-IR shows good result to compare material compositions while TGA and OIT show good result to compare aging character of materials
Gender similarities and differences in brain activation strategies: Voxel-based meta-analysis on fMRI studies.

Science.gov (United States)

AlRyalat, Saif Aldeen

2017-01-01

Gender similarities and differences have long been a matter of debate in almost all human research, especially upon reaching the discussion about brain functions. This large scale meta-analysis was performed on functional MRI studies. It included more than 700 active brain foci from more than 70 different experiments to study gender related similarities and differences in brain activation strategies for three of the main brain functions: Visual-spatial cognition, memory, and emotion. Areas that are significantly activated by both genders (i.e. core areas) for the tested brain function are mentioned, whereas those areas significantly activated exclusively in one gender are the gender specific areas. During visual-spatial cognition task, and in addition to the core areas, males significantly activated their left superior frontal gyrus, compared with left superior parietal lobule in females. For memory tasks, several different brain areas activated by each gender, but females significantly activated two areas from the limbic system during memory retrieval tasks. For emotional task, males tend to recruit their bilateral prefrontal regions, whereas females tend to recruit their bilateral amygdalae. This meta-analysis provides an overview based on functional MRI studies on how males and females use their brain.
Self-similar analysis of the spherical implosion process

International Nuclear Information System (INIS)

Ishiguro, Yukio; Katsuragi, Satoru.

1976-07-01

The implosion processes caused by laser-heating ablation has been studied by self-similarity analysis. Attention is paid to the possibility of existence of the self-similar solution which reproduces the implosion process of high compression. Details of the self-similar analysis are reproduced and conclusions are drawn quantitatively on the gas compression by a single shock. The compression process by a sequence of shocks is discussed in self-similarity. The gas motion followed by a homogeneous isentropic compression is represented by a self-similar motion. (auth.)
Case-based reasoning diagnostic technique based on multi-attribute similarity

Energy Technology Data Exchange (ETDEWEB)

Makoto, Takahashi [Tohoku University, Miyagi (Japan); Akio, Gofuku [Okayama University, Okayamaa (Japan)

2014-08-15

Case-based diagnostic technique has been developed based on the multi-attribute similarity. Specific feature of the developed system is to use multiple attributes of process signals for similarity evaluation to retrieve a similar case stored in a case base. The present technique has been applied to the measurement data from Monju with some simulated anomalies. The results of numerical experiments showed that the present technique can be utilizes as one of the methods for a hybrid-type diagnosis system.
A method for rapid similarity analysis of RNA secondary structures

Directory of Open Access Journals (Sweden)

Liu Na

2006-11-01

Full Text Available Abstract Background Owing to the rapid expansion of RNA structure databases in recent years, efficient methods for structure comparison are in demand for function prediction and evolutionary analysis. Usually, the similarity of RNA secondary structures is evaluated based on tree models and dynamic programming algorithms. We present here a new method for the similarity analysis of RNA secondary structures. Results Three sets of real data have been used as input for the example applications. Set I includes the structures from 5S rRNAs. Set II includes the secondary structures from RNase P and RNase MRP. Set III includes the structures from 16S rRNAs. Reasonable phylogenetic trees are derived for these three sets of data by using our method. Moreover, our program runs faster as compared to some existing ones. Conclusion The famous Lempel-Ziv algorithm can efficiently extract the information on repeated patterns encoded in RNA secondary structures and makes our method an alternative to analyze the similarity of RNA secondary structures. This method will also be useful to researchers who are interested in evolutionary analysis.
Link-Based Similarity Measures Using Reachability Vectors

Directory of Open Access Journals (Sweden)

Seok-Ho Yoon

2014-01-01

Full Text Available We present a novel approach for computing link-based similarities among objects accurately by utilizing the link information pertaining to the objects involved. We discuss the problems with previous link-based similarity measures and propose a novel approach for computing link based similarities that does not suffer from these problems. In the proposed approach each target object is represented by a vector. Each element of the vector corresponds to all the objects in the given data, and the value of each element denotes the weight for the corresponding object. As for this weight value, we propose to utilize the probability of reaching from the target object to the specific object, computed using the “Random Walk with Restart” strategy. Then, we define the similarity between two objects as the cosine similarity of the two vectors. In this paper, we provide examples to show that our approach does not suffer from the aforementioned problems. We also evaluate the performance of the proposed methods in comparison with existing link-based measures, qualitatively and quantitatively, with respect to two kinds of data sets, scientific papers and Web documents. Our experimental results indicate that the proposed methods significantly outperform the existing measures.
Similarity-based search of model organism, disease and drug effect phenotypes

KAUST Repository

Hoehndorf, Robert

2015-02-19

Background: Semantic similarity measures over phenotype ontologies have been demonstrated to provide a powerful approach for the analysis of model organism phenotypes, the discovery of animal models of human disease, novel pathways, gene functions, druggable therapeutic targets, and determination of pathogenicity. Results: We have developed PhenomeNET 2, a system that enables similarity-based searches over a large repository of phenotypes in real-time. It can be used to identify strains of model organisms that are phenotypically similar to human patients, diseases that are phenotypically similar to model organism phenotypes, or drug effect profiles that are similar to the phenotypes observed in a patient or model organism. PhenomeNET 2 is available at http://aber-owl.net/phenomenet. Conclusions: Phenotype-similarity searches can provide a powerful tool for the discovery and investigation of molecular mechanisms underlying an observed phenotypic manifestation. PhenomeNET 2 facilitates user-defined similarity searches and allows researchers to analyze their data within a large repository of human, mouse and rat phenotypes.
An Energy-Based Similarity Measure for Time Series

Directory of Open Access Journals (Sweden)

Pierre Brunagel

2007-11-01

Full Text Available A new similarity measure, called SimilB, for time series analysis, based on the cross-ÃŽÂ¨B-energy operator (2004, is introduced. ÃŽÂ¨B is a nonlinear measure which quantifies the interaction between two time series. Compared to Euclidean distance (ED or the Pearson correlation coefficient (CC, SimilB includes the temporal information and relative changes of the time series using the first and second derivatives of the time series. SimilB is well suited for both nonstationary and stationary time series and particularly those presenting discontinuities. Some new properties of ÃŽÂ¨B are presented. Particularly, we show that ÃŽÂ¨B as similarity measure is robust to both scale and time shift. SimilB is illustrated with synthetic time series and an artificial dataset and compared to the CC and the ED measures.
Algorithm Research of Individualized Travelling Route Recommendation Based on Similarity

Directory of Open Access Journals (Sweden)

Xue Shan

2015-01-01

Full Text Available Although commercial recommendation system has made certain achievement in travelling route development, the recommendation system is facing a series of challenges because of people’s increasing interest in travelling. It is obvious that the core content of the recommendation system is recommendation algorithm. The advantages of recommendation algorithm can bring great effect to the recommendation system. Based on this, this paper applies traditional collaborative filtering algorithm for analysis. Besides, illustrating the deficiencies of the algorithm, such as the rating unicity and rating matrix sparsity, this paper proposes an improved algorithm combing the multi-similarity algorithm based on user and the element similarity algorithm based on user, so as to compensate for the deficiencies that traditional algorithm has within a controllable range. Experimental results have shown that the improved algorithm has obvious advantages in comparison with the traditional one. The improved algorithm has obvious effect on remedying the rating matrix sparsity and rating unicity.
Domain similarity based orthology detection.

Science.gov (United States)

Bitard-Feildel, Tristan; Kemena, Carsten; Greenwood, Jenny M; Bornberg-Bauer, Erich

2015-05-13

Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins. We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison. We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda .
Measuring user similarity using electric circuit analysis: application to collaborative filtering.

Science.gov (United States)

Yang, Joonhyuk; Kim, Jinwook; Kim, Wonjoon; Kim, Young Hwan

2012-01-01

We propose a new technique of measuring user similarity in collaborative filtering using electric circuit analysis. Electric circuit analysis is used to measure the potential differences between nodes on an electric circuit. In this paper, by applying this method to transaction networks comprising users and items, i.e., user-item matrix, and by using the full information about the relationship structure of users in the perspective of item adoption, we overcome the limitations of one-to-one similarity calculation approach, such as the Pearson correlation, Tanimoto coefficient, and Hamming distance, in collaborative filtering. We found that electric circuit analysis can be successfully incorporated into recommender systems and has the potential to significantly enhance predictability, especially when combined with user-based collaborative filtering. We also propose four types of hybrid algorithms that combine the Pearson correlation method and electric circuit analysis. One of the algorithms exceeds the performance of the traditional collaborative filtering by 37.5% at most. This work opens new opportunities for interdisciplinary research between physics and computer science and the development of new recommendation systems.
Measuring user similarity using electric circuit analysis: application to collaborative filtering.

Directory of Open Access Journals (Sweden)

Joonhyuk Yang

Full Text Available We propose a new technique of measuring user similarity in collaborative filtering using electric circuit analysis. Electric circuit analysis is used to measure the potential differences between nodes on an electric circuit. In this paper, by applying this method to transaction networks comprising users and items, i.e., user-item matrix, and by using the full information about the relationship structure of users in the perspective of item adoption, we overcome the limitations of one-to-one similarity calculation approach, such as the Pearson correlation, Tanimoto coefficient, and Hamming distance, in collaborative filtering. We found that electric circuit analysis can be successfully incorporated into recommender systems and has the potential to significantly enhance predictability, especially when combined with user-based collaborative filtering. We also propose four types of hybrid algorithms that combine the Pearson correlation method and electric circuit analysis. One of the algorithms exceeds the performance of the traditional collaborative filtering by 37.5% at most. This work opens new opportunities for interdisciplinary research between physics and computer science and the development of new recommendation systems.
Inference-Based Similarity Search in Randomized Montgomery Domains for Privacy-Preserving Biometric Identification.

Science.gov (United States)

Wang, Yi; Wan, Jianwu; Guo, Jun; Cheung, Yiu-Ming; C Yuen, Pong

2017-07-14

Similarity search is essential to many important applications and often involves searching at scale on high-dimensional data based on their similarity to a query. In biometric applications, recent vulnerability studies have shown that adversarial machine learning can compromise biometric recognition systems by exploiting the biometric similarity information. Existing methods for biometric privacy protection are in general based on pairwise matching of secured biometric templates and have inherent limitations in search efficiency and scalability. In this paper, we propose an inference-based framework for privacy-preserving similarity search in Hamming space. Our approach builds on an obfuscated distance measure that can conceal Hamming distance in a dynamic interval. Such a mechanism enables us to systematically design statistically reliable methods for retrieving most likely candidates without knowing the exact distance values. We further propose to apply Montgomery multiplication for generating search indexes that can withstand adversarial similarity analysis, and show that information leakage in randomized Montgomery domains can be made negligibly small. Our experiments on public biometric datasets demonstrate that the inference-based approach can achieve a search accuracy close to the best performance possible with secure computation methods, but the associated cost is reduced by orders of magnitude compared to cryptographic primitives.
DOSim: An R package for similarity between diseases based on Disease Ontology

Science.gov (United States)

2011-01-01

Background The construction of the Disease Ontology (DO) has helped promote the investigation of diseases and disease risk factors. DO enables researchers to analyse disease similarity by adopting semantic similarity measures, and has expanded our understanding of the relationships between different diseases and to classify them. Simultaneously, similarities between genes can also be analysed by their associations with similar diseases. As a result, disease heterogeneity is better understood and insights into the molecular pathogenesis of similar diseases have been gained. However, bioinformatics tools that provide easy and straight forward ways to use DO to study disease and gene similarity simultaneously are required. Results We have developed an R-based software package (DOSim) to compute the similarity between diseases and to measure the similarity between human genes in terms of diseases. DOSim incorporates a DO-based enrichment analysis function that can be used to explore the disease feature of an independent gene set. A multilayered enrichment analysis (GO and KEGG annotation) annotation function that helps users explore the biological meaning implied in a newly detected gene module is also part of the DOSim package. We used the disease similarity application to demonstrate the relationship between 128 different DO cancer terms. The hierarchical clustering of these 128 different cancers showed modular characteristics. In another case study, we used the gene similarity application on 361 obesity-related genes. The results revealed the complex pathogenesis of obesity. In addition, the gene module detection and gene module multilayered annotation functions in DOSim when applied on these 361 obesity-related genes helped extend our understanding of the complex pathogenesis of obesity risk phenotypes and the heterogeneity of obesity-related diseases. Conclusions DOSim can be used to detect disease-driven gene modules, and to annotate the modules for functions and
Weighted similarity-based clustering of chemical structures and bioactivity data in early drug discovery.

Science.gov (United States)

Perualila-Tan, Nolen Joy; Shkedy, Ziv; Talloen, Willem; Göhlmann, Hinrich W H; Moerbeke, Marijke Van; Kasim, Adetayo

2016-08-01

The modern process of discovering candidate molecules in early drug discovery phase includes a wide range of approaches to extract vital information from the intersection of biology and chemistry. A typical strategy in compound selection involves compound clustering based on chemical similarity to obtain representative chemically diverse compounds (not incorporating potency information). In this paper, we propose an integrative clustering approach that makes use of both biological (compound efficacy) and chemical (structural features) data sources for the purpose of discovering a subset of compounds with aligned structural and biological properties. The datasets are integrated at the similarity level by assigning complementary weights to produce a weighted similarity matrix, serving as a generic input in any clustering algorithm. This new analysis work flow is semi-supervised method since, after the determination of clusters, a secondary analysis is performed wherein it finds differentially expressed genes associated to the derived integrated cluster(s) to further explain the compound-induced biological effects inside the cell. In this paper, datasets from two drug development oncology projects are used to illustrate the usefulness of the weighted similarity-based clustering approach to integrate multi-source high-dimensional information to aid drug discovery. Compounds that are structurally and biologically similar to the reference compounds are discovered using this proposed integrative approach.
Uncovering highly obfuscated plagiarism cases using fuzzy semantic-based similarity model

Directory of Open Access Journals (Sweden)

Salha M. Alzahrani

2015-07-01

Full Text Available Highly obfuscated plagiarism cases contain unseen and obfuscated texts, which pose difficulties when using existing plagiarism detection methods. A fuzzy semantic-based similarity model for uncovering obfuscated plagiarism is presented and compared with five state-of-the-art baselines. Semantic relatedness between words is studied based on the part-of-speech (POS tags and WordNet-based similarity measures. Fuzzy-based rules are introduced to assess the semantic distance between source and suspicious texts of short lengths, which implement the semantic relatedness between words as a membership function to a fuzzy set. In order to minimize the number of false positives and false negatives, a learning method that combines a permission threshold and a variation threshold is used to decide true plagiarism cases. The proposed model and the baselines are evaluated on 99,033 ground-truth annotated cases extracted from different datasets, including 11,621 (11.7% handmade paraphrases, 54,815 (55.4% artificial plagiarism cases, and 32,578 (32.9% plagiarism-free cases. We conduct extensive experimental verifications, including the study of the effects of different segmentations schemes and parameter settings. Results are assessed using precision, recall, F-measure and granularity on stratified 10-fold cross-validation data. The statistical analysis using paired t-tests shows that the proposed approach is statistically significant in comparison with the baselines, which demonstrates the competence of fuzzy semantic-based model to detect plagiarism cases beyond the literal plagiarism. Additionally, the analysis of variance (ANOVA statistical test shows the effectiveness of different segmentation schemes used with the proposed approach.

New Genome Similarity Measures based on Conserved Gene Adjacencies.

Science.gov (United States)

Doerr, Daniel; Kowada, Luis Antonio B; Araujo, Eloi; Deshpande, Shachi; Dantas, Simone; Moret, Bernard M E; Stoye, Jens

2017-06-01

Many important questions in molecular biology, evolution, and biomedicine can be addressed by comparative genomic approaches. One of the basic tasks when comparing genomes is the definition of measures of similarity (or dissimilarity) between two genomes, for example, to elucidate the phylogenetic relationships between species. The power of different genome comparison methods varies with the underlying formal model of a genome. The simplest models impose the strong restriction that each genome under study must contain the same genes, each in exactly one copy. More realistic models allow several copies of a gene in a genome. One speaks of gene families, and comparative genomic methods that allow this kind of input are called gene family-based. The most powerful-but also most complex-models avoid this preprocessing of the input data and instead integrate the family assignment within the comparative analysis. Such methods are called gene family-free. In this article, we study an intermediate approach between family-based and family-free genomic similarity measures. Introducing this simpler model, called gene connections, we focus on the combinatorial aspects of gene family-free genome comparison. While in most cases, the computational costs to the general family-free case are the same, we also find an instance where the gene connections model has lower complexity. Within the gene connections model, we define three variants of genomic similarity measures that have different expression powers. We give polynomial-time algorithms for two of them, while we show NP-hardness for the third, most powerful one. We also generalize the measures and algorithms to make them more robust against recent local disruptions in gene order. Our theoretical findings are supported by experimental results, proving the applicability and performance of our newly defined similarity measures.
Similarity-based Polymorphic Shellcode Detection

Directory of Open Access Journals (Sweden)

Denis Yurievich Gamayunov

2013-02-01

Full Text Available In the work the method for polymorphic shellcode dedection based on the set of known shellcodes is proposed. The method’s main idea is in sequential applying of deobfuscating transformations to a data analyzed and then recognizing similarity with malware samples. The method has been tested on the sets of shellcodes generated using Metasploit Framework v.4.1.0 and PELock Obfuscator and shows 87 % precision with zero false positives rate.
Western classical music development: a statistical analysis of composers similarity, differentiation and evolution.

Science.gov (United States)

Georges, Patrick

2017-01-01

This paper proposes a statistical analysis that captures similarities and differences between classical music composers with the eventual aim to understand why particular composers 'sound' different even if their 'lineages' (influences network) are similar or why they 'sound' alike if their 'lineages' are different. In order to do this we use statistical methods and measures of association or similarity (based on presence/absence of traits such as specific 'ecological' characteristics and personal musical influences) that have been developed in biosystematics, scientometrics, and bibliographic coupling. This paper also represents a first step towards a more ambitious goal of developing an evolutionary model of Western classical music.
A Framework for Analysis of Music Similarity Measures

DEFF Research Database (Denmark)

Jensen, Jesper Højvang; Christensen, Mads G.; Jensen, Søren Holdt

2007-01-01

To analyze specific properties of music similarity measures that the commonly used genre classification evaluation procedure does not reveal, we introduce a MIDI based test framework for music similarity measures. We introduce the framework by example and thus outline an experiment to analyze the...
Appropriate Similarity Measures for Author Cocitation Analysis

NARCIS (Netherlands)

N.J.P. van Eck (Nees Jan); L. Waltman (Ludo)

2007-01-01

textabstractWe provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of
Natural texture retrieval based on perceptual similarity measurement

Science.gov (United States)

Gao, Ying; Dong, Junyu; Lou, Jianwen; Qi, Lin; Liu, Jun

2018-04-01

A typical texture retrieval system performs feature comparison and might not be able to make human-like judgments of image similarity. Meanwhile, it is commonly known that perceptual texture similarity is difficult to be described by traditional image features. In this paper, we propose a new texture retrieval scheme based on texture perceptual similarity. The key of the proposed scheme is that prediction of perceptual similarity is performed by learning a non-linear mapping from image features space to perceptual texture space by using Random Forest. We test the method on natural texture dataset and apply it on a new wallpapers dataset. Experimental results demonstrate that the proposed texture retrieval scheme with perceptual similarity improves the retrieval performance over traditional image features.
PHOG analysis of self-similarity in aesthetic images

Science.gov (United States)

Amirshahi, Seyed Ali; Koch, Michael; Denzler, Joachim; Redies, Christoph

2012-03-01

In recent years, there have been efforts in defining the statistical properties of aesthetic photographs and artworks using computer vision techniques. However, it is still an open question how to distinguish aesthetic from non-aesthetic images with a high recognition rate. This is possibly because aesthetic perception is influenced also by a large number of cultural variables. Nevertheless, the search for statistical properties of aesthetic images has not been futile. For example, we have shown that the radially averaged power spectrum of monochrome artworks of Western and Eastern provenance falls off according to a power law with increasing spatial frequency (1/f2 characteristics). This finding implies that this particular subset of artworks possesses a Fourier power spectrum that is self-similar across different scales of spatial resolution. Other types of aesthetic images, such as cartoons, comics and mangas also display this type of self-similarity, as do photographs of complex natural scenes. Since the human visual system is adapted to encode images of natural scenes in a particular efficient way, we have argued that artists imitate these statistics in their artworks. In support of this notion, we presented results that artists portrait human faces with the self-similar Fourier statistics of complex natural scenes although real-world photographs of faces are not self-similar. In view of these previous findings, we investigated other statistical measures of self-similarity to characterize aesthetic and non-aesthetic images. In the present work, we propose a novel measure of self-similarity that is based on the Pyramid Histogram of Oriented Gradients (PHOG). For every image, we first calculate PHOG up to pyramid level 3. The similarity between the histograms of each section at a particular level is then calculated to the parent section at the previous level (or to the histogram at the ground level). The proposed approach is tested on datasets of aesthetic and
Similarity measurement method of high-dimensional data based on normalized net lattice subspace

Institute of Scientific and Technical Information of China (English)

Li Wenfa; Wang Gongming; Li Ke; Huang Su

2017-01-01

The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities occupies a large proportion of the similarity, leading to the dissimilarities between any results.A similarity measurement method of high-dimensional data based on normalized net lattice subspace is proposed.The data range of each dimension is divided into several intervals, and the components in different dimensions are mapped onto the corresponding interval.Only the component in the same or adjacent interval is used to calculate the similarity.To validate this meth-od, three data types are used, and seven common similarity measurement methods are compared. The experimental result indicates that the relative difference of the method is increasing with the di-mensionality and is approximately two or three orders of magnitude higher than the conventional method.In addition, the similarity range of this method in different dimensions is [0, 1], which is fit for similarity analysis after dimensionality reduction.
Similarity analysis between quantum images

Science.gov (United States)

Zhou, Ri-Gui; Liu, XingAo; Zhu, Changming; Wei, Lai; Zhang, Xiafen; Ian, Hou

2018-06-01

Similarity analyses between quantum images are so essential in quantum image processing that it provides fundamental research for the other fields, such as quantum image matching, quantum pattern recognition. In this paper, a quantum scheme based on a novel quantum image representation and quantum amplitude amplification algorithm is proposed. At the end of the paper, three examples and simulation experiments show that the measurement result must be 0 when two images are same, and the measurement result has high probability of being 1 when two images are different.
A Model-Based Approach to Constructing Music Similarity Functions

Directory of Open Access Journals (Sweden)

Lamere Paul

2007-01-01

Full Text Available Several authors have presented systems that estimate the audio similarity of two pieces of music through the calculation of a distance metric, such as the Euclidean distance, between spectral features calculated from the audio, related to the timbre or pitch of the signal. These features can be augmented with other, temporally or rhythmically based features such as zero-crossing rates, beat histograms, or fluctuation patterns to form a more well-rounded music similarity function. It is our contention that perceptual or cultural labels, such as the genre, style, or emotion of the music, are also very important features in the perception of music. These labels help to define complex regions of similarity within the available feature spaces. We demonstrate a machine-learning-based approach to the construction of a similarity metric, which uses this contextual information to project the calculated features into an intermediate space where a music similarity function that incorporates some of the cultural information may be calculated.
A Model-Based Approach to Constructing Music Similarity Functions

Science.gov (United States)

West, Kris; Lamere, Paul

2006-12-01

Several authors have presented systems that estimate the audio similarity of two pieces of music through the calculation of a distance metric, such as the Euclidean distance, between spectral features calculated from the audio, related to the timbre or pitch of the signal. These features can be augmented with other, temporally or rhythmically based features such as zero-crossing rates, beat histograms, or fluctuation patterns to form a more well-rounded music similarity function. It is our contention that perceptual or cultural labels, such as the genre, style, or emotion of the music, are also very important features in the perception of music. These labels help to define complex regions of similarity within the available feature spaces. We demonstrate a machine-learning-based approach to the construction of a similarity metric, which uses this contextual information to project the calculated features into an intermediate space where a music similarity function that incorporates some of the cultural information may be calculated.
Personalized recommendation with corrected similarity

International Nuclear Information System (INIS)

Zhu, Xuzhen; Tian, Hui; Cai, Shimin

2014-01-01

Personalized recommendation has attracted a surge of interdisciplinary research. Especially, similarity-based methods in applications of real recommendation systems have achieved great success. However, the computations of similarities are overestimated or underestimated, in particular because of the defective strategy of unidirectional similarity estimation. In this paper, we solve this drawback by leveraging mutual correction of forward and backward similarity estimations, and propose a new personalized recommendation index, i.e., corrected similarity based inference (CSI). Through extensive experiments on four benchmark datasets, the results show a greater improvement of CSI in comparison with these mainstream baselines. And a detailed analysis is presented to unveil and understand the origin of such difference between CSI and mainstream indices. (paper)
Immunoinformatics and Similarity Analysis of House Dust Mite Tropomyosin

Directory of Open Access Journals (Sweden)

Mohammad Mehdi Ranjbar

2015-10-01

Full Text Available Background: Dermatophagoides farinae and Dermatophagoides pteronyssinus are house dust mites (HDM that they cause severe asthma and allergic symptoms. Tropomyosin protein plays an important role in mentioned immune and allergic reactions to HDMs. Here, tropomyosin protein from Dermatophagoides spp. was comprehensively screened in silico for its allergenicity, antigenicity and similarity/conservation.Materials and Methods: The amino acid sequences of D. farinae tropomyosin, D. pteronyssinus and other mites were retrieved. We included alignments and evaluated conserved/ variable regions along sequences, constructed their phylogenetic tree and estimated overall mean distances. Then, followed by with prediction of linear B-cell epitope based on different approaches, and besides in-silico evaluation of IgE epitopes allergenicity (by SVMc, IgE epitope, ARPs BLAST, MAST and hybrid method. Finally, comparative analysis of results by different approaches was made.Results: Alignment results revealed near complete identity between D. farina and D. pteronyssinus members, and also there was close similarity among Dermatophagoides spp. Most of the variations among mites' tropomyosin were approximately located at amino acids 23 to 80, 108 to 120, 142 to 153 and 220 to 230. Topology of tree showed close relationships among mites in tropomyosin protein sequence, although their sequences in D. farina, D. pteronyssinus and Psoroptes ovis are more similar to each other and clustered. Dermanyssus gallinae (AC: Q2WBI0 has less relationship to other mites, being located in a separate branch. Hydrophilicity and flexibility plots revealed that many parts of this protein have potential to be hydrophilic and flexible. Surface accessibility represented 7 different epitopes. Beta-turns in this protein are with high probability in the middle part and its two terminals. Kolaskar and Tongaonkar method analysis represented 11 immunogenic epitopes between amino acids 7-16. From
Comparative Analysis of Mass Spectral Similarity Measures on Peak Alignment for Comprehensive Two-Dimensional Gas Chromatography Mass Spectrometry

Science.gov (United States)

2013-01-01

Peak alignment is a critical procedure in mass spectrometry-based biomarker discovery in metabolomics. One of peak alignment approaches to comprehensive two-dimensional gas chromatography mass spectrometry (GC×GC-MS) data is peak matching-based alignment. A key to the peak matching-based alignment is the calculation of mass spectral similarity scores. Various mass spectral similarity measures have been developed mainly for compound identification, but the effect of these spectral similarity measures on the performance of peak matching-based alignment still remains unknown. Therefore, we selected five mass spectral similarity measures, cosine correlation, Pearson's correlation, Spearman's correlation, partial correlation, and part correlation, and examined their effects on peak alignment using two sets of experimental GC×GC-MS data. The results show that the spectral similarity measure does not affect the alignment accuracy significantly in analysis of data from less complex samples, while the partial correlation performs much better than other spectral similarity measures when analyzing experimental data acquired from complex biological samples. PMID:24151524
Color-Based Image Retrieval from High-Similarity Image Databases

DEFF Research Database (Denmark)

Hansen, Michael Adsetts Edberg; Carstensen, Jens Michael

2003-01-01

Many image classification problems can fruitfully be thought of as image retrieval in a "high similarity image database" (HSID) characterized by being tuned towards a specific application and having a high degree of visual similarity between entries that should be distinguished. We introduce...... a method for HSID retrieval using a similarity measure based on a linear combination of Jeffreys-Matusita (JM) distances between distributions of color (and color derivatives) estimated from a set of automatically extracted image regions. The weight coefficients are estimated based on optimal retrieval...... performance. Experimental results on the difficult task of visually identifying clones of fungal colonies grown in a petri dish and categorization of pelts show a high retrieval accuracy of the method when combined with standardized sample preparation and image acquisition....
A general framework for regularized, similarity-based image restoration.

Science.gov (United States)

Kheradmand, Amin; Milanfar, Peyman

2014-12-01

Any image can be represented as a function defined on a weighted graph, in which the underlying structure of the image is encoded in kernel similarity and associated Laplacian matrices. In this paper, we develop an iterative graph-based framework for image restoration based on a new definition of the normalized graph Laplacian. We propose a cost function, which consists of a new data fidelity term and regularization term derived from the specific definition of the normalized graph Laplacian. The normalizing coefficients used in the definition of the Laplacian and associated regularization term are obtained using fast symmetry preserving matrix balancing. This results in some desired spectral properties for the normalized Laplacian such as being symmetric, positive semidefinite, and returning zero vector when applied to a constant image. Our algorithm comprises of outer and inner iterations, where in each outer iteration, the similarity weights are recomputed using the previous estimate and the updated objective function is minimized using inner conjugate gradient iterations. This procedure improves the performance of the algorithm for image deblurring, where we do not have access to a good initial estimate of the underlying image. In addition, the specific form of the cost function allows us to render the spectral analysis for the solutions of the corresponding linear equations. In addition, the proposed approach is general in the sense that we have shown its effectiveness for different restoration problems, including deblurring, denoising, and sharpening. Experimental results verify the effectiveness of the proposed algorithm on both synthetic and real examples.
Sample similarity analysis of angles of repose based on experimental results for DEM calibration

Science.gov (United States)

Tan, Yuan; Günthner, Willibald A.; Kessler, Stephan; Zhang, Lu

2017-06-01

As a fundamental material property, particle-particle friction coefficient is usually calculated based on angle of repose which can be obtained experimentally. In the present study, the bottomless cylinder test was carried out to investigate this friction coefficient of a kind of biomass material, i.e. willow chips. Because of its irregular shape and varying particle size distribution, calculation of the angle becomes less applicable and decisive. In the previous studies only one section of those uneven slopes is chosen in most cases, although standard methods in definition of a representable section are barely found. Hence, we presented an efficient and reliable method from the new technology, 3D scan, which was used to digitize the surface of heaps and generate its point cloud. Then, two tangential lines of any selected section were calculated through the linear least-squares regression (LLSR), such that the left and right angle of repose of a pile could be derived. As the next step, a certain sum of sections were stochastic selected, and calculations were repeated correspondingly in order to achieve sample of angles, which was plotted in Cartesian coordinates as spots diagram. Subsequently, different samples were acquired through various selections of sections. By applying similarities and difference analysis of these samples, the reliability of this proposed method was verified. Phased results provides a realistic criterion to reduce the deviation between experiment and simulation as a result of random selection of a single angle, which will be compared with the simulation results in the future.
Sample similarity analysis of angles of repose based on experimental results for DEM calibration

Directory of Open Access Journals (Sweden)

Tan Yuan

2017-01-01

Full Text Available As a fundamental material property, particle-particle friction coefficient is usually calculated based on angle of repose which can be obtained experimentally. In the present study, the bottomless cylinder test was carried out to investigate this friction coefficient of a kind of biomass material, i.e. willow chips. Because of its irregular shape and varying particle size distribution, calculation of the angle becomes less applicable and decisive. In the previous studies only one section of those uneven slopes is chosen in most cases, although standard methods in definition of a representable section are barely found. Hence, we presented an efficient and reliable method from the new technology, 3D scan, which was used to digitize the surface of heaps and generate its point cloud. Then, two tangential lines of any selected section were calculated through the linear least-squares regression (LLSR, such that the left and right angle of repose of a pile could be derived. As the next step, a certain sum of sections were stochastic selected, and calculations were repeated correspondingly in order to achieve sample of angles, which was plotted in Cartesian coordinates as spots diagram. Subsequently, different samples were acquired through various selections of sections. By applying similarities and difference analysis of these samples, the reliability of this proposed method was verified. Phased results provides a realistic criterion to reduce the deviation between experiment and simulation as a result of random selection of a single angle, which will be compared with the simulation results in the future.
Effective Results Analysis for the Similar Software Products’ Orthogonality

OpenAIRE

Ion Ivan; Daniel Milodin

2009-01-01

It is defined the concept of similar software. There are established conditions of archiving the software components. It is carried out the orthogonality evaluation and the correlation between the orthogonality and the complexity of the homogenous software components is analyzed. Shall proceed to build groups of similar software products, belonging to the orthogonality intervals. There are presented in graphical form the results of the analysis. There are detailed aspects of the functioning o...
A grammar-based semantic similarity algorithm for natural language sentences.

Science.gov (United States)

Lee, Ming Che; Chang, Jia Wei; Hsieh, Tung Cheng

2014-01-01

This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to "artificial language", such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.

A similarity based agglomerative clustering algorithm in networks

Science.gov (United States)

Liu, Zhiyuan; Wang, Xiujuan; Ma, Yinghong

2018-04-01

The detection of clusters is benefit for understanding the organizations and functions of networks. Clusters, or communities, are usually groups of nodes densely interconnected but sparsely linked with any other clusters. To identify communities, an efficient and effective community agglomerative algorithm based on node similarity is proposed. The proposed method initially calculates similarities between each pair of nodes, and form pre-partitions according to the principle that each node is in the same community as its most similar neighbor. After that, check each partition whether it satisfies community criterion. For the pre-partitions who do not satisfy, incorporate them with others that having the biggest attraction until there are no changes. To measure the attraction ability of a partition, we propose an attraction index that based on the linked node's importance in networks. Therefore, our proposed method can better exploit the nodes' properties and network's structure. To test the performance of our algorithm, both synthetic and empirical networks ranging in different scales are tested. Simulation results show that the proposed algorithm can obtain superior clustering results compared with six other widely used community detection algorithms.
A Feature-Based Structural Measure: An Image Similarity Measure for Face Recognition

Directory of Open Access Journals (Sweden)

Noor Abdalrazak Shnain

2017-08-01

Full Text Available Facial recognition is one of the most challenging and interesting problems within the field of computer vision and pattern recognition. During the last few years, it has gained special attention due to its importance in relation to current issues such as security, surveillance systems and forensics analysis. Despite this high level of attention to facial recognition, the success is still limited by certain conditions; there is no method which gives reliable results in all situations. In this paper, we propose an efficient similarity index that resolves the shortcomings of the existing measures of feature and structural similarity. This measure, called the Feature-Based Structural Measure (FSM, combines the best features of the well-known SSIM (structural similarity index measure and FSIM (feature similarity index measure approaches, striking a balance between performance for similar and dissimilar images of human faces. In addition to the statistical structural properties provided by SSIM, edge detection is incorporated in FSM as a distinctive structural feature. Its performance is tested for a wide range of PSNR (peak signal-to-noise ratio, using ORL (Olivetti Research Laboratory, now AT&T Laboratory Cambridge and FEI (Faculty of Industrial Engineering, São Bernardo do Campo, São Paulo, Brazil databases. The proposed measure is tested under conditions of Gaussian noise; simulation results show that the proposed FSM outperforms the well-known SSIM and FSIM approaches in its efficiency of similarity detection and recognition of human faces.
A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

Science.gov (United States)

Chang, Jia Wei; Hsieh, Tung Cheng

2014-01-01

This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure. PMID:24982952
A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

Directory of Open Access Journals (Sweden)

Ming Che Lee

2014-01-01

Full Text Available This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.
Similarity and uncertainty analysis of the ALLEGRO MOX core

International Nuclear Information System (INIS)

Vrban, B.; Hascik, J.; Necas, V.; Slugen, V.

2015-01-01

The similarity and uncertainty analysis of the ESNII+ ALLEGRO MOX core has identified specific problems and challenges in the field of neutronic calculations. Similarity assessment identified 9 partly comparable experiments where only one reached ck and E values over 0.9. However the Global Integral Index G remains still low (0.75) and cannot be judge das sufficient. The total uncertainty of calculated k eff induced by XS data is according to our calculation 1.04%. The main contributors to this uncertainty are 239 Pu nubar and 238 U inelastic scattering. The additional margin from uncovered sensitivities was determined to be 0.28%. The identified low number of similar experiments prevents the use of advanced XS adjustment and bias estimation methods. More experimental data are needed and presented results may serve as a basic step in development of necessary critical assemblies. Although exact data are not presented in the paper, faster 44 energy group calculation gives almost the same results in similarity analysis in comparison to more complex 238 group calculation. Finally, it was demonstrated that TSUNAMI-IP utility can play a significant role in the future fast reactor development in Slovakia and in the Visegrad region. Clearly a further Research and Development and strong effort should be carried out in order to receive more complex methodology consisting of more plausible covariance data and related quantities. (authors)
Human-based percussion and self-similarity detection in electroacoustic music

Science.gov (United States)

Mills, John Anderson, III

Electroacoustic music is music that uses electronic technology for the compositional manipulation of sound, and is a unique genre of music for many reasons. Analyzing electroacoustic music requires special measures, some of which are integrated into the design of a preliminary percussion analysis tool set for electroacoustic music. This tool set is designed to incorporate the human processing of music and sound. Models of the human auditory periphery are used as a front end to the analysis algorithms. The audio properties of percussivity and self-similarity are chosen as the focus because these properties are computable and informative. A collection of human judgments about percussion was undertaken to acquire clearly specified, sound-event dimensions that humans use as a percussive cue. A total of 29 participants was asked to make judgments about the percussivity of 360 pairs of synthesized snare-drum sounds. The grouped results indicate that of the dimensions tested rise time is the strongest cue for percussivity. String resonance also has a strong effect, but because of the complex nature of string resonance, it is not a fundamental dimension of a sound event. Gross spectral filtering also has an effect on the judgment of percussivity but the effect is weaker than for rise time and string resonance. Gross spectral filtering also has less effect when the stronger cue of rise time is modified simultaneously. A percussivity-profile algorithm (PPA) is designed to identify those instants in pieces of music that humans also would identify as percussive. The PPA is implemented using a time-domain, channel-based approach and psychoacoustic models. The input parameters are tuned to maximize performance at matching participants' choices in the percussion-judgment collection. After the PPA is tuned, the PPA then is used to analyze pieces of electroacoustic music. Real electroacoustic music introduces new challenges for the PPA, though those same challenges might affect
Extending the similarity-based XML multicast approach with digital signatures

DEFF Research Database (Denmark)

Azzini, Antonia; Marrara, Stefania; Jensen, Meiko

2009-01-01

This paper investigates the interplay between similarity-based SOAP message aggregation and digital signature application. An overview on the approaches resulting from the different orders for the tasks of signature application, verification, similarity aggregation and splitting is provided....... Depending on the intersection between similarity-aggregated and signed SOAP message parts, the paper discusses three different cases of signature application, and sketches their applicability and performance implications....
Effective Results Analysis for the Similar Software Products’ Orthogonality

Directory of Open Access Journals (Sweden)

Ion Ivan

2009-10-01

Full Text Available It is defined the concept of similar software. There are established conditions of archiving the software components. It is carried out the orthogonality evaluation and the correlation between the orthogonality and the complexity of the homogenous software components is analyzed. Shall proceed to build groups of similar software products, belonging to the orthogonality intervals. There are presented in graphical form the results of the analysis. There are detailed aspects of the functioning of the software product allocated for the orthogonality.
Computational prediction of drug-drug interactions based on drugs functional similarities.

Science.gov (United States)

Ferdousi, Reza; Safdari, Reza; Omidi, Yadollah

2017-06-01

Therapeutic activities of drugs are often influenced by co-administration of drugs that may cause inevitable drug-drug interactions (DDIs) and inadvertent side effects. Prediction and identification of DDIs are extremely vital for the patient safety and success of treatment modalities. A number of computational methods have been employed for the prediction of DDIs based on drugs structures and/or functions. Here, we report on a computational method for DDIs prediction based on functional similarity of drugs. The model was set based on key biological elements including carriers, transporters, enzymes and targets (CTET). The model was applied for 2189 approved drugs. For each drug, all the associated CTETs were collected, and the corresponding binary vectors were constructed to determine the DDIs. Various similarity measures were conducted to detect DDIs. Of the examined similarity methods, the inner product-based similarity measures (IPSMs) were found to provide improved prediction values. Altogether, 2,394,766 potential drug pairs interactions were studied. The model was able to predict over 250,000 unknown potential DDIs. Upon our findings, we propose the current method as a robust, yet simple and fast, universal in silico approach for identification of DDIs. We envision that this proposed method can be used as a practical technique for the detection of possible DDIs based on the functional similarities of drugs. Copyright © 2017. Published by Elsevier Inc.
Dimensional analysis, similarity, analogy, and the simulation theory

International Nuclear Information System (INIS)

Davis, A.A.

1978-01-01

Dimensional analysis, similarity, analogy, and cybernetics are shown to be four consecutive steps in application of the simulation theory. This paper introduces the classes of phenomena which follow the same formal mathematical equations as models of the natural laws and the interior sphere of restraints groups of phenomena in which one can introduce simplfied nondimensional mathematical equations. The simulation by similarity in a specific field of physics, by analogy in two or more different fields of physics, and by cybernetics in nature in two or more fields of mathematics, physics, biology, economics, politics, sociology, etc., appears as a unique theory which permits one to transport the results of experiments from the models, convenably selected to meet the conditions of researches, constructions, and measurements in the laboratories to the originals which are the primary objectives of the researches. Some interesting conclusions which cannot be avoided in the use of simplified nondimensional mathematical equations as models of natural laws are presented. Interesting limitations on the use of simulation theory based on assumed simplifications are recognized. This paper shows as necessary, in scientific research, that one write mathematical models of general laws which will be applied to nature in its entirety. The paper proposes the extent of the second law of thermodynamics as the generalized law of entropy to model life and its activities. This paper shows that the physical studies and philosophical interpretations of phenomena and natural laws cannot be separated in scientific work; they are interconnected and one cannot be put above the others
Self-similar cosmological solutions with dark energy. I. Formulation and asymptotic analysis

International Nuclear Information System (INIS)

Harada, Tomohiro; Maeda, Hideki; Carr, B. J.

2008-01-01

Based on the asymptotic analysis of ordinary differential equations, we classify all spherically symmetric self-similar solutions to the Einstein equations which are asymptotically Friedmann at large distances and contain a perfect fluid with equation of state p=(γ-1)μ with 0 1). However, in the latter case there is an additional parameter associated with the weak discontinuity at the sonic point and the solutions are only asymptotically 'quasi-Friedmann', in the sense that they exhibit an angle deficit at large distances. In the 0<γ<2/3 case, there is no sonic point and there exists a one-parameter family of solutions which are genuinely asymptotically Friedmann at large distances. We find eight classes of asymptotic behavior: Friedmann or quasi-Friedmann or quasistatic or constant-velocity at large distances, quasi-Friedmann or positive-mass singular or negative-mass singular at small distances, and quasi-Kantowski-Sachs at intermediate distances. The self-similar asymptotically quasistatic and quasi-Kantowski-Sachs solutions are analytically extendible and of great cosmological interest. We also investigate their conformal diagrams. The results of the present analysis are utilized in an accompanying paper to obtain and physically interpret numerical solutions
Random walk-based similarity measure method for patterns in complex object

Directory of Open Access Journals (Sweden)

Liu Shihu

2017-04-01

Full Text Available This paper discusses the similarity of the patterns in complex objects. The complex object is composed both of the attribute information of patterns and the relational information between patterns. Bearing in mind the specificity of complex object, a random walk-based similarity measurement method for patterns is constructed. In this method, the reachability of any two patterns with respect to the relational information is fully studied, and in the case of similarity of patterns with respect to the relational information can be calculated. On this bases, an integrated similarity measurement method is proposed, and algorithms 1 and 2 show the performed calculation procedure. One can find that this method makes full use of the attribute information and relational information. Finally, a synthetic example shows that our proposed similarity measurement method is validated.
Efficient Algorithm for Computing Link-based Similarity in Real World Networks

DEFF Research Database (Denmark)

Cai, Yuanzhe; Cong, Gao; Xu, Jia

2009-01-01

Similarity calculation has many applications, such as information retrieval, and collaborative filtering, among many others. It has been shown that link-based similarity measure, such as SimRank, is very effective in characterizing the object similarities in networks, such as the Web, by exploiti...
Similarity search processing. Paralelization and indexing technologies.

Directory of Open Access Journals (Sweden)

Eder Dos Santos

2015-08-01

The next Scientific-Technical Report addresses the similarity search and the implementation of metric structures on parallel environments. It also presents the state of the art related to similarity search on metric structures and parallelism technologies. Comparative analysis are also proposed, seeking to identify the behavior of a set of metric spaces and metric structures over processing platforms multicore-based and GPU-based.
Calculating the knowledge-based similarity of functional groups using crystallographic data

Science.gov (United States)

Watson, Paul; Willett, Peter; Gillet, Valerie J.; Verdonk, Marcel L.

2001-09-01

A knowledge-based method for calculating the similarity of functional groups is described and validated. The method is based on experimental information derived from small molecule crystal structures. These data are used in the form of scatterplots that show the likelihood of a non-bonded interaction being formed between functional group A (the `central group') and functional group B (the `contact group' or `probe'). The scatterplots are converted into three-dimensional maps that show the propensity of the probe at different positions around the central group. Here we describe how to calculate the similarity of a pair of central groups based on these maps. The similarity method is validated using bioisosteric functional group pairs identified in the Bioster database and Relibase. The Bioster database is a critical compilation of thousands of bioisosteric molecule pairs, including drugs, enzyme inhibitors and agrochemicals. Relibase is an object-oriented database containing structural data about protein-ligand interactions. The distributions of the similarities of the bioisosteric functional group pairs are compared with similarities for all the possible pairs in IsoStar, and are found to be significantly different. Enrichment factors are also calculated showing the similarity method is statistically significantly better than random in predicting bioisosteric functional group pairs.
Image magnification based on similarity analogy

International Nuclear Information System (INIS)

Chen Zuoping; Ye Zhenglin; Wang Shuxun; Peng Guohua

2009-01-01

Aiming at the high time complexity of the decoding phase in the traditional image enlargement methods based on fractal coding, a novel image magnification algorithm is proposed in this paper, which has the advantage of iteration-free decoding, by using the similarity analogy between an image and its zoom-out and zoom-in. A new pixel selection technique is also presented to further improve the performance of the proposed method. Furthermore, by combining some existing fractal zooming techniques, an efficient image magnification algorithm is obtained, which can provides the image quality as good as the state of the art while greatly decrease the time complexity of the decoding phase.
Improved personalized recommendation based on a similarity network

Science.gov (United States)

Wang, Ximeng; Liu, Yun; Xiong, Fei

2016-08-01

A recommender system helps individual users find the preferred items rapidly and has attracted extensive attention in recent years. Many successful recommendation algorithms are designed on bipartite networks, such as network-based inference or heat conduction. However, most of these algorithms define the resource-allocation methods for an average allocation. That is not reasonable because average allocation cannot indicate the user choice preference and the influence between users which leads to a series of non-personalized recommendation results. We propose a personalized recommendation approach that combines the similarity function and bipartite network to generate a similarity network that improves the resource-allocation process. Our model introduces user influence into the recommender system and states that the user influence can make the resource-allocation process more reasonable. We use four different metrics to evaluate our algorithms for three benchmark data sets. Experimental results show that the improved recommendation on a similarity network can obtain better accuracy and diversity than some competing approaches.
Patient Similarity in Prediction Models Based on Health Data: A Scoping Review

Science.gov (United States)

Sharafoddini, Anis; Dubin, Joel A

2017-01-01

Background Physicians and health policy makers are required to make predictions during their decision making in various medical problems. Many advances have been made in predictive modeling toward outcome prediction, but these innovations target an average patient and are insufficiently adjustable for individual patients. One developing idea in this field is individualized predictive analytics based on patient similarity. The goal of this approach is to identify patients who are similar to an index patient and derive insights from the records of similar patients to provide personalized predictions.. Objective The aim is to summarize and review published studies describing computer-based approaches for predicting patients’ future health status based on health data and patient similarity, identify gaps, and provide a starting point for related future research. Methods The method involved (1) conducting the review by performing automated searches in Scopus, PubMed, and ISI Web of Science, selecting relevant studies by first screening titles and abstracts then analyzing full-texts, and (2) documenting by extracting publication details and information on context, predictors, missing data, modeling algorithm, outcome, and evaluation methods into a matrix table, synthesizing data, and reporting results. Results After duplicate removal, 1339 articles were screened in abstracts and titles and 67 were selected for full-text review. In total, 22 articles met the inclusion criteria. Within included articles, hospitals were the main source of data (n=10). Cardiovascular disease (n=7) and diabetes (n=4) were the dominant patient diseases. Most studies (n=18) used neighborhood-based approaches in devising prediction models. Two studies showed that patient similarity-based modeling outperformed population-based predictive methods. Conclusions Interest in patient similarity-based predictive modeling for diagnosis and prognosis has been growing. In addition to raw/coded health
Anomalous Traffic Detection and Self-Similarity Analysis in the Environment of ATMSim

Directory of Open Access Journals (Sweden)

Hae-Duck J. Jeong

2017-12-01

Full Text Available Internet utilisation has steadily increased, predominantly due to the rapid recent development of information and communication networks and the widespread distribution of smartphones. As a result of this increase in Internet consumption, various types of services, including web services, social networking services (SNS, Internet banking, and remote processing systems have been created. These services have significantly enhanced global quality of life. However, as a negative side-effect of this rapid development, serious information security problems have also surfaced, which has led to serious to Internet privacy invasions and network attacks. In an attempt to contribute to the process of addressing these problems, this paper proposes a process to detect anomalous traffic using self-similarity analysis in the Anomaly Teletraffic detection Measurement analysis Simulator (ATMSim environment as a research method. Simulations were performed to measure normal and anomalous traffic. First, normal traffic for each attack, including the Address Resolution Protocol (ARP and distributed denial-of-service (DDoS was measured for 48 h over 10 iterations. Hadoop was used to facilitate processing of the large amount of collected data, after which MapReduce was utilised after storing the data in the Hadoop Distributed File System (HDFS. A new platform on Hadoop, the detection system ATMSim, was used to identify anomalous traffic after which a comparative analysis of the normal and anomalous traffic was performed through a self-similarity analysis. There were four categories of collected traffic that were divided according to the attack methods used: normal local area network (LAN traffic, DDoS attack, and ARP spoofing, as well as DDoS and ARP attack. ATMSim, the anomaly traffic detection system, was used to determine if real attacks could be identified effectively. To achieve this, the ATMSim was used in simulations for each scenario to test its ability to
A Similarity Analysis of Audio Signal to Develop a Human Activity Recognition Using Similarity Networks

Directory of Open Access Journals (Sweden)

Alejandra García-Hernández

2017-11-01

Full Text Available Human Activity Recognition (HAR is one of the main subjects of study in the areas of computer vision and machine learning due to the great benefits that can be achieved. Examples of the study areas are: health prevention, security and surveillance, automotive research, and many others. The proposed approaches are carried out using machine learning techniques and present good results. However, it is difficult to observe how the descriptors of human activities are grouped. In order to obtain a better understanding of the the behavior of descriptors, it is important to improve the abilities to recognize the human activities. This paper proposes a novel approach for the HAR based on acoustic data and similarity networks. In this approach, we were able to characterize the sound of the activities and identify those activities looking for similarity in the sound pattern. We evaluated the similarity of the sounds considering mainly two features: the sound location and the materials that were used. As a result, the materials are a good reference classifying the human activities compared with the location.

Anomaly Detection in Nanofibrous Materials by CNN-Based Self-Similarity

Directory of Open Access Journals (Sweden)

Paolo Napoletano

2018-01-01

Full Text Available Automatic detection and localization of anomalies in nanofibrous materials help to reduce the cost of the production process and the time of the post-production visual inspection process. Amongst all the monitoring methods, those exploiting Scanning Electron Microscope (SEM imaging are the most effective. In this paper, we propose a region-based method for the detection and localization of anomalies in SEM images, based on Convolutional Neural Networks (CNNs and self-similarity. The method evaluates the degree of abnormality of each subregion of an image under consideration by computing a CNN-based visual similarity with respect to a dictionary of anomaly-free subregions belonging to a training set. The proposed method outperforms the state of the art.
Similarity-Based Interference and the Acquisition of Adjunct Control

Directory of Open Access Journals (Sweden)

Juliana Gerard

2017-10-01

Full Text Available Previous research on the acquisition of adjunct control has observed non-adultlike behavior for sentences like “John bumped Mary after tripping on the sidewalk.” While adults only allow a subject control interpretation for these sentences (that John tripped on the sidewalk, preschool-aged children have been reported to allow a much wider range of interpretations. A number of different tasks have been used with the aim of identifying a grammatical source of children’s errors. In this paper, we consider the role of extragrammatical factors. In two comprehension experiments, we demonstrate that error rates go up when the similarity increases between an antecedent and a linearly intervening noun phrase, first with similarity in gender, and next with similarity in number marking. This suggests that difficulties with adjunct control are to be explained (at least in part by the sentence processing mechanisms that underlie similarity-based interference in adults.
A Multi-Model Stereo Similarity Function Based on Monogenic Signal Analysis in Poisson Scale Space

Directory of Open Access Journals (Sweden)

Jinjun Li

2011-01-01

Full Text Available A stereo similarity function based on local multi-model monogenic image feature descriptors (LMFD is proposed to match interest points and estimate disparity map for stereo images. Local multi-model monogenic image features include local orientation and instantaneous phase of the gray monogenic signal, local color phase of the color monogenic signal, and local mean colors in the multiscale color monogenic signal framework. The gray monogenic signal, which is the extension of analytic signal to gray level image using Dirac operator and Laplace equation, consists of local amplitude, local orientation, and instantaneous phase of 2D image signal. The color monogenic signal is the extension of monogenic signal to color image based on Clifford algebras. The local color phase can be estimated by computing geometric product between the color monogenic signal and a unit reference vector in RGB color space. Experiment results on the synthetic and natural stereo images show the performance of the proposed approach.
Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM

Directory of Open Access Journals (Sweden)

Yunyun Liang

2015-01-01

Full Text Available Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM. Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS, segmented PsePSSM, and segmented autocovariance transformation (ACT based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640 are adopted in this paper. Then a 700-dimensional (700D feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA. To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.
Measuring time series regularity using nonlinear similarity-based sample entropy

International Nuclear Information System (INIS)

Xie Hongbo; He Weixing; Liu Hui

2008-01-01

Sampe Entropy (SampEn), a measure quantifying regularity and complexity, is believed to be an effective analyzing method of diverse settings that include both deterministic chaotic and stochastic processes, particularly operative in the analysis of physiological signals that involve relatively small amount of data. However, the similarity definition of vectors is based on Heaviside function, of which the boundary is discontinuous and hard, may cause some problems in the validity and accuracy of SampEn. Sigmoid function is a smoothed and continuous version of Heaviside function. To overcome the problems SampEn encountered, a modified SampEn (mSampEn) based on nonlinear Sigmoid function was proposed. The performance of mSampEn was tested on the independent identically distributed (i.i.d.) uniform random numbers, the MIX stochastic model, the Rossler map, and the Hennon map. The results showed that mSampEn was superior to SampEn in several aspects, including giving entropy definition in case of small parameters, better relative consistency, robust to noise, and more independence on record length when characterizing time series generated from either deterministic or stochastic system with different regularities
A Study of Wavelet Analysis and Data Extraction from Second-Order Self-Similar Time Series

Directory of Open Access Journals (Sweden)

Leopoldo Estrada Vargas

2013-01-01

Full Text Available Statistical analysis and synthesis of self-similar discrete time signals are presented. The analysis equation is formally defined through a special family of basis functions of which the simplest case matches the Haar wavelet. The original discrete time series is synthesized without loss by a linear combination of the basis functions after some scaling, displacement, and phase shift. The decomposition is then used to synthesize a new second-order self-similar signal with a different Hurst index than the original. The components are also used to describe the behavior of the estimated mean and variance of self-similar discrete time series. It is shown that the sample mean, although it is unbiased, provides less information about the process mean as its Hurst index is higher. It is also demonstrated that the classical variance estimator is biased and that the widely accepted aggregated variance-based estimator of the Hurst index results biased not due to its nature (which is being unbiased and has minimal variance but to flaws in its implementation. Using the proposed decomposition, the correct estimation of the Variance Plot is described, as well as its close association with the popular Logscale Diagram.
Information filtering based on transferring similarity.

Science.gov (United States)

Sun, Duo; Zhou, Tao; Liu, Jian-Guo; Liu, Run-Ran; Jia, Chun-Xiao; Wang, Bing-Hong

2009-07-01

In this Brief Report, we propose an index of user similarity, namely, the transferring similarity, which involves all high-order similarities between users. Accordingly, we design a modified collaborative filtering algorithm, which provides remarkably higher accurate predictions than the standard collaborative filtering. More interestingly, we find that the algorithmic performance will approach its optimal value when the parameter, contained in the definition of transferring similarity, gets close to its critical value, before which the series expansion of transferring similarity is convergent and after which it is divergent. Our study is complementary to the one reported in [E. A. Leicht, P. Holme, and M. E. J. Newman, Phys. Rev. E 73, 026120 (2006)], and is relevant to the missing link prediction problem.
Using SQL Databases for Sequence Similarity Searching and Analysis.

Science.gov (United States)

Pearson, William R; Mackey, Aaron J

2017-09-13

Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.

Science.gov (United States)

Lin, Jie; Adjeroh, Donald A; Jiang, Bing-Hua; Jiang, Yue

2018-05-15

Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). yueljiang@163.com. Supplementary data are available at Bioinformatics online.
Kovasznay modes in the linear stability analysis of self-similar ablation flows

International Nuclear Information System (INIS)

Lombard, V.

2008-12-01

Exact self-similar solutions of gas dynamics equations with nonlinear heat conduction for semi-infinite slabs of perfect gases are used for studying the stability of ablative flows in inertial confinement fusion, when a shock wave propagates in front of a thermal front. Both the similarity solutions and their linear perturbations are numerically computed with a dynamical multi-domain Chebyshev pseudo-spectral method. Laser-imprint results, showing that maximum amplification occurs for a laser-intensity modulation of zero transverse wavenumber have thus been obtained (Abeguile et al. (2006); Clarisse et al. (2008)). Here we pursue this approach by proceeding for the first time to an analysis of perturbations in terms of Kovasznay modes. Based on the analysis of two compressible and incompressible flows, evolution equations of vorticity, acoustic and entropy modes are proposed for each flow region and mode couplings are assessed. For short times, perturbations are transferred from the external surface to the ablation front by diffusion and propagate as acoustic waves up to the shock wave. For long times, the shock region is governed by the free propagation of acoustic waves. A study of perturbations and associated sources allows us to identify strong mode couplings in the conduction and ablation regions. Moreover, the maximum instability depends on compressibility. Finally, a comparison with experiments of flows subjected to initial surface defects is initiated. (author)
Comparative analysis of chemical similarity methods for modular natural products with a hypothetical structure enumeration algorithm.

Science.gov (United States)

Skinnider, Michael A; Dejong, Chris A; Franczak, Brian C; McNicholas, Paul D; Magarvey, Nathan A

2017-08-16

Natural products represent a prominent source of pharmaceutically and industrially important agents. Calculating the chemical similarity of two molecules is a central task in cheminformatics, with applications at multiple stages of the drug discovery pipeline. Quantifying the similarity of natural products is a particularly important problem, as the biological activities of these molecules have been extensively optimized by natural selection. The large and structurally complex scaffolds of natural products distinguish their physical and chemical properties from those of synthetic compounds. However, no analysis of the performance of existing methods for molecular similarity calculation specific to natural products has been reported to date. Here, we present LEMONS, an algorithm for the enumeration of hypothetical modular natural product structures. We leverage this algorithm to conduct a comparative analysis of molecular similarity methods within the unique chemical space occupied by modular natural products using controlled synthetic data, and comprehensively investigate the impact of diverse biosynthetic parameters on similarity search. We additionally investigate a recently described algorithm for natural product retrobiosynthesis and alignment, and find that when rule-based retrobiosynthesis can be applied, this approach outperforms conventional two-dimensional fingerprints, suggesting it may represent a valuable approach for the targeted exploration of natural product chemical space and microbial genome mining. Our open-source algorithm is an extensible method of enumerating hypothetical natural product structures with diverse potential applications in bioinformatics.
Toward translational incremental similarity-based reasoning in breast cancer grading

Science.gov (United States)

Tutac, Adina E.; Racoceanu, Daniel; Leow, Wee-Keng; Müller, Henning; Putti, Thomas; Cretu, Vladimir

2009-02-01

One of the fundamental issues in bridging the gap between the proliferation of Content-Based Image Retrieval (CBIR) systems in the scientific literature and the deficiency of their usage in medical community is based on the characteristic of CBIR to access information by images or/and text only. Yet, the way physicians are reasoning about patients leads intuitively to a case representation. Hence, a proper solution to overcome this gap is to consider a CBIR approach inspired by Case-Based Reasoning (CBR), which naturally introduces medical knowledge structured by cases. Moreover, in a CBR system, the knowledge is incrementally added and learned. The purpose of this study is to initiate a translational solution from CBIR algorithms to clinical practice, using a CBIR/CBR hybrid approach. Therefore, we advance the idea of a translational incremental similarity-based reasoning (TISBR), using combined CBIR and CBR characteristics: incremental learning of medical knowledge, medical case-based structure of the knowledge (CBR), image usage to retrieve similar cases (CBIR), similarity concept (central for both paradigms). For this purpose, three major axes are explored: the indexing, the cases retrieval and the search refinement, applied to Breast Cancer Grading (BCG), a powerful breast cancer prognosis exam. The effectiveness of this strategy is currently evaluated over cases provided by the Pathology Department of Singapore National University Hospital, for the indexing. With its current accuracy, TISBR launches interesting perspectives for complex reasoning in future medical research, opening the way to a better knowledge traceability and a better acceptance rate of computer-aided diagnosis assistance among practitioners.
Similar speaker recognition using nonlinear analysis

International Nuclear Information System (INIS)

Seo, J.P.; Kim, M.S.; Baek, I.C.; Kwon, Y.H.; Lee, K.S.; Chang, S.W.; Yang, S.I.

2004-01-01

Speech features of the conventional speaker identification system, are usually obtained by linear methods in spectral space. However, these methods have the drawback that speakers with similar voices cannot be distinguished, because the characteristics of their voices are also similar in spectral space. To overcome the difficulty in linear methods, we propose to use the correlation exponent in the nonlinear space as a new feature vector for speaker identification among persons with similar voices. We show that our proposed method surprisingly reduces the error rate of speaker identification system to speakers with similar voices
K-Line Patterns’ Predictive Power Analysis Using the Methods of Similarity Match and Clustering

Directory of Open Access Journals (Sweden)

Lv Tao

2017-01-01

Full Text Available Stock price prediction based on K-line patterns is the essence of candlestick technical analysis. However, there are some disputes on whether the K-line patterns have predictive power in academia. To help resolve the debate, this paper uses the data mining methods of pattern recognition, pattern clustering, and pattern knowledge mining to research the predictive power of K-line patterns. The similarity match model and nearest neighbor-clustering algorithm are proposed for solving the problem of similarity match and clustering of K-line series, respectively. The experiment includes testing the predictive power of the Three Inside Up pattern and Three Inside Down pattern with the testing dataset of the K-line series data of Shanghai 180 index component stocks over the latest 10 years. Experimental results show that (1 the predictive power of a pattern varies a great deal for different shapes and (2 each of the existing K-line patterns requires further classification based on the shape feature for improving the prediction performance.
Similarity-based recommendation of new concepts to a terminology

NARCIS (Netherlands)

Chandar, Praveen; Yaman, Anil; Hoxha, Julia; He, Zhe; Weng, Chunhua

2015-01-01

Terminologies can suffer from poor concept coverage due to delays in addition of new concepts. This study tests a similarity-based approach to recommending concepts from a text corpus to a terminology. Our approach involves extraction of candidate concepts from a given text corpus, which are
Structure modulates similarity-based interference in sluicing: An eye tracking study.

Directory of Open Access Journals (Sweden)

Jesse A. Harris

2015-12-01

Full Text Available In cue-based content-addressable approaches to memory, a target and its competitors are retrieved in parallel from memory via a fast, associative cue-matching procedure under a severely limited focus of attention. Such a parallel matching procedure could in principle ignore the serial order or hierarchical structure characteristic of linguistic relations. I present an eye tracking while reading experiment that investigates whether the sentential position of a potential antecedent modulates the strength of similarity-based interference, a well-studied effect in which increased similarity in features between a target and its competitors results in slower and less accurate retrieval overall. The manipulation trades on an independently established Locality bias in sluiced structures to associate a wh-remnant (which ones in clausal ellipsis with the most local correlate (some wines, as in The tourists enjoyed some wines, but I don’t know which ones. The findings generally support cue-based parsing models of sentence processing that are subject to similarity-based interference in retrieval, and provide additional support to the growing body of evidence that retrieval is sensitive to both the structural position of a target antecedent and its competitors, and the specificity of retrieval cues.
Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection.

Science.gov (United States)

Chen, Yifei; Sun, Yuxing; Han, Bing-Qing

2015-01-01

Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.
Molecular basis sets - a general similarity-based approach for representing chemical spaces.

Science.gov (United States)

Raghavendra, Akshay S; Maggiora, Gerald M

2007-01-01

A new method, based on generalized Fourier analysis, is described that utilizes the concept of "molecular basis sets" to represent chemical space within an abstract vector space. The basis vectors in this space are abstract molecular vectors. Inner products among the basis vectors are determined using an ansatz that associates molecular similarities between pairs of molecules with their corresponding inner products. Moreover, the fact that similarities between pairs of molecules are, in essentially all cases, nonzero implies that the abstract molecular basis vectors are nonorthogonal, but since the similarity of a molecule with itself is unity, the molecular vectors are normalized to unity. A symmetric orthogonalization procedure, which optimally preserves the character of the original set of molecular basis vectors, is used to construct appropriate orthonormal basis sets. Molecules can then be represented, in general, by sets of orthonormal "molecule-like" basis vectors within a proper Euclidean vector space. However, the dimension of the space can become quite large. Thus, the work presented here assesses the effect of basis set size on a number of properties including the average squared error and average norm of molecular vectors represented in the space-the results clearly show the expected reduction in average squared error and increase in average norm as the basis set size is increased. Several distance-based statistics are also considered. These include the distribution of distances and their differences with respect to basis sets of differing size and several comparative distance measures such as Spearman rank correlation and Kruscal stress. All of the measures show that, even though the dimension can be high, the chemical spaces they represent, nonetheless, behave in a well-controlled and reasonable manner. Other abstract vector spaces analogous to that described here can also be constructed providing that the appropriate inner products can be directly
[-25]A Similarity Analysis of Audio Signal to Develop a Human Activity Recognition Using Similarity Networks.

Science.gov (United States)

García-Hernández, Alejandra; Galván-Tejada, Carlos E; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Velasco-Elizondo, Perla; Cárdenas-Vargas, Rogelio

2017-11-21

Human Activity Recognition (HAR) is one of the main subjects of study in the areas of computer vision and machine learning due to the great benefits that can be achieved. Examples of the study areas are: health prevention, security and surveillance, automotive research, and many others. The proposed approaches are carried out using machine learning techniques and present good results. However, it is difficult to observe how the descriptors of human activities are grouped. In order to obtain a better understanding of the the behavior of descriptors, it is important to improve the abilities to recognize the human activities. This paper proposes a novel approach for the HAR based on acoustic data and similarity networks. In this approach, we were able to characterize the sound of the activities and identify those activities looking for similarity in the sound pattern. We evaluated the similarity of the sounds considering mainly two features: the sound location and the materials that were used. As a result, the materials are a good reference classifying the human activities compared with the location.
Scalar Similarity for Relaxed Eddy Accumulation Methods

Science.gov (United States)

Ruppert, Johannes; Thomas, Christoph; Foken, Thomas

2006-07-01

The relaxed eddy accumulation (REA) method allows the measurement of trace gas fluxes when no fast sensors are available for eddy covariance measurements. The flux parameterisation used in REA is based on the assumption of scalar similarity, i.e., similarity of the turbulent exchange of two scalar quantities. In this study changes in scalar similarity between carbon dioxide, sonic temperature and water vapour were assessed using scalar correlation coefficients and spectral analysis. The influence on REA measurements was assessed by simulation. The evaluation is based on observations over grassland, irrigated cotton plantation and spruce forest. Scalar similarity between carbon dioxide, sonic temperature and water vapour showed a distinct diurnal pattern and change within the day. Poor scalar similarity was found to be linked to dissimilarities in the energy contained in the low frequency part of the turbulent spectra ( definition.

RNA-Seq-based toxicogenomic assessment of fresh frozen and formalin-fixed tissues yields similar mechanistic insights.

Science.gov (United States)

Auerbach, Scott S; Phadke, Dhiral P; Mav, Deepak; Holmgren, Stephanie; Gao, Yuan; Xie, Bin; Shin, Joo Heon; Shah, Ruchir R; Merrick, B Alex; Tice, Raymond R

2015-07-01

Formalin-fixed, paraffin-embedded (FFPE) pathology specimens represent a potentially vast resource for transcriptomic-based biomarker discovery. We present here a comparison of results from a whole transcriptome RNA-Seq analysis of RNA extracted from fresh frozen and FFPE livers. The samples were derived from rats exposed to aflatoxin B1 (AFB1 ) and a corresponding set of control animals. Principal components analysis indicated that samples were separated in the two groups representing presence or absence of chemical exposure, both in fresh frozen and FFPE sample types. Sixty-five percent of the differentially expressed transcripts (AFB1 vs. controls) in fresh frozen samples were also differentially expressed in FFPE samples (overlap significance: P < 0.0001). Genomic signature and gene set analysis of AFB1 differentially expressed transcript lists indicated highly similar results between fresh frozen and FFPE at the level of chemogenomic signatures (i.e., single chemical/dose/duration elicited transcriptomic signatures), mechanistic and pathology signatures, biological processes, canonical pathways and transcription factor networks. Overall, our results suggest that similar hypotheses about the biological mechanism of toxicity would be formulated from fresh frozen and FFPE samples. These results indicate that phenotypically anchored archival specimens represent a potentially informative resource for signature-based biomarker discovery and mechanistic characterization of toxicity. Copyright © 2014 John Wiley & Sons, Ltd.
Analysis of the human diseasome using phenotype similarity between common, genetic, and infectious diseases

KAUST Repository

Hoehndorf, Robert

2015-06-08

Phenotypes are the observable characteristics of an organism arising from its response to the environment. Phenotypes associated with engineered and natural genetic variation are widely recorded using phenotype ontologies in model organisms, as are signs and symptoms of human Mendelian diseases in databases such as OMIM and Orphanet. Exploiting these resources, several computational methods have been developed for integration and analysis of phenotype data to identify the genetic etiology of diseases or suggest plausible interventions. A similar resource would be highly useful not only for rare and Mendelian diseases, but also for common, complex and infectious diseases. We apply a semantic text-mining approach to identify the phenotypes (signs and symptoms) associated with over 6,000 diseases. We evaluate our text-mined phenotypes by demonstrating that they can correctly identify known disease-associated genes in mice and humans with high accuracy. Using a phenotypic similarity measure, we generate a human disease network in which diseases that have similar signs and symptoms cluster together, and we use this network to identify closely related diseases based on common etiological, anatomical as well as physiological underpinnings.
Analysis of the human diseasome using phenotype similarity between common, genetic, and infectious diseases

Science.gov (United States)

Hoehndorf, Robert; Schofield, Paul N.; Gkoutos, Georgios V.

2015-06-01

Phenotypes are the observable characteristics of an organism arising from its response to the environment. Phenotypes associated with engineered and natural genetic variation are widely recorded using phenotype ontologies in model organisms, as are signs and symptoms of human Mendelian diseases in databases such as OMIM and Orphanet. Exploiting these resources, several computational methods have been developed for integration and analysis of phenotype data to identify the genetic etiology of diseases or suggest plausible interventions. A similar resource would be highly useful not only for rare and Mendelian diseases, but also for common, complex and infectious diseases. We apply a semantic text-mining approach to identify the phenotypes (signs and symptoms) associated with over 6,000 diseases. We evaluate our text-mined phenotypes by demonstrating that they can correctly identify known disease-associated genes in mice and humans with high accuracy. Using a phenotypic similarity measure, we generate a human disease network in which diseases that have similar signs and symptoms cluster together, and we use this network to identify closely related diseases based on common etiological, anatomical as well as physiological underpinnings.
Measure of Node Similarity in Multilayer Networks

DEFF Research Database (Denmark)

Møllgaard, Anders; Zettler, Ingo; Dammeyer, Jesper

2016-01-01

university.Our analysis is based on data obtained using smartphones equipped with customdata collection software, complemented by questionnaire-based data. The networkof social contacts is represented as a weighted multilayer network constructedfrom different channels of telecommunication as well as data...... might bepresent in one layer of the multilayer network and simultaneously be absent inthe other layers. For a variable such as gender, our measure reveals atransition from similarity between nodes connected with links of relatively lowweight to dis-similarity for the nodes connected by the strongest...
A similarity-based data warehousing environment for medical images.

Science.gov (United States)

Teixeira, Jefferson William; Annibal, Luana Peixoto; Felipe, Joaquim Cezar; Ciferri, Ricardo Rodrigues; Ciferri, Cristina Dutra de Aguiar

2015-11-01

A core issue of the decision-making process in the medical field is to support the execution of analytical (OLAP) similarity queries over images in data warehousing environments. In this paper, we focus on this issue. We propose imageDWE, a non-conventional data warehousing environment that enables the storage of intrinsic features taken from medical images in a data warehouse and supports OLAP similarity queries over them. To comply with this goal, we introduce the concept of perceptual layer, which is an abstraction used to represent an image dataset according to a given feature descriptor in order to enable similarity search. Based on this concept, we propose the imageDW, an extended data warehouse with dimension tables specifically designed to support one or more perceptual layers. We also detail how to build an imageDW and how to load image data into it. Furthermore, we show how to process OLAP similarity queries composed of a conventional predicate and a similarity search predicate that encompasses the specification of one or more perceptual layers. Moreover, we introduce an index technique to improve the OLAP query processing over images. We carried out performance tests over a data warehouse environment that consolidated medical images from exams of several modalities. The results demonstrated the feasibility and efficiency of our proposed imageDWE to manage images and to process OLAP similarity queries. The results also demonstrated that the use of the proposed index technique guaranteed a great improvement in query processing. Copyright © 2015 Elsevier Ltd. All rights reserved.
Observations and analysis of self-similar branching topology in glacier networks

Science.gov (United States)

Bahr, D.B.; Peckham, S.D.

1996-01-01

Glaciers, like rivers, have a branching structure which can be characterized by topological trees or networks. Probability distributions of various topological quantities in the networks are shown to satisfy the criterion for self-similarity, a symmetry structure which might be used to simplify future models of glacier dynamics. Two analytical methods of describing river networks, Shreve's random topology model and deterministic self-similar trees, are applied to the six glaciers of south central Alaska studied in this analysis. Self-similar trees capture the topological behavior observed for all of the glaciers, and most of the networks are also reasonably approximated by Shreve's theory. Copyright 1996 by the American Geophysical Union.
Authentication of commercial spices based on the similarities between gas chromatographic fingerprints.

Science.gov (United States)

Matsushita, Takaya; Zhao, Jing Jing; Igura, Noriyuki; Shimoda, Mitsuya

2018-06-01

A simple and solvent-free method was developed for the authentication of commercial spices. The similarities between gas chromatographic fingerprints were measured using similarity indices and multivariate data analyses, as morphological differentiation between dried powders and small spice particles was challenging. The volatile compounds present in 11 spices (i.e. allspice, anise, black pepper, caraway, clove, coriander, cumin, dill, fennel, star anise, and white pepper) were extracted by headspace solid-phase microextraction, and analysed by gas chromatography-mass spectrometry. The largest 10 peaks were selected from each total ion chromatogram, and a total of 65 volatiles were tentatively identified. The similarity indices (i.e. the congruence coefficients) were calculated using the data matrices of the identified compound relative peak areas to differentiate between two sets of fingerprints. Where pairs of similar fingerprints produced high congruence coefficients (>0.80), distinctive volatile markers were employed to distinguish between these samples. In addition, hierarchical cluster analysis and principal component analysis were performed to visualise the similarity among fingerprints, and the analysed spices were grouped and characterised according to their distinctive major components. This method is suitable for screening unknown spices, and can therefore be employed to evaluate the quality and authenticity of various spices. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Similarity-based search of model organism, disease and drug effect phenotypes

KAUST Repository

Hoehndorf, Robert; Gruenberger, Michael; Gkoutos, Georgios V; Schofield, Paul N

2015-01-01

Background: Semantic similarity measures over phenotype ontologies have been demonstrated to provide a powerful approach for the analysis of model organism phenotypes, the discovery of animal models of human disease, novel pathways, gene functions
MAC/FAC: A Model of Similarity-Based Retrieval

Science.gov (United States)

1994-10-01

Grapes (0.28) 327 Sour Grapes, analog The Taming of the Shrew (0.22), Merry Wives 251 (0.18), S[11 stories], Sour Grapes (-0.19) Sour Grapes, literal... The Institute for the 0 1 Learning Sciences Northwestern University CD• 00 MAC/FAC: A MODEL OF SIMILARITY-BASED RETRIEVAL Kenneth D. Forbus Dedre...Gentner Keith Law Technical Report #59 • October 1994 94-35188 wit Establisthed in 1989 with the support of Andersen Consulting Form Approved REPORT
Experimental Study of Dowel Bar Alternatives Based on Similarity Model Test

Directory of Open Access Journals (Sweden)

Chichun Hu

2017-01-01

Full Text Available In this study, a small-scaled accelerated loading test based on similarity theory and Accelerated Pavement Analyzer was developed to evaluate dowel bars with different materials and cross-sections. Jointed concrete specimen consisting of one dowel was designed as scaled model for the test, and each specimen was subjected to 864 thousand loading cycles. Deflections between jointed slabs were measured with dial indicators, and strains of the dowel bars were monitored with strain gauges. The load transfer efficiency, differential deflection, and dowel-concrete bearing stress for each case were calculated from these measurements. The test results indicated that the effect of the dowel modulus on load transfer efficiency can be characterized based on the similarity model test developed in the study. Moreover, round steel dowel was found to have similar performance to larger FRP dowel, and elliptical dowel can be preferentially considered in practice.
Correlation between social proximity and mobility similarity.

Science.gov (United States)

Fan, Chao; Liu, Yiding; Huang, Junming; Rong, Zhihai; Zhou, Tao

2017-09-20

Human behaviors exhibit ubiquitous correlations in many aspects, such as individual and collective levels, temporal and spatial dimensions, content, social and geographical layers. With rich Internet data of online behaviors becoming available, it attracts academic interests to explore human mobility similarity from the perspective of social network proximity. Existent analysis shows a strong correlation between online social proximity and offline mobility similarity, namely, mobile records between friends are significantly more similar than between strangers, and those between friends with common neighbors are even more similar. We argue the importance of the number and diversity of common friends, with a counter intuitive finding that the number of common friends has no positive impact on mobility similarity while the diversity plays a key role, disagreeing with previous studies. Our analysis provides a novel view for better understanding the coupling between human online and offline behaviors, and will help model and predict human behaviors based on social proximity.
Network similarity and statistical analysis of earthquake seismic data

OpenAIRE

Deyasi, Krishanu; Chakraborty, Abhijit; Banerjee, Anirban

2016-01-01

We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We cal...
Extension of frequency-based dissimilarity for retrieving similar plasma waveforms

International Nuclear Information System (INIS)

Hochin, Teruhisa; Koyama, Katsumasa; Nakanishi, Hideya; Kojima, Mamoru

2008-01-01

Some computer-aided assistance in finding the waveforms similar to a waveform has become indispensable for accelerating data analysis in the plasma experiments. For the slowly-varying waveforms and those having time-sectional oscillation patterns, the methods using the Fourier series coefficients of waveforms in calculating the dissimilarity have successfully improved the performance in retrieving similar waveforms. This paper treats severely-varying waveforms, and proposes two extensions to the dissimilarity of waveforms. The first extension is to capture the difference of the importance of the Fourier series coefficients of waveforms against frequency. The second extension is to consider the outlines of waveforms. The correctness of the extended dissimilarity is experimentally evaluated by using the metrics used in evaluating that of the information retrieval, i.e. precision and recall. The experimental results show that the extended dissimilarity could improve the correctness of the similarity retrieval of plasma waveforms
Distributional and Knowledge-Based Approaches for Computing Portuguese Word Similarity

Directory of Open Access Journals (Sweden)

Hugo Gonçalo Oliveira

2018-02-01

Full Text Available Identifying similar and related words is not only key in natural language understanding but also a suitable task for assessing the quality of computational resources that organise words and meanings of a language, compiled by different means. This paper, which aims to be a reference for those interested in computing word similarity in Portuguese, presents several approaches for this task and is motivated by the recent availability of state-of-the-art distributional models of Portuguese words, which add to several lexical knowledge bases (LKBs for this language, available for a longer time. The previous resources were exploited to answer word similarity tests, which also became recently available for Portuguese. We conclude that there are several valid approaches for this task, but not one that outperforms all the others in every single test. Distributional models seem to capture relatedness better, while LKBs are better suited for computing genuine similarity, but, in general, better results are obtained when knowledge from different sources is combined.
The Evolution of Facultative Conformity Based on Similarity.

Science.gov (United States)

Efferson, Charles; Lalive, Rafael; Cacault, Maria Paula; Kistler, Deborah

2016-01-01

Conformist social learning can have a pronounced impact on the cultural evolution of human societies, and it can shape both the genetic and cultural evolution of human social behavior more broadly. Conformist social learning is beneficial when the social learner and the demonstrators from whom she learns are similar in the sense that the same behavior is optimal for both. Otherwise, the social learner's optimum is likely to be rare among demonstrators, and conformity is costly. The trade-off between these two situations has figured prominently in the longstanding debate about the evolution of conformity, but the importance of the trade-off can depend critically on the flexibility of one's social learning strategy. We developed a gene-culture coevolutionary model that allows cognition to encode and process information about the similarity between naive learners and experienced demonstrators. Facultative social learning strategies that condition on perceived similarity evolve under certain circumstances. When this happens, facultative adjustments are often asymmetric. Asymmetric adjustments mean that the tendency to follow the majority when learners perceive demonstrators as similar is stronger than the tendency to follow the minority when learners perceive demonstrators as different. In an associated incentivized experiment, we found that social learners adjusted how they used social information based on perceived similarity, but adjustments were symmetric. The symmetry of adjustments completely eliminated the commonly assumed trade-off between cases in which learners and demonstrators share an optimum versus cases in which they do not. In a second experiment that maximized the potential for social learners to follow their preferred strategies, a few social learners exhibited an inclination to follow the majority. Most, however, did not respond systematically to social information. Additionally, in the complete absence of information about their similarity to
The Evolution of Facultative Conformity Based on Similarity.

Directory of Open Access Journals (Sweden)

Charles Efferson

Full Text Available Conformist social learning can have a pronounced impact on the cultural evolution of human societies, and it can shape both the genetic and cultural evolution of human social behavior more broadly. Conformist social learning is beneficial when the social learner and the demonstrators from whom she learns are similar in the sense that the same behavior is optimal for both. Otherwise, the social learner's optimum is likely to be rare among demonstrators, and conformity is costly. The trade-off between these two situations has figured prominently in the longstanding debate about the evolution of conformity, but the importance of the trade-off can depend critically on the flexibility of one's social learning strategy. We developed a gene-culture coevolutionary model that allows cognition to encode and process information about the similarity between naive learners and experienced demonstrators. Facultative social learning strategies that condition on perceived similarity evolve under certain circumstances. When this happens, facultative adjustments are often asymmetric. Asymmetric adjustments mean that the tendency to follow the majority when learners perceive demonstrators as similar is stronger than the tendency to follow the minority when learners perceive demonstrators as different. In an associated incentivized experiment, we found that social learners adjusted how they used social information based on perceived similarity, but adjustments were symmetric. The symmetry of adjustments completely eliminated the commonly assumed trade-off between cases in which learners and demonstrators share an optimum versus cases in which they do not. In a second experiment that maximized the potential for social learners to follow their preferred strategies, a few social learners exhibited an inclination to follow the majority. Most, however, did not respond systematically to social information. Additionally, in the complete absence of information about their
The Evolution of Facultative Conformity Based on Similarity

Science.gov (United States)

Efferson, Charles; Lalive, Rafael; Cacault, Maria Paula; Kistler, Deborah

2016-01-01

Conformist social learning can have a pronounced impact on the cultural evolution of human societies, and it can shape both the genetic and cultural evolution of human social behavior more broadly. Conformist social learning is beneficial when the social learner and the demonstrators from whom she learns are similar in the sense that the same behavior is optimal for both. Otherwise, the social learner’s optimum is likely to be rare among demonstrators, and conformity is costly. The trade-off between these two situations has figured prominently in the longstanding debate about the evolution of conformity, but the importance of the trade-off can depend critically on the flexibility of one’s social learning strategy. We developed a gene-culture coevolutionary model that allows cognition to encode and process information about the similarity between naive learners and experienced demonstrators. Facultative social learning strategies that condition on perceived similarity evolve under certain circumstances. When this happens, facultative adjustments are often asymmetric. Asymmetric adjustments mean that the tendency to follow the majority when learners perceive demonstrators as similar is stronger than the tendency to follow the minority when learners perceive demonstrators as different. In an associated incentivized experiment, we found that social learners adjusted how they used social information based on perceived similarity, but adjustments were symmetric. The symmetry of adjustments completely eliminated the commonly assumed trade-off between cases in which learners and demonstrators share an optimum versus cases in which they do not. In a second experiment that maximized the potential for social learners to follow their preferred strategies, a few social learners exhibited an inclination to follow the majority. Most, however, did not respond systematically to social information. Additionally, in the complete absence of information about their similarity to
Dimensional analysis and self-similarity methods for engineers and scientists

CERN Document Server

Zohuri, Bahman

2015-01-01

This ground-breaking reference provides an overview of key concepts in dimensional analysis, and then pushes well beyond traditional applications in fluid mechanics to demonstrate how powerful this tool can be in solving complex problems across many diverse fields. Of particular interest is the book's coverage of dimensional analysis and self-similarity methods in nuclear and energy engineering. Numerous practical examples of dimensional problems are presented throughout, allowing readers to link the book's theoretical explanations and step-by-step mathematical solutions to practical impleme
Adaptive Spatial Filter Based on Similarity Indices to Preserve the Neural Information on EEG Signals during On-Line Processing

Directory of Open Access Journals (Sweden)

Denis Delisle-Rodriguez

2017-11-01

Full Text Available This work presents a new on-line adaptive filter, which is based on a similarity analysis between standard electrode locations, in order to reduce artifacts and common interferences throughout electroencephalography (EEG signals, but preserving the useful information. Standard deviation and Concordance Correlation Coefficient (CCC between target electrodes and its correspondent neighbor electrodes are analyzed on sliding windows to select those neighbors that are highly correlated. Afterwards, a model based on CCC is applied to provide higher values of weight to those correlated electrodes with lower similarity to the target electrode. The approach was applied to brain computer-interfaces (BCIs based on Canonical Correlation Analysis (CCA to recognize 40 targets of steady-state visual evoked potential (SSVEP, providing an accuracy (ACC of 86.44 ± 2.81%. In addition, also using this approach, features of low frequency were selected in the pre-processing stage of another BCI to recognize gait planning. In this case, the recognition was significantly ( p < 0.01 improved for most of the subjects ( A C C ≥ 74.79 % , when compared with other BCIs based on Common Spatial Pattern, Filter Bank-Common Spatial Pattern, and Riemannian Geometry.
An Enhanced Rule-Based Web Scanner Based on Similarity Score

Directory of Open Access Journals (Sweden)

LEE, M.

2016-08-01

Full Text Available This paper proposes an enhanced rule-based web scanner in order to get better accuracy in detecting web vulnerabilities than the existing tools, which have relatively high false alarm rate when the web pages are installed in unconventional directory paths. Using the proposed matching method based on similarity score, the proposed scheme can determine whether two pages have the same vulnerabilities or not. With this method, the proposed scheme is able to figure out the target web pages are vulnerable by comparing them to the web pages that are known to have vulnerabilities. We show the proposed scanner reduces 12% false alarm rate compared to the existing well-known scanner through the performance evaluation via various experiments. The proposed scheme is especially helpful in detecting vulnerabilities of the web applications which come from well-known open-source web applications after small customization, which happens frequently in many small-sized companies.

Design of chemical space networks using a Tanimoto similarity variant based upon maximum common substructures.

Science.gov (United States)

Zhang, Bijun; Vogt, Martin; Maggiora, Gerald M; Bajorath, Jürgen

2015-10-01

Chemical space networks (CSNs) have recently been introduced as an alternative to other coordinate-free and coordinate-based chemical space representations. In CSNs, nodes represent compounds and edges pairwise similarity relationships. In addition, nodes are annotated with compound property information such as biological activity. CSNs have been applied to view biologically relevant chemical space in comparison to random chemical space samples and found to display well-resolved topologies at low edge density levels. The way in which molecular similarity relationships are assessed is an important determinant of CSN topology. Previous CSN versions were based on numerical similarity functions or the assessment of substructure-based similarity. Herein, we report a new CSN design that is based upon combined numerical and substructure similarity evaluation. This has been facilitated by calculating numerical similarity values on the basis of maximum common substructures (MCSs) of compounds, leading to the introduction of MCS-based CSNs (MCS-CSNs). This CSN design combines advantages of continuous numerical similarity functions with a robust and chemically intuitive substructure-based assessment. Compared to earlier version of CSNs, MCS-CSNs are characterized by a further improved organization of local compound communities as exemplified by the delineation of drug-like subspaces in regions of biologically relevant chemical space.
Two Phase Non-Rigid Multi-Modal Image Registration Using Weber Local Descriptor-Based Similarity Metrics and Normalized Mutual Information

Directory of Open Access Journals (Sweden)

Feng Yang

2013-06-01

Full Text Available Non-rigid multi-modal image registration plays an important role in medical image processing and analysis. Existing image registration methods based on similarity metrics such as mutual information (MI and sum of squared differences (SSD cannot achieve either high registration accuracy or high registration efficiency. To address this problem, we propose a novel two phase non-rigid multi-modal image registration method by combining Weber local descriptor (WLD based similarity metrics with the normalized mutual information (NMI using the diffeomorphic free-form deformation (FFD model. The first phase aims at recovering the large deformation component using the WLD based non-local SSD (wldNSSD or weighted structural similarity (wldWSSIM. Based on the output of the former phase, the second phase is focused on getting accurate transformation parameters related to the small deformation using the NMI. Extensive experiments on T1, T2 and PD weighted MR images demonstrate that the proposed wldNSSD-NMI or wldWSSIM-NMI method outperforms the registration methods based on the NMI, the conditional mutual information (CMI, the SSD on entropy images (ESSD and the ESSD-NMI in terms of registration accuracy and computation efficiency.
Comparative Analysis of Networks of Phonologically Similar Words in English and Spanish

Directory of Open Access Journals (Sweden)

Michael S. Vitevitch

2010-03-01

Full Text Available Previous network analyses of several languages revealed a unique set of structural characteristics. One of these characteristics—the presence of many smaller components (referred to as islands—was further examined with a comparative analysis of the island constituents. The results showed that Spanish words in the islands tended to be phonologically and semantically similar to each other, but English words in the islands tended only to be phonologically similar to each other. The results of this analysis yielded hypotheses about language processing that can be tested with psycholinguistic experiments, and offer insight into cross-language differences in processing that have been previously observed.
A behavioral similarity measure between labeled Petri nets based on principal transition sequences

NARCIS (Netherlands)

Wang, J.; He, T.; Wen, L.; Wu, N.; Hofstede, ter A.H.M.; Su, J.; Meersman, R.; Dillon, T.S.; Herrero, P.

2010-01-01

Being able to determine the degree of similarity between process models is important for management, reuse, and analysis of business process models. In this paper we propose a novel method to determine the degree of similarity between process models, which exploits their semantics. Our approach is
Scaling Analysis of the Single-Phase Natural Circulation: the Hydraulic Similarity

International Nuclear Information System (INIS)

Yu, Xin-Guo; Choi, Ki-Yong

2015-01-01

These passive safety systems all rely on the natural circulation to cool down the reactor cores during an accident. Thus, a robust and accurate scaling methodology must be developed and employed to both assist in the design of a scaled-down test facility and guide the tests in order to mimic the natural circulation flow of its prototype. The natural circulation system generally consists of a heat source, the connecting pipes and several heat sinks. Although many applauding scaling methodologies have been proposed during last several decades, few works have been dedicated to systematically analyze and exactly preserve the hydraulic similarity. In the present study, the hydraulic similarity analyses are performed at both system and local level. By this mean, the scaling criteria for the exact hydraulic similarity in a full-pressure model have been sought. In other words, not only the system-level but also the local-level hydraulic similarities are pursued. As the hydraulic characteristics of a fluid system is governed by the momentum equation, the scaling analysis starts with it. A dimensionless integral loop momentum equation is derived to obtain the dimensionless numbers. In the dimensionless momentum equation, two dimensionless numbers, the dimensionless flow resistance number and the dimensionless gravitational force number, are identified along with a unique hydraulic time scale, characterizing the system hydraulic response. A full-height full-pressure model is also made to see which model among the full-height model and reduced-height model can preserve the hydraulic behavior of the prototype. From the dimensionless integral momentum equation, a unique hydraulic time scale, which characterizes the hydraulic response of a single-phase natural circulation system, is identified along with two dimensionless parameters: the dimensionless flow resistance number and the dimensionless gravitational force number. By satisfying the equality of both dimensionless numbers
Scaling Analysis of the Single-Phase Natural Circulation: the Hydraulic Similarity

Energy Technology Data Exchange (ETDEWEB)

Yu, Xin-Guo; Choi, Ki-Yong [KAERI, Daejeon (Korea, Republic of)

2015-05-15

These passive safety systems all rely on the natural circulation to cool down the reactor cores during an accident. Thus, a robust and accurate scaling methodology must be developed and employed to both assist in the design of a scaled-down test facility and guide the tests in order to mimic the natural circulation flow of its prototype. The natural circulation system generally consists of a heat source, the connecting pipes and several heat sinks. Although many applauding scaling methodologies have been proposed during last several decades, few works have been dedicated to systematically analyze and exactly preserve the hydraulic similarity. In the present study, the hydraulic similarity analyses are performed at both system and local level. By this mean, the scaling criteria for the exact hydraulic similarity in a full-pressure model have been sought. In other words, not only the system-level but also the local-level hydraulic similarities are pursued. As the hydraulic characteristics of a fluid system is governed by the momentum equation, the scaling analysis starts with it. A dimensionless integral loop momentum equation is derived to obtain the dimensionless numbers. In the dimensionless momentum equation, two dimensionless numbers, the dimensionless flow resistance number and the dimensionless gravitational force number, are identified along with a unique hydraulic time scale, characterizing the system hydraulic response. A full-height full-pressure model is also made to see which model among the full-height model and reduced-height model can preserve the hydraulic behavior of the prototype. From the dimensionless integral momentum equation, a unique hydraulic time scale, which characterizes the hydraulic response of a single-phase natural circulation system, is identified along with two dimensionless parameters: the dimensionless flow resistance number and the dimensionless gravitational force number. By satisfying the equality of both dimensionless numbers
An Analysis of Looking Back Method in Problem-Based Learning: Case Study on Congruence and Similarity in Junior High School

Science.gov (United States)

Kosasih, U.; Wahyudin, W.; Prabawanto, S.

2017-09-01

This study aims to understand how learners do look back their idea of problem solving. This research is based on qualitative approach with case study design. Participants in this study were xx students of Junior High School, who were studying the material of congruence and similarity. The supporting instruments in this research are test and interview sheet. The data obtained were analyzed by coding and constant-comparison. The analysis find that there are three ways in which the students review the idea of problem solving, which is 1) carried out by comparing answers to the completion measures exemplified by learning resources; 2) carried out by examining the logical relationship between the solution and the problem; and 3) carried out by means of confirmation to the prior knowledge they have. This happens because most students learn in a mechanistic way. This study concludes that students validate the idea of problem solving obtained, influenced by teacher explanations, learning resources, and prior knowledge. Therefore, teacher explanations and learning resources contribute to the success or failure of students in solving problems.
Neural Substrates of Similarity and Rule-based Strategies in Judgment

Directory of Open Access Journals (Sweden)

Bettina eVon Helversen

2014-10-01

Full Text Available Making accurate judgments is a core human competence and a prerequisite for success in many areas of life. Plenty of evidence exists that people can employ different judgment strategies to solve identical judgment problems. In categorization, it has been demonstrated that similarity-based and rule-based strategies are associated with activity in different brain regions. Building on this research, the present work tests whether solving two identical judgment problems recruits different neural substrates depending on people's judgment strategies. Combining cognitive modeling of judgment strategies at the behavioral level with functional magnetic resonance imaging (fMRI, we compare brain activity when using two archetypal judgment strategies: a similarity-based exemplar strategy and a rule-based heuristic strategy. Using an exemplar-based strategy should recruit areas involved in long-term memory processes to a larger extent than a heuristic strategy. In contrast, using a heuristic strategy should recruit areas involved in the application of rules to a larger extent than an exemplar-based strategy. Largely consistent with our hypotheses, we found that using an exemplar-based strategy led to relatively higher BOLD activity in the anterior prefrontal and inferior parietal cortex, presumably related to retrieval and selective attention processes. In contrast, using a heuristic strategy led to relatively higher activity in areas in the dorsolateral prefrontal and the temporal-parietal cortex associated with cognitive control and information integration. Thus, even when people solve identical judgment problems, different neural substrates can be recruited depending on the judgment strategy involved.
A Model for Comparative Analysis of the Similarity between Android and iOS Operating Systems

Directory of Open Access Journals (Sweden)

Lixandroiu R.

2014-12-01

Full Text Available Due to recent expansion of mobile devices, in this article we try to do an analysis of two of the most used mobile OSS. This analysis is made on the method of calculating Jaccard's similarity coefficient. To complete the analysis, we developed a hierarchy of factors in evaluating OSS. Analysis has shown that the two OSS are similar in terms of functionality, but there are a number of factors that weighted make a difference.
Thai Language Sentence Similarity Computation Based on Syntactic Structure and Semantic Vector

Science.gov (United States)

Wang, Hongbin; Feng, Yinhan; Cheng, Liang

2018-03-01

Sentence similarity computation plays an increasingly important role in text mining, Web page retrieval, machine translation, speech recognition and question answering systems. Thai language as a kind of resources scarce language, it is not like Chinese language with HowNet and CiLin resources. So the Thai sentence similarity research faces some challenges. In order to solve this problem of the Thai language sentence similarity computation. This paper proposes a novel method to compute the similarity of Thai language sentence based on syntactic structure and semantic vector. This method firstly uses the Part-of-Speech (POS) dependency to calculate two sentences syntactic structure similarity, and then through the word vector to calculate two sentences semantic similarity. Finally, we combine the two methods to calculate two Thai language sentences similarity. The proposed method not only considers semantic, but also considers the sentence syntactic structure. The experiment result shows that this method in Thai language sentence similarity computation is feasible.
Mapping Rice Cropping Systems in Vietnam Using an NDVI-Based Time-Series Similarity Measurement Based on DTW Distance

Directory of Open Access Journals (Sweden)

Xudong Guan

2016-01-01

Full Text Available Normalized Difference Vegetation Index (NDVI derived from Moderate Resolution Imaging Spectroradiometer (MODIS time-series data has been widely used in the fields of crop and rice classification. The cloudy and rainy weather characteristics of the monsoon season greatly reduce the likelihood of obtaining high-quality optical remote sensing images. In addition, the diverse crop-planting system in Vietnam also hinders the comparison of NDVI among different crop stages. To address these problems, we apply a Dynamic Time Warping (DTW distance-based similarity measure approach and use the entire yearly NDVI time series to reduce the inaccuracy of classification using a single image. We first de-noise the NDVI time series using S-G filtering based on the TIMESAT software. Then, a standard NDVI time-series base for rice growth is established based on field survey data and Google Earth sample data. NDVI time-series data for each pixel are constructed and the DTW distance with the standard rice growth NDVI time series is calculated. Then, we apply thresholds to extract rice growth areas. A qualitative assessment using statistical data and a spatial assessment using sampled data from the rice-cropping map reveal a high mapping accuracy at the national scale between the statistical data, with the corresponding R2 being as high as 0.809; however, the mapped rice accuracy decreased at the provincial scale due to the reduced number of rice planting areas per province. An analysis of the results indicates that the 500-m resolution MODIS data are limited in terms of mapping scattered rice parcels. The results demonstrate that the DTW-based similarity measure of the NDVI time series can be effectively used to map large-area rice cropping systems with diverse cultivation processes.
Determination of optimal samples for robot calibration based on error similarity

Directory of Open Access Journals (Sweden)

Tian Wei

2015-06-01

Full Text Available Industrial robots are used for automatic drilling and riveting. The absolute position accuracy of an industrial robot is one of the key performance indexes in aircraft assembly, and can be improved through error compensation to meet aircraft assembly requirements. The achievable accuracy and the difficulty of accuracy compensation implementation are closely related to the choice of sampling points. Therefore, based on the error similarity error compensation method, a method for choosing sampling points on a uniform grid is proposed. A simulation is conducted to analyze the influence of the sample point locations on error compensation. In addition, the grid steps of the sampling points are optimized using a statistical analysis method. The method is used to generate grids and optimize the grid steps of a Kuka KR-210 robot. The experimental results show that the method for planning sampling data can be used to effectively optimize the sampling grid. After error compensation, the position accuracy of the robot meets the position accuracy requirements.
Image-based metal artifact reduction in x-ray computed tomography utilizing local anatomical similarity

Science.gov (United States)

Dong, Xue; Yang, Xiaofeng; Rosenfield, Jonathan; Elder, Eric; Dhabaan, Anees

2017-03-01

X-ray computed tomography (CT) is widely used in radiation therapy treatment planning in recent years. However, metal implants such as dental fillings and hip prostheses can cause severe bright and dark streaking artifacts in reconstructed CT images. These artifacts decrease image contrast and degrade HU accuracy, leading to inaccuracies in target delineation and dose calculation. In this work, a metal artifact reduction method is proposed based on the intrinsic anatomical similarity between neighboring CT slices. Neighboring CT slices from the same patient exhibit similar anatomical features. Exploiting this anatomical similarity, a gamma map is calculated as a weighted summation of relative HU error and distance error for each pixel in an artifact-corrupted CT image relative to a neighboring, artifactfree image. The minimum value in the gamma map for each pixel is used to identify an appropriate pixel from the artifact-free CT slice to replace the corresponding artifact-corrupted pixel. With the proposed method, the mean CT HU error was reduced from 360 HU and 460 HU to 24 HU and 34 HU on head and pelvis CT images, respectively. Dose calculation accuracy also improved, as the dose difference was reduced from greater than 20% to less than 4%. Using 3%/3mm criteria, the gamma analysis failure rate was reduced from 23.25% to 0.02%. An image-based metal artifact reduction method is proposed that replaces corrupted image pixels with pixels from neighboring CT slices free of metal artifacts. This method is shown to be capable of suppressing streaking artifacts, thereby improving HU and dose calculation accuracy.
Privacy-Preserving Patient Similarity Learning in a Federated Environment: Development and Analysis.

Science.gov (United States)

Lee, Junghye; Sun, Jimeng; Wang, Fei; Wang, Shuang; Jun, Chi-Hyuck; Jiang, Xiaoqian

2018-04-13

There is an urgent need for the development of global analytic frameworks that can perform analyses in a privacy-preserving federated environment across multiple institutions without privacy leakage. A few studies on the topic of federated medical analysis have been conducted recently with the focus on several algorithms. However, none of them have solved similar patient matching, which is useful for applications such as cohort construction for cross-institution observational studies, disease surveillance, and clinical trials recruitment. The aim of this study was to present a privacy-preserving platform in a federated setting for patient similarity learning across institutions. Without sharing patient-level information, our model can find similar patients from one hospital to another. We proposed a federated patient hashing framework and developed a novel algorithm to learn context-specific hash codes to represent patients across institutions. The similarities between patients can be efficiently computed using the resulting hash codes of corresponding patients. To avoid security attack from reverse engineering on the model, we applied homomorphic encryption to patient similarity search in a federated setting. We used sequential medical events extracted from the Multiparameter Intelligent Monitoring in Intensive Care-III database to evaluate the proposed algorithm in predicting the incidence of five diseases independently. Our algorithm achieved averaged area under the curves of 0.9154 and 0.8012 with balanced and imbalanced data, respectively, in κ-nearest neighbor with κ=3. We also confirmed privacy preservation in similarity search by using homomorphic encryption. The proposed algorithm can help search similar patients across institutions effectively to support federated data analysis in a privacy-preserving manner. ©Junghye Lee, Jimeng Sun, Fei Wang, Shuang Wang, Chi-Hyuck Jun, Xiaoqian Jiang. Originally published in JMIR Medical Informatics (http
Visual reconciliation of alternative similarity spaces in climate modeling

Science.gov (United States)

J Poco; A Dasgupta; Y Wei; William Hargrove; C.R. Schwalm; D.N. Huntzinger; R Cook; E Bertini; C.T. Silva

2015-01-01

Visual data analysis often requires grouping of data objects based on their similarity. In many application domains researchers use algorithms and techniques like clustering and multidimensional scaling to extract groupings from data. While extracting these groups using a single similarity criteria is relatively straightforward, comparing alternative criteria poses...
Measure of Node Similarity in Multilayer Networks

DEFF Research Database (Denmark)

Møllgaard, Anders; Zettler, Ingo; Dammeyer, Jesper

2016-01-01

The weight of links in a network is often related to the similarity of thenodes. Here, we introduce a simple tunable measure for analysing the similarityof nodes across different link weights. In particular, we use the measure toanalyze homophily in a group of 659 freshman students at a large...... university.Our analysis is based on data obtained using smartphones equipped with customdata collection software, complemented by questionnaire-based data. The networkof social contacts is represented as a weighted multilayer network constructedfrom different channels of telecommunication as well as data...... might bepresent in one layer of the multilayer network and simultaneously be absent inthe other layers. For a variable such as gender, our measure reveals atransition from similarity between nodes connected with links of relatively lowweight to dis-similarity for the nodes connected by the strongest...
3D Facial Similarity Measure Based on Geodesic Network and Curvatures

Directory of Open Access Journals (Sweden)

Junli Zhao

2014-01-01

Full Text Available Automated 3D facial similarity measure is a challenging and valuable research topic in anthropology and computer graphics. It is widely used in various fields, such as criminal investigation, kinship confirmation, and face recognition. This paper proposes a 3D facial similarity measure method based on a combination of geodesic and curvature features. Firstly, a geodesic network is generated for each face with geodesics and iso-geodesics determined and these network points are adopted as the correspondence across face models. Then, four metrics associated with curvatures, that is, the mean curvature, Gaussian curvature, shape index, and curvedness, are computed for each network point by using a weighted average of its neighborhood points. Finally, correlation coefficients according to these metrics are computed, respectively, as the similarity measures between two 3D face models. Experiments of different persons’ 3D facial models and different 3D facial models of the same person are implemented and compared with a subjective face similarity study. The results show that the geodesic network plays an important role in 3D facial similarity measure. The similarity measure defined by shape index is consistent with human’s subjective evaluation basically, and it can measure the 3D face similarity more objectively than the other indices.
Classification of Unknown Thermocouple Types Using Similarity Factor Measurement

Directory of Open Access Journals (Sweden)

Seshu K. DAMARLA

2011-01-01

Full Text Available In contrast to classification using PCA method, a new methodology is proposed for type identification of unknown thermocouple. The new methodology is based on calculating the degree of similarity between two multivariate datasets using two types of similarity factors. One similarity factor is based on principle component analysis and the angles between the principle component subspaces while the other is based on the Mahalanobis distance between the datasets. Datasets containing thermo-emfs against given temperature ranges are formed for each type of thermocouple (e.g. J, K, S, T, R, E, B and N type by experimentation are considered as reference datasets. Datasets corresponding to unknown type are captured. Similarity factor between the datasets one of which being the unknown type and the other being each known type are compared. When maximum similarity factor occurs, then the class of unknown type is allocated to that of known type.
Genetic similarity among commercial oil palm materials based on microsatellite markers

Directory of Open Access Journals (Sweden)

Diana Arias

2012-08-01

Full Text Available Microsatellite markers are used to determine genetic similarities among individuals and might be used in various applications in breeding programs. For example, knowing the genetic similarity relationships of commercial planting materials helps to better understand their responses to environmental, agronomic and plant health factors. This study assessed 17 microsatellite markers in 9 crosses (D x P of Elaeis guineensis Jacq. from various commercial companies in Malaysia, France, Costa Rica and Colombia, in order to find possible genetic differences and/or similarities. Seventy-seven alleles were obtained, with an average of 4.5 alleles per primer and a range of 2-8 amplified alleles. The results show a significant reduction of alleles, compared to the number of alleles reported for wild oil palm populations. The obtained dendrogram shows the formation of two groups based on their genetic similarity. Group A, with ~76% similarity, contains the commercial material of 3 codes of Deli x La Mé crosses produced in France and Colombia, and group B, with ~66% genetic similarity, includes all the materials produced by commercial companies in Malaysia, France, Costa Rica and Colombia
Gender similarities and differences.

Science.gov (United States)

Hyde, Janet Shibley

2014-01-01

Whether men and women are fundamentally different or similar has been debated for more than a century. This review summarizes major theories designed to explain gender differences: evolutionary theories, cognitive social learning theory, sociocultural theory, and expectancy-value theory. The gender similarities hypothesis raises the possibility of theorizing gender similarities. Statistical methods for the analysis of gender differences and similarities are reviewed, including effect sizes, meta-analysis, taxometric analysis, and equivalence testing. Then, relying mainly on evidence from meta-analyses, gender differences are reviewed in cognitive performance (e.g., math performance), personality and social behaviors (e.g., temperament, emotions, aggression, and leadership), and psychological well-being. The evidence on gender differences in variance is summarized. The final sections explore applications of intersectionality and directions for future research.

Novel Agent Based-approach for Industrial Diagnosis: A Combined use Between Case-based Reasoning and Similarity Measure

Directory of Open Access Journals (Sweden)

Fatima Zohra Benkaddour

2016-12-01

Full Text Available In spunlace nonwovens industry, the maintenance task is very complex, it requires experts and operators collaboration. In this paper, we propose a new approach integrating an agent- based modelling with case-based reasoning that utilizes similarity measures and preferences module. The main purpose of our study is to compare and evaluate the most suitable similarity measure for our case. Furthermore, operators that are usually geographically dispersed, have to collaborate and negotiate to achieve mutual agreements, especially when their proposals (diagnosis lead to a conflicting situation. The experimentation shows that the suggested agent-based approach is very interesting and efficient for operators and experts who collaborate in INOTIS enterprise.
Genome-wide analysis of regions similar to promoters of histone genes

KAUST Repository

Chowdhary, Rajesh

2010-05-28

Background: The purpose of this study is to: i) develop a computational model of promoters of human histone-encoding genes (shortly histone genes), an important class of genes that participate in various critical cellular processes, ii) use the model so developed to identify regions across the human genome that have similar structure as promoters of histone genes; such regions could represent potential genomic regulatory regions, e.g. promoters, of genes that may be coregulated with histone genes, and iii/ identify in this way genes that have high likelihood of being coregulated with the histone genes.Results: We successfully developed a histone promoter model using a comprehensive collection of histone genes. Based on leave-one-out cross-validation test, the model produced good prediction accuracy (94.1% sensitivity, 92.6% specificity, and 92.8% positive predictive value). We used this model to predict across the genome a number of genes that shared similar promoter structures with the histone gene promoters. We thus hypothesize that these predicted genes could be coregulated with histone genes. This hypothesis matches well with the available gene expression, gene ontology, and pathways data. Jointly with promoters of the above-mentioned genes, we found a large number of intergenic regions with similar structure as histone promoters.Conclusions: This study represents one of the most comprehensive computational analyses conducted thus far on a genome-wide scale of promoters of human histone genes. Our analysis suggests a number of other human genes that share a high similarity of promoter structure with the histone genes and thus are highly likely to be coregulated, and consequently coexpressed, with the histone genes. We also found that there are a large number of intergenic regions across the genome with their structures similar to promoters of histone genes. These regions may be promoters of yet unidentified genes, or may represent remote control regions that
Similarity Analysis for Reactor Flow Distribution Test and Its Validation

Energy Technology Data Exchange (ETDEWEB)

Hong, Soon Joon; Ha, Jung Hui [Heungdeok IT Valley, Yongin (Korea, Republic of); Lee, Taehoo; Han, Ji Woong [KAERI, Daejeon (Korea, Republic of)

2015-05-15

facility. It was clearly found in Hong et al. In this study the feasibility of the similarity analysis of Hong et al. was examined. The similarity analysis was applied to SFR which has been designed in KAERI (Korea Atomic Energy Research Institute) in order to design the reactor flow distribution test. The length scale was assumed to be 1/5, and the velocity scale 1/2, which bounds the square root of the length scale (1/√5). The CFX calculations for both prototype and model were carried out and the flow field was compared.
Measure of Node Similarity in Multilayer Networks.

Directory of Open Access Journals (Sweden)

Anders Mollgaard

Full Text Available The weight of links in a network is often related to the similarity of the nodes. Here, we introduce a simple tunable measure for analysing the similarity of nodes across different link weights. In particular, we use the measure to analyze homophily in a group of 659 freshman students at a large university. Our analysis is based on data obtained using smartphones equipped with custom data collection software, complemented by questionnaire-based data. The network of social contacts is represented as a weighted multilayer network constructed from different channels of telecommunication as well as data on face-to-face contacts. We find that even strongly connected individuals are not more similar with respect to basic personality traits than randomly chosen pairs of individuals. In contrast, several socio-demographics variables have a significant degree of similarity. We further observe that similarity might be present in one layer of the multilayer network and simultaneously be absent in the other layers. For a variable such as gender, our measure reveals a transition from similarity between nodes connected with links of relatively low weight to dis-similarity for the nodes connected by the strongest links. We finally analyze the overlap between layers in the network for different levels of acquaintanceships.
A path-based measurement for human miRNA functional similarities using miRNA-disease associations

Science.gov (United States)

Ding, Pingjian; Luo, Jiawei; Xiao, Qiu; Chen, Xiangtao

2016-09-01

Compared with the sequence and expression similarity, miRNA functional similarity is so important for biology researches and many applications such as miRNA clustering, miRNA function prediction, miRNA synergism identification and disease miRNA prioritization. However, the existing methods always utilized the predicted miRNA target which has high false positive and false negative to calculate the miRNA functional similarity. Meanwhile, it is difficult to achieve high reliability of miRNA functional similarity with miRNA-disease associations. Therefore, it is increasingly needed to improve the measurement of miRNA functional similarity. In this study, we develop a novel path-based calculation method of miRNA functional similarity based on miRNA-disease associations, called MFSP. Compared with other methods, our method obtains higher average functional similarity of intra-family and intra-cluster selected groups. Meanwhile, the lower average functional similarity of inter-family and inter-cluster miRNA pair is obtained. In addition, the smaller p-value is achieved, while applying Wilcoxon rank-sum test and Kruskal-Wallis test to different miRNA groups. The relationship between miRNA functional similarity and other information sources is exhibited. Furthermore, the constructed miRNA functional network based on MFSP is a scale-free and small-world network. Moreover, the higher AUC for miRNA-disease prediction indicates the ability of MFSP uncovering miRNA functional similarity.
Virtual drug screen schema based on multiview similarity integration and ranking aggregation.

Science.gov (United States)

Kang, Hong; Sheng, Zhen; Zhu, Ruixin; Huang, Qi; Liu, Qi; Cao, Zhiwei

2012-03-26

The current drug virtual screen (VS) methods mainly include two categories. i.e., ligand/target structure-based virtual screen and that, utilizing protein-ligand interaction fingerprint information based on the large number of complex structures. Since the former one focuses on the one-side information while the later one focuses on the whole complex structure, they are thus complementary and can be boosted by each other. However, a common problem faced here is how to present a comprehensive understanding and evaluation of the various virtual screen results derived from various VS methods. Furthermore, there is still an urgent need for developing an efficient approach to fully integrate various VS methods from a comprehensive multiview perspective. In this study, our virtual screen schema based on multiview similarity integration and ranking aggregation was tested comprehensively with statistical evaluations, providing several novel and useful clues on how to perform drug VS from multiple heterogeneous data sources. (1) 18 complex structures of HIV-1 protease with ligands from the PDB were curated as a test data set and the VS was performed with five different drug representations. Ritonavir ( 1HXW ) was selected as the query in VS and the weighted ranks of the query results were aggregated from multiple views through four similarity integration approaches. (2) Further, one of the ranking aggregation methods was used to integrate the similarity ranks calculated by gene ontology (GO) fingerprint and structural fingerprint on the data set from connectivity map, and two typical HDAC and HSP90 inhibitors were chosen as the queries. The results show that rank aggregation can enhance the result of similarity searching in VS when two or more descriptions are involved and provide a more reasonable similarity rank result. Our study shows that integrated VS based on multiple data fusion can achieve a remarkable better performance compared to that from individual ones and
Protein-protein interaction network-based detection of functionally similar proteins within species.

Science.gov (United States)

Song, Baoxing; Wang, Fen; Guo, Yang; Sang, Qing; Liu, Min; Li, Dengyun; Fang, Wei; Zhang, Deli

2012-07-01

Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent. Copyright © 2012 Wiley Periodicals, Inc.
Neutrosophic Similarity Score Based Weighted Histogram for Robust Mean-Shift Tracking

Directory of Open Access Journals (Sweden)

Keli Hu

2017-10-01

Full Text Available Visual object tracking is a critical task in computer vision. Challenging things always exist when an object needs to be tracked. For instance, background clutter is one of the most challenging problems. The mean-shift tracker is quite popular because of its efficiency and performance in a range of conditions. However, the challenge of background clutter also disturbs its performance. In this article, we propose a novel weighted histogram based on neutrosophic similarity score to help the mean-shift tracker discriminate the target from the background. Neutrosophic set (NS is a new branch of philosophy for dealing with incomplete, indeterminate, and inconsistent information. In this paper, we utilize the single valued neutrosophic set (SVNS, which is a subclass of NS to improve the mean-shift tracker. First, two kinds of criteria are considered as the object feature similarity and the background feature similarity, and each bin of the weight histogram is represented in the SVNS domain via three membership functions T(Truth, I(indeterminacy, and F(Falsity. Second, the neutrosophic similarity score function is introduced to fuse those two criteria and to build the final weight histogram. Finally, a novel neutrosophic weighted mean-shift tracker is proposed. The proposed tracker is compared with several mean-shift based trackers on a dataset of 61 public sequences. The results revealed that our method outperforms other trackers, especially when confronting background clutter.
Study on force mechanism for therapeutic effect of pushing manipulation with one-finger meditation base on similarity analysis of force and waveform.

Science.gov (United States)

Fang, Lei; Fang, Min; Guo, Min-Min

2016-12-27

To reveal the force mechanism for therapeutic effect of pushing manipulation with one-finger meditation. A total of 15 participants were recruited in this study and assigned to an expert group, a skilled group and a novice group, with 5 participants in each group. Mechanical signals were collected from a biomechanical testing platform, and these data were further observed via similarity analysis and cluster analysis. Comparing the force waveforms of manipulation revealed that the manipulation forces were similar between the expert group and the skilled group (P>0.05). The mean value of vertical force was 9.8 N, and 95% CI rang from 6.37 to 14.70 N, but there were significant differences compared with the novice group (PPushing manipulation with one-finger meditation is a kind of light stimulation manipulation on the acupoint, and force characteristics of double waveforms continuously alternated during manual operation.
Generating "fragment-based virtual library" using pocket similarity search of ligand-receptor complexes.

Science.gov (United States)

Khashan, Raed S

2015-01-01

As the number of available ligand-receptor complexes is increasing, researchers are becoming more dedicated to mine these complexes to aid in the drug design and development process. We present free software which is developed as a tool for performing similarity search across ligand-receptor complexes for identifying binding pockets which are similar to that of a target receptor. The search is based on 3D-geometric and chemical similarity of the atoms forming the binding pocket. For each match identified, the ligand's fragment(s) corresponding to that binding pocket are extracted, thus forming a virtual library of fragments (FragVLib) that is useful for structure-based drug design. The program provides a very useful tool to explore available databases.
Collaborative Filtering Recommendation Based on Trust Model with Fused Similar Factor

Directory of Open Access Journals (Sweden)

Ye Li

2017-01-01

Full Text Available Recommended system is beneficial to e-commerce sites, which provides customers with product information and recommendations; the recommendation system is currently widely used in many fields. In an era of information explosion, the key challenges of the recommender system is to obtain valid information from the tremendous amount of information and produce high quality recommendations. However, when facing the large mount of information, the traditional collaborative filtering algorithm usually obtains a high degree of sparseness, which ultimately lead to low accuracy recommendations. To tackle this issue, we propose a novel algorithm named Collaborative Filtering Recommendation Based on Trust Model with Fused Similar Factor, which is based on the trust model and is combined with the user similarity. The novel algorithm takes into account the degree of interest overlap between the two users and results in a superior performance to the recommendation based on Trust Model in criteria of Precision, Recall, Diversity and Coverage. Additionally, the proposed model can effectively improve the efficiency of collaborative filtering algorithm and achieve high performance.
Deep Convolutional Neural Networks Outperform Feature-Based But Not Categorical Models in Explaining Object Similarity Judgments

Science.gov (United States)

Jozwik, Kamila M.; Kriegeskorte, Nikolaus; Storrs, Katherine R.; Mur, Marieke

2017-01-01

Recent advances in Deep convolutional Neural Networks (DNNs) have enabled unprecedentedly accurate computational models of brain representations, and present an exciting opportunity to model diverse cognitive functions. State-of-the-art DNNs achieve human-level performance on object categorisation, but it is unclear how well they capture human behavior on complex cognitive tasks. Recent reports suggest that DNNs can explain significant variance in one such task, judging object similarity. Here, we extend these findings by replicating them for a rich set of object images, comparing performance across layers within two DNNs of different depths, and examining how the DNNs’ performance compares to that of non-computational “conceptual” models. Human observers performed similarity judgments for a set of 92 images of real-world objects. Representations of the same images were obtained in each of the layers of two DNNs of different depths (8-layer AlexNet and 16-layer VGG-16). To create conceptual models, other human observers generated visual-feature labels (e.g., “eye”) and category labels (e.g., “animal”) for the same image set. Feature labels were divided into parts, colors, textures and contours, while category labels were divided into subordinate, basic, and superordinate categories. We fitted models derived from the features, categories, and from each layer of each DNN to the similarity judgments, using representational similarity analysis to evaluate model performance. In both DNNs, similarity within the last layer explains most of the explainable variance in human similarity judgments. The last layer outperforms almost all feature-based models. Late and mid-level layers outperform some but not all feature-based models. Importantly, categorical models predict similarity judgments significantly better than any DNN layer. Our results provide further evidence for commonalities between DNNs and brain representations. Models derived from visual features
Average is Boring: How Similarity Kills a Meme's Success

Science.gov (United States)

Coscia, Michele

2014-09-01

Every day we are exposed to different ideas, or memes, competing with each other for our attention. Previous research explained popularity and persistence heterogeneity of memes by assuming them in competition for limited attention resources, distributed in a heterogeneous social network. Little has been said about what characteristics make a specific meme more likely to be successful. We propose a similarity-based explanation: memes with higher similarity to other memes have a significant disadvantage in their potential popularity. We employ a meme similarity measure based on semantic text analysis and computer vision to prove that a meme is more likely to be successful and to thrive if its characteristics make it unique. Our results show that indeed successful memes are located in the periphery of the meme similarity space and that our similarity measure is a promising predictor of a meme success.
Protein-protein interaction inference based on semantic similarity of Gene Ontology terms.

Science.gov (United States)

Zhang, Shu-Bo; Tang, Qiang-Rong

2016-07-21

Identifying protein-protein interactions is important in molecular biology. Experimental methods to this issue have their limitations, and computational approaches have attracted more and more attentions from the biological community. The semantic similarity derived from the Gene Ontology (GO) annotation has been regarded as one of the most powerful indicators for protein interaction. However, conventional methods based on GO similarity fail to take advantage of the specificity of GO terms in the ontology graph. We proposed a GO-based method to predict protein-protein interaction by integrating different kinds of similarity measures derived from the intrinsic structure of GO graph. We extended five existing methods to derive the semantic similarity measures from the descending part of two GO terms in the GO graph, then adopted a feature integration strategy to combines both the ascending and the descending similarity scores derived from the three sub-ontologies to construct various kinds of features to characterize each protein pair. Support vector machines (SVM) were employed as discriminate classifiers, and five-fold cross validation experiments were conducted on both human and yeast protein-protein interaction datasets to evaluate the performance of different kinds of integrated features, the experimental results suggest the best performance of the feature that combines information from both the ascending and the descending parts of the three ontologies. Our method is appealing for effective prediction of protein-protein interaction. Copyright © 2016 Elsevier Ltd. All rights reserved.
Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences.

Science.gov (United States)

Tan, Yen Hock; Huang, He; Kihara, Daisuke

2006-08-15

Aligning distantly related protein sequences is a long-standing problem in bioinformatics, and a key for successful protein structure prediction. Its importance is increasing recently in the context of structural genomics projects because more and more experimentally solved structures are available as templates for protein structure modeling. Toward this end, recent structure prediction methods employ profile-profile alignments, and various ways of aligning two profiles have been developed. More fundamentally, a better amino acid similarity matrix can improve a profile itself; thereby resulting in more accurate profile-profile alignments. Here we have developed novel amino acid similarity matrices from knowledge-based amino acid contact potentials. Contact potentials are used because the contact propensity to the other amino acids would be one of the most conserved features of each position of a protein structure. The derived amino acid similarity matrices are tested on benchmark alignments at three different levels, namely, the family, the superfamily, and the fold level. Compared to BLOSUM45 and the other existing matrices, the contact potential-based matrices perform comparably in the family level alignments, but clearly outperform in the fold level alignments. The contact potential-based matrices perform even better when suboptimal alignments are considered. Comparing the matrices themselves with each other revealed that the contact potential-based matrices are very different from BLOSUM45 and the other matrices, indicating that they are located in a different basin in the amino acid similarity matrix space.
Activation analysis. A basis for chemical similarity and classification

Energy Technology Data Exchange (ETDEWEB)

Beeck, J OP de [Ghent Rijksuniversiteit (Belgium). Instituut voor Kernwetenschappen

1977-01-01

It is shown that activation analysis is especially suited to serve as a basis for determining the chemical similarity between samples defined by their trace-element concentration patterns. The general problem of classification and identification is discussed. The nature of possible classification structures and their appropriate clustering strategies is considered. A practical computer method is suggested and its application as well as the graphical representation of classification results are given. The possibility for classification using information theory is mentioned. Classification of chemical elements is discussed and practically realized after Hadamard transformation of the concentration variation patterns in a series of samples.
Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks.

Science.gov (United States)

Arunraja, Muruganantham; Malathi, Veluchamy; Sakthivel, Erulappan

2015-11-01

Wireless sensor networks are engaged in various data gathering applications. The major bottleneck in wireless data gathering systems is the finite energy of sensor nodes. By conserving the on board energy, the life span of wireless sensor network can be well extended. Data communication being the dominant energy consuming activity of wireless sensor network, data reduction can serve better in conserving the nodal energy. Spatial and temporal correlation among the sensor data is exploited to reduce the data communications. Data similar cluster formation is an effective way to exploit spatial correlation among the neighboring sensors. By sending only a subset of data and estimate the rest using this subset is the contemporary way of exploiting temporal correlation. In Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks, we construct data similar iso-clusters with minimal communication overhead. The intra-cluster communication is reduced using adaptive-normalized least mean squares based dual prediction framework. The cluster head reduces the inter-cluster data payload using a lossless compressive forwarding technique. The proposed work achieves significant data reduction in both the intra-cluster and the inter-cluster communications, with the optimal data accuracy of collected data. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Neutrosophic Refined Similarity Measure Based on Cosine Function

Directory of Open Access Journals (Sweden)

Said Broumi

2014-12-01

Full Text Available In this paper, the cosine similarity measure of neutrosophic refined (multi- sets is proposed and its properties are studied. The concept of this cosine similarity measure of neutrosophic refined sets is the extension of improved cosine similarity measure of single valued neutrosophic. Finally, using this cosine similarity measure of neutrosophic refined set, the application of medical diagnosis is presented.
Molecular similarity measures.

Science.gov (United States)

Maggiora, Gerald M; Shanmugasundaram, Veerabahu

2011-01-01

Molecular similarity is a pervasive concept in chemistry. It is essential to many aspects of chemical reasoning and analysis and is perhaps the fundamental assumption underlying medicinal chemistry. Dissimilarity, the complement of similarity, also plays a major role in a growing number of applications of molecular diversity in combinatorial chemistry, high-throughput screening, and related fields. How molecular information is represented, called the representation problem, is important to the type of molecular similarity analysis (MSA) that can be carried out in any given situation. In this work, four types of mathematical structure are used to represent molecular information: sets, graphs, vectors, and functions. Molecular similarity is a pairwise relationship that induces structure into sets of molecules, giving rise to the concept of chemical space. Although all three concepts - molecular similarity, molecular representation, and chemical space - are treated in this chapter, the emphasis is on molecular similarity measures. Similarity measures, also called similarity coefficients or indices, are functions that map pairs of compatible molecular representations that are of the same mathematical form into real numbers usually, but not always, lying on the unit interval. This chapter presents a somewhat pedagogical discussion of many types of molecular similarity measures, their strengths and limitations, and their relationship to one another. An expanded account of the material on chemical spaces presented in the first edition of this book is also provided. It includes a discussion of the topography of activity landscapes and the role that activity cliffs in these landscapes play in structure-activity studies.
Automated dating of the world’s language families based on lexical similarity

OpenAIRE

Holman, E.; Brown, C.; Wichmann, S.; Müller, A.; Velupillai, V.; Hammarström, H.; Sauppe, S.; Jung, H.; Bakker, D.; Brown, P.; Belyaev, O.; Urban, M.; Mailhammer, R.; List, J.; Egorov, D.

2011-01-01

This paper describes a computerized alternative to glottochronology for estimating elapsed time since parent languages diverged into daughter languages. The method, developed by the Automated Similarity Judgment Program (ASJP) consortium, is different from glottochronology in four major respects: (1) it is automated and thus is more objective, (2) it applies a uniform analytical approach to a single database of worldwide languages, (3) it is based on lexical similarity as determined from Leve...

Density-based similarity measures for content based search

Energy Technology Data Exchange (ETDEWEB)

Hush, Don R [Los Alamos National Laboratory; Porter, Reid B [Los Alamos National Laboratory; Ruggiero, Christy E [Los Alamos National Laboratory

2009-01-01

We consider the query by multiple example problem where the goal is to identify database samples whose content is similar to a coUection of query samples. To assess the similarity we use a relative content density which quantifies the relative concentration of the query distribution to the database distribution. If the database distribution is a mixture of the query distribution and a background distribution then it can be shown that database samples whose relative content density is greater than a particular threshold {rho} are more likely to have been generated by the query distribution than the background distribution. We describe an algorithm for predicting samples with relative content density greater than {rho} that is computationally efficient and possesses strong performance guarantees. We also show empirical results for applications in computer network monitoring and image segmentation.
Approach for Text Classification Based on the Similarity Measurement between Normal Cloud Models

Directory of Open Access Journals (Sweden)

Jin Dai

2014-01-01

Full Text Available The similarity between objects is the core research area of data mining. In order to reduce the interference of the uncertainty of nature language, a similarity measurement between normal cloud models is adopted to text classification research. On this basis, a novel text classifier based on cloud concept jumping up (CCJU-TC is proposed. It can efficiently accomplish conversion between qualitative concept and quantitative data. Through the conversion from text set to text information table based on VSM model, the text qualitative concept, which is extraction from the same category, is jumping up as a whole category concept. According to the cloud similarity between the test text and each category concept, the test text is assigned to the most similar category. By the comparison among different text classifiers in different feature selection set, it fully proves that not only does CCJU-TC have a strong ability to adapt to the different text features, but also the classification performance is also better than the traditional classifiers.
Similarities and differences in coatings for magnesium-based stents and orthopaedic implants

Directory of Open Access Journals (Sweden)

Jun Ma

2014-07-01

Full Text Available Magnesium (Mg-based biodegradable materials are promising candidates for the new generation of implantable medical devices, particularly cardiovascular stents and orthopaedic implants. Mg-based cardiovascular stents represent the most innovative stent technology to date. However, these products still do not fully meet clinical requirements with regards to fast degradation rates, late restenosis, and thrombosis. Thus various surface coatings have been introduced to protect Mg-based stents from rapid corrosion and to improve biocompatibility. Similarly, different coatings have been used for orthopaedic implants, e.g., plates and pins for bone fracture fixation or as an interference screw for tendon-bone or ligament-bone insertion, to improve biocompatibility and corrosion resistance. Metal coatings, nanoporous inorganic coatings and permanent polymers have been proved to enhance corrosion resistance; however, inflammation and foreign body reactions have also been reported. By contrast, biodegradable polymers are more biocompatible in general and are favoured over permanent materials. Drugs are also loaded with biodegradable polymers to improve their performance. The key similarities and differences in coatings for Mg-based stents and orthopaedic implants are summarized.
Face Recognition Performance Improvement using a Similarity Score of Feature Vectors based on Probabilistic Histograms

Directory of Open Access Journals (Sweden)

SRIKOTE, G.

2016-08-01

Full Text Available This paper proposes an improved performance algorithm of face recognition to identify two face mismatch pairs in cases of incorrect decisions. The primary feature of this method is to deploy the similarity score with respect to Gaussian components between two previously unseen faces. Unlike the conventional classical vector distance measurement, our algorithms also consider the plot of summation of the similarity index versus face feature vector distance. A mixture of Gaussian models of labeled faces is also widely applicable to different biometric system parameters. By comparative evaluations, it has been shown that the efficiency of the proposed algorithm is superior to that of the conventional algorithm by an average accuracy of up to 1.15% and 16.87% when compared with 3x3 Multi-Region Histogram (MRH direct-bag-of-features and Principal Component Analysis (PCA-based face recognition systems, respectively. The experimental results show that similarity score consideration is more discriminative for face recognition compared to feature distance. Experimental results of Labeled Face in the Wild (LFW data set demonstrate that our algorithms are suitable for real applications probe-to-gallery identification of face recognition systems. Moreover, this proposed method can also be applied to other recognition systems and therefore additionally improves recognition scores.
Analysis of HIV-1 intersubtype recombination breakpoints suggests region with high pairing probability may be a more fundamental factor than sequence similarity affecting HIV-1 recombination.

Science.gov (United States)

Jia, Lei; Li, Lin; Gui, Tao; Liu, Siyang; Li, Hanping; Han, Jingwan; Guo, Wei; Liu, Yongjian; Li, Jingyun

2016-09-21

With increasing data on HIV-1, a more relevant molecular model describing mechanism details of HIV-1 genetic recombination usually requires upgrades. Currently an incomplete structural understanding of the copy choice mechanism along with several other issues in the field that lack elucidation led us to perform an analysis of the correlation between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarity to further explore structural mechanisms. Near full length sequences of URFs from Asia, Europe, and Africa (one sequence/patient), and representative sequences of worldwide CRFs were retrieved from the Los Alamos HIV database. Their recombination patterns were analyzed by jpHMM in detail. Then the relationships between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarities were investigated. Pearson correlation test showed that all URF groups and the CRF group exhibit the same breakpoint distribution pattern. Additionally, the Wilcoxon two-sample test indicated a significant and inexplicable limitation of recombination in regions with high pairing probability. These regions have been found to be strongly conserved across distinct biological states (i.e., strong intersubtype similarity), and genetic similarity has been determined to be a very important factor promoting recombination. Thus, the results revealed an unexpected disagreement between intersubtype similarity and breakpoint distribution, which were further confirmed by genetic similarity analysis. Our analysis reveals a critical conflict between results from natural HIV-1 isolates and those from HIV-1-based assay vectors in which genetic similarity has been shown to be a very critical factor promoting recombination. These results indicate the region with high-pairing probabilities may be a more fundamental factor affecting HIV-1 recombination than sequence similarity in natural HIV-1 infections. Our
Similarity Evaluation of Different Origins and Species of Dendrobiums by GC-MS and FTIR Analysis of Polysaccharides

Directory of Open Access Journals (Sweden)

Nai-Dong Chen

2015-01-01

Full Text Available GC-MS method combined with FTIR techniques by the analysis of polysaccharide was applied to evaluate the similarity between wild (W and tissue-cultured (TC Dendrobium huoshanense (DHS, Dendrobium officinale (DO, and Dendrobium moniliforme (DM as well as 3 wild Dendrobium spp.: Dendrobium henanense (DHN, Dendrobium loddigesii (DL, and Dendrobium crepidatum (DC. Eight monosaccharides involving xylose, arabinose, rhamnose, glucose, mannose, fructose, galactose, and galacturonic acid were identified in the polysaccharide from each Dendrobium sample while the contents of the monosugars varied remarkably across origins and species. Further similarity evaluation based on GC-MS data showed that the rcor values of different origins of DHS, DO, and DM were 0.831, 0.865, and 0.884, respectively, while the rcor values ranged from 0.475 to 0.837 across species. FTIR files of the polysaccharides revealed that the similarity coefficients between W and TC-DHS, DO, and DM were 88.7%, 86.8%, and 88.5%, respectively, in contrast to the similarity coefficients varying from 57.4% to 82.6% across species. These results suggested that the structures of polysaccharides between different origins of the investigated Dendrobiums might be higher than what we had supposed.
Oscillatory flow at the end of parallel-plate stacks: phenomenological and similarity analysis

International Nuclear Information System (INIS)

Mao Xiaoan; Jaworski, Artur J

2010-01-01

This paper addresses the physics of the oscillatory flow in the vicinity of a series of parallel plates forming geometrically identical channels. This type of flow is particularly relevant to thermoacoustic engines and refrigerators, where a reciprocating flow is responsible for the desirable energy transfer, but it is also of interest to general fluid mechanics of oscillatory flows past bluff bodies. In this paper, the physics of an acoustically induced flow past a series of plates in an isothermal condition is studied in detail using the data provided by PIV imaging. Particular attention is given to the analysis of the wake flow during the ejection part of the flow cycle, where either closed recirculating vortices or alternating vortex shedding can be observed. This is followed by a similarity analysis of the governing Navier-Stokes equations in order to derive the similarity criteria governing the wake flow behaviour. To this end, similarity numbers including two types of Reynolds number, the Keulegan-Carpenter number and a non-dimensional stack configuration parameter, d/h, are considered and their influence on the phenomena are discussed.
Examining Similarity Structure: Multidimensional Scaling and Related Approaches in Neuroimaging

Directory of Open Access Journals (Sweden)

Svetlana V. Shinkareva

2013-01-01

Full Text Available This paper covers similarity analyses, a subset of multivariate pattern analysis techniques that are based on similarity spaces defined by multivariate patterns. These techniques offer several advantages and complement other methods for brain data analyses, as they allow for comparison of representational structure across individuals, brain regions, and data acquisition methods. Particular attention is paid to multidimensional scaling and related approaches that yield spatial representations or provide methods for characterizing individual differences. We highlight unique contributions of these methods by reviewing recent applications to functional magnetic resonance imaging data and emphasize areas of caution in applying and interpreting similarity analysis methods.
Similarity of trajectories taking into account geographic context

Directory of Open Access Journals (Sweden)

Maike Buchin

2014-12-01

Full Text Available The movements of animals, people, and vehicles are embedded in a geographic context. This context influences the movement and may cause the formation of certain behavioral responses. Thus, it is essential to include context parameters in the study of movement and the development of movement pattern analytics. Advances in sensor technologies and positioning devices provide valuable data not only of moving agents but also of the circumstances embedding the movement in space and time. Developing knowledge discovery methods to investigate the relation between movement and its surrounding context is a major challenge in movement analysis today. In this paper we show how to integrate geographic context into the similarity analysis of movement data. For this, we discuss models for geographic context of movement data. Based on this we develop simple but efficient context-aware similarity measures for movement trajectories, which combine a spatial and a contextual distance. These are based on well-known similarity measures for trajectories, such as the Hausdorff, Fréchet, or equal time distance. We validate our approach by applying these measures to movement data of hurricanes and albatross.
Similarity Measure of Graphs

Directory of Open Access Journals (Sweden)

Amine Labriji

2017-07-01

Full Text Available The topic of identifying the similarity of graphs was considered as highly recommended research field in the Web semantic, artificial intelligence, the shape recognition and information research. One of the fundamental problems of graph databases is finding similar graphs to a graph query. Existing approaches dealing with this problem are usually based on the nodes and arcs of the two graphs, regardless of parental semantic links. For instance, a common connection is not identified as being part of the similarity of two graphs in cases like two graphs without common concepts, the measure of similarity based on the union of two graphs, or the one based on the notion of maximum common sub-graph (SCM, or the distance of edition of graphs. This leads to an inadequate situation in the context of information research. To overcome this problem, we suggest a new measure of similarity between graphs, based on the similarity measure of Wu and Palmer. We have shown that this new measure satisfies the properties of a measure of similarities and we applied this new measure on examples. The results show that our measure provides a run time with a gain of time compared to existing approaches. In addition, we compared the relevance of the similarity values obtained, it appears that this new graphs measure is advantageous and offers a contribution to solving the problem mentioned above.
EKF-GPR-Based Fingerprint Renovation for Subset-Based Indoor Localization with Adjusted Cosine Similarity.

Science.gov (United States)

Yang, Junhua; Li, Yong; Cheng, Wei; Liu, Yang; Liu, Chenxi

2018-01-22

Received Signal Strength Indicator (RSSI) localization using fingerprint has become a prevailing approach for indoor localization. However, the fingerprint-collecting work is repetitive and time-consuming. After the original fingerprint radio map is built, it is laborious to upgrade the radio map. In this paper, we describe a Fingerprint Renovation System (FRS) based on crowdsourcing, which avoids the use of manual labour to obtain the up-to-date fingerprint status. Extended Kalman Filter (EKF) and Gaussian Process Regression (GPR) in FRS are combined to calculate the current state based on the original fingerprinting radio map. In this system, a method of subset acquisition also makes an immediate impression to reduce the huge computation caused by too many reference points (RPs). Meanwhile, adjusted cosine similarity (ACS) is employed in the online phase to solve the issue of outliers produced by cosine similarity. Both experiments and analytical simulation in a real Wireless Fidelity (Wi-Fi) environment indicate the usefulness of our system to significant performance improvements. The results show that FRS improves the accuracy by 19.6% in the surveyed area compared to the radio map un-renovated. Moreover, the proposed subset algorithm can bring less computation.
EKF–GPR-Based Fingerprint Renovation for Subset-Based Indoor Localization with Adjusted Cosine Similarity

Science.gov (United States)

Yang, Junhua; Li, Yong; Cheng, Wei; Liu, Yang; Liu, Chenxi

2018-01-01

Received Signal Strength Indicator (RSSI) localization using fingerprint has become a prevailing approach for indoor localization. However, the fingerprint-collecting work is repetitive and time-consuming. After the original fingerprint radio map is built, it is laborious to upgrade the radio map. In this paper, we describe a Fingerprint Renovation System (FRS) based on crowdsourcing, which avoids the use of manual labour to obtain the up-to-date fingerprint status. Extended Kalman Filter (EKF) and Gaussian Process Regression (GPR) in FRS are combined to calculate the current state based on the original fingerprinting radio map. In this system, a method of subset acquisition also makes an immediate impression to reduce the huge computation caused by too many reference points (RPs). Meanwhile, adjusted cosine similarity (ACS) is employed in the online phase to solve the issue of outliers produced by cosine similarity. Both experiments and analytical simulation in a real Wireless Fidelity (Wi-Fi) environment indicate the usefulness of our system to significant performance improvements. The results show that FRS improves the accuracy by 19.6% in the surveyed area compared to the radio map un-renovated. Moreover, the proposed subset algorithm can bring less computation. PMID:29361805
Distance and Density Similarity Based Enhanced k-NN Classifier for Improving Fault Diagnosis Performance of Bearings

Directory of Open Access Journals (Sweden)

Sharif Uddin

2016-01-01

Full Text Available An enhanced k-nearest neighbor (k-NN classification algorithm is presented, which uses a density based similarity measure in addition to a distance based similarity measure to improve the diagnostic performance in bearing fault diagnosis. Due to its use of distance based similarity measure alone, the classification accuracy of traditional k-NN deteriorates in case of overlapping samples and outliers and is highly susceptible to the neighborhood size, k. This study addresses these limitations by proposing the use of both distance and density based measures of similarity between training and test samples. The proposed k-NN classifier is used to enhance the diagnostic performance of a bearing fault diagnosis scheme, which classifies different fault conditions based upon hybrid feature vectors extracted from acoustic emission (AE signals. Experimental results demonstrate that the proposed scheme, which uses the enhanced k-NN classifier, yields better diagnostic performance and is more robust to variations in the neighborhood size, k.
SDL: Saliency-Based Dictionary Learning Framework for Image Similarity.

Science.gov (United States)

Sarkar, Rituparna; Acton, Scott T

2018-02-01

In image classification, obtaining adequate data to learn a robust classifier has often proven to be difficult in several scenarios. Classification of histological tissue images for health care analysis is a notable application in this context due to the necessity of surgery, biopsy or autopsy. To adequately exploit limited training data in classification, we propose a saliency guided dictionary learning method and subsequently an image similarity technique for histo-pathological image classification. Salient object detection from images aids in the identification of discriminative image features. We leverage the saliency values for the local image regions to learn a dictionary and respective sparse codes for an image, such that the more salient features are reconstructed with smaller error. The dictionary learned from an image gives a compact representation of the image itself and is capable of representing images with similar content, with comparable sparse codes. We employ this idea to design a similarity measure between a pair of images, where local image features of one image, are encoded with the dictionary learned from the other and vice versa. To effectively utilize the learned dictionary, we take into account the contribution of each dictionary atom in the sparse codes to generate a global image representation for image comparison. The efficacy of the proposed method was evaluated using three tissue data sets that consist of mammalian kidney, lung and spleen tissue, breast cancer, and colon cancer tissue images. From the experiments, we observe that our methods outperform the state of the art with an increase of 14.2% in the average classification accuracy over all data sets.
Assessing semantic similarity of texts - Methods and algorithms

Science.gov (United States)

Rozeva, Anna; Zerkova, Silvia

2017-12-01

Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.
Similarity analysis and scaling criteria for LWRs under single-phase and two-phase natural circulation

International Nuclear Information System (INIS)

Ishii, M.; Kataoka, I.

1983-03-01

Scaling criteria for a natural circulation loop under single phase and two-phase flow conditions have been derived. For a single phase case the continuity, integral momentum, and energy equations in one-dimensional area average forms have been used. From this, the geometrical similarity groups, friction number, Richardson number, characteristic time constant ratio, Biot number, and heat source number are obtained. The Biot number involves the heat transfer coefficient which may cause some difficulties in simulating the turbulent flow regime. For a two-phase flow case, the similarity groups obtained from a perturbation analysis based on the one-dimensional drift-flux model have been used. The physical significance of the phase change number, subcooling number, drift-flux number, friction number are discussed and conditions imposed by these groups are evaluated. In the two-phase flow case, the critical heat flux is one of the most important transients which should be simulated in a scale model. The above results are applied to the LOFT facility in case of a natural circulation simulation. Some preliminary conclusions on the feasibility of the facility have been obtained
Similarity analysis and scaling criteria for LWRs under single-phase and two-phase natural circulation

Energy Technology Data Exchange (ETDEWEB)

Ishii, M.; Kataoka, I.

1983-03-01

Scaling criteria for a natural circulation loop under single phase and two-phase flow conditions have been derived. For a single phase case the continuity, integral momentum, and energy equations in one-dimensional area average forms have been used. From this, the geometrical similarity groups, friction number, Richardson number, characteristic time constant ratio, Biot number, and heat source number are obtained. The Biot number involves the heat transfer coefficient which may cause some difficulties in simulating the turbulent flow regime. For a two-phase flow case, the similarity groups obtained from a perturbation analysis based on the one-dimensional drift-flux model have been used. The physical significance of the phase change number, subcooling number, drift-flux number, friction number are discussed and conditions imposed by these groups are evaluated. In the two-phase flow case, the critical heat flux is one of the most important transients which should be simulated in a scale model. The above results are applied to the LOFT facility in case of a natural circulation simulation. Some preliminary conclusions on the feasibility of the facility have been obtained.
Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement

National Research Council Canada - National Science Library

Ortega-Binderberger, Michael

2002-01-01

... as a critical area of research. This thesis explores how to enhance database systems with content based search over arbitrary abstract data types in a similarity based framework with query refinement...
On the Power and Limits of Sequence Similarity Based Clustering of Proteins Into Families

DEFF Research Database (Denmark)

Wiwie, Christian; Röttger, Richard

2017-01-01

Over the last decades, we have observed an ongoing tremendous growth of available sequencing data fueled by the advancements in wet-lab technology. The sequencing information is only the beginning of the actual understanding of how organisms survive and prosper. It is, for instance, equally...... important to also unravel the proteomic repertoire of an organism. A classical computational approach for detecting protein families is a sequence-based similarity calculation coupled with a subsequent cluster analysis. In this work we have intensively analyzed various clustering tools on a large scale. We...... used the data to investigate the behavior of the tools' parameters underlining the diversity of the protein families. Furthermore, we trained regression models for predicting the expected performance of a clustering tool for an unknown data set and aimed to also suggest optimal parameters...
Expanding the boundaries of local similarity analysis.

Science.gov (United States)

Durno, W Evan; Hanson, Niels W; Konwar, Kishori M; Hallam, Steven J

2013-01-01

Pairwise comparison of time series data for both local and time-lagged relationships is a computationally challenging problem relevant to many fields of inquiry. The Local Similarity Analysis (LSA) statistic identifies the existence of local and lagged relationships, but determining significance through a p-value has been algorithmically cumbersome due to an intensive permutation test, shuffling rows and columns and repeatedly calculating the statistic. Furthermore, this p-value is calculated with the assumption of normality -- a statistical luxury dissociated from most real world datasets. To improve the performance of LSA on big datasets, an asymptotic upper bound on the p-value calculation was derived without the assumption of normality. This change in the bound calculation markedly improved computational speed from O(pm²n) to O(m²n), where p is the number of permutations in a permutation test, m is the number of time series, and n is the length of each time series. The bounding process is implemented as a computationally efficient software package, FASTLSA, written in C and optimized for threading on multi-core computers, improving its practical computation time. We computationally compare our approach to previous implementations of LSA, demonstrate broad applicability by analyzing time series data from public health, microbial ecology, and social media, and visualize resulting networks using the Cytoscape software. The FASTLSA software package expands the boundaries of LSA allowing analysis on datasets with millions of co-varying time series. Mapping metadata onto force-directed graphs derived from FASTLSA allows investigators to view correlated cliques and explore previously unrecognized network relationships. The software is freely available for download at: http://www.cmde.science.ubc.ca/hallam/fastLSA/.

Quality assessment of protein model-structures based on structural and functional similarities.

Science.gov (United States)

Konopka, Bogumil M; Nebel, Jean-Christophe; Kotulska, Malgorzata

2012-09-21

Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and
Identification among morphologically similar Argyreia (Convolvulaceae) based on leaf anatomy and phenetic analyses.

Science.gov (United States)

Traiperm, Paweena; Chow, Janene; Nopun, Possathorn; Staples, G; Swangpol, Sasivimon C

2017-12-01

The genus Argyreia Lour. is one of the species-rich Asian genera in the family Convolvulaceae. Several species complexes were recognized in which taxon delimitation was imprecise, especially when examining herbarium materials without fully developed open flowers. The main goal of this study is to investigate and describe leaf anatomy for some morphologically similar Argyreia using epidermal peeling, leaf and petiole transverse sections, and scanning electron microscopy. Phenetic analyses including cluster analysis and principal component analysis were used to investigate the similarity of these morpho-types. Anatomical differences observed between the morpho-types include epidermal cell walls and the trichome types on the leaf epidermis. Additional differences in the leaf and petiole transverse sections include the epidermal cell shape of the adaxial leaf blade, the leaf margins, and the petiole transverse sectional outline. The phenogram from cluster analysis using the UPGMA method represented four groups with an R value of 0.87. Moreover, the important quantitative and qualitative leaf anatomical traits of the four groups were confirmed by the principal component analysis of the first two components. The results from phenetic analyses confirmed the anatomical differentiation between the morpho-types. Leaf anatomical features regarded as particularly informative for morpho-type differentiation can be used to supplement macro morphological identification.
Dynamic Time Warping Distance Method for Similarity Test of Multipoint Ground Motion Field

Directory of Open Access Journals (Sweden)

Yingmin Li

2010-01-01

Full Text Available The reasonability of artificial multi-point ground motions and the identification of abnormal records in seismic array observations, are two important issues in application and analysis of multi-point ground motion fields. Based on the dynamic time warping (DTW distance method, this paper discusses the application of similarity measurement in the similarity analysis of simulated multi-point ground motions and the actual seismic array records. Analysis results show that the DTW distance method not only can quantitatively reflect the similarity of simulated ground motion field, but also offers advantages in clustering analysis and singularity recognition of actual multi-point ground motion field.
Evaluation of GO-based functional similarity measures using S. cerevisiae protein interaction and expression profile data

Directory of Open Access Journals (Sweden)

Du LinFang

2008-11-01

Full Text Available Abstract Background Researchers interested in analysing the expression patterns of functionally related genes usually hope to improve the accuracy of their results beyond the boundaries of currently available experimental data. Gene ontology (GO data provides a novel way to measure the functional relationship between gene products. Many approaches have been reported for calculating the similarities between two GO terms, known as semantic similarities. However, biologists are more interested in the relationship between gene products than in the scores linking the GO terms. To highlight the relationships among genes, recent studies have focused on functional similarities. Results In this study, we evaluated five functional similarity methods using both protein-protein interaction (PPI and expression data of S. cerevisiae. The receiver operating characteristics (ROC and correlation coefficient analysis of these methods showed that the maximum method outperformed the other methods. Statistical comparison of multiple- and single-term annotated proteins in biological process ontology indicated that genes with multiple GO terms may be more reliable for separating true positives from noise. Conclusion This study demonstrated the reliability of current approaches that elevate the similarity of GO terms to the similarity of proteins. Suggestions for further improvements in functional similarity analysis are also provided.
Static Analysis for Event-Based XML Processing

DEFF Research Database (Denmark)

Møller, Anders

2008-01-01

Event-based processing of XML data - as exemplified by the popular SAX framework - is a powerful alternative to using W3C's DOM or similar tree-based APIs. The event-based approach is a streaming fashion with minimal memory consumption. This paper discusses challenges for creating program analyses...... for SAX applications. In particular, we consider the problem of statically guaranteeing the a given SAX program always produces only well-formed and valid XML output. We propose an analysis technique based on ecisting anglyses of Servlets, string operations, and XML graphs....
Dynamics based alignment of proteins: an alternative approach to quantify dynamic similarity

Directory of Open Access Journals (Sweden)

Lyngsø Rune

2010-04-01

Full Text Available Abstract Background The dynamic motions of many proteins are central to their function. It therefore follows that the dynamic requirements of a protein are evolutionary constrained. In order to assess and quantify this, one needs to compare the dynamic motions of different proteins. Comparing the dynamics of distinct proteins may also provide insight into how protein motions are modified by variations in sequence and, consequently, by structure. The optimal way of comparing complex molecular motions is, however, far from trivial. The majority of comparative molecular dynamics studies performed to date relied upon prior sequence or structural alignment to define which residues were equivalent in 3-dimensional space. Results Here we discuss an alternative methodology for comparative molecular dynamics that does not require any prior alignment information. We show it is possible to align proteins based solely on their dynamics and that we can use these dynamics-based alignments to quantify the dynamic similarity of proteins. Our method was tested on 10 representative members of the PDZ domain family. Conclusions As a result of creating pair-wise dynamics-based alignments of PDZ domains, we have found evolutionarily conserved patterns in their backbone dynamics. The dynamic similarity of PDZ domains is highly correlated with their structural similarity as calculated with Dali. However, significant differences in their dynamics can be detected indicating that sequence has a more refined role to play in protein dynamics than just dictating the overall fold. We suggest that the method should be generally applicable.
Pulse shape analysis based on similarity and neural network with digital-analog fusion method

International Nuclear Information System (INIS)

Mardiyanto, M.P.; Uritani, A.; Sakai, H.; Kawarabayashi, J.; Iguchi, T.

2000-01-01

Through the measurement of 22 Na γ-rays, it has been demonstrated that the correction process was well done by fusing the similarity values with the pulse heights measured by the analog system, where at least four improvements in the energy spectrum characteristics were recognized, i.e., the increase of the peak-to-valley ratio, the photopeak area, the photopeak sharpness without discarding any events, and the 1,275 keV γ-ray photopeak was seen. The use of a slow digitizer was the main problem for this method. However, it can be solved easily using a faster digitizer. The fusion method was also applied for the beta-gamma mixed spectra separation. Mixed spectra of beta-gamma of the 137 Cs- 90 Sr mixed source could be separated well. We made a comparison between the energy spectrum of 137 Cs as a result of independent measurement with the result of the separation. After being compared, both FWHM agreed quite well. However, there was a slight difference between the two spectra on the peak-to-valley ratio. This separation method is simple and useful so that it can be applied for many other similar applications. (S.Y.)
2-gram-based Phonetic Feature Generation for Convolutional Neural Network in Assessment of Trademark Similarity

OpenAIRE

Ko, Kyung Pyo; Lee, Kwang Hee; Jang, Mi So; Park, Gun Hong

2018-01-01

A trademark is a mark used to identify various commodities. If same or similar trademark is registered for the same or similar commodity, the purchaser of the goods may be confused. Therefore, in the process of trademark registration examination, the examiner judges whether the trademark is the same or similar to the other applied or registered trademarks. The confusion in trademarks is based on the visual, phonetic or conceptual similarity of the marks. In this paper, we focus specifically o...
IntelliGO: a new vector-based semantic similarity measure including annotation origin

Directory of Open Access Journals (Sweden)

Devignes Marie-Dominique

2010-12-01

previously published measures. Conclusions The IntelliGO similarity measure provides a customizable and comprehensive method for quantifying gene similarity based on GO annotations. It also displays a robust set-discriminating power which suggests it will be useful for functional clustering. Availability An on-line version of the IntelliGO similarity measure is available at: http://bioinfo.loria.fr/Members/benabdsi/intelligo_project/
Measuring structural similarity in large online networks.

Science.gov (United States)

Shi, Yongren; Macy, Michael

2016-09-01

Structural similarity based on bipartite graphs can be used to detect meaningful communities, but the networks have been tiny compared to massive online networks. Scalability is important in applications involving tens of millions of individuals with highly skewed degree distributions. Simulation analysis holding underlying similarity constant shows that two widely used measures - Jaccard index and cosine similarity - are biased by the distribution of out-degree in web-scale networks. However, an alternative measure, the Standardized Co-incident Ratio (SCR), is unbiased. We apply SCR to members of Congress, musical artists, and professional sports teams to show how massive co-following on Twitter can be used to map meaningful affiliations among cultural entities, even in the absence of direct connections to one another. Our results show how structural similarity can be used to map cultural alignments and demonstrate the potential usefulness of social media data in the study of culture, politics, and organizations across the social and behavioral sciences. Copyright © 2016 Elsevier Inc. All rights reserved.
An inter-comparison of similarity-based methods for organisation and classification of groundwater hydrographs

Science.gov (United States)

Haaf, Ezra; Barthel, Roland

2018-04-01

Classification and similarity based methods, which have recently received major attention in the field of surface water hydrology, namely through the PUB (prediction in ungauged basins) initiative, have not yet been applied to groundwater systems. However, it can be hypothesised, that the principle of "similar systems responding similarly to similar forcing" applies in subsurface hydrology as well. One fundamental prerequisite to test this hypothesis and eventually to apply the principle to make "predictions for ungauged groundwater systems" is efficient methods to quantify the similarity of groundwater system responses, i.e. groundwater hydrographs. In this study, a large, spatially extensive, as well as geologically and geomorphologically diverse dataset from Southern Germany and Western Austria was used, to test and compare a set of 32 grouping methods, which have previously only been used individually in local-scale studies. The resulting groupings are compared to a heuristic visual classification, which serves as a baseline. A performance ranking of these classification methods is carried out and differences in homogeneity of grouping results were shown, whereby selected groups were related to hydrogeological indices and geological descriptors. This exploratory empirical study shows that the choice of grouping method has a large impact on the object distribution within groups, as well as on the homogeneity of patterns captured in groups. The study provides a comprehensive overview of a large number of grouping methods, which can guide researchers when attempting similarity-based groundwater hydrograph classification.
Music Retrieval based on Melodic Similarity

NARCIS (Netherlands)

Typke, R.

2007-01-01

This thesis introduces a method for measuring melodic similarity for notated music such as MIDI files. This music search algorithm views music as sets of notes that are represented as weighted points in the two-dimensional space of time and pitch. Two point sets can be compared by calculating how
Advanced Models and Algorithms for Self-Similar IP Network Traffic Simulation and Performance Analysis

Science.gov (United States)

Radev, Dimitar; Lokshina, Izabella

2010-11-01

The paper examines self-similar (or fractal) properties of real communication network traffic data over a wide range of time scales. These self-similar properties are very different from the properties of traditional models based on Poisson and Markov-modulated Poisson processes. Advanced fractal models of sequentional generators and fixed-length sequence generators, and efficient algorithms that are used to simulate self-similar behavior of IP network traffic data are developed and applied. Numerical examples are provided; and simulation results are obtained and analyzed.
RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation.

Science.gov (United States)

Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin

2017-01-21

RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA
Analysis of pulse thermography using similarities between wave and diffusion propagation

Science.gov (United States)

Gershenson, M.

2017-05-01

Pulse thermography or thermal wave imaging are commonly used as nondestructive evaluation (NDE) method. While the technical aspect has evolve with time, theoretical interpretation is lagging. Interpretation is still using curved fitting on a log log scale. A new approach based directly on the governing differential equation is introduced. By using relationships between wave propagation and the diffusive propagation of thermal excitation, it is shown that one can transform from solutions in one type of propagation to the other. The method is based on the similarities between the Laplace transforms of the diffusion equation and the wave equation. For diffusive propagation we have the Laplace variable s to the first power, while for the wave propagation similar equations occur with s2. For discrete time the transformation between the domains is performed by multiplying the temperature data vector by a matrix. The transform is local. The performance of the techniques is tested on synthetic data. The application of common back projection techniques used in the processing of wave data is also demonstrated. The combined use of the transform and back projection makes it possible to improve both depth and lateral resolution of transient thermography.
Density-based retrieval from high-similarity image databases

DEFF Research Database (Denmark)

Hansen, Michael Edberg; Carstensen, Jens Michael

2004-01-01

Many image classification problems can fruitfully be thought of as image retrieval in a "high similarity image database" (HSID) characterized by being tuned towards a specific application and having a high degree of visual similarity between entries that should be distinguished. We introduce a me...
Lung Cancer Signature Biomarkers: tissue specific semantic similarity based clustering of Digital Differential Display (DDD data

Directory of Open Access Journals (Sweden)

Srivastava Mousami

2012-11-01

Full Text Available Abstract Background The tissue-specific Unigene Sets derived from more than one million expressed sequence tags (ESTs in the NCBI, GenBank database offers a platform for identifying significantly and differentially expressed tissue-specific genes by in-silico methods. Digital differential display (DDD rapidly creates transcription profiles based on EST comparisons and numerically calculates, as a fraction of the pool of ESTs, the relative sequence abundance of known and novel genes. However, the process of identifying the most likely tissue for a specific disease in which to search for candidate genes from the pool of differentially expressed genes remains difficult. Therefore, we have used ‘Gene Ontology semantic similarity score’ to measure the GO similarity between gene products of lung tissue-specific candidate genes from control (normal and disease (cancer sets. This semantic similarity score matrix based on hierarchical clustering represents in the form of a dendrogram. The dendrogram cluster stability was assessed by multiple bootstrapping. Multiple bootstrapping also computes a p-value for each cluster and corrects the bias of the bootstrap probability. Results Subsequent hierarchical clustering by the multiple bootstrapping method (α = 0.95 identified seven clusters. The comparative, as well as subtractive, approach revealed a set of 38 biomarkers comprising four distinct lung cancer signature biomarker clusters (panel 1–4. Further gene enrichment analysis of the four panels revealed that each panel represents a set of lung cancer linked metastasis diagnostic biomarkers (panel 1, chemotherapy/drug resistance biomarkers (panel 2, hypoxia regulated biomarkers (panel 3 and lung extra cellular matrix biomarkers (panel 4. Conclusions Expression analysis reveals that hypoxia induced lung cancer related biomarkers (panel 3, HIF and its modulating proteins (TGM2, CSNK1A1, CTNNA1, NAMPT/Visfatin, TNFRSF1A, ETS1, SRC-1, FN1, APLP2, DMBT1
A Two-Stage Composition Method for Danger-Aware Services Based on Context Similarity

Science.gov (United States)

Wang, Junbo; Cheng, Zixue; Jing, Lei; Ota, Kaoru; Kansen, Mizuo

Context-aware systems detect user's physical and social contexts based on sensor networks, and provide services that adapt to the user accordingly. Representing, detecting, and managing the contexts are important issues in context-aware systems. Composition of contexts is a useful method for these works, since it can detect a context by automatically composing small pieces of information to discover service. Danger-aware services are a kind of context-aware services which need description of relations between a user and his/her surrounding objects and between users. However when applying the existing composition methods to danger-aware services, they show the following shortcomings that (1) they have not provided an explicit method for representing composition of multi-user' contexts, (2) there is no flexible reasoning mechanism based on similarity of contexts, so that they can just provide services exactly following the predefined context reasoning rules. Therefore, in this paper, we propose a two-stage composition method based on context similarity to solve the above problems. The first stage is composition of the useful information to represent the context for a single user. The second stage is composition of multi-users' contexts to provide services by considering the relation of users. Finally the danger degree of the detected context is computed by using context similarity between the detected context and the predefined context. Context is dynamically represented based on two-stage composition rules and a Situation theory based Ontology, which combines the advantages of Ontology and Situation theory. We implement the system in an indoor ubiquitous environment, and evaluate the system through two experiments with the support of subjects. The experiment results show the method is effective, and the accuracy of danger detection is acceptable to a danger-aware system.
Object recognition based on Google's reverse image search and image similarity

Science.gov (United States)

Horváth, András.

2015-12-01

Image classification is one of the most challenging tasks in computer vision and a general multiclass classifier could solve many different tasks in image processing. Classification is usually done by shallow learning for predefined objects, which is a difficult task and very different from human vision, which is based on continuous learning of object classes and one requires years to learn a large taxonomy of objects which are not disjunct nor independent. In this paper I present a system based on Google image similarity algorithm and Google image database, which can classify a large set of different objects in a human like manner, identifying related classes and taxonomies.
Behavioral similarity measurement based on image processing for robots that use imitative learning

Science.gov (United States)

Sterpin B., Dante G.; Martinez S., Fernando; Jacinto G., Edwar

2017-02-01

In the field of the artificial societies, particularly those are based on memetics, imitative behavior is essential for the development of cultural evolution. Applying this concept for robotics, through imitative learning, a robot can acquire behavioral patterns from another robot. Assuming that the learning process must have an instructor and, at least, an apprentice, the fact to obtain a quantitative measurement for their behavioral similarity, would be potentially useful, especially in artificial social systems focused on cultural evolution. In this paper the motor behavior of both kinds of robots, for two simple tasks, is represented by 2D binary images, which are processed in order to measure their behavioral similarity. The results shown here were obtained comparing some similarity measurement methods for binary images.

Similarity-based multi-model ensemble approach for 1-15-day advance prediction of monsoon rainfall over India

Science.gov (United States)

Jaiswal, Neeru; Kishtawal, C. M.; Bhomia, Swati

2018-04-01

The southwest (SW) monsoon season (June, July, August and September) is the major period of rainfall over the Indian region. The present study focuses on the development of a new multi-model ensemble approach based on the similarity criterion (SMME) for the prediction of SW monsoon rainfall in the extended range. This approach is based on the assumption that training with the similar type of conditions may provide the better forecasts in spite of the sequential training which is being used in the conventional MME approaches. In this approach, the training dataset has been selected by matching the present day condition to the archived dataset and days with the most similar conditions were identified and used for training the model. The coefficients thus generated were used for the rainfall prediction. The precipitation forecasts from four general circulation models (GCMs), viz. European Centre for Medium-Range Weather Forecasts (ECMWF), United Kingdom Meteorological Office (UKMO), National Centre for Environment Prediction (NCEP) and China Meteorological Administration (CMA) have been used for developing the SMME forecasts. The forecasts of 1-5, 6-10 and 11-15 days were generated using the newly developed approach for each pentad of June-September during the years 2008-2013 and the skill of the model was analysed using verification scores, viz. equitable skill score (ETS), mean absolute error (MAE), Pearson's correlation coefficient and Nash-Sutcliffe model efficiency index. Statistical analysis of SMME forecasts shows superior forecast skill compared to the conventional MME and the individual models for all the pentads, viz. 1-5, 6-10 and 11-15 days.
Reducing 4D CT artifacts using optimized sorting based on anatomic similarity.

Science.gov (United States)

Johnston, Eric; Diehn, Maximilian; Murphy, James D; Loo, Billy W; Maxim, Peter G

2011-05-01

Four-dimensional (4D) computed tomography (CT) has been widely used as a tool to characterize respiratory motion in radiotherapy. The two most commonly used 4D CT algorithms sort images by the associated respiratory phase or displacement into a predefined number of bins, and are prone to image artifacts at transitions between bed positions. The purpose of this work is to demonstrate a method of reducing motion artifacts in 4D CT by incorporating anatomic similarity into phase or displacement based sorting protocols. Ten patient datasets were retrospectively sorted using both the displacement and phase based sorting algorithms. Conventional sorting methods allow selection of only the nearest-neighbor image in time or displacement within each bin. In our method, for each bed position either the displacement or the phase defines the center of a bin range about which several candidate images are selected. The two dimensional correlation coefficients between slices bordering the interface between adjacent couch positions are then calculated for all candidate pairings. Two slices have a high correlation if they are anatomically similar. Candidates from each bin are then selected to maximize the slice correlation over the entire data set using the Dijkstra's shortest path algorithm. To assess the reduction of artifacts, two thoracic radiation oncologists independently compared the resorted 4D datasets pairwise with conventionally sorted datasets, blinded to the sorting method, to choose which had the least motion artifacts. Agreement between reviewers was evaluated using the weighted kappa score. Anatomically based image selection resulted in 4D CT datasets with significantly reduced motion artifacts with both displacement (P = 0.0063) and phase sorting (P = 0.00022). There was good agreement between the two reviewers, with complete agreement 34 times and complete disagreement 6 times. Optimized sorting using anatomic similarity significantly reduces 4D CT motion
Towards Personalized Medicine: Leveraging Patient Similarity and Drug Similarity Analytics

Science.gov (United States)

Zhang, Ping; Wang, Fei; Hu, Jianying; Sorrentino, Robert

2014-01-01

The rapid adoption of electronic health records (EHR) provides a comprehensive source for exploratory and predictive analytic to support clinical decision-making. In this paper, we investigate how to utilize EHR to tailor treatments to individual patients based on their likelihood to respond to a therapy. We construct a heterogeneous graph which includes two domains (patients and drugs) and encodes three relationships (patient similarity, drug similarity, and patient-drug prior associations). We describe a novel approach for performing a label propagation procedure to spread the label information representing the effectiveness of different drugs for different patients over this heterogeneous graph. The proposed method has been applied on a real-world EHR dataset to help identify personalized treatments for hypercholesterolemia. The experimental results demonstrate the effectiveness of the approach and suggest that the combination of appropriate patient similarity and drug similarity analytics could lead to actionable insights for personalized medicine. Particularly, by leveraging drug similarity in combination with patient similarity, our method could perform well even on new or rarely used drugs for which there are few records of known past performance. PMID:25717413
Similar digit-based working memory in deaf signers and hearing non-signers despite digit span differences

Directory of Open Access Journals (Sweden)

Josefine eAndin

2013-12-01

Full Text Available Similar working memory (WM for lexical items has been demonstrated for signers and non-signers while short-term memory (STM is regularly poorer in deaf than hearing individuals. In the present study, we investigated digit-based WM and STM in Swedish and British deaf signers and hearing non-signers. To maintain good experimental control we used printed stimuli throughout and held response mode constant across groups. We showed that deaf signers have similar digit-based WM performance, despite shorter digit spans, compared to well-matched hearing non-signers. We found no difference between signers and non-signers on STM span for letters chosen to minimize phonological similarity or in the effects of recall direction. This set of findings indicates that similar WM for signers and non-signers can be generalized from lexical items to digits and suggests that poorer STM in deaf signers compared to hearing non-signers may be due to differences in phonological similarity across the language modalities of sign and speech.
Contributions to the stability analysis of self-similar supersonic heat waves related to inertial confinement fusion

International Nuclear Information System (INIS)

Dastugue, Laurent

2013-01-01

Exact self-similar solutions of gas dynamics equations with nonlinear heat conduction for semi-infinite slabs of perfect gases are used for studying the stability of flows in inertial confinement fusion. Both the similarity solutions and their linear perturbations are computed with a multi domain Chebyshev pseudo-spectral method, allowing us to account for, without any other approximation, compressibility and unsteadiness. Following previous results (Clarisse et al., 2008; Lombard, 2008) representative of the early ablation of a target by a nonuniform laser flux (electronic conduction, subsonic heat front downstream of a quasi-perfect shock front), we explore here other configurations. For this early ablation phase, but for a nonuniform incident X-radiation (radiative conduction), we study a compressible and a weakly compressible flow. In both cases, we recover the behaviours obtained for compressible flows with electronic heat conduction with a maximal instability for a zero wavenumber. Besides, the spectral method is extended to compute similarity solutions taking into account the supersonic heat wave ahead of the shock front. Based on an analysis of the reduced equations singularities (infinitely stiff front), this method allows us to describe the supersonic heat wave regime proper to the initial irradiation of the target and to recover the ablative solutions which were obtained under a negligible fore-running heat wave approximation. (author) [fr
Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment.

Science.gov (United States)

Ferragina, Paolo; Giancarlo, Raffaele; Greco, Valentina; Manzini, Giovanni; Valiente, Gabriel

2007-07-13

Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. However, the alignment methods seem inadequate for post-genomic studies since they do not scale well with data set size and they seem to be confined only to genomic and proteomic sequences. Therefore, alignment-free similarity measures are actively pursued. Among those, USM (Universal Similarity Metric) has gained prominence. It is based on the deep theory of Kolmogorov Complexity and universality is its most novel striking feature. Since it can only be approximated via data compression, USM is a methodology rather than a formula quantifying the similarity of two strings. Three approximations of USM are available, namely UCD (Universal Compression Dissimilarity), NCD (Normalized Compression Dissimilarity) and CD (Compression Dissimilarity). Their applicability and robustness is tested on various data sets yielding a first massive quantitative estimate that the USM methodology and its approximations are of value. Despite the rich theory developed around USM, its experimental assessment has limitations: only a few data compressors have been tested in conjunction with USM and mostly at a qualitative level, no comparison among UCD, NCD and CD is available and no comparison of USM with existing methods, both based on alignments and not, seems to be available. We experimentally test the USM methodology by using 25 compressors, all three of its known approximations and six data sets of relevance to Molecular Biology. This offers the first systematic and quantitative experimental assessment of this methodology, that naturally complements the many theoretical and the preliminary experimental results available. Moreover, we compare the USM methodology both with methods based on alignments and not. We may group our experiments into two sets. The first one, performed via ROC (Receiver Operating Curve) analysis, aims at
Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment

Directory of Open Access Journals (Sweden)

Manzini Giovanni

2007-07-01

Full Text Available Abstract Background Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. However, the alignment methods seem inadequate for post-genomic studies since they do not scale well with data set size and they seem to be confined only to genomic and proteomic sequences. Therefore, alignment-free similarity measures are actively pursued. Among those, USM (Universal Similarity Metric has gained prominence. It is based on the deep theory of Kolmogorov Complexity and universality is its most novel striking feature. Since it can only be approximated via data compression, USM is a methodology rather than a formula quantifying the similarity of two strings. Three approximations of USM are available, namely UCD (Universal Compression Dissimilarity, NCD (Normalized Compression Dissimilarity and CD (Compression Dissimilarity. Their applicability and robustness is tested on various data sets yielding a first massive quantitative estimate that the USM methodology and its approximations are of value. Despite the rich theory developed around USM, its experimental assessment has limitations: only a few data compressors have been tested in conjunction with USM and mostly at a qualitative level, no comparison among UCD, NCD and CD is available and no comparison of USM with existing methods, both based on alignments and not, seems to be available. Results We experimentally test the USM methodology by using 25 compressors, all three of its known approximations and six data sets of relevance to Molecular Biology. This offers the first systematic and quantitative experimental assessment of this methodology, that naturally complements the many theoretical and the preliminary experimental results available. Moreover, we compare the USM methodology both with methods based on alignments and not. We may group our experiments into two sets. The first one, performed via ROC
Optimizing top precision performance measure of content-based image retrieval by learning similarity function

KAUST Repository

Liang, Ru-Ze

2017-04-24

In this paper we study the problem of content-based image retrieval. In this problem, the most popular performance measure is the top precision measure, and the most important component of a retrieval system is the similarity function used to compare a query image against a database image. However, up to now, there is no existing similarity learning method proposed to optimize the top precision measure. To fill this gap, in this paper, we propose a novel similarity learning method to maximize the top precision measure. We model this problem as a minimization problem with an objective function as the combination of the losses of the relevant images ranked behind the top-ranked irrelevant image, and the squared Frobenius norm of the similarity function parameter. This minimization problem is solved as a quadratic programming problem. The experiments over two benchmark data sets show the advantages of the proposed method over other similarity learning methods when the top precision is used as the performance measure.
Optimizing top precision performance measure of content-based image retrieval by learning similarity function

KAUST Repository

Liang, Ru-Ze; Shi, Lihui; Wang, Haoxiang; Meng, Jiandong; Wang, Jim Jing-Yan; Sun, Qingquan; Gu, Yi

2017-01-01

In this paper we study the problem of content-based image retrieval. In this problem, the most popular performance measure is the top precision measure, and the most important component of a retrieval system is the similarity function used to compare a query image against a database image. However, up to now, there is no existing similarity learning method proposed to optimize the top precision measure. To fill this gap, in this paper, we propose a novel similarity learning method to maximize the top precision measure. We model this problem as a minimization problem with an objective function as the combination of the losses of the relevant images ranked behind the top-ranked irrelevant image, and the squared Frobenius norm of the similarity function parameter. This minimization problem is solved as a quadratic programming problem. The experiments over two benchmark data sets show the advantages of the proposed method over other similarity learning methods when the top precision is used as the performance measure.
Metabolomics coupled with similarity analysis advances the elucidation of the cold/hot properties of traditional Chinese medicines.

Science.gov (United States)

Jia, Yan; Zhang, Zheng-Zheng; Wei, Yu-Hai; Xue-Mei, Qin; Li, Zhen-Yu

2017-08-01

It recently becomes an important and urgent mission for modern scientific research to identify and explain the theory of traditional Chinese medicine (TCM), which has been utilized in China for more than four millennia. Since few works have been contributed to understanding the TCM theory, the mechanism of actions of drugs with cold/hot properties remains unclear. In the present study, six kinds of typical herbs with cold or hot properties were orally administered into mice, and serum and liver samples were analyzed using an untargeted nuclear magnetic resonance (NMR) based metabolomics approach coupled with similarity analysis. This approach was performed to identify and quantify changes in metabolic pathways to elucidate drug actions on the treated mice. Our results showed that those drugs with same property exerted similar effects on the metabolic alterations in mouse serum and liver samples, while drugs with different property showed different effects. The effects of herbal medicines with cold/hot properties were exerted by regulating the pathways linked to glycometabolism, lipid metabolism, amino acids metabolism and other metabolic pathways. The results elucidated the differences and similarities of drugs with cold/hot properties, providing useful information on the explanation of medicinal properties of these TCMs. Copyright © 2017 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.
Analysis of metal forming processes by using physical modeling and new plastic similarity condition

International Nuclear Information System (INIS)

Gronostajski, Z.; Hawryluk, M.

2007-01-01

In recent years many advances have been made in numerical methods, for linear and non-linear problems. However the success of them depends very much on the correctness of the problem formulation and the availability of the input data. Validity of the theoretical results can be verified by an experiment using the real or soft materials. An essential reduction of time and costs of the experiment can be obtained by using soft materials, which behaves in a way analogous to that of real metal during deformation. The advantages of using of the soft materials are closely connected with flow stress 500 to 1000 times lower than real materials. The accuracy of physical modeling depend on the similarity conditions between physical model and real process. The most important similarity conditions are materials similarity in the range of plastic and elastic deformation, geometrical, frictional and thermal similarities. New original plastic similarity condition for physical modeling of metal forming processes is proposed in the paper. It bases on the mathematical description of similarity of the flow stress curves of soft materials and real ones
Genetic similarity of soybean genotypes revealed by seed protein

Directory of Open Access Journals (Sweden)

Nikolić Ana

2005-01-01

Full Text Available More accurate and complete descriptions of genotypes could help determinate future breeding strategies and facilitate introgression of new genotypes in current soybean genetic pool. The objective of this study was to characterize 20 soybean genotypes from the Maize Research Institute "Zemun Polje" collection, which have good agronomic performances, high yield, lodging and drought resistance, and low shuttering by seed proteins as biochemical markers. Seed proteins were isolated and separated by PAA electrophoresis. On the basis of the presence/absence of protein fractions coefficients of similarity were calculated as Dice and Roger and Tanamoto coefficient between pairs of genotypes. The similarity matrix was submitted for hierarchical cluster analysis of un weighted pair group using arithmetic average (UPGMA method and necessary computation were performed using NTSYS-pc program. Protein seed analysis confirmed low level of genetic diversity in soybean. The highest genetic similarity was between genotypes P9272 and Kador. According to obtained results, soybean genotypes were assigned in two larger groups and coefficients of similarity showed similar results. Because of the lack of pedigree data for analyzed genotypes, correspondence with marker data could not be determined. In plant with a narrow genetic base in their gene pool, such as soybean, protein markers may not be sufficient for characterization and study of genetic diversity.
Virtual screening by a new Clustering-based Weighted Similarity Extreme Learning Machine approach.

Science.gov (United States)

Pasupa, Kitsuchart; Kudisthalert, Wasu

2018-01-01

Machine learning techniques are becoming popular in virtual screening tasks. One of the powerful machine learning algorithms is Extreme Learning Machine (ELM) which has been applied to many applications and has recently been applied to virtual screening. We propose the Weighted Similarity ELM (WS-ELM) which is based on a single layer feed-forward neural network in a conjunction of 16 different similarity coefficients as activation function in the hidden layer. It is known that the performance of conventional ELM is not robust due to random weight selection in the hidden layer. Thus, we propose a Clustering-based WS-ELM (CWS-ELM) that deterministically assigns weights by utilising clustering algorithms i.e. k-means clustering and support vector clustering. The experiments were conducted on one of the most challenging datasets-Maximum Unbiased Validation Dataset-which contains 17 activity classes carefully selected from PubChem. The proposed algorithms were then compared with other machine learning techniques such as support vector machine, random forest, and similarity searching. The results show that CWS-ELM in conjunction with support vector clustering yields the best performance when utilised together with Sokal/Sneath(1) coefficient. Furthermore, ECFP_6 fingerprint presents the best results in our framework compared to the other types of fingerprints, namely ECFP_4, FCFP_4, and FCFP_6.
Visualization-based analysis of multiple response survey data

Science.gov (United States)

Timofeeva, Anastasiia

2017-11-01

During the survey, the respondents are often allowed to tick more than one answer option for a question. Analysis and visualization of such data have difficulties because of the need for processing multiple response variables. With standard representation such as pie and bar charts, information about the association between different answer options is lost. The author proposes a visualization approach for multiple response variables based on Venn diagrams. For a more informative representation with a large number of overlapping groups it is suggested to use similarity and association matrices. Some aggregate indicators of dissimilarity (similarity) are proposed based on the determinant of the similarity matrix and the maximum eigenvalue of association matrix. The application of the proposed approaches is well illustrated by the example of the analysis of advertising sources. Intersection of sets indicates that the same consumer audience is covered by several advertising sources. This information is very important for the allocation of the advertising budget. The differences between target groups in advertising sources are of interest. To identify such differences the hypothesis of homogeneity and independence are tested. Recent approach to the problem are briefly reviewed and compared. An alternative procedure is suggested. It is based on partition of a consumer audience into pairwise disjoint subsets and includes hypothesis testing of the difference between the population proportions. It turned out to be more suitable for the real problem being solved.
IMPLEMENTATION OF GIS-BASED MULTICRITERIA DECISION ANALYSIS WITH VB IN ArcGIS

OpenAIRE

DERYA OZTURK; FATMAGUL BATUK

2011-01-01

This article focuses on the integration of multicriteria decision analysis (MCDA) and geographical information systems (GIS) and introduces a tool, GIS–MCDA, written in visual basic in ArcGIS for GIS-based MCDA. The GIS–MCDA deals with raster-based data sets and includes standardization, weighting and decision analysis methods, and sensitivity analysis. Simple additive weighting, weighted product method, technique for order preference by similarity to ideal solution, compromise programming, a...
Attention-based image similarity measure with application to content-based information retrieval

Science.gov (United States)

Stentiford, Fred W. M.

2003-01-01

Whilst storage and capture technologies are able to cope with huge numbers of images, image retrieval is in danger of rendering many repositories valueless because of the difficulty of access. This paper proposes a similarity measure that imposes only very weak assumptions on the nature of the features used in the recognition process. This approach does not make use of a pre-defined set of feature measurements which are extracted from a query image and used to match those from database images, but instead generates features on a trial and error basis during the calculation of the similarity measure. This has the significant advantage that features that determine similarity can match whatever image property is important in a particular region whether it be a shape, a texture, a colour or a combination of all three. It means that effort is expended searching for the best feature for the region rather than expecting that a fixed feature set will perform optimally over the whole area of an image and over every image in a database. The similarity measure is evaluated on a problem of distinguishing similar shapes in sets of black and white symbols.
GOMA: functional enrichment analysis tool based on GO modules

Institute of Scientific and Technical Information of China (English)

Qiang Huang; Ling-Yun Wu; Yong Wang; Xiang-Sun Zhang

2013-01-01

Analyzing the function of gene sets is a critical step in interpreting the results of high-throughput experiments in systems biology.A variety of enrichment analysis tools have been developed in recent years,but most output a long list of significantly enriched terms that are often redundant,making it difficult to extract the most meaningful functions.In this paper,we present GOMA,a novel enrichment analysis method based on the new concept of enriched functional Gene Ontology (GO) modules.With this method,we systematically revealed functional GO modules,i.e.,groups of functionally similar GO terms,via an optimization model and then ranked them by enrichment scores.Our new method simplifies enrichment analysis results by reducing redundancy,thereby preventing inconsistent enrichment results among functionally similar terms and providing more biologically meaningful results.
Similarity-Based Unification: A Multi-Adjoint Approach

Czech Academy of Sciences Publication Activity Database

Medina, J.; Ojeda-Aciego, M.; Vojtáš, Peter

2004-01-01

Roč. 146, č. 1 (2004), s. 43-62 ISSN 0165-0114 Source of funding: V - iné verejné zdroje Keywords : similarity * fuzzy unification Subject RIV: BA - General Mathematics Impact factor: 0.734, year: 2004
Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms

International Nuclear Information System (INIS)

Tourassi, Georgia D.; Harrawood, Brian; Singh, Swatee; Lo, Joseph Y.; Floyd, Carey E.

2007-01-01

The purpose of this study was to evaluate image similarity measures employed in an information-theoretic computer-assisted detection (IT-CAD) scheme. The scheme was developed for content-based retrieval and detection of masses in screening mammograms. The study is aimed toward an interactive clinical paradigm where physicians query the proposed IT-CAD scheme on mammographic locations that are either visually suspicious or indicated as suspicious by other cuing CAD systems. The IT-CAD scheme provides an evidence-based, second opinion for query mammographic locations using a knowledge database of mass and normal cases. In this study, eight entropy-based similarity measures were compared with respect to retrieval precision and detection accuracy using a database of 1820 mammographic regions of interest. The IT-CAD scheme was then validated on a separate database for false positive reduction of progressively more challenging visual cues generated by an existing, in-house mass detection system. The study showed that the image similarity measures fall into one of two categories; one category is better suited to the retrieval of semantically similar cases while the second is more effective with knowledge-based decisions regarding the presence of a true mass in the query location. In addition, the IT-CAD scheme yielded a substantial reduction in false-positive detections while maintaining high detection rate for malignant masses
A Novel Approach to Semantic Similarity Measurement Based on a Weighted Concept Lattice: Exemplifying Geo-Information

Directory of Open Access Journals (Sweden)

Jia Xiao

2017-11-01

Full Text Available The measurement of semantic similarity has been widely recognized as having a fundamental and key role in information science and information systems. Although various models have been proposed to measure semantic similarity, these models are not able effectively to quantify the weights of relevant factors that impact on the judgement of semantic similarity, such as the attributes of concepts, application context, and concept hierarchy. In this paper, we propose a novel approach that comprehensively considers the effects of various factors on semantic similarity judgment, which we name semantic similarity measurement based on a weighted concept lattice (SSMWCL. A feature model and network model are integrated together in SSMWCL. Based on the feature model, the combined weight of each attribute of the concepts is calculated by merging its information entropy and inclusion-degree importance in a specific application context. By establishing the weighted concept lattice, the relative hierarchical depths of concepts for comparison are computed according to the principle of the network model. The integration of feature model and network model enables SSMWCL to take account of differences in concepts more comprehensively in semantic similarity measurement. Additionally, a workflow of SSMWCL is designed to demonstrate these procedures and a case study of geo-information is conducted to assess the approach.

QSAR models based on quantum topological molecular similarity.

Science.gov (United States)

Popelier, P L A; Smith, P J

2006-07-01

A new method called quantum topological molecular similarity (QTMS) was fairly recently proposed [J. Chem. Inf. Comp. Sc., 41, 2001, 764] to construct a variety of medicinal, ecological and physical organic QSAR/QSPRs. QTMS method uses quantum chemical topology (QCT) to define electronic descriptors drawn from modern ab initio wave functions of geometry-optimised molecules. It was shown that the current abundance of computing power can be utilised to inject realistic descriptors into QSAR/QSPRs. In this article we study seven datasets of medicinal interest : the dissociation constants (pK(a)) for a set of substituted imidazolines , the pK(a) of imidazoles , the ability of a set of indole derivatives to displace [(3)H] flunitrazepam from binding to bovine cortical membranes , the influenza inhibition constants for a set of benzimidazoles , the interaction constants for a set of amides and the enzyme liver alcohol dehydrogenase , the natriuretic activity of sulphonamide carbonic anhydrase inhibitors and the toxicity of a series of benzyl alcohols. A partial least square analysis in conjunction with a genetic algorithm delivered excellent models. They are also able to highlight the active site, of the ligand or the molecule whose structure determines the activity. The advantages and limitations of QTMS are discussed.
A hybrid algorithm for selecting head-related transfer function based on similarity of anthropometric structures

Science.gov (United States)

Zeng, Xiang-Yang; Wang, Shu-Guang; Gao, Li-Ping

2010-09-01

As the basic data for virtual auditory technology, head-related transfer function (HRTF) has many applications in the areas of room acoustic modeling, spatial hearing and multimedia. How to individualize HRTF fast and effectively has become an opening problem at present. Based on the similarity and relativity of anthropometric structures, a hybrid HRTF customization algorithm, which has combined the method of principal component analysis (PCA), multiple linear regression (MLR) and database matching (DM), has been presented in this paper. The HRTFs selected by both the best match and the worst match have been applied into obtaining binaurally auralized sounds, which are then used for subjective listening experiments and the results are compared. For the area in the horizontal plane, the localization results have shown that the selection of HRTFs can enhance the localization accuracy and can also abate the problem of front-back confusion.
Cloud based automated framework for semantic rich ontology construction and similarity computation for E-health applications

Directory of Open Access Journals (Sweden)

T. MuthamilSelvan

Full Text Available Ontology structure, a core of semantic web is an excellent tool for knowledge representation and semantic visualization. Moreover, knowledge reuse is made possible through similarity measure estimation between two ontologies, threshold estimation and use of simple if-then rules for checking relevancy and irrelevancy measures. Reduced semantic representations of the ontology provide reduced knowledge visualization which is critical especially for e-health data processing and analysis. This usually occurs due to the presence of implicit knowledge and polymorphic objects and can be made semantically rich through the construction by resolving this implicit knowledge occurring in the form of non-dominant words and conditional dependence actions. This paper presents the working of the automated framework for the construction of semantic rich ontology structures and store in the repository. This construction uses dyadic deontic logic based Graph Derivation Representation in order to construct semantically rich ontologies. Moreover, in order to retrieve a set of relevant documents in response to the cloud user document, the degree of similarity between two ontologies is estimated using the traditional cosine similarity measure and simple if-then rules are used to determine the number of relevant documents and obtain such document's metadata for further processing. These working modules will be extremely beneficial to the authenticated cloud users for document retrieval, information extraction and domain dictionary construction which are especially used for e-health applications. The proposed framework is implemented using diabetes dataset and the effectiveness of the experimental results is high when compared to other Graph Derivation Representation methods. The graphical results shown in the paper is an added visualization for viewing the performance of the proposed framework. Keywords: Ontology, Implicit knowledge, Conditional dependence, Graph
On Prolonging Network Lifetime through Load-Similar Node Deployment in Wireless Sensor Networks

Science.gov (United States)

Li, Qiao-Qin; Gong, Haigang; Liu, Ming; Yang, Mei; Zheng, Jun

2011-01-01

This paper is focused on the study of the energy hole problem in the Progressive Multi-hop Rotational Clustered (PMRC)-structure, a highly scalable wireless sensor network (WSN) architecture. Based on an analysis on the traffic load distribution in PMRC-based WSNs, we propose a novel load-similar node distribution strategy combined with the Minimum Overlapping Layers (MOL) scheme to address the energy hole problem in PMRC-based WSNs. In this strategy, sensor nodes are deployed in the network area according to the load distribution. That is, more nodes shall be deployed in the range where the average load is higher, and then the loads among different areas in the sensor network tend to be balanced. Simulation results demonstrate that the load-similar node distribution strategy prolongs network lifetime and reduces the average packet latency in comparison with existing nonuniform node distribution and uniform node distribution strategies. Note that, besides the PMRC structure, the analysis model and the proposed load-similar node distribution strategy are also applicable to other multi-hop WSN structures. PMID:22163809
Similarity, trust in institutions, affect, and populism

DEFF Research Database (Denmark)

Scholderer, Joachim; Finucane, Melissa L.

-based evaluations are fundamental to human information processing, they can contribute significantly to other judgments (such as the risk, cost-effectiveness, trustworthiness) of the same stimulus object. Although deliberation and analysis are certainly important in some decision-making circumstances, reliance...... on affect is a quicker, easier, and a more efficient way of navigating in a complex and uncertain world. Hence, many theorists give affect a direct and primary role in motivating behavior. Taken together, the results provide uncannily strong support for the value-similarity hypothesis, strengthening...... types of information about gene technology. The materials were attributed to different institutions. The results indicated that participants' trust in an institution was a function of the similarity between the position advocated in the materials and participants' own attitudes towards gene technology...
Multi-Attribute Decision Making Based on Several Trigonometric Hamming Similarity Measures under Interval Rough Neutrosophic Environment

Directory of Open Access Journals (Sweden)

Surapati Pramanik

2018-03-01

Full Text Available In this paper, the sine, cosine and cotangent similarity measures of interval rough neutrosophic sets is proposed. Some properties of the proposed measures are discussed. We have proposed multi attribute decision making approaches based on proposed similarity measures. To demonstrate the applicability, a numerical example is solved.
Simplification and Shift in Cognition of Political Difference: Applying the Geometric Modeling to the Analysis of Semantic Similarity Judgment

Science.gov (United States)

Kato, Junko; Okada, Kensuke

2011-01-01

Perceiving differences by means of spatial analogies is intrinsic to human cognition. Multi-dimensional scaling (MDS) analysis based on Minkowski geometry has been used primarily on data on sensory similarity judgments, leaving judgments on abstractive differences unanalyzed. Indeed, analysts have failed to find appropriate experimental or real-life data in this regard. Our MDS analysis used survey data on political scientists' judgments of the similarities and differences between political positions expressed in terms of distance. Both distance smoothing and majorization techniques were applied to a three-way dataset of similarity judgments provided by at least seven experts on at least five parties' positions on at least seven policies (i.e., originally yielding 245 dimensions) to substantially reduce the risk of local minima. The analysis found two dimensions, which were sufficient for mapping differences, and fit the city-block dimensions better than the Euclidean metric in all datasets obtained from 13 countries. Most city-block dimensions were highly correlated with the simplified criterion (i.e., the left–right ideology) for differences that are actually used in real politics. The isometry of the city-block and dominance metrics in two-dimensional space carries further implications. More specifically, individuals may pay attention to two dimensions (if represented in the city-block metric) or focus on a single dimension (if represented in the dominance metric) when judging differences between the same objects. Switching between metrics may be expected to occur during cognitive processing as frequently as the apparent discontinuities and shifts in human attention that may underlie changing judgments in real situations occur. Consequently, the result has extended strong support for the validity of the geometric models to represent an important social cognition, i.e., the one of political differences, which is deeply rooted in human nature. PMID:21673959
A Segment-Based Trajectory Similarity Measure in the Urban Transportation Systems.

Science.gov (United States)

Mao, Yingchi; Zhong, Haishi; Xiao, Xianjian; Li, Xiaofang

2017-03-06

With the rapid spread of built-in GPS handheld smart devices, the trajectory data from GPS sensors has grown explosively. Trajectory data has spatio-temporal characteristics and rich information. Using trajectory data processing techniques can mine the patterns of human activities and the moving patterns of vehicles in the intelligent transportation systems. A trajectory similarity measure is one of the most important issues in trajectory data mining (clustering, classification, frequent pattern mining, etc.). Unfortunately, the main similarity measure algorithms with the trajectory data have been found to be inaccurate, highly sensitive of sampling methods, and have low robustness for the noise data. To solve the above problems, three distances and their corresponding computation methods are proposed in this paper. The point-segment distance can decrease the sensitivity of the point sampling methods. The prediction distance optimizes the temporal distance with the features of trajectory data. The segment-segment distance introduces the trajectory shape factor into the similarity measurement to improve the accuracy. The three kinds of distance are integrated with the traditional dynamic time warping algorithm (DTW) algorithm to propose a new segment-based dynamic time warping algorithm (SDTW). The experimental results show that the SDTW algorithm can exhibit about 57%, 86%, and 31% better accuracy than the longest common subsequence algorithm (LCSS), and edit distance on real sequence algorithm (EDR) , and DTW, respectively, and that the sensitivity to the noise data is lower than that those algorithms.
Similarity analysis between chromosomes of Homo sapiens and monkeys with correlation coefficient, rank correlation coefficient and cosine similarity measures.

Science.gov (United States)

Someswara Rao, Chinta; Viswanadha Raju, S

2016-03-01

In this paper, we consider correlation coefficient, rank correlation coefficient and cosine similarity measures for evaluating similarity between Homo sapiens and monkeys. We used DNA chromosomes of genome wide genes to determine the correlation between the chromosomal content and evolutionary relationship. The similarity among the H. sapiens and monkeys is measured for a total of 210 chromosomes related to 10 species. The similarity measures of these different species show the relationship between the H. sapiens and monkey. This similarity will be helpful at theft identification, maternity identification, disease identification, etc.
Traditional and robust vector selection methods for use with similarity based models

International Nuclear Information System (INIS)

Hines, J. W.; Garvey, D. R.

2006-01-01

Vector selection, or instance selection as it is often called in the data mining literature, performs a critical task in the development of nonparametric, similarity based models. Nonparametric, similarity based modeling (SBM) is a form of 'lazy learning' which constructs a local model 'on the fly' by comparing a query vector to historical, training vectors. For large training sets the creation of local models may become cumbersome, since each training vector must be compared to the query vector. To alleviate this computational burden, varying forms of training vector sampling may be employed with the goal of selecting a subset of the training data such that the samples are representative of the underlying process. This paper describes one such SBM, namely auto-associative kernel regression (AAKR), and presents five traditional vector selection methods and one robust vector selection method that may be used to select prototype vectors from a larger data set in model training. The five traditional vector selection methods considered are min-max, vector ordering, combination min-max and vector ordering, fuzzy c-means clustering, and Adeli-Hung clustering. Each method is described in detail and compared using artificially generated data and data collected from the steam system of an operating nuclear power plant. (authors)
Semantic similarity from natural language and ontology analysis

CERN Document Server

Harispe, Sébastien; Janaqi, Stefan

2015-01-01

Artificial Intelligence federates numerous scientific fields in the aim of developing machines able to assist human operators performing complex treatments---most of which demand high cognitive skills (e.g. learning or decision processes). Central to this quest is to give machines the ability to estimate the likeness or similarity between things in the way human beings estimate the similarity between stimuli.In this context, this book focuses on semantic measures: approaches designed for comparing semantic entities such as units of language, e.g. words, sentences, or concepts and instances def
High-power Yb-fiber comb based on pre-chirped-management self-similar amplification

Science.gov (United States)

Luo, Daping; Liu, Yang; Gu, Chenglin; Wang, Chao; Zhu, Zhiwei; Zhang, Wenchao; Deng, Zejiang; Zhou, Lian; Li, Wenxue; Zeng, Heping

2018-02-01

We report a fiber self-similar-amplification (SSA) comb system that delivers a 250-MHz, 109-W, 42-fs pulse train with a 10-dB spectral width of 85 nm at 1056 nm. A pair of grisms is employed to compensate the group velocity dispersion and third-order dispersion of pre-amplified pulses for facilitating a self-similar evolution and a self-phase modulation (SPM). Moreover, we analyze the stabilities and noise characteristics of both the locked carrier envelope phase and the repetition rate, verifying the stability of the generated high-power comb. The demonstration of the SSA comb at such high power proves the feasibility of the SPM-based low-noise ultrashort comb.
Similar-Case-Based Optimization of Beam Arrangements in Stereotactic Body Radiotherapy for Assisting Treatment Planners

Directory of Open Access Journals (Sweden)

Taiki Magome

2013-01-01

Full Text Available Objective. To develop a similar-case-based optimization method for beam arrangements in lung stereotactic body radiotherapy (SBRT to assist treatment planners. Methods. First, cases that are similar to an objective case were automatically selected based on geometrical features related to a planning target volume (PTV location, PTV shape, lung size, and spinal cord position. Second, initial beam arrangements were determined by registration of similar cases with the objective case using a linear registration technique. Finally, beam directions of the objective case were locally optimized based on the cost function, which takes into account the radiation absorption in normal tissues and organs at risk. The proposed method was evaluated with 10 test cases and a treatment planning database including 81 cases, by using 11 planning evaluation indices such as tumor control probability and normal tissue complication probability (NTCP. Results. The procedure for the local optimization of beam arrangements improved the quality of treatment plans with significant differences (P<0.05 in the homogeneity index and conformity index for the PTV, V10, V20, mean dose, and NTCP for the lung. Conclusion. The proposed method could be usable as a computer-aided treatment planning tool for the determination of beam arrangements in SBRT.
A measure of association between vectors based on "similarity covariance"

OpenAIRE

Pascual-Marqui, Roberto D.; Lehmann, Dietrich; Kochi, Kieko; Kinoshita, Toshihiko; Yamada, Naoto

2013-01-01

The "maximum similarity correlation" definition introduced in this study is motivated by the seminal work of Szekely et al on "distance covariance" (Ann. Statist. 2007, 35: 2769-2794; Ann. Appl. Stat. 2009, 3: 1236-1265). Instead of using Euclidean distances "d" as in Szekely et al, we use "similarity", which can be defined as "exp(-d/s)", where the scaling parameter s>0 controls how rapidly the similarity falls off with distance. Scale parameters are chosen by maximizing the similarity corre...
Revisiting a dogma: similar survival of patients with small bowel and gastric GIST. A population-based propensity score SEER analysis.

Science.gov (United States)

Guller, Ulrich; Tarantino, Ignazio; Cerny, Thomas; Ulrich, Alexis; Schmied, Bruno M; Warschkow, Rene

2017-01-01

The objective of the present analysis was to assess whether small bowel gastrointestinal stromal tumor (GIST) is associated with worse cancer-specific survival (CSS) and overall survival (OS) compared with gastric GIST on a population-based level. Data on patients aged 18 years or older with histologically proven GIST was extracted from the SEER database from 1998 to 2011. OS and CSS for small bowel GIST were compared with OS and CSS for gastric GIST by application of adjusted and unadjusted Cox regression analyses and propensity score analyses. GIST were located in the stomach (n = 3011, 59 %), duodenum (n = 313, 6 %), jejunum/ileum (n = 1288, 25 %), colon (n = 139, 3 %), rectum (n = 172, 3 %), and extraviscerally (n = 173, 3 %). OS and CSS of patients with GIST in the duodenum [OS, HR 0.95, 95 % confidence interval (CI) 0.76-1.19; CSS, HR 0.99, 95 % CI 0.76-1.29] and in the jejunum/ileum (OS, HR 0.97, 95 % CI 0.85-1.10; CSS, HR = 0.95, 95 % CI 0.81-1.10) were similar to those of patients with gastric GIST in multivariate analyses. Conversely, OS and CSS of patients with GIST in the colon (OS, HR 1.40; 95 % CI 1.07-1.83; CSS, HR 1.89, 95 % CI 1.41-2.54) and in an extravisceral location (OS, HR 1.42, 95 % CI 1.14-1.77; CSS, HR = 1.43, 95 % CI 1.11-1.84) were significantly worse than those of patients with gastric GIST. Contrary to common belief, OS and CSS of patients with small bowel GIST are not statistically different from those of patients with gastric GIST when adjustment is made for confounding variables on a population-based level. The prognosis of patients with nongastric GIST is worse because of a colonic and extravisceral GIST location. These findings have implications regarding adjuvant treatment of GIST patients. Hence, the dogma that small bowel GIST patients have worse prognosis than gastric GIST patients and therefore should receive adjuvant treatment to a greater extent must be revisited.
Computer-aided beam arrangement based on similar cases in radiation treatment-planning databases for stereotactic lung radiation therapy

International Nuclear Information System (INIS)

Magome, Taiki; Shioyama, Yoshiyuki; Arimura, Hidetaka

2013-01-01

The purpose of this study was to develop a computer-aided method for determination of beam arrangements based on similar cases in a radiotherapy treatment-planning database for stereotactic lung radiation therapy. Similar-case-based beam arrangements were automatically determined based on the following two steps. First, the five most similar cases were searched, based on geometrical features related to the location, size and shape of the planning target volume, lung and spinal cord. Second, five beam arrangements of an objective case were automatically determined by registering five similar cases with the objective case, with respect to lung regions, by means of a linear registration technique. For evaluation of the beam arrangements five treatment plans were manually created by applying the beam arrangements determined in the second step to the objective case. The most usable beam arrangement was selected by sorting the five treatment plans based on eight plan evaluation indices, including the D95, mean lung dose and spinal cord maximum dose. We applied the proposed method to 10 test cases, by using an RTP database of 81 cases with lung cancer, and compared the eight plan evaluation indices between the original treatment plan and the corresponding most usable similar-case-based treatment plan. As a result, the proposed method may provide usable beam arrangements, which have no statistically significant differences from the original beam arrangements (P>0.05) in terms of the eight plan evaluation indices. Therefore, the proposed method could be employed as an educational tool for less experienced treatment planners. (author)
The Arabidopsis co-expression tool (act): a WWW-based tool and database for microarray-based gene expression analysis

DEFF Research Database (Denmark)

Jen, C. H.; Manfield, I. W.; Michalopoulos, D. W.

2006-01-01

be examined using the novel clique finder tool to determine the sets of genes most likely to be regulated in a similar manner. In combination, these tools offer three levels of analysis: creation of correlation lists of co-expressed genes, refinement of these lists using two-dimensional scatter plots......We present a new WWW-based tool for plant gene analysis, the Arabidopsis Co-Expression Tool (act) , based on a large Arabidopsis thaliana microarray data set obtained from the Nottingham Arabidopsis Stock Centre. The co-expression analysis tool allows users to identify genes whose expression...
Areal Feature Matching Based on Similarity Using Critic Method

Science.gov (United States)

Kim, J.; Yu, K.

2015-10-01

In this paper, we propose an areal feature matching method that can be applied for many-to-many matching, which involves matching a simple entity with an aggregate of several polygons or two aggregates of several polygons with fewer user intervention. To this end, an affine transformation is applied to two datasets by using polygon pairs for which the building name is the same. Then, two datasets are overlaid with intersected polygon pairs that are selected as candidate matching pairs. If many polygons intersect at this time, we calculate the inclusion function between such polygons. When the value is more than 0.4, many of the polygons are aggregated as single polygons by using a convex hull. Finally, the shape similarity is calculated between the candidate pairs according to the linear sum of the weights computed in CRITIC method and the position similarity, shape ratio similarity, and overlap similarity. The candidate pairs for which the value of the shape similarity is more than 0.7 are determined as matching pairs. We applied the method to two geospatial datasets: the digital topographic map and the KAIS map in South Korea. As a result, the visual evaluation showed two polygons that had been well detected by using the proposed method. The statistical evaluation indicates that the proposed method is accurate when using our test dataset with a high F-measure of 0.91.
Looking Similar Promotes Group Stability in a Game-Based Virtual Community.

Science.gov (United States)

Lortie, Catherine L; Guitton, Matthieu J

2012-08-01

Online support groups are popular Web-based resources that provide tailored information and peer support through virtual communities and fulfill the users' needs for empowerment and belonging. However, the therapeutic potential of online support groups is at present limited by the lack of systematic research on the cognitive mechanisms underlying social group cohesion in virtual communities. We might increase the benefits of participation in online support groups if we gain more insight into the factors that promote long-term commitment to peer support. One approach to foster the therapeutic potential of online support groups could be to increase social selection based on visual similarity. We performed a case study using the popular virtual setting of "World of Warcraft" (Blizzard Entertainment, Irvine, CA). We monitored the social dynamics of a virtual community composed of avatars whose appearance was identical during a period of 3 months, biweekly, for a total of 24 measures. We observed that this homogeneous community displayed a very high level of group stability over time in terms of the total number of members, the number of members that stayed the same, and the number of arrivals and departures, despite the fact that belonging to a heterogeneous group typically favors the success of the group with respect to game progression. Our results confirm that appearance can trigger social selection in online virtual communities. Displaying a similar appearance could be one way to strengthen social bonds among peers who share various health and well-being issues. Thus, the therapeutic potential of online support groups could be promoted through visual cohesion.
Similarity analysis between chromosomes of Homo sapiens and monkeys with correlation coefficient, rank correlation coefficient and cosine similarity measures

OpenAIRE

Someswara Rao, Chinta; Viswanadha Raju, S.

2016-01-01

In this paper, we consider correlation coefficient, rank correlation coefficient and cosine similarity measures for evaluating similarity between Homo sapiens and monkeys. We used DNA chromosomes of genome wide genes to determine the correlation between the chromosomal content and evolutionary relationship. The similarity among the H. sapiens and monkeys is measured for a total of 210 chromosomes related to 10 species. The similarity measures of these different species show the relationship b...

Approach to analysis of inter-regional similarity of investment activity support measures in legislation of regions (on the example of Krasnoyarsk region

Directory of Open Access Journals (Sweden)

Valentina F. Lapo

2017-01-01

Full Text Available The most part of stimulation methods in Russia are concentrated in legal documents of the regions of the Russian Federation. They directed on intensification of investment activity in regions. How similar are these investment stimulation conceptions? There is no mention in the literature of the methodical questions of quantitative analysis and inter-regional comparisons. In addition, there are no results of statistical research of inter-regional correlations of stimulation methods and analysis of dynamics of this process. There are no special measuring instruments. The presented work is aimed at completion of these blanks. The approach for the spatial correlation analysis of legislative norms is offered in research. Classification of investments’ stimulation methods is developed. The way of preparing and coding data for research is offered. The approach and system of coefficients for the analysis of inter-regional interrelations of legislative systems of investments’ stimulation is offered. A proximity coefficient of regional legislation, a factor of structure similarity and a dynamic coincidence index are proposed. The space-time base of investment stimulation methods on Russian Federation regions for 12 years is collected and statistically processed for research. There are only 758 documents. A source of texts is a site of the Ministry of Justice of the Russian Federation.Research of documents has allowed revealing a number of laws in formation of regional investment stimulation systems. The regions that are the most similar in terms of structure of stimulation methods are identified. We have found the group of regions for which it is observed the increase in similarity of the legislation and the group with the reduction of similarity. Therefore, it is obvious that the general trend to reduction of similarity in the legislation takes place between Krasnoyarsk territory and the other regions of the Russian Federation. Calculations have
Odour-based discrimination of similarity at the major histocompatibility complex in birds.

Science.gov (United States)

Leclaire, Sarah; Strandh, Maria; Mardon, Jérôme; Westerdahl, Helena; Bonadonna, Francesco

2017-01-11

Many animals are known to preferentially mate with partners that are dissimilar at the major histocompatibility complex (MHC) in order to maximize the antigen binding repertoire (or disease resistance) in their offspring. Although several mammals, fish or lizards use odour cues to assess MHC similarity with potential partners, the ability of birds to assess MHC similarity using olfactory cues has not yet been explored. Here we used a behavioural binary choice test and high-throughput-sequencing of MHC class IIB to determine whether blue petrels can discriminate MHC similarity based on odour cues alone. Blue petrels are seabirds with particularly good sense of smell, they have a reciprocal mate choice and are known to preferentially mate with MHC-dissimilar partners. Incubating males preferentially approached the odour of the more MHC-dissimilar female, whereas incubating females showed opposite preferences. Given their mating pattern, females were, however, expected to show preference for the odour of the more MHC-dissimilar male. Further studies are needed to determine whether, as in women and female mice, the preference varies with the reproductive cycle in blue petrel females. Our results provide the first evidence that birds can use odour cues only to assess MHC dissimilarity. © 2017 The Author(s).
Patch Similarity Modulus and Difference Curvature Based Fourth-Order Partial Differential Equation for Image Denoising

Directory of Open Access Journals (Sweden)

Yunjiao Bai

2015-01-01

Full Text Available The traditional fourth-order nonlinear diffusion denoising model suffers the isolated speckles and the loss of fine details in the processed image. For this reason, a new fourth-order partial differential equation based on the patch similarity modulus and the difference curvature is proposed for image denoising. First, based on the intensity similarity of neighbor pixels, this paper presents a new edge indicator called patch similarity modulus, which is strongly robust to noise. Furthermore, the difference curvature which can effectively distinguish between edges and noise is incorporated into the denoising algorithm to determine the diffusion process by adaptively adjusting the size of the diffusion coefficient. The experimental results show that the proposed algorithm can not only preserve edges and texture details, but also avoid isolated speckles and staircase effect while filtering out noise. And the proposed algorithm has a better performance for the images with abundant details. Additionally, the subjective visual quality and objective evaluation index of the denoised image obtained by the proposed algorithm are higher than the ones from the related methods.
OS2: Oblivious similarity based searching for encrypted data outsourced to an untrusted domain

Science.gov (United States)

Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Ramzan, Naeem

2017-01-01

Public cloud storage services are becoming prevalent and myriad data sharing, archiving and collaborative services have emerged which harness the pay-as-you-go business model of public cloud. To ensure privacy and confidentiality often encrypted data is outsourced to such services, which further complicates the process of accessing relevant data by using search queries. Search over encrypted data schemes solve this problem by exploiting cryptographic primitives and secure indexing to identify outsourced data that satisfy the search criteria. Almost all of these schemes rely on exact matching between the encrypted data and search criteria. A few schemes which extend the notion of exact matching to similarity based search, lack realism as those schemes rely on trusted third parties or due to increase storage and computational complexity. In this paper we propose Oblivious Similarity based Search (OS2) for encrypted data. It enables authorized users to model their own encrypted search queries which are resilient to typographical errors. Unlike conventional methodologies, OS2 ranks the search results by using similarity measure offering a better search experience than exact matching. It utilizes encrypted bloom filter and probabilistic homomorphic encryption to enable authorized users to access relevant data without revealing results of search query evaluation process to the untrusted cloud service provider. Encrypted bloom filter based search enables OS2 to reduce search space to potentially relevant encrypted data avoiding unnecessary computation on public cloud. The efficacy of OS2 is evaluated on Google App Engine for various bloom filter lengths on different cloud configurations. PMID:28692697
AREAL FEATURE MATCHING BASED ON SIMILARITY USING CRITIC METHOD

Directory of Open Access Journals (Sweden)

J. Kim

2015-10-01

Full Text Available In this paper, we propose an areal feature matching method that can be applied for many-to-many matching, which involves matching a simple entity with an aggregate of several polygons or two aggregates of several polygons with fewer user intervention. To this end, an affine transformation is applied to two datasets by using polygon pairs for which the building name is the same. Then, two datasets are overlaid with intersected polygon pairs that are selected as candidate matching pairs. If many polygons intersect at this time, we calculate the inclusion function between such polygons. When the value is more than 0.4, many of the polygons are aggregated as single polygons by using a convex hull. Finally, the shape similarity is calculated between the candidate pairs according to the linear sum of the weights computed in CRITIC method and the position similarity, shape ratio similarity, and overlap similarity. The candidate pairs for which the value of the shape similarity is more than 0.7 are determined as matching pairs. We applied the method to two geospatial datasets: the digital topographic map and the KAIS map in South Korea. As a result, the visual evaluation showed two polygons that had been well detected by using the proposed method. The statistical evaluation indicates that the proposed method is accurate when using our test dataset with a high F-measure of 0.91.
Analysis and Modeling of Time-Correlated Characteristics of Rainfall-Runoff Similarity in the Upstream Red River Basin

Directory of Open Access Journals (Sweden)

Xiuli Sang

2012-01-01

Full Text Available We constructed a similarity model (based on Euclidean distance between rainfall and runoff to study time-correlated characteristics of rainfall-runoff similar patterns in the upstream Red River Basin and presented a detailed evaluation of the time correlation of rainfall-runoff similarity. The rainfall-runoff similarity was used to determine the optimum similarity. The results showed that a time-correlated model was found to be capable of predicting the rainfall-runoff similarity in the upstream Red River Basin in a satisfactory way. Both noised and denoised time series by thresholding the wavelet coefficients were applied to verify the accuracy of model. And the corresponding optimum similar sets obtained as the equation solution conditions showed an interesting and stable trend. On the whole, the annual mean similarity presented a gradually rising trend, for quantitatively estimating comprehensive influence of climate change and of human activities on rainfall-runoff similarity.
In the pursuit of a semantic similarity metric based on UMLS annotations for articles in PubMed Central Open Access.

Science.gov (United States)

Garcia Castro, Leyla Jael; Berlanga, Rafael; Garcia, Alexander

2015-10-01

Although full-text articles are provided by the publishers in electronic formats, it remains a challenge to find related work beyond the title and abstract context. Identifying related articles based on their abstract is indeed a good starting point; this process is straightforward and does not consume as many resources as full-text based similarity would require. However, further analyses may require in-depth understanding of the full content. Two articles with highly related abstracts can be substantially different regarding the full content. How similarity differs when considering title-and-abstract versus full-text and which semantic similarity metric provides better results when dealing with full-text articles are the main issues addressed in this manuscript. We have benchmarked three similarity metrics - BM25, PMRA, and Cosine, in order to determine which one performs best when using concept-based annotations on full-text documents. We also evaluated variations in similarity values based on title-and-abstract against those relying on full-text. Our test dataset comprises the Genomics track article collection from the 2005 Text Retrieval Conference. Initially, we used an entity recognition software to semantically annotate titles and abstracts as well as full-text with concepts defined in the Unified Medical Language System (UMLS®). For each article, we created a document profile, i.e., a set of identified concepts, term frequency, and inverse document frequency; we then applied various similarity metrics to those document profiles. We considered correlation, precision, recall, and F1 in order to determine which similarity metric performs best with concept-based annotations. For those full-text articles available in PubMed Central Open Access (PMC-OA), we also performed dispersion analyses in order to understand how similarity varies when considering full-text articles. We have found that the PubMed Related Articles similarity metric is the most suitable for
Graphical user interface based computer simulation of self-similar modes of a paraxial slow self-focusing laser beam for saturating plasma nonlinearities

International Nuclear Information System (INIS)

Batra, Karuna; Mitra, Sugata; Subbarao, D.; Sharma, R.P.; Uma, R.

2005-01-01

The task for the present study is to make an investigation of self-similarity in a self-focusing laser beam both theoretically and numerically using graphical user interface based interactive computer simulation model in MATLAB (matrix laboratory) software in the presence of saturating ponderomotive force based and relativistic electron quiver based plasma nonlinearities. The corresponding eigenvalue problem is solved analytically using the standard eikonal formalism and the underlying dynamics of self-focusing is dictated by the corrected paraxial theory for slow self-focusing. The results are also compared with computer simulation of self-focusing by the direct fast Fourier transform based spectral methods. It is found that the self-similar solution obtained analytically oscillates around the true numerical solution equating it at regular intervals. The simulation results are the main ones although a feasible semianalytical theory under many assumptions is given to understand the process. The self-similar profiles are called as self-organized profiles (not in a strict sense), which are found to be close to Laguerre-Gaussian curves for all the modes, the shape being conserved. This terminology is chosen because it has already been shown from a phase space analysis that the width of an initially Gaussian beam undergoes periodic oscillations that are damped when any absorption is added in the model, i.e., the beam width converges to a constant value. The research paper also tabulates the specific values of the normalized phase shift for solutions decaying to zero at large transverse distances for first three modes which can, however, be extended to higher order modes
An efficient similarity measure for content based image retrieval using memetic algorithm

Directory of Open Access Journals (Sweden)

Mutasem K. Alsmadi

2017-06-01

Full Text Available Content based image retrieval (CBIR systems work by retrieving images which are related to the query image (QI from huge databases. The available CBIR systems extract limited feature sets which confine the retrieval efficacy. In this work, extensive robust and important features were extracted from the images database and then stored in the feature repository. This feature set is composed of color signature with the shape and color texture features. Where, features are extracted from the given QI in the similar fashion. Consequently, a novel similarity evaluation using a meta-heuristic algorithm called a memetic algorithm (genetic algorithm with great deluge is achieved between the features of the QI and the features of the database images. Our proposed CBIR system is assessed by inquiring number of images (from the test dataset and the efficiency of the system is evaluated by calculating precision-recall value for the results. The results were superior to other state-of-the-art CBIR systems in regard to precision.
Evidence for similar patterns of neural activity elicited by picture- and word-based representations of natural scenes.

Science.gov (United States)

Kumar, Manoj; Federmeier, Kara D; Fei-Fei, Li; Beck, Diane M

2017-07-15

A long-standing core question in cognitive science is whether different modalities and representation types (pictures, words, sounds, etc.) access a common store of semantic information. Although different input types have been shown to activate a shared network of brain regions, this does not necessitate that there is a common representation, as the neurons in these regions could still differentially process the different modalities. However, multi-voxel pattern analysis can be used to assess whether, e.g., pictures and words evoke a similar pattern of activity, such that the patterns that separate categories in one modality transfer to the other. Prior work using this method has found support for a common code, but has two limitations: they have either only examined disparate categories (e.g. animals vs. tools) that are known to activate different brain regions, raising the possibility that the pattern separation and inferred similarity reflects only large scale differences between the categories or they have been limited to individual object representations. By using natural scene categories, we not only extend the current literature on cross-modal representations beyond objects, but also, because natural scene categories activate a common set of brain regions, we identify a more fine-grained (i.e. higher spatial resolution) common representation. Specifically, we studied picture- and word-based representations of natural scene stimuli from four different categories: beaches, cities, highways, and mountains. Participants passively viewed blocks of either phrases (e.g. "sandy beach") describing scenes or photographs from those same scene categories. To determine whether the phrases and pictures evoke a common code, we asked whether a classifier trained on one stimulus type (e.g. phrase stimuli) would transfer (i.e. cross-decode) to the other stimulus type (e.g. picture stimuli). The analysis revealed cross-decoding in the occipitotemporal, posterior parietal and
Similarity, Clustering, and Scaling Analyses for the Foreign Exchange Market ---Comprehensive Analysis on States of Market Participants with High-Frequency Financial Data---

Science.gov (United States)

Sato, A.; Sakai, H.; Nishimura, M.; Holyst, J. A.

This article proposes mathematical methods to quantify states of marketparticipants in the foreign exchange market (FX market) and conduct comprehensive analysis on behavior of market participants by means of high-frequency financial data. Based on econophysics tools and perspectives we study similarity measures for both rate movements and quotation activities among various currency pairs. We perform also clustering analysis on market states for observation days, and find scaling relationship between mean values of quotation activities and their standard deviations. Using these mathematical methods we can visualize states of the FX market comprehensively. Finally we conclude that states of market participants temporally vary due to both external and internal factors.
Similarity analyses of chromatographic herbal fingerprints: A review

International Nuclear Information System (INIS)

Goodarzi, Mohammad; Russell, Paul J.; Vander Heyden, Yvan

2013-01-01

Graphical abstract: -- Highlights: •Similarity analyses of herbal fingerprints are reviewed. •Different (dis)similarity approaches are discussed. •(Dis)similarity-metrics and exploratory-analysis approaches are illustrated. •Correlation and distance-based measures are overviewed. •Similarity analyses illustrated by several case studies. -- Abstract: Herbal medicines are becoming again more popular in the developed countries because being “natural” and people thus often assume that they are inherently safe. Herbs have also been used worldwide for many centuries in the traditional medicines. The concern of their safety and efficacy has grown since increasing western interest. Herbal materials and their extracts are very complex, often including hundreds of compounds. A thorough understanding of their chemical composition is essential for conducting a safety risk assessment. However, herbal material can show considerable variability. The chemical constituents and their amounts in a herb can be different, due to growing conditions, such as climate and soil, the drying process, the harvest season, etc. Among the analytical methods, chromatographic fingerprinting has been recommended as a potential and reliable methodology for the identification and quality control of herbal medicines. Identification is needed to avoid fraud and adulteration. Currently, analyzing chromatographic herbal fingerprint data sets has become one of the most applied tools in quality assessment of herbal materials. Mostly, the entire chromatographic profiles are used to identify or to evaluate the quality of the herbs investigated. Occasionally only a limited number of compounds are considered. One approach to the safety risk assessment is to determine whether the herbal material is substantially equivalent to that which is either readily consumed in the diet, has a history of application or has earlier been commercialized i.e. to what is considered as reference material. In order
Similarity analyses of chromatographic herbal fingerprints: A review

Energy Technology Data Exchange (ETDEWEB)

Goodarzi, Mohammad [Department of Analytical Chemistry and Pharmaceutical Technology, Center for Pharmaceutical Research, Vrije Universiteit Brussel, Laarbeeklaan 103, B-1090 Brussels (Belgium); Russell, Paul J. [Safety and Environmental Assurance Centre, Unilever, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ (United Kingdom); Vander Heyden, Yvan, E-mail: yvanvdh@vub.ac.be [Department of Analytical Chemistry and Pharmaceutical Technology, Center for Pharmaceutical Research, Vrije Universiteit Brussel, Laarbeeklaan 103, B-1090 Brussels (Belgium)

2013-12-04

Graphical abstract: -- Highlights: •Similarity analyses of herbal fingerprints are reviewed. •Different (dis)similarity approaches are discussed. •(Dis)similarity-metrics and exploratory-analysis approaches are illustrated. •Correlation and distance-based measures are overviewed. •Similarity analyses illustrated by several case studies. -- Abstract: Herbal medicines are becoming again more popular in the developed countries because being “natural” and people thus often assume that they are inherently safe. Herbs have also been used worldwide for many centuries in the traditional medicines. The concern of their safety and efficacy has grown since increasing western interest. Herbal materials and their extracts are very complex, often including hundreds of compounds. A thorough understanding of their chemical composition is essential for conducting a safety risk assessment. However, herbal material can show considerable variability. The chemical constituents and their amounts in a herb can be different, due to growing conditions, such as climate and soil, the drying process, the harvest season, etc. Among the analytical methods, chromatographic fingerprinting has been recommended as a potential and reliable methodology for the identification and quality control of herbal medicines. Identification is needed to avoid fraud and adulteration. Currently, analyzing chromatographic herbal fingerprint data sets has become one of the most applied tools in quality assessment of herbal materials. Mostly, the entire chromatographic profiles are used to identify or to evaluate the quality of the herbs investigated. Occasionally only a limited number of compounds are considered. One approach to the safety risk assessment is to determine whether the herbal material is substantially equivalent to that which is either readily consumed in the diet, has a history of application or has earlier been commercialized i.e. to what is considered as reference material. In order
Geographical markers for Saccharomyces cerevisiae strains with similar technological origins domesticated for rice-based ethnic fermented beverages production in North East India.

Science.gov (United States)

Jeyaram, Kumaraswamy; Tamang, Jyoti Prakash; Capece, Angela; Romano, Patrizia

2011-11-01

Autochthonous strains of Saccharomyces cerevisiae from traditional starters used for the production of rice-based ethnic fermented beverage in North East India were examined for their genetic polymorphism using mitochondrial DNA-RFLP and electrophoretic karyotyping. Mitochondrial DNA-RFLP analysis of S. cerevisiae strains with similar technological origins from hamei starter of Manipur and marcha starter of Sikkim revealed widely separated clusters based on their geographical origin. Electrophoretic karyotyping showed high polymorphism amongst the hamei strains within similar mitochondrial DNA-RFLP cluster and one unique karyotype of marcha strain was widely distributed in the Sikkim-Himalayan region. We conceptualized the possibility of separate domestication events for hamei strains in Manipur (located in the Indo-Burma biodiversity hotspot) and marcha strains in Sikkim (located in Himalayan biodiversity hotspot), as a consequence of less homogeneity in the genomic structure between these two groups, their clear separation being based on geographical origin, but not on technological origin and low strain level diversity within each group. The molecular markers developed based on HinfI-mtDNA-RFLP profile and the chromosomal doublets in chromosome VIII position of Sikkim-Himalayan strains could be effectively used as geographical markers for authenticating the above starter strains and differentiating them from other commercial strains.
Self-similar continued root approximants

International Nuclear Information System (INIS)

Gluzman, S.; Yukalov, V.I.

2012-01-01

A novel method of summing asymptotic series is advanced. Such series repeatedly arise when employing perturbation theory in powers of a small parameter for complicated problems of condensed matter physics, statistical physics, and various applied problems. The method is based on the self-similar approximation theory involving self-similar root approximants. The constructed self-similar continued roots extrapolate asymptotic series to finite values of the expansion parameter. The self-similar continued roots contain, as a particular case, continued fractions and Padé approximants. A theorem on the convergence of the self-similar continued roots is proved. The method is illustrated by several examples from condensed-matter physics.
Visualizing acoustic similarities between emotions in speech: an acoustic map of emotions

NARCIS (Netherlands)

Truong, K.P.; Leeuwen, D.A. van

2007-01-01

In this paper, we introduce a visual analysis method to assess the discriminability and confusiability between emotions according to automatic emotion classifiers. The degree of acoustic similarities between emotions can be defined in terms of distances that are based on pair-wise emotion
A self-similar hierarchy of the Korean stock market

Science.gov (United States)

Lim, Gyuchang; Min, Seungsik; Yoo, Kun-Woo

2013-01-01

A scaling analysis is performed on market values of stocks listed on Korean stock exchanges such as the KOSPI and the KOSDAQ. Different from previous studies on price fluctuations, market capitalizations are dealt with in this work. First, we show that the sum of the two stock exchanges shows a clear rank-size distribution, i.e., the Zipf's law, just as each separate one does. Second, by abstracting Zipf's law as a γ-sequence, we define a self-similar hierarchy consisting of many levels, with the numbers of firms at each level forming a geometric sequence. We also use two exponential functions to describe the hierarchy and derive a scaling law from them. Lastly, we propose a self-similar hierarchical process and perform an empirical analysis on our data set. Based on our findings, we argue that all money invested in the stock market is distributed in a hierarchical way and that a slight difference exists between the two exchanges.
The shareholding similarity of the shareholders of the worldwide listed energy companies based on a two-mode primitive network and a one-mode derivative holding-based network

Science.gov (United States)

Li, Huajiao; Fang, Wei; An, Haizhong; Yan, LiLi

2014-12-01

Two-mode and multi-mode networks represent new directions of simulating a complex network that can simulate the relationships among the entities more precisely. In this paper, we constructed two different levels of networks: one is the two-mode primitive networks of the energy listed companies and their shareholders on the basis of the two-mode method of complex theory, and the other is the derivative one-mode holding-based network based on the equivalence network theory. We calculated two different topological characteristics of the two networks, that is, the out-degree of the actor nodes of the two-mode network (9003 nodes) and the weights of the edges of the one-mode network (619,766 edges), and we analyzed the distribution features of both of the two topological characteristics. In this paper, we define both the weighted and un-weighted Shareholding Similarity Coefficient, and using the data of the worldwide listed energy companies and their shareholders as empirical study subjects, we calculated and compared both the weighted and un-weighted shareholding similarity coefficient of the worldwide listed energy companies. The result of the analysis indicates that (1) both the out-degree of the actor nodes of the two-mode network and the weights of the edges of the one-mode network follow a power-law distribution; (2) there are significant differences between the weighted and un-weighted shareholding similarity coefficient of the worldwide listed energy companies, and the weighted shareholding similarity coefficient is of greater regularity than the un-weighted one; (3) there are a vast majority of shareholders who hold stock in only one or a few of the listed energy companies; and (4) the shareholders hold stock in the same listed energy companies when the value of the un-weighted shareholding similarity coefficient is between 0.4 and 0.8. The study will be a helpful tool to analyze the relationships of the nodes of the one-mode network, which is constructed based
Female choice for male cuticular hydrocarbon profile in decorated crickets is not based on similarity to their own profile.

Science.gov (United States)

Steiger, S; Capodeanu-Nägler, A; Gershman, S N; Weddle, C B; Rapkin, J; Sakaluk, S K; Hunt, J

2015-12-01

Indirect genetic benefits derived from female mate choice comprise additive (good genes) and nonadditive genetic benefits (genetic compatibility). Although good genes can be revealed by condition-dependent display traits, the mechanism by which compatibility alleles are detected is unclear because evaluation of the genetic similarity of a prospective mate requires the female to assess the genotype of the male and compare it to her own. Cuticular hydrocarbons (CHCs), lipids coating the exoskeleton of most insects, influence female mate choice in a number of species and offer a way for females to assess genetic similarity of prospective mates. Here, we determine whether female mate choice in decorated crickets is based on male CHCs and whether it is influenced by females' own CHC profiles. We used multivariate selection analysis to estimate the strength and form of selection acting on male CHCs through female mate choice, and employed different measures of multivariate dissimilarity to determine whether a female's preference for male CHCs is based on similarity to her own CHC profile. Female mating preferences were significantly influenced by CHC profiles of males. Male CHC attractiveness was not, however, contingent on the CHC profile of the choosing female, as certain male CHC phenotypes were equally attractive to most females, evidenced by significant linear and stabilizing selection gradients. These results suggest that additive genetic benefits, rather than nonadditive genetic benefits, accrue to female mate choice, in support of earlier work showing that CHC expression of males, but not females, is condition dependent. © 2015 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2015 European Society For Evolutionary Biology.
Semantic Similarity between Web Documents Using Ontology

Science.gov (United States)

Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh

2018-06-01

The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.

Semantic Similarity between Web Documents Using Ontology

Science.gov (United States)

Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh

2018-03-01

The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.
Research on electric and thermal characteristics of plasma torch based on similarity theory

International Nuclear Information System (INIS)

Cheng Changming; Tang Deli; Lan Wei

2007-01-01

Configuration and working principle of a DC non-transferred plasma torch have been introduced. Based on similarity theory, connections between the electric-thermal characteristics and operational parameter such as flowing gas rate and arc power have been investigated. Calculation and experiment are compared. The results indicate that the calculation results are in agreement with experimental ones. The formulas can be used for plasma torch improvement and optimization. (authors)
Fast protein tertiary structure retrieval based on global surface shape similarity.

Science.gov (United States)

Sael, Lee; Li, Bin; La, David; Fang, Yi; Ramani, Karthik; Rustamov, Raif; Kihara, Daisuke

2008-09-01

Characterization and identification of similar tertiary structure of proteins provides rich information for investigating function and evolution. The importance of structure similarity searches is increasing as structure databases continue to expand, partly due to the structural genomics projects. A crucial drawback of conventional protein structure comparison methods, which compare structures by their main-chain orientation or the spatial arrangement of secondary structure, is that a database search is too slow to be done in real-time. Here we introduce a global surface shape representation by three-dimensional (3D) Zernike descriptors, which represent a protein structure compactly as a series expansion of 3D functions. With this simplified representation, the search speed against a few thousand structures takes less than a minute. To investigate the agreement between surface representation defined by 3D Zernike descriptor and conventional main-chain based representation, a benchmark was performed against a protein classification generated by the combinatorial extension algorithm. Despite the different representation, 3D Zernike descriptor retrieved proteins of the same conformation defined by combinatorial extension in 89.6% of the cases within the top five closest structures. The real-time protein structure search by 3D Zernike descriptor will open up new possibility of large-scale global and local protein surface shape comparison. 2008 Wiley-Liss, Inc.
Structure and sensitivity analysis of individual-based predator–prey models

International Nuclear Information System (INIS)

Imron, Muhammad Ali; Gergs, Andre; Berger, Uta

2012-01-01

The expensive computational cost of sensitivity analyses has hampered the use of these techniques for analysing individual-based models in ecology. A relatively cheap computational cost, referred to as the Morris method, was chosen to assess the relative effects of all parameters on the model’s outputs and to gain insights into predator–prey systems. Structure and results of the sensitivity analysis of the Sumatran tiger model – the Panthera Population Persistence (PPP) and the Notonecta foraging model (NFM) – were compared. Both models are based on a general predation cycle and designed to understand the mechanisms behind the predator–prey interaction being considered. However, the models differ significantly in their complexity and the details of the processes involved. In the sensitivity analysis, parameters that directly contribute to the number of prey items killed were found to be most influential. These were the growth rate of prey and the hunting radius of tigers in the PPP model as well as attack rate parameters and encounter distance of backswimmers in the NFM model. Analysis of distances in both of the models revealed further similarities in the sensitivity of the two individual-based models. The findings highlight the applicability and importance of sensitivity analyses in general, and screening design methods in particular, during early development of ecological individual-based models. Comparison of model structures and sensitivity analyses provides a first step for the derivation of general rules in the design of predator–prey models for both practical conservation and conceptual understanding. - Highlights: ► Structure of predation processes is similar in tiger and backswimmer model. ► The two individual-based models (IBM) differ in space formulations. ► In both models foraging distance is among the sensitive parameters. ► Morris method is applicable for the sensitivity analysis even of complex IBMs.
AR(p) -based detrended fluctuation analysis

Science.gov (United States)

Alvarez-Ramirez, J.; Rodriguez, E.

2018-07-01

Autoregressive models are commonly used for modeling time-series from nature, economics and finance. This work explored simple autoregressive AR(p) models to remove long-term trends in detrended fluctuation analysis (DFA). Crude oil prices and bitcoin exchange rate were considered, with the former corresponding to a mature market and the latter to an emergent market. Results showed that AR(p) -based DFA performs similar to traditional DFA. However, the former DFA provides information on stability of long-term trends, which is valuable for understanding and quantifying the dynamics of complex time series from financial systems.
LDA-Based Unified Topic Modeling for Similar TV User Grouping and TV Program Recommendation.

Science.gov (United States)

Pyo, Shinjee; Kim, Eunhui; Kim, Munchurl

2015-08-01

Social TV is a social media service via TV and social networks through which TV users exchange their experiences about TV programs that they are viewing. For social TV service, two technical aspects are envisioned: grouping of similar TV users to create social TV communities and recommending TV programs based on group and personal interests for personalizing TV. In this paper, we propose a unified topic model based on grouping of similar TV users and recommending TV programs as a social TV service. The proposed unified topic model employs two latent Dirichlet allocation (LDA) models. One is a topic model of TV users, and the other is a topic model of the description words for viewed TV programs. The two LDA models are then integrated via a topic proportion parameter for TV programs, which enforces the grouping of similar TV users and associated description words for watched TV programs at the same time in a unified topic modeling framework. The unified model identifies the semantic relation between TV user groups and TV program description word groups so that more meaningful TV program recommendations can be made. The unified topic model also overcomes an item ramp-up problem such that new TV programs can be reliably recommended to TV users. Furthermore, from the topic model of TV users, TV users with similar tastes can be grouped as topics, which can then be recommended as social TV communities. To verify our proposed method of unified topic-modeling-based TV user grouping and TV program recommendation for social TV services, in our experiments, we used real TV viewing history data and electronic program guide data from a seven-month period collected by a TV poll agency. The experimental results show that the proposed unified topic model yields an average 81.4% precision for 50 topics in TV program recommendation and its performance is an average of 6.5% higher than that of the topic model of TV users only. For TV user prediction with new TV programs, the average
Similarity score computation for minutiae-based fingerprint recognition

CSIR Research Space (South Africa)

Khanyile, NP

2014-09-01

Full Text Available This paper identifies and analyses the factors that contribute to the similarity between two sets of minutiae points as well as the probability that two sets of minutiae points were extracted from fingerprints of the same finger. Minutiae...
Neutrosophic Cubic MCGDM Method Based on Similarity Measure

Directory of Open Access Journals (Sweden)

Surapati Pramanik

2017-06-01

Full Text Available The notion of neutrosophic cubic set is originated from the hybridization of the concept of neutrosophic set and interval valued neutrosophic set. We define similarity measure for neutrosophic cubic sets and prove some of its basic properties.
Similarity Analysis About The Training Of Family Health Strategy Professionals For The Psychosocial Care Of The Elderly

Directory of Open Access Journals (Sweden)

Verônica Lourdes Lima Batista Maia

2017-08-01

Full Text Available Background: Elderly mental health is an important topic of discussion to Brazilian public health because it involves factors related to the training of health professionals focused on these demands in the Family Health Strategy. Objectives: To make a similarity analysis about the training of the Family Health Strategy professionals for psychosocial care for the elderly. Methodology: Qualitative research carried out with 31 professionals from the Family Health Strategy in the city of Picos, Piauí, Brazil. Data were collected through a semi-structured interview script. The interviews were performed in a reserved room and recorded with the aid of an MP4 player. The data were processed by the IRAMUTEQ software and analyzed through similarity analysis that is based on graph theory. Results: The study participants were 13 doctors and 18 nurses, 27 (87.09% were female. The training time of these professionals was comprised between 2 to 32 years of training and the duration of the Health Strategy from 1 year to 16 years. According to the co-occurrence tree, the data indicate that: the word "elderly" is at the heart of the ramifications and expresses how family and professionals can contribute to treatment; another demonstrated representation is that it is difficult for professionals to carry out their activities with the elderly due to lack of training in the specific area of mental health. Conclusion: the family plays a fundamental role in the elderly care with psychosocial needs and the professionals of the Family Health Strategy present difficulties to carry out comprehensive care due to deficiencies in their training. Keywords: Mental health. Family Health. Elderly.
SU-F-P-51: Similarity Analysis of the Linear Accelerator Machines Based On Clinical Simulation

International Nuclear Information System (INIS)

Li, K

2016-01-01

Purpose: To evaluate the clinical rationale for Truebeam and Trilogy Linac machines from Varian Medical System as exchangeable treatment modalities in the same radiation oncology department. Methods: Intensity Modulated Radiotherapy (IMRT) plans for different diseases were selected for this study. These disease sites included brain, head and neck, breast, lung, and prostate. The parameters selected for this study were the energy amount, Monitor Unit (MU); dose coverage of target reflected by prescription isodose volume(PIV); dose spillage described by the volume of 50% isodoseline of the prescription; and dose homogeneities represented by the maximum dose (MaxD) and the minimum dose (MinD) of target volume (TV) and critical structure (CS). Each percentage difference between the values of these parameters formed an element of a matrix, which was called Similarity Comparison Matrix(SCM). The elements of the SCM were then simplified by dimensional conversion algorithm, which was used to determine clinical similarity between two machines through a single value. Results: For the selected clinical cases in this study, the average percentage differences between Trilogy and Truebeam in MU was 0.28% with standard deviation(SD) 0.66%, PIV was 0.23% with SD 0.20%, Volume at 50% prescription dose was 0.31% with SD at 0.78%, MaxD at TV is 0.26% with SD 0.35%, MinD at TV is −0.04% with SD 0.51%, MaxD in CS is −0.53% with SD 0.92%, and MinD in CS 3.31%, with SD at 2.89%. The sum, product, geometric and harmonic mean for the matrix elements were 19.0%, 0.00%, 0.19%, and 0.00%. Conclusion: A method to compare two machines in clinical level was developed and some reference values were calculated for decision-making in clinical practice, and this strategy could be expanded to different clinical applications.
SU-F-P-51: Similarity Analysis of the Linear Accelerator Machines Based On Clinical Simulation

Energy Technology Data Exchange (ETDEWEB)

Li, K [Associates In Medical Physics, Lanham, MD (United States); John R Marsh Cancer Center (United States)

2016-06-15

Purpose: To evaluate the clinical rationale for Truebeam and Trilogy Linac machines from Varian Medical System as exchangeable treatment modalities in the same radiation oncology department. Methods: Intensity Modulated Radiotherapy (IMRT) plans for different diseases were selected for this study. These disease sites included brain, head and neck, breast, lung, and prostate. The parameters selected for this study were the energy amount, Monitor Unit (MU); dose coverage of target reflected by prescription isodose volume(PIV); dose spillage described by the volume of 50% isodoseline of the prescription; and dose homogeneities represented by the maximum dose (MaxD) and the minimum dose (MinD) of target volume (TV) and critical structure (CS). Each percentage difference between the values of these parameters formed an element of a matrix, which was called Similarity Comparison Matrix(SCM). The elements of the SCM were then simplified by dimensional conversion algorithm, which was used to determine clinical similarity between two machines through a single value. Results: For the selected clinical cases in this study, the average percentage differences between Trilogy and Truebeam in MU was 0.28% with standard deviation(SD) 0.66%, PIV was 0.23% with SD 0.20%, Volume at 50% prescription dose was 0.31% with SD at 0.78%, MaxD at TV is 0.26% with SD 0.35%, MinD at TV is −0.04% with SD 0.51%, MaxD in CS is −0.53% with SD 0.92%, and MinD in CS 3.31%, with SD at 2.89%. The sum, product, geometric and harmonic mean for the matrix elements were 19.0%, 0.00%, 0.19%, and 0.00%. Conclusion: A method to compare two machines in clinical level was developed and some reference values were calculated for decision-making in clinical practice, and this strategy could be expanded to different clinical applications.
Self-similar factor approximants

International Nuclear Information System (INIS)

Gluzman, S.; Yukalov, V.I.; Sornette, D.

2003-01-01

The problem of reconstructing functions from their asymptotic expansions in powers of a small variable is addressed by deriving an improved type of approximants. The derivation is based on the self-similar approximation theory, which presents the passage from one approximant to another as the motion realized by a dynamical system with the property of group self-similarity. The derived approximants, because of their form, are called self-similar factor approximants. These complement the obtained earlier self-similar exponential approximants and self-similar root approximants. The specific feature of self-similar factor approximants is that their control functions, providing convergence of the computational algorithm, are completely defined from the accuracy-through-order conditions. These approximants contain the Pade approximants as a particular case, and in some limit they can be reduced to the self-similar exponential approximants previously introduced by two of us. It is proved that the self-similar factor approximants are able to reproduce exactly a wide class of functions, which include a variety of nonalgebraic functions. For other functions, not pertaining to this exactly reproducible class, the factor approximants provide very accurate approximations, whose accuracy surpasses significantly that of the most accurate Pade approximants. This is illustrated by a number of examples showing the generality and accuracy of the factor approximants even when conventional techniques meet serious difficulties
A touch-probe path generation method through similarity analysis between the feature vectors in new and old models

Energy Technology Data Exchange (ETDEWEB)

Jeon, Hye Sung; Lee, Jin Won; Yang, Jeong Sam [Dept. of Industrial Engineering, Ajou University, Suwon (Korea, Republic of)

2016-10-15

The On-machine measurement (OMM), which measures a work piece during or after the machining process in the machining center, has the advantage of measuring the work piece directly within the work space without moving it. However, the path generation procedure used to determine the measuring sequence and variables for the complex features of a target work piece has the limitation of requiring time-consuming tasks to generate the measuring points and mostly relies on the proficiency of the on-site engineer. In this study, we propose a touch-probe path generation method using similarity analysis between the feature vectors of three-dimensional (3-D) shapes for the OMM. For the similarity analysis between a new 3-D model and existing 3-D models, we extracted the feature vectors from models that can describe the characteristics of a geometric shape model; then, we applied those feature vectors to a geometric histogram that displays a probability distribution obtained by the similarity analysis algorithm. In addition, we developed a computer-aided inspection planning system that corrects non-applied measuring points that are caused by minute geometry differences between the two models and generates the final touch-probe path.
Gas load forecasting based on optimized fuzzy c-mean clustering analysis of selecting similar days

Directory of Open Access Journals (Sweden)

Qiu Jing

2017-08-01

Full Text Available Traditional fuzzy c-means (FCM clustering in short term load forecasting method is easy to fall into local optimum and is sensitive to the initial cluster center.In this paper,we propose to use global search feature of particle swarm optimization (PSO algorithm to avoid these shortcomings,and to use FCM optimization to select similar date of forecast as training sample of support vector machines.This will not only strengthen the data rule of training samples,but also ensure the consistency of data characteristics.Experimental results show that the prediction accuracy of this prediction model is better than that of BP neural network and support vector machine (SVM algorithms.
Creating Usage Context-Based Object Similarities to Boost Recommender Systems in Technology Enhanced Learning

Science.gov (United States)

Niemann, Katja; Wolpers, Martin

2015-01-01

In this paper, we introduce a new way of detecting semantic similarities between learning objects by analysing their usage in web portals. Our approach relies on the usage-based relations between the objects themselves rather then on the content of the learning objects or on the relations between users and learning objects. We then take this new…
Classification of peacock feather reflectance using principal component analysis similarity factors from multispectral imaging data.

Science.gov (United States)

Medina, José M; Díaz, José A; Vukusic, Pete

2015-04-20

Iridescent structural colors in biology exhibit sophisticated spatially-varying reflectance properties that depend on both the illumination and viewing angles. The classification of such spectral and spatial information in iridescent structurally colored surfaces is important to elucidate the functional role of irregularity and to improve understanding of color pattern formation at different length scales. In this study, we propose a non-invasive method for the spectral classification of spatial reflectance patterns at the micron scale based on the multispectral imaging technique and the principal component analysis similarity factor (PCASF). We demonstrate the effectiveness of this approach and its component methods by detailing its use in the study of the angle-dependent reflectance properties of Pavo cristatus (the common peacock) feathers, a species of peafowl very well known to exhibit bright and saturated iridescent colors. We show that multispectral reflectance imaging and PCASF approaches can be used as effective tools for spectral recognition of iridescent patterns in the visible spectrum and provide meaningful information for spectral classification of the irregularity of the microstructure in iridescent plumage.
Efficient data retrieval method for similar plasma waveforms in EAST

Energy Technology Data Exchange (ETDEWEB)

Liu, Ying, E-mail: liuying-ipp@szu.edu.cn [SZU-CASIPP Joint Laboratory for Applied Plasma, Shenzhen University, Shenzhen 518060 (China); Huang, Jianjun; Zhou, Huasheng; Wang, Fan [SZU-CASIPP Joint Laboratory for Applied Plasma, Shenzhen University, Shenzhen 518060 (China); Wang, Feng [Institute of Plasma Physics Chinese Academy of Sciences, Hefei 230031 (China)

2016-11-15

Highlights: • The proposed method is carried out by means of bounding envelope and angle distance. • It allows retrieving for whole similar waveforms of any time length. • In addition, the proposed method is also possible to retrieve subsequences. - Abstract: Fusion research relies highly on data analysis due to its massive-sized database. In the present work, we propose an efficient method for searching and retrieving similar plasma waveforms in Experimental Advanced Superconducting Tokamak (EAST). Based on Piecewise Linear Aggregate Approximation (PLAA) for extracting feature values, the searching process is accomplished in two steps. The first one is coarse searching to narrow down the search space, which is carried out by means of bounding envelope. The second step is fine searching to retrieval similar waveforms, which is implemented by the angle distance. The proposed method is tested in EAST databases and turns out to have good performance in retrieving similar waveforms.
[Formula: see text]: Oblivious similarity based searching for encrypted data outsourced to an untrusted domain.

Science.gov (United States)

Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Ramzan, Naeem; Khan, Wajahat Ali

2017-01-01

Public cloud storage services are becoming prevalent and myriad data sharing, archiving and collaborative services have emerged which harness the pay-as-you-go business model of public cloud. To ensure privacy and confidentiality often encrypted data is outsourced to such services, which further complicates the process of accessing relevant data by using search queries. Search over encrypted data schemes solve this problem by exploiting cryptographic primitives and secure indexing to identify outsourced data that satisfy the search criteria. Almost all of these schemes rely on exact matching between the encrypted data and search criteria. A few schemes which extend the notion of exact matching to similarity based search, lack realism as those schemes rely on trusted third parties or due to increase storage and computational complexity. In this paper we propose Oblivious Similarity based Search ([Formula: see text]) for encrypted data. It enables authorized users to model their own encrypted search queries which are resilient to typographical errors. Unlike conventional methodologies, [Formula: see text] ranks the search results by using similarity measure offering a better search experience than exact matching. It utilizes encrypted bloom filter and probabilistic homomorphic encryption to enable authorized users to access relevant data without revealing results of search query evaluation process to the untrusted cloud service provider. Encrypted bloom filter based search enables [Formula: see text] to reduce search space to potentially relevant encrypted data avoiding unnecessary computation on public cloud. The efficacy of [Formula: see text] is evaluated on Google App Engine for various bloom filter lengths on different cloud configurations.
Barcode server: a visualization-based genome analysis system.

Directory of Open Access Journals (Sweden)

Fenglou Mao

Full Text Available We have previously developed a computational method for representing a genome as a barcode image, which makes various genomic features visually apparent. We have demonstrated that this visual capability has made some challenging genome analysis problems relatively easy to solve. We have applied this capability to a number of challenging problems, including (a identification of horizontally transferred genes, (b identification of genomic islands with special properties and (c binning of metagenomic sequences, and achieved highly encouraging results. These application results inspired us to develop this barcode-based genome analysis server for public service, which supports the following capabilities: (a calculation of the k-mer based barcode image for a provided DNA sequence; (b detection of sequence fragments in a given genome with distinct barcodes from those of the majority of the genome, (c clustering of provided DNA sequences into groups having similar barcodes; and (d homology-based search using Blast against a genome database for any selected genomic regions deemed to have interesting barcodes. The barcode server provides a job management capability, allowing processing of a large number of analysis jobs for barcode-based comparative genome analyses. The barcode server is accessible at http://csbl1.bmb.uga.edu/Barcode.
Marriage Matters: Spousal Similarity in Life Satisfaction

OpenAIRE

Ulrich Schimmack; Richard Lucas

2006-01-01

Examined the concurrent and cross-lagged spousal similarity in life satisfaction over a 21-year period. Analyses were based on married couples (N = 847) in the German Socio-Economic Panel (SOEP). Concurrent spousal similarity was considerably higher than one-year retest similarity, revealing spousal similarity in the variable component of life satisfac-tion. Spousal similarity systematically decreased with length of retest interval, revealing simi-larity in the changing component of life sati...

Tuning glass formation and brittle behaviors by similar solvent element substitution in (Mn,Fe)-based bulk metallic glasses

Energy Technology Data Exchange (ETDEWEB)

Xu, Tao [Key Laboratory of Aerospace Materials and Performance (Ministry of Education), School of Materials Science and Engineering, Beihang University, Beijing 100191 (China); Li, Ran, E-mail: liran@buaa.edu.cn [Key Laboratory of Aerospace Materials and Performance (Ministry of Education), School of Materials Science and Engineering, Beihang University, Beijing 100191 (China); Xiao, Ruijuan [Institute of Physics, Chinese Academy of Sciences, Beijing 100190 (China); Liu, Gang [State Key Laboratory for Mechanical Behavior of Materials, School of Materials Science and Engineering, Xi' an Jiaotong University, Xi' an 710049 (China); Wang, Jianfeng [School of Materials Science and Engineering, Zhengzhou University, Zhengzhou 450001 (China); Zhang, Tao, E-mail: zhangtao@buaa.edu.cn [Key Laboratory of Aerospace Materials and Performance (Ministry of Education), School of Materials Science and Engineering, Beihang University, Beijing 100191 (China)

2015-02-25

A family of Mn-rich bulk metallic glasses (BMGs) was developed through the similar solvent elements (SSE) substitution of Mn for Fe in (Mn{sub x}Fe{sub 80−x})P{sub 10}B{sub 7}C{sub 3} alloys. The effect of the SSE substitution on glass formation, thermal stability, elastic constants, mechanical properties, fracture morphologies, Weibull modulus and indentation fracture toughness was discussed. A thermodynamics analysis provided by Battezzati et al. (L. Battezzati, E. Garrone, Z. Metallkd. 75 (1984) 305–310) was adopted to explain the compositional dependence of the glass-forming ability (GFA). The elastic moduli follow roughly linear correlations with the substitution concentration of Mn in (Mn{sub x}Fe{sub 80−x})P{sub 10}B{sub 7}C{sub 3} BMGs. The introduction of Mn to replace Fe significantly decreases the plasticity of the resulting BMGs and the Weibull modulus of the fracture strength. A super-brittle Mn-based BMGs of (Mn{sub 55}Fe{sub 25})P{sub 10}B{sub 7}C{sub 3} BMGs were found with the indentation fracture toughness (K{sub c}) of 1.91±0.04 MPa m{sup 1/2}, the lowest value among all kinds of BMGs so far. The atomic and electronic structure of the selected BMGs were simulated by the first principles molecular dynamics calculations based on density functional theory, which provided a possible understanding of the brittleness caused by the similar chemical element replacement of Mn for Fe.
An effective framework for finding similar cases of dengue from audio and text data using domain thesaurus and case base reasoning

Science.gov (United States)

Sandhu, Rajinder; Kaur, Jaspreet; Thapar, Vivek

2018-02-01

Dengue, also known as break-bone fever, is a tropical disease transmitted by mosquitoes. If the similarity between dengue infected users can be identified, it can help government's health agencies to manage the outbreak more effectively. To find similarity between cases affected by Dengue, user's personal and health information are the two fundamental requirements. Identification of similar symptoms, causes, effects, predictions and treatment procedures, is important. In this paper, an effective framework is proposed which finds similar patients suffering from dengue using keyword aware domain thesaurus and case base reasoning method. This paper focuses on the use of ontology dependent domain thesaurus technique to extract relevant keywords and then build cases with the help of case base reasoning method. Similar cases can be shared with users, nearby hospitals and health organizations to manage the problem more adequately. Two million case bases were generated to test the proposed similarity method. Experimental evaluations of proposed framework resulted in high accuracy and low error rate for finding similar cases of dengue as compared to UPCC and IPCC algorithms. The framework developed in this paper is for dengue but can easily be extended to other domains also.
A Novel Relevance Feedback Approach Based on Similarity Measure Modification in an X-Ray Image Retrieval System Based on Fuzzy Representation Using Fuzzy Attributed Relational Graph

Directory of Open Access Journals (Sweden)

Hossien Pourghassem

2011-04-01

Full Text Available Relevance feedback approaches is used to improve the performance of content-based image retrieval systems. In this paper, a novel relevance feedback approach based on similarity measure modification in an X-ray image retrieval system based on fuzzy representation using fuzzy attributed relational graph (FARG is presented. In this approach, optimum weight of each feature in feature vector is calculated using similarity rate between query image and relevant and irrelevant images in user feedback. The calculated weight is used to tune fuzzy graph matching algorithm as a modifier parameter in similarity measure. The standard deviation of the retrieved image features is applied to calculate the optimum weight. The proposed image retrieval system uses a FARG for representation of images, a fuzzy matching graph algorithm as similarity measure and a semantic classifier based on merging scheme for determination of the search space in image database. To evaluate relevance feedback approach in the proposed system, a standard X-ray image database consisting of 10000 images in 57 classes is used. The improvement of the evaluation parameters shows proficiency and efficiency of the proposed system.
Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles

Science.gov (United States)

Liu, Rey-Long

2015-01-01

Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling) that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references) in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no) articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations. PMID:26440794
Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles.

Directory of Open Access Journals (Sweden)

Rey-Long Liu

Full Text Available Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations.
Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates.

Science.gov (United States)

Xia, Li C; Steele, Joshua A; Cram, Jacob A; Cardon, Zoe G; Simmons, Sheri L; Vallino, Joseph J; Fuhrman, Jed A; Sun, Fengzhu

2011-01-01

The increasing availability of time series microbial community data from metagenomics and other molecular biological studies has enabled the analysis of large-scale microbial co-occurrence and association networks. Among the many analytical techniques available, the Local Similarity Analysis (LSA) method is unique in that it captures local and potentially time-delayed co-occurrence and association patterns in time series data that cannot otherwise be identified by ordinary correlation analysis. However LSA, as originally developed, does not consider time series data with replicates, which hinders the full exploitation of available information. With replicates, it is possible to understand the variability of local similarity (LS) score and to obtain its confidence interval. We extended our LSA technique to time series data with replicates and termed it extended LSA, or eLSA. Simulations showed the capability of eLSA to capture subinterval and time-delayed associations. We implemented the eLSA technique into an easy-to-use analytic software package. The software pipeline integrates data normalization, statistical correlation calculation, statistical significance evaluation, and association network construction steps. We applied the eLSA technique to microbial community and gene expression datasets, where unique time-dependent associations were identified. The extended LSA analysis technique was demonstrated to reveal statistically significant local and potentially time-delayed association patterns in replicated time series data beyond that of ordinary correlation analysis. These statistically significant associations can provide insights to the real dynamics of biological systems. The newly designed eLSA software efficiently streamlines the analysis and is freely available from the eLSA homepage, which can be accessed at http://meta.usc.edu/softs/lsa.
ANALYSIS DATA SETS USING HYBRID TECHNIQUES APPLIED ARTIFICIAL INTELLIGENCE BASED PRODUCTION SYSTEMS INTEGRATED DESIGN

Directory of Open Access Journals (Sweden)

Daniel-Petru GHENCEA

2017-06-01

Full Text Available The paper proposes a prediction model of behavior spindle from the point of view of the thermal deformations and the level of the vibrations by highlighting and processing the characteristic equations. This is a model analysis for the shaft with similar electro-mechanical characteristics can be achieved using a hybrid analysis based on artificial intelligence (genetic algorithms - artificial neural networks - fuzzy logic. The paper presents a prediction mode obtaining valid range of values for spindles with similar characteristics based on measured data sets from a few spindles test without additional measures being required. Extracting polynomial functions of graphs resulting from simultaneous measurements and predict the dynamics of the two features with multi-objective criterion is the main advantage of this method.
Analysis of newly established EST databases reveals similarities between heart regeneration in newt and fish

Directory of Open Access Journals (Sweden)

Weis Patrick

2010-01-01

Full Text Available Abstract Background The newt Notophthalmus viridescens possesses the remarkable ability to respond to cardiac damage by formation of new myocardial tissue. Surprisingly little is known about changes in gene activities that occur during the course of regeneration. To begin to decipher the molecular processes, that underlie restoration of functional cardiac tissue, we generated an EST database from regenerating newt hearts and compared the transcriptional profile of selected candidates with genes deregulated during zebrafish heart regeneration. Results A cDNA library of 100,000 cDNA clones was generated from newt hearts 14 days after ventricular injury. Sequencing of 11520 cDNA clones resulted in 2894 assembled contigs. BLAST searches revealed 1695 sequences with potential homology to sequences from the NCBI database. BLAST searches to TrEMBL and Swiss-Prot databases assigned 1116 proteins to Gene Ontology terms. We also identified a relatively large set of 174 ORFs, which are likely to be unique for urodele amphibians. Expression analysis of newt-zebrafish homologues confirmed the deregulation of selected genes during heart regeneration. Sequences, BLAST results and GO annotations were visualized in a relational web based database followed by grouping of identified proteins into clusters of GO Terms. Comparison of data from regenerating zebrafish hearts identified biological processes, which were uniformly overrepresented during cardiac regeneration in newt and zebrafish. Conclusion We concluded that heart regeneration in newts and zebrafish led to the activation of similar sets of genes, which suggests that heart regeneration in both species might follow similar principles. The design of the newly established newt EST database allows identification of molecular pathways important for heart regeneration.
On distributional assumptions and whitened cosine similarities

DEFF Research Database (Denmark)

Loog, Marco

2008-01-01

Recently, an interpretation of the whitened cosine similarity measure as a Bayes decision rule was proposed (C. Liu, "The Bayes Decision Rule Induced Similarity Measures,'' IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1086-1090, June 2007. This communication makes th...
Using Response Surface Analysis to Interpret the Impact of Parent?Offspring Personality Similarity on Adolescent Externalizing Problems

OpenAIRE

Franken, Aart; Laceulle, Odillia M.; Van Aken, Marcel A.G.; Ormel, Johan

2017-01-01

Abstract Personality similarity between parent and offspring has been suggested to play an important role in offspring's development of externalizing problems. Nonetheless, much remains unknown regarding the nature of this association. This study aimed to investigate the effects of parent?offspring similarity at different levels of personality traits, comparing expectations based on evolutionary and goodness?of?fit perspectives. Two waves of data from the TRAILS study (N?=?1587, 53% girls) we...
Similarity and Dissimilarity between Superiors and Subordinates and Their Implications for Dyadic Relationship Quality

Directory of Open Access Journals (Sweden)

Nereida Salette Paulo da Silveira

2009-01-01

Full Text Available Although literature advocates the advantages of work force diversification, studies based on the Similarity- Attraction Paradigm indicate that people are more disposed to feel attraction to those who are similar to them. A field study with the comparative data of 89 dyads investigated the effect of the actual and perceived similarity in the quality of the relationship between superiors and subordinates within the Leader-Member Exchange [LMX] perspective. The investigated characteristics were: gender, age and work-family conflict. The data indicate the influence only of perceived similarity in the quality of the relationship between superiors and subordinates. This effect broadens when the subordinate feels satisfied with the quality and frequency of contact with his/her superior. The methodological procedures included factorial analysis and validation of two scales (EIFT and LMX-7, the correlations analysis and hierarchic regressions. Finally, the implications of some results and directions for future research in diversity are discussed.
Analysis of equivalent antenna based on FDTD method

Directory of Open Access Journals (Sweden)

Yun-xing Yang

2014-09-01

Full Text Available An equivalent microstrip antenna used in radio proximity fuse is presented. The design of this antenna is based on multilayer multi-permittivity dielectric substrate which is analyzed by finite difference time domain (FDTD method. Equivalent iterative formula is modified in the condition of cylindrical coordinate system. The mixed substrate which contains two kinds of media (one of them is airtakes the place of original single substrate. The results of equivalent antenna simulation show that the resonant frequency of equivalent antenna is similar to that of the original antenna. The validity of analysis can be validated by means of antenna resonant frequency formula. Two antennas have same radiation pattern and similar gain. This method can be used to reduce the weight of antenna, which is significant to the design of missile-borne antenna.
Detecting groups of similar components in complex networks

International Nuclear Information System (INIS)

Wang Jiao; Lai, C-H

2008-01-01

We study how to detect groups in a complex network each of which consists of component nodes sharing a similar connection pattern. Based on the mixture models and the exploratory analysis set up by Newman and Leicht (2007 Proc. Natl. Acad. Sci. USA 104 9564), we develop an algorithm that is applicable to a network with any degree distribution. The partition of a network suggested by this algorithm also applies to its complementary network. In general, groups of similar components are not necessarily identical with the communities in a community network; thus partitioning a network into groups of similar components provides additional information of the network structure. The proposed algorithm can also be used for community detection when the groups and the communities overlap. By introducing a tunable parameter that controls the involved effects of the heterogeneity, we can also investigate conveniently how the group structure can be coupled with the heterogeneity characteristics. In particular, an interesting example shows a group partition can evolve into a community partition in some situations when the involved heterogeneity effects are tuned. The extension of this algorithm to weighted networks is discussed as well.
Analysis And Augmentation Of Timing Advance Based Geolocation In Lte Cellular Networks

Science.gov (United States)

2016-12-01

measurements to validate TA-based positioning approaches in LTE . Their approach did not, however, focus on characterizing the TA. Rather, similar to...UE will measure the time difference of arrival of the LTE Positioning Reference Signal (PRS) from multiple eNBs. This information is then sent to a...NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA DISSERTATION ANALYSIS AND AUGMENTATION OF TIMING ADVANCE-BASED GEOLOCATION IN LTE CELLULAR NETWORKS by
Noise suppression for dual-energy CT via penalized weighted least-square optimization with similarity-based regularization

Energy Technology Data Exchange (ETDEWEB)

Harms, Joseph; Wang, Tonghe; Petrongolo, Michael; Zhu, Lei, E-mail: leizhu@gatech.edu [Nuclear and Radiological Engineering and Medical Physics Programs, The George W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332 (United States); Niu, Tianye [Sir Run Run Shaw Hospital, Zhejiang University School of Medicine (China); Institute of Translational Medicine, Zhejiang University, Hangzhou, Zhejiang, 310016 (China)

2016-05-15

Purpose: Dual-energy CT (DECT) expands applications of CT imaging in its capability to decompose CT images into material images. However, decomposition via direct matrix inversion leads to large noise amplification and limits quantitative use of DECT. Their group has previously developed a noise suppression algorithm via penalized weighted least-square optimization with edge-preservation regularization (PWLS-EPR). In this paper, the authors improve method performance using the same framework of penalized weighted least-square optimization but with similarity-based regularization (PWLS-SBR), which substantially enhances the quality of decomposed images by retaining a more uniform noise power spectrum (NPS). Methods: The design of PWLS-SBR is based on the fact that averaging pixels of similar materials gives a low-noise image. For each pixel, the authors calculate the similarity to other pixels in its neighborhood by comparing CT values. Using an empirical Gaussian model, the authors assign high/low similarity value to one neighboring pixel if its CT value is close/far to the CT value of the pixel of interest. These similarity values are organized in matrix form, such that multiplication of the similarity matrix to the image vector reduces image noise. The similarity matrices are calculated on both high- and low-energy CT images and averaged. In PWLS-SBR, the authors include a regularization term to minimize the L-2 norm of the difference between the images without and with noise suppression via similarity matrix multiplication. By using all pixel information of the initial CT images rather than just those lying on or near edges, PWLS-SBR is superior to the previously developed PWLS-EPR, as supported by comparison studies on phantoms and a head-and-neck patient. Results: On the line-pair slice of the Catphan{sup ©}600 phantom, PWLS-SBR outperforms PWLS-EPR and retains spatial resolution of 8 lp/cm, comparable to the original CT images, even at 90% reduction in noise
Understanding similarity of groundwater systems with empirical copulas

Science.gov (United States)

Haaf, Ezra; Kumar, Rohini; Samaniego, Luis; Barthel, Roland

2016-04-01

Within the classification framework for groundwater systems that aims for identifying similarity of hydrogeological systems and transferring information from a well-observed to an ungauged system (Haaf and Barthel, 2015; Haaf and Barthel, 2016), we propose a copula-based method for describing groundwater-systems similarity. Copulas are an emerging method in hydrological sciences that make it possible to model the dependence structure of two groundwater level time series, independently of the effects of their marginal distributions. This study is based on Samaniego et al. (2010), which described an approach calculating dissimilarity measures from bivariate empirical copula densities of streamflow time series. Subsequently, streamflow is predicted in ungauged basins by transferring properties from similar catchments. The proposed approach is innovative because copula-based similarity has not yet been applied to groundwater systems. Here we estimate the pairwise dependence structure of 600 wells in Southern Germany using 10 years of weekly groundwater level observations. Based on these empirical copulas, dissimilarity measures are estimated, such as the copula's lower- and upper corner cumulated probability, copula-based Spearman's rank correlation - as proposed by Samaniego et al. (2010). For the characterization of groundwater systems, copula-based metrics are compared with dissimilarities obtained from precipitation signals corresponding to the presumed area of influence of each groundwater well. This promising approach provides a new tool for advancing similarity-based classification of groundwater system dynamics. Haaf, E., Barthel, R., 2015. Methods for assessing hydrogeological similarity and for classification of groundwater systems on the regional scale, EGU General Assembly 2015, Vienna, Austria. Haaf, E., Barthel, R., 2016. An approach for classification of hydrogeological systems at the regional scale based on groundwater hydrographs EGU General Assembly
Development of similarity theory for control systems

Science.gov (United States)

Myshlyaev, L. P.; Evtushenko, V. F.; Ivushkin, K. A.; Makarov, G. V.

2018-05-01

The area of effective application of the traditional similarity theory and the need necessity of its development for systems are discussed. The main statements underlying the similarity theory of control systems are given. The conditions for the similarity of control systems and the need for similarity control control are formulated. Methods and algorithms for estimating and similarity control of control systems and the results of research of control systems based on their similarity are presented. The similarity control of systems includes the current evaluation of the degree of similarity of control systems and the development of actions controlling similarity, and the corresponding targeted change in the state of any element of control systems.
On different forms of self similarity

International Nuclear Information System (INIS)

Aswathy, R.K.; Mathew, Sunil

2016-01-01

Fractal geometry is mainly based on the idea of self-similar forms. To be self-similar, a shape must able to be divided into parts that are smaller copies, which are more or less similar to the whole. There are different forms of self similarity in nature and mathematics. In this paper, some of the topological properties of super self similar sets are discussed. It is proved that in a complete metric space with two or more elements, the set of all non super self similar sets are dense in the set of all non-empty compact sub sets. It is also proved that the product of self similar sets are super self similar in product metric spaces and that the super self similarity is preserved under isometry. A characterization of super self similar sets using contracting sub self similarity is also presented. Some relevant counterexamples are provided. The concepts of exact super and sub self similarity are introduced and a necessary and sufficient condition for a set to be exact super self similar in terms of condensation iterated function systems (Condensation IFS’s) is obtained. A method to generate exact sub self similar sets using condensation IFS’s and the denseness of exact super self similar sets are also discussed.
Application of 3D Zernike descriptors to shape-based ligand similarity searching.

Science.gov (United States)

Venkatraman, Vishwesh; Chakravarthy, Padmasini Ramji; Kihara, Daisuke

2009-12-17

The identification of promising drug leads from a large database of compounds is an important step in the preliminary stages of drug design. Although shape is known to play a key role in the molecular recognition process, its application to virtual screening poses significant hurdles both in terms of the encoding scheme and speed. In this study, we have examined the efficacy of the alignment independent three-dimensional Zernike descriptor (3DZD) for fast shape based similarity searching. Performance of this approach was compared with several other methods including the statistical moments based ultrafast shape recognition scheme (USR) and SIMCOMP, a graph matching algorithm that compares atom environments. Three benchmark datasets are used to thoroughly test the methods in terms of their ability for molecular classification, retrieval rate, and performance under the situation that simulates actual virtual screening tasks over a large pharmaceutical database. The 3DZD performed better than or comparable to the other methods examined, depending on the datasets and evaluation metrics used. Reasons for the success and the failure of the shape based methods for specific cases are investigated. Based on the results for the three datasets, general conclusions are drawn with regard to their efficiency and applicability. The 3DZD has unique ability for fast comparison of three-dimensional shape of compounds. Examples analyzed illustrate the advantages and the room for improvements for the 3DZD.
Soldier motivation – different or similar?

DEFF Research Database (Denmark)

Brænder, Morten; Andersen, Lotte Bøgh

Recent research in military sociology has shown that in addition to their strong peer motivation modern soldiers are oriented toward contributing to society. It has not, however, been tested how soldier motivation differs from the motivation of other citizens in this respect. In this paper......, by means of public service motivation, a concept developed within the public administration literature, we compare soldier and civilian motivation. The contribution of this paper is an analysis of whether and how Danish combat soldiers differs from other Danes in regard to public service motivation? Using...... surveys with similar questions, we find that soldiers are more normatively motivated to contribute to society than other citizens (higher commitment to the public interest), while their affectively based motivation is lower (lower compassion). This points towards a potential problem in regard...

GIS: a comprehensive source for protein structure similarities.

Science.gov (United States)

Guerler, Aysam; Knapp, Ernst-Walter

2010-07-01

A web service for analysis of protein structures that are sequentially or non-sequentially similar was generated. Recently, the non-sequential structure alignment algorithm GANGSTA+ was introduced. GANGSTA+ can detect non-sequential structural analogs for proteins stated to possess novel folds. Since GANGSTA+ ignores the polypeptide chain connectivity of secondary structure elements (i.e. alpha-helices and beta-strands), it is able to detect structural similarities also between proteins whose sequences were reshuffled during evolution. GANGSTA+ was applied in an all-against-all comparison on the ASTRAL40 database (SCOP version 1.75), which consists of >10,000 protein domains yielding about 55 x 10(6) possible protein structure alignments. Here, we provide the resulting protein structure alignments as a public web-based service, named GANGSTA+ Internet Services (GIS). We also allow to browse the ASTRAL40 database of protein structures with GANGSTA+ relative to an externally given protein structure using different constraints to select specific results. GIS allows us to analyze protein structure families according to the SCOP classification scheme. Additionally, users can upload their own protein structures for pairwise protein structure comparison, alignment against all protein structures of the ASTRAL40 database (SCOP version 1.75) or symmetry analysis. GIS is publicly available at http://agknapp.chemie.fu-berlin.de/gplus.
Similarity ratio analysis for early stage fault detection with optical emission spectrometer in plasma etching process.

Directory of Open Access Journals (Sweden)

Jie Yang

Full Text Available A Similarity Ratio Analysis (SRA method is proposed for early-stage Fault Detection (FD in plasma etching processes using real-time Optical Emission Spectrometer (OES data as input. The SRA method can help to realise a highly precise control system by detecting abnormal etch-rate faults in real-time during an etching process. The method processes spectrum scans at successive time points and uses a windowing mechanism over the time series to alleviate problems with timing uncertainties due to process shift from one process run to another. A SRA library is first built to capture features of a healthy etching process. By comparing with the SRA library, a Similarity Ratio (SR statistic is then calculated for each spectrum scan as the monitored process progresses. A fault detection mechanism, named 3-Warning-1-Alarm (3W1A, takes the SR values as inputs and triggers a system alarm when certain conditions are satisfied. This design reduces the chance of false alarm, and provides a reliable fault reporting service. The SRA method is demonstrated on a real semiconductor manufacturing dataset. The effectiveness of SRA-based fault detection is evaluated using a time-series SR test and also using a post-process SR test. The time-series SR provides an early-stage fault detection service, so less energy and materials will be wasted by faulty processing. The post-process SR provides a fault detection service with higher reliability than the time-series SR, but with fault testing conducted only after each process run completes.
System based practice: a concept analysis

Directory of Open Access Journals (Sweden)

SHAHRAM YAZDANI

2016-04-01

Full Text Available Introduction: Systems-Based Practice (SBP is one of the six competencies introduced by the ACGME for physicians to provide high quality of care and also the most challenging of them in performance, training, and evaluation of medical students. This concept analysis clarifies the concept of SBP by identifying its components to make it possible to differentiate it from other similar concepts. For proper training of SBP and to ensure these competencies in physicians, it is necessary to have an operational definition, and SBP’s components must be precisely defined in order to provide valid and reliable assessment tools. Methods: Walker & Avant’s approach to concept analysis was performed in eight stages: choosing a concept, determining the purpose of analysis, identifying all uses of the concept, defining attributes, identifying a model case, identifying borderline, related, and contrary cases, identifying antecedents and consequences, and defining empirical referents. Results: Based on the analysis undertaken, the attributes of SBP includes knowledge of the system, balanced decision between patients’ need and system goals, effective role playing in interprofessional health care team, system level of health advocacy, and acting for system improvement. System thinking and a functional system are antecedents and system goals are consequences. A case model, as well as border, and contrary cases of SBP, has been introduced. Conclusion: The identification of SBP attributes in this study contributes to the body of knowledge in SBP and reduces the ambiguity of this concept to make it possible for applying it in training of different medical specialties. Also, it would be possible to develop and use more precise tools to evaluate SBP competency by using empirical referents of the analysis.
Similarity principles for equipment qualification by experience

International Nuclear Information System (INIS)

Kana, D.D.; Pomerening, D.J.

1988-07-01

A methodology is developed for seismic qualification of nuclear plant equipment by applying similarity principles to existing experience data. Experience data are available from previous qualifications by analysis or testing, or from actual earthquake events. Similarity principles are defined in terms of excitation, equipment physical characteristics, and equipment response. Physical similarity is further defined in terms of a critical transfer function for response at a location on a primary structure, whose response can be assumed directly related to ultimate fragility of the item under elevated levels of excitation. Procedures are developed for combining experience data into composite specifications for qualification of equipment that can be shown to be physically similar to the reference equipment. Other procedures are developed for extending qualifications beyond the original specifications under certain conditions. Some examples for application of the procedures and verification of them are given for certain cases that can be approximated by a two degree of freedom simple primary/secondary system. Other examples are based on use of actual test data available from previous qualifications. Relationships of the developments with other previously-published methods are discussed. The developments are intended to elaborate on the rather broad revised guidelines developed by the IEEE 344 Standards Committee for equipment qualification in new nuclear plants. However, the results also contribute to filling a gap that exists between the IEEE 344 methodology and that previously developed by the Seismic Qualification Utilities Group. The relationship of the results to safety margin methodology is also discussed. (author)
Incidental Learning: A Brief, Valid Measure of Memory Based on the WAIS-IV Vocabulary and Similarities Subtests.

Science.gov (United States)

Spencer, Robert J; Reckow, Jaclyn; Drag, Lauren L; Bieliauskas, Linas A

2016-12-01

We assessed the validity of a brief incidental learning measure based on the Similarities and Vocabulary subtests of the Wechsler Adult Intelligence Scale-Fourth Edition (WAIS-IV). Most neuropsychological assessments for memory require intentional learning, but incidental learning occurs without explicit instruction. Incidental memory tests such as the WAIS-III Symbol Digit Coding subtest have existed for many years, but few memory studies have used a semantically processed incidental learning model. We conducted a retrospective analysis of 37 veterans with traumatic brain injury, referred for outpatient neuropsychological testing at a Veterans Affairs hospital. As part of their evaluation, the participants completed the incidental learning tasks. We compared their incidental learning performance to their performance on traditional memory measures. Incidental learning scores correlated strongly with scores on the California Verbal Learning Test-Second Edition (CVLT-II) and Brief Visuospatial Memory Test-Revised (BVMT-R). After we conducted a partial correlation that controlled for the effects of age, incidental learning correlated significantly with the CVLT-II Immediate Free Recall, CVLT-II Short-Delay Recall, CVLT-II Long-Delay Recall, and CVLT-II Yes/No Recognition Hits, and with the BVMT-R Delayed Recall and BVMT-R Recognition Discrimination Index. Our incidental learning procedures derived from subtests of the WAIS-IV Edition are an efficient and valid way of measuring memory. These tasks add minimally to testing time and capitalize on the semantic encoding that is inherent in completing the Similarities and Vocabulary subtests.
The effective thermal conductivity of porous media based on statistical self-similarity

International Nuclear Information System (INIS)

Kou Jianlong; Wu Fengmin; Lu Hangjun; Xu Yousheng; Song Fuquan

2009-01-01

A fractal model is presented based on the thermal-electrical analogy technique and statistical self-similarity of fractal saturated porous media. A dimensionless effective thermal conductivity of saturated fractal porous media is studied by the relationship between the dimensionless effective thermal conductivity and the geometrical parameters of porous media with no empirical constant. Through this study, it is shown that the dimensionless effective thermal conductivity decreases with the increase of porosity (φ) and pore area fractal dimension (D f ) when k s /k g >1. The opposite trends is observed when k s /k g t ). The model predictions are compared with existing experimental data and the results show that they are in good agreement with existing experimental data.
Developing a Clustering-Based Empirical Bayes Analysis Method for Hotspot Identification

Directory of Open Access Journals (Sweden)

Yajie Zou

2017-01-01

Full Text Available Hotspot identification (HSID is a critical part of network-wide safety evaluations. Typical methods for ranking sites are often rooted in using the Empirical Bayes (EB method to estimate safety from both observed crash records and predicted crash frequency based on similar sites. The performance of the EB method is highly related to the selection of a reference group of sites (i.e., roadway segments or intersections similar to the target site from which safety performance functions (SPF used to predict crash frequency will be developed. As crash data often contain underlying heterogeneity that, in essence, can make them appear to be generated from distinct subpopulations, methods are needed to select similar sites in a principled manner. To overcome this possible heterogeneity problem, EB-based HSID methods that use common clustering methodologies (e.g., mixture models, K-means, and hierarchical clustering to select “similar” sites for building SPFs are developed. Performance of the clustering-based EB methods is then compared using real crash data. Here, HSID results, when computed on Texas undivided rural highway cash data, suggest that all three clustering-based EB analysis methods are preferred over the conventional statistical methods. Thus, properly classifying the road segments for heterogeneous crash data can further improve HSID accuracy.
Detecting and classifying method based on similarity matching of Android malware behavior with profile.

Science.gov (United States)

Jang, Jae-Wook; Yun, Jaesung; Mohaisen, Aziz; Woo, Jiyoung; Kim, Huy Kang

2016-01-01

Mass-market mobile security threats have increased recently due to the growth of mobile technologies and the popularity of mobile devices. Accordingly, techniques have been introduced for identifying, classifying, and defending against mobile threats utilizing static, dynamic, on-device, and off-device techniques. Static techniques are easy to evade, while dynamic techniques are expensive. On-device techniques are evasion, while off-device techniques need being always online. To address some of those shortcomings, we introduce Andro-profiler, a hybrid behavior based analysis and classification system for mobile malware. Andro-profiler main goals are efficiency, scalability, and accuracy. For that, Andro-profiler classifies malware by exploiting the behavior profiling extracted from the integrated system logs including system calls. Andro-profiler executes a malicious application on an emulator in order to generate the integrated system logs, and creates human-readable behavior profiles by analyzing the integrated system logs. By comparing the behavior profile of malicious application with representative behavior profile for each malware family using a weighted similarity matching technique, Andro-profiler detects and classifies it into malware families. The experiment results demonstrate that Andro-profiler is scalable, performs well in detecting and classifying malware with accuracy greater than 98 %, outperforms the existing state-of-the-art work, and is capable of identifying 0-day mobile malware samples.
Improved collaborative filtering recommendation algorithm of similarity measure

Science.gov (United States)

Zhang, Baofu; Yuan, Baoping

2017-05-01

The Collaborative filtering recommendation algorithm is one of the most widely used recommendation algorithm in personalized recommender systems. The key is to find the nearest neighbor set of the active user by using similarity measure. However, the methods of traditional similarity measure mainly focus on the similarity of user common rating items, but ignore the relationship between the user common rating items and all items the user rates. And because rating matrix is very sparse, traditional collaborative filtering recommendation algorithm is not high efficiency. In order to obtain better accuracy, based on the consideration of common preference between users, the difference of rating scale and score of common items, this paper presents an improved similarity measure method, and based on this method, a collaborative filtering recommendation algorithm based on similarity improvement is proposed. Experimental results show that the algorithm can effectively improve the quality of recommendation, thus alleviate the impact of data sparseness.
Magnetic Reconnection in Different Environments: Similarities and Differences

Science.gov (United States)

Hesse, Michael; Aunai, Nicolas; Kuznetsova, Masha; Zenitani, Seiji; Birn, Joachim

2014-01-01

Depending on the specific situation, magnetic reconnection may involve symmetric or asymmetric inflow regions. Asymmetric reconnection applies, for example, to reconnection at the Earth's magnetopause, whereas reconnection in the nightside magnetotail tends to involve more symmetric geometries. A combination of review and new results pertaining to magnetic reconnection is being presented. The focus is on three aspects: A basic, MHD-based, analysis of the role magnetic reconnection plays in the transport of energy, followed by an analysis of a kinetic model of time dependent reconnection in a symmetric current sheet, similar to what is typically being encountered in the magnetotail of the Earth. The third element is a review of recent results pertaining to the orientation of the reconnection line in asymmetric geometries, which are typical for the magnetopause of the Earth, as well as likely to occur at other planets.
Similarity-based interference in a working memory numerical updating task: age-related differences between younger and older adults.

Science.gov (United States)

Pelegrina, Santiago; Borella, Erika; Carretti, Barbara; Lechuga, M Teresa

2012-01-01

Similarity among representations held simultaneously in working memory (WM) is a factor which increases interference and hinders performance. The aim of the current study was to investigate age-related differences between younger and older adults in a working memory numerical updating task, in which the similarity between information held in WM was manipulated. Results showed a higher susceptibility of older adults to similarity-based interference when accuracy, and not response times, was considered. It was concluded that older adults' WM difficulties appear to be due to the availability of stored information, which, in turn, might be related to the ability to generate distinctive representations and to the process of binding such representations to their context when similar information has to be processed in WM.
Effect of similar elements on improving glass-forming ability of La-Ce-based alloys

International Nuclear Information System (INIS)

Zhang Tao; Li Ran; Pang Shujie

2009-01-01

To date the effect of unlike component elements on glass-forming ability (GFA) of alloys have been studied extensively, and it is generally recognized that the main consisting elements of the alloys with high GFA usually have large difference in atomic size and atomic interaction (large negative heat of mixing) among them. In our recent work, a series of rare earth metal-based alloy compositions with superior GFA were found through the approach of coexistence of similar constituent elements. The quinary (La 0.5 Ce 0.5 ) 65 Al 10 (Co 0.6 Cu 0.4 ) 25 bulk metallic glass (BMG) in a rod form with a diameter up to 32 mm was synthesized by tilt-pour casting, for which the glass-forming ability is significantly higher than that for ternary Ln-Al-TM alloys (Ln = La or Ce; TM = Co or Cu) with critical diameters for glass-formation of several millimeters. We suggest that the strong frustration of crystallization by utilizing the coexistence of La-Ce and Co-Cu to complicate competing crystalline phases is helpful to construct BMG component with superior GFA. The results of our present work indicate that similar elements (elements with similar atomic size and chemical properties) have significant effect on GFA of alloys.
Artistic image analysis using graph-based learning approaches.

Science.gov (United States)

Carneiro, Gustavo

2013-08-01

We introduce a new methodology for the problem of artistic image analysis, which among other tasks, involves the automatic identification of visual classes present in an art work. In this paper, we advocate the idea that artistic image analysis must explore a graph that captures the network of artistic influences by computing the similarities in terms of appearance and manual annotation. One of the novelties of our methodology is the proposed formulation that is a principled way of combining these two similarities in a single graph. Using this graph, we show that an efficient random walk algorithm based on an inverted label propagation formulation produces more accurate annotation and retrieval results compared with the following baseline algorithms: bag of visual words, label propagation, matrix completion, and structural learning. We also show that the proposed approach leads to a more efficient inference and training procedures. This experiment is run on a database containing 988 artistic images (with 49 visual classification problems divided into a multiclass problem with 27 classes and 48 binary problems), where we show the inference and training running times, and quantitative comparisons with respect to several retrieval and annotation performance measures.
Mining author relationship in scholarly networks based on tripartite citation analysis

Science.gov (United States)

Wang, Xiaohan; Yang, Siluo

2017-01-01

Following scholars in Scientometrics as examples, we develop five author relationship networks, namely, co-authorship, author co-citation (AC), author bibliographic coupling (ABC), author direct citation (ADC), and author keyword coupling (AKC). The time frame of data sets is divided into two periods: before 2011 (i.e., T1) and after 2011 (i.e., T2). Through quadratic assignment procedure analysis, we found that some authors have ABC or AC relationships (i.e., potential communication relationship, PCR) but do not have actual collaborations or direct citations (i.e., actual communication relationship, ACR) among them. In addition, we noticed that PCR and AKC are highly correlated and that the old PCR and the new ACR are correlated and consistent. Such facts indicate that PCR tends to produce academic exchanges based on similar themes, and ABC bears more advantages in predicting potential relations. Based on tripartite citation analysis, including AC, ABC, and ADC, we also present an author-relation mining process. Such process can be used to detect deep and potential author relationships. We analyze the prediction capacity by comparing between the T1 and T2 periods, which demonstrate that relation mining can be complementary in identifying authors based on similar themes and discovering more potential collaborations and academic communities. PMID:29117198
Mining author relationship in scholarly networks based on tripartite citation analysis.

Directory of Open Access Journals (Sweden)

Feifei Wang

Full Text Available Following scholars in Scientometrics as examples, we develop five author relationship networks, namely, co-authorship, author co-citation (AC, author bibliographic coupling (ABC, author direct citation (ADC, and author keyword coupling (AKC. The time frame of data sets is divided into two periods: before 2011 (i.e., T1 and after 2011 (i.e., T2. Through quadratic assignment procedure analysis, we found that some authors have ABC or AC relationships (i.e., potential communication relationship, PCR but do not have actual collaborations or direct citations (i.e., actual communication relationship, ACR among them. In addition, we noticed that PCR and AKC are highly correlated and that the old PCR and the new ACR are correlated and consistent. Such facts indicate that PCR tends to produce academic exchanges based on similar themes, and ABC bears more advantages in predicting potential relations. Based on tripartite citation analysis, including AC, ABC, and ADC, we also present an author-relation mining process. Such process can be used to detect deep and potential author relationships. We analyze the prediction capacity by comparing between the T1 and T2 periods, which demonstrate that relation mining can be complementary in identifying authors based on similar themes and discovering more potential collaborations and academic communities.
Application of 3D Zernike descriptors to shape-based ligand similarity searching

Directory of Open Access Journals (Sweden)

Venkatraman Vishwesh

2009-12-01

Full Text Available Abstract Background The identification of promising drug leads from a large database of compounds is an important step in the preliminary stages of drug design. Although shape is known to play a key role in the molecular recognition process, its application to virtual screening poses significant hurdles both in terms of the encoding scheme and speed. Results In this study, we have examined the efficacy of the alignment independent three-dimensional Zernike descriptor (3DZD for fast shape based similarity searching. Performance of this approach was compared with several other methods including the statistical moments based ultrafast shape recognition scheme (USR and SIMCOMP, a graph matching algorithm that compares atom environments. Three benchmark datasets are used to thoroughly test the methods in terms of their ability for molecular classification, retrieval rate, and performance under the situation that simulates actual virtual screening tasks over a large pharmaceutical database. The 3DZD performed better than or comparable to the other methods examined, depending on the datasets and evaluation metrics used. Reasons for the success and the failure of the shape based methods for specific cases are investigated. Based on the results for the three datasets, general conclusions are drawn with regard to their efficiency and applicability. Conclusion The 3DZD has unique ability for fast comparison of three-dimensional shape of compounds. Examples analyzed illustrate the advantages and the room for improvements for the 3DZD.
Image quality assessment based on inter-patch and intra-patch similarity.

Directory of Open Access Journals (Sweden)

Fei Zhou

Full Text Available In this paper, we propose a full-reference (FR image quality assessment (IQA scheme, which evaluates image fidelity from two aspects: the inter-patch similarity and the intra-patch similarity. The scheme is performed in a patch-wise fashion so that a quality map can be obtained. On one hand, we investigate the disparity between one image patch and its adjacent ones. This disparity is visually described by an inter-patch feature, where the hybrid effect of luminance masking and contrast masking is taken into account. The inter-patch similarity is further measured by modifying the normalized correlation coefficient (NCC. On the other hand, we also attach importance to the impact of image contents within one patch on the IQA problem. For the intra-patch feature, we consider image curvature as an important complement of image gradient. According to local image contents, the intra-patch similarity is measured by adaptively comparing image curvature and gradient. Besides, a nonlinear integration of the inter-patch and intra-patch similarity is presented to obtain an overall score of image quality. The experiments conducted on six publicly available image databases show that our scheme achieves better performance in comparison with several state-of-the-art schemes.
SIMILARITY COMPARISON AND CLASSIFICATION OF SUCKING LOUSE COMMUNITIES ON SOME SMALL MAMMALS IN YUNNAN, CHINA

Institute of Scientific and Technical Information of China (English)

Xian-guoGuo; Ti-junQian; Li-junGuo; Wen-geDong

2004-01-01

The similarity and classification of sucking louse communities on 24 species of small mammals were studied in Yunnan Province, China, through a hierarchical cluster analysis. All the louse species on the body surface of a certain species of small mammals are regarded as a louse community unit. The results reveal that the community structure of sucking lice on small mammals is simple with low species diversity. Most small mammals usually have certain louse species on their body surface; there exists a high degree of host specificity. Most louse communities on the same genus of small mammals show a high similarity and are classified into the same group based on hierarchical cluster analysis. When the hosts have a close affinity in taxonomy, the louse communities on their body surface would tend to be similar with the same or similar dominant louse species (as observed in genus Rattus, Niviventer, Apodemus and Eothenomys). The similarity of sucking louse communities is highly consistent with the affinity of small mammal hosts in taxonomy. The results suggest a close relationship of co-evolution between sucking lice and their hosts.
Filling Predictable and Unpredictable Gaps, with and without Similarity-Based Interference: Evidence for LIFG Effects of Dependency Processing.

Science.gov (United States)

Leiken, Kimberly; McElree, Brian; Pylkkänen, Liina

2015-01-01

One of the most replicated findings in neurolinguistic literature on syntax is the increase of hemodynamic activity in the left inferior frontal gyrus (LIFG) in response to object relative (OR) clauses compared to subject relative clauses. However, behavioral studies have shown that ORs are primarily only costly when similarity-based interference is involved and recently, Leiken and Pylkkänen (2014) showed with magnetoencephalography (MEG) that an LIFG increase at an OR gap is also dependent on such interference. However, since ORs always involve a cue indicating an upcoming dependency formation, OR dependencies could be processed already prior to the gap-site and thus show no sheer dependency effects at the gap itself. To investigate the role of gap predictability in LIFG dependency effects, this MEG study compared ORs to verb phrase ellipsis (VPE), which was used as an example of a non-predictable dependency. Additionally, we explored LIFG sensitivity to filler-gap order by including right node raising structures, in which the order of filler and gap is reverse to that of ORs and VPE. Half of the stimuli invoked similarity-based interference and half did not. Our results demonstrate that LIFG effects of dependency can be elicited regardless of whether the dependency is predictable, the stimulus materials evoke similarity-based interference, or the filler precedes the gap. Thus, contrary to our own prior data, the current findings suggest a highly general role for the LIFG in dependency interpretation that is not limited to environments involving similarity-based interference. Additionally, the millisecond time-resolution of MEG allowed for a detailed characterization of the temporal profiles of LIFG dependency effects across our three constructions, revealing that the timing of these effects is somewhat construction-specific.
Quantum mechanical analysis of fractal conductance fluctuations: a picture using self-similar periodic orbits

International Nuclear Information System (INIS)

Ogura, Tatsuo; Miyamoto, Masanori; Budiyono, Agung; Nakamura, Katsuhiro

2007-01-01

Fractal magnetoconductance fluctuations are often observed in experiments on ballistic quantum dots. Although the analysis of the exact self-affine fractal has been given by the semiclassical theory using self-similar periodic orbits in systems with a soft-walled potential with a saddle, there has been no corresponding quantum mechanical investigation. We numerically calculate the quantum conductance with use of the recursive Green's function method applied to open cavities characterized by a Henon-Heiles type potential. The conductance fluctuations show exact self-affinity just as in some of the experimental observations. The enlargement factor for the horizontal axis can be explained by the scaling factor of the area of self-similar periodic orbits, and therefore be attributed to the curvature of the saddle in the cavity potential. The fractal dimension obtained through the box counting method agrees with those evaluated with use of the Hurst exponent, and coincides with the semiclassical prediction. We further investigate the variation of the fractal dimension by changing the control parameters between the classical and quantum domains. (fast track communication)

Comparison of similarity coefficients used for cluster analysis with dominant markers in maize (Zea mays L

Directory of Open Access Journals (Sweden)

Meyer Andréia da Silva

2004-01-01

Full Text Available The objective of this study was to evaluate whether different similarity coefficients used with dominant markers can influence the results of cluster analysis, using eighteen inbred lines of maize from two different populations, BR-105 and BR-106. These were analyzed by AFLP and RAPD markers and eight similarity coefficients were calculated: Jaccard, Sorensen-Dice, Anderberg, Ochiai, Simple-matching, Rogers and Tanimoto, Ochiai II and Russel and Rao. The similarity matrices obtained were compared by the Spearman correlation, cluster analysis with dendrograms (UPGMA, WPGMA, Single Linkage, Complete Linkage and Neighbour-Joining methods, the consensus fork index between all pairs of dendrograms, groups obtained through the Tocher optimization procedure and projection efficiency in a two-dimensional space. The results showed that for almost all methodologies and marker systems, the Jaccard, Sorensen-Dice, Anderberg and Ochiai coefficient showed close results, due to the fact that all of them exclude negative co-occurrences. Significant alterations in the results for the Simple Matching, Rogers and Tanimoto, and Ochiai II coefficients were not observed either, probably due to the fact that they all include negative co-occurrences. The Russel and Rao coefficient presented very different results from the others in almost all the cases studied and should not be used, because it excludes the negative co-occurrences in the numerator and includes them in the denominator of their expression. Due to the fact that the negative co-occurrences do not necessarily mean that the regions of the DNA are identical, the use of coefficients that do not include negative co-occurrences was suggested.
Perceptions of Ideal and Former Partners’ Personality and Similarity

Directory of Open Access Journals (Sweden)

Pieternel Dijkstra

2010-12-01

Full Text Available The present study aimed to test predictions based on both the ‗similarity-attraction‘ hypothesis and the ‗attraction-similarity‘ hypothesis, by studying perceptions of ideal and former partners. Based on the ‗similarity-attraction‘ hypothesis, we expected individuals to desire ideal partners who are similar to the self in personality. In addition, based on the ‗attraction-similarity hypothesis‘, we expected individuals to perceive former partners as dissimilar to them in terms of personality. Findings showed that, whereas the ideal partner was seen as similar to and more positive than the self, the former partner was seen as dissimilar to and more negative than the self. In addition, our study showed that individuals did not rate similarity in personality as very important when seeking a mate. Our findings may help understand why so many relationships end in divorce due to mismatches in personality.
Similarity analysis of spectra obtained via reflectance spectrometry in legal medicine.

Science.gov (United States)

Belenki, Liudmila; Sterzik, Vera; Bohnert, Michael

2014-02-01

In the present study, a series of reflectance spectra of postmortem lividity, pallor, and putrefaction-affected skin for 195 investigated cases in the course of cooling down the corpse has been collected. The reflectance spectrometric measurements were stored together with their respective metadata in a MySQL database. The latter has been managed via a scientific information repository. We propose similarity measures and a criterion of similarity that capture similar spectra recorded at corpse skin. We systematically clustered reflectance spectra from the database as well as their metadata, such as case number, age, sex, skin temperature, duration of cooling, and postmortem time, with respect to the given criterion of similarity. Altogether, more than 500 reflectance spectra have been pairwisely compared. The measures that have been used to compare a pair of reflectance curve samples include the Euclidean distance between curves and the Euclidean distance between derivatives of the functions represented by the reflectance curves at the same wavelengths in the spectral range of visible light between 380 and 750 nm. For each case, using the recorded reflectance curves and the similarity criterion, the postmortem time interval during which a characteristic change in the shape of reflectance spectrum takes place is estimated. The latter is carried out via a software package composed of Java, Python, and MatLab scripts that query the MySQL database. We show that in legal medicine, matching and clustering of reflectance curves obtained by means of reflectance spectrometry with respect to a given criterion of similarity can be used to estimate the postmortem interval.
Spectral analysis of four surprisingly similar hot hydrogen-rich subdwarf O stars

Science.gov (United States)

Latour, M.; Chayer, P.; Green, E. M.; Irrgang, A.; Fontaine, G.

2018-01-01

Context. Post-extreme horizontal branch stars (post-EHB) are helium-shell burning objects evolving away from the EHB and contracting directly towards the white dwarf regime. While the stars forming the EHB have been extensively studied in the past, their hotter and more evolved progeny are not so well characterized. Aims: We perform a comprehensive spectroscopic analysis of four such bright sdO stars, namely Feige 34, Feige 67, AGK+81°266, and LS II+18°9, among which the first three are used as standard stars for flux calibration. Our goal is to determine their atmospheric parameters, chemical properties, and evolutionary status to better understand this class of stars that are en route to become white dwarfs. Methods: We used non-local thermodynamic equilibrium model atmospheres in combination with high quality optical and UV spectra. Photometric data were also used to compute the spectroscopic distances of our stars and to characterize the companion responsible for the infrared excess of Feige 34. Results: The four bright sdO stars have very similar atmospheric parameters with Teff between 60 000 and 63 000 K and log g (cm s-2) in the range 5.9 to 6.1. This places these objects right on the theoretical post-EHB evolutionary tracks. The UV spectra are dominated by strong iron and nickel lines and suggest abundances that are enriched with respect to those of the Sun by factors of 25 and 60. On the other hand, the lighter elements, C, N, O, Mg, Si, P, and S are depleted. The stars have very similar abundances, although AGK+81°266 shows differences in its light element abundances. For instance, the helium abundance of this object is 10 times lower than that observed in the other three stars. All our stars show UV spectral lines that require additional line broadening that is consistent with a rotational velocity of about 25 km s-1. The infrared excess of Feige 34 is well reproduced by a M0 main-sequence companion and the surface area ratio of the two stars
A methodology for strain-based fatigue reliability analysis

International Nuclear Information System (INIS)

Zhao, Y.X.

2000-01-01

A significant scatter of the cyclic stress-strain (CSS) responses should be noted for a nuclear reactor material, 1Cr18Ni9Ti pipe-weld metal. Existence of the scatter implies that a random cyclic strain applied history will be introduced under any of the loading modes even a deterministic loading history. A non-conservative evaluation might be given in the practice without considering the scatter. A methodology for strain-based fatigue reliability analysis, which has taken into account the scatter, is developed. The responses are approximately modeled by probability-based CSS curves of Ramberg-Osgood relation. The strain-life data are modeled, similarly, by probability-based strain-life curves of Coffin-Manson law. The reliability assessment is constructed by considering interference of the random fatigue strain applied and capacity histories. Probability density functions of the applied and capacity histories are analytically given. The methodology could be conveniently extrapolated to the case of deterministic CSS relation as the existent methods did. Non-conservative evaluation of the deterministic CSS relation and availability of present methodology have been indicated by an analysis of the material test results
Identification of polycystic ovary syndrome potential drug targets based on pathobiological similarity in the protein-protein interaction network

OpenAIRE

Huang, Hao; He, Yuehan; Li, Wan; Wei, Wenqing; Li, Yiran; Xie, Ruiqiang; Guo, Shanshan; Wang, Yahui; Jiang, Jing; Chen, Binbin; Lv, Junjie; Zhang, Nana; Chen, Lina; He, Weiming

2016-01-01

Polycystic ovary syndrome (PCOS) is one of the most common endocrinological disorders in reproductive aged women. PCOS and Type 2 Diabetes (T2D) are closely linked in multiple levels and possess high pathobiological similarity. Here, we put forward a new computational approach based on the pathobiological similarity to identify PCOS potential drug target modules (PPDT-Modules) and PCOS potential drug targets in the protein-protein interaction network (PPIN). From the systems level and biologi...
Hierarchical Organization of Auditory and Motor Representations in Speech Perception: Evidence from Searchlight Similarity Analysis.

Science.gov (United States)

Evans, Samuel; Davis, Matthew H

2015-12-01

How humans extract the identity of speech sounds from highly variable acoustic signals remains unclear. Here, we use searchlight representational similarity analysis (RSA) to localize and characterize neural representations of syllables at different levels of the hierarchically organized temporo-frontal pathways for speech perception. We asked participants to listen to spoken syllables that differed considerably in their surface acoustic form by changing speaker and degrading surface acoustics using noise-vocoding and sine wave synthesis while we recorded neural responses with functional magnetic resonance imaging. We found evidence for a graded hierarchy of abstraction across the brain. At the peak of the hierarchy, neural representations in somatomotor cortex encoded syllable identity but not surface acoustic form, at the base of the hierarchy, primary auditory cortex showed the reverse. In contrast, bilateral temporal cortex exhibited an intermediate response, encoding both syllable identity and the surface acoustic form of speech. Regions of somatomotor cortex associated with encoding syllable identity in perception were also engaged when producing the same syllables in a separate session. These findings are consistent with a hierarchical account of how variable acoustic signals are transformed into abstract representations of the identity of speech sounds. © The Author 2015. Published by Oxford University Press.
New similarity of triangular fuzzy number and its application.

Science.gov (United States)

Zhang, Xixiang; Ma, Weimin; Chen, Liping

2014-01-01

The similarity of triangular fuzzy numbers is an important metric for application of it. There exist several approaches to measure similarity of triangular fuzzy numbers. However, some of them are opt to be large. To make the similarity well distributed, a new method SIAM (Shape's Indifferent Area and Midpoint) to measure triangular fuzzy number is put forward, which takes the shape's indifferent area and midpoint of two triangular fuzzy numbers into consideration. Comparison with other similarity measurements shows the effectiveness of the proposed method. Then, it is applied to collaborative filtering recommendation to measure users' similarity. A collaborative filtering case is used to illustrate users' similarity based on cloud model and triangular fuzzy number; the result indicates that users' similarity based on triangular fuzzy number can obtain better discrimination. Finally, a simulated collaborative filtering recommendation system is developed which uses cloud model and triangular fuzzy number to express users' comprehensive evaluation on items, and result shows that the accuracy of collaborative filtering recommendation based on triangular fuzzy number is higher.
Branch length similarity entropy-based descriptors for shape representation

Science.gov (United States)

Kwon, Ohsung; Lee, Sang-Hee

2017-11-01

In previous studies, we showed that the branch length similarity (BLS) entropy profile could be successfully used for the shape recognition such as battle tanks, facial expressions, and butterflies. In the present study, we proposed new descriptors, roundness, symmetry, and surface roughness, for the recognition, which are more accurate and fast in the computation than the previous descriptors. The roundness represents how closely a shape resembles to a circle, the symmetry characterizes how much one shape is similar with another when the shape is moved in flip, and the surface roughness quantifies the degree of vertical deviations of a shape boundary. To evaluate the performance of the descriptors, we used the database of leaf images with 12 species. Each species consisted of 10 - 20 leaf images and the total number of images were 160. The evaluation showed that the new descriptors successfully discriminated the leaf species. We believe that the descriptors can be a useful tool in the field of pattern recognition.
Similarity measures for face recognition

CERN Document Server

Vezzetti, Enrico

2015-01-01

Face recognition has several applications, including security, such as (authentication and identification of device users and criminal suspects), and in medicine (corrective surgery and diagnosis). Facial recognition programs rely on algorithms that can compare and compute the similarity between two sets of images. This eBook explains some of the similarity measures used in facial recognition systems in a single volume. Readers will learn about various measures including Minkowski distances, Mahalanobis distances, Hansdorff distances, cosine-based distances, among other methods. The book also summarizes errors that may occur in face recognition methods. Computer scientists "facing face" and looking to select and test different methods of computing similarities will benefit from this book. The book is also useful tool for students undertaking computer vision courses.
Are calanco landforms similar to river basins?

Science.gov (United States)

Caraballo-Arias, N A; Ferro, V

2017-12-15

In the past badlands have been often considered as ideal field laboratories for studying landscape evolution because of their geometrical similarity to larger fluvial systems. For a given hydrological process, no scientific proof exists that badlands can be considered a model of river basin prototypes. In this paper the measurements carried out on 45 Sicilian calanchi, a type of badlands that appears as a small-scale hydrographic unit, are used to establish their morphological similarity with river systems whose data are available in the literature. At first the geomorphological similarity is studied by identifying the dimensionless groups, which can assume the same value or a scaled one in a fixed ratio, representing drainage basin shape, stream network and relief properties. Then, for each property, the dimensionless groups are calculated for the investigated calanchi and the river basins and their corresponding scale ratio is evaluated. The applicability of Hack's, Horton's and Melton's laws for establishing similarity criteria is also tested. The developed analysis allows to conclude that a quantitative morphological similarity between calanco landforms and river basins can be established using commonly applied dimensionless groups. In particular, the analysis showed that i) calanchi and river basins have a geometrically similar shape respect to the parameters Rf and Re with a scale factor close to 1, ii) calanchi and river basins are similar respect to the bifurcation and length ratios (λ=1), iii) for the investigated calanchi the Melton number assumes values less than that (0.694) corresponding to the river case and a scale ratio ranging from 0.52 and 0.78 can be used, iv) calanchi and river basins have similar mean relief ratio values (λ=1.13) and v) calanchi present active geomorphic processes and therefore fall in a more juvenile stage with respect to river basins. Copyright © 2017 Elsevier B.V. All rights reserved.
Applying ligands profiling using multiple extended electron distribution based field templates and feature trees similarity searching in the discovery of new generation of urea-based antineoplastic kinase inhibitors.

Directory of Open Access Journals (Sweden)

Eman M Dokla

Full Text Available This study provides a comprehensive computational procedure for the discovery of novel urea-based antineoplastic kinase inhibitors while focusing on diversification of both chemotype and selectivity pattern. It presents a systematic structural analysis of the different binding motifs of urea-based kinase inhibitors and the corresponding configurations of the kinase enzymes. The computational model depends on simultaneous application of two protocols. The first protocol applies multiple consecutive validated virtual screening filters including SMARTS, support vector-machine model (ROC = 0.98, Bayesian model (ROC = 0.86 and structure-based pharmacophore filters based on urea-based kinase inhibitors complexes retrieved from literature. This is followed by hits profiling against different extended electron distribution (XED based field templates representing different kinase targets. The second protocol enables cancericidal activity verification by using the algorithm of feature trees (Ftrees similarity searching against NCI database. Being a proof-of-concept study, this combined procedure was experimentally validated by its utilization in developing a novel series of urea-based derivatives of strong anticancer activity. This new series is based on 3-benzylbenzo[d]thiazol-2(3H-one scaffold which has interesting chemical feasibility and wide diversification capability. Antineoplastic activity of this series was assayed in vitro against NCI 60 tumor-cell lines showing very strong inhibition of GI(50 as low as 0.9 uM. Additionally, its mechanism was unleashed using KINEX™ protein kinase microarray-based small molecule inhibitor profiling platform and cell cycle analysis showing a peculiar selectivity pattern against Zap70, c-src, Mink1, csk and MeKK2 kinases. Interestingly, it showed activity on syk kinase confirming the recent studies finding of the high activity of diphenyl urea containing compounds against this kinase. Allover, the new series
The Impact of Similarity-Based Interference in Processing Wh-Questions in Aphasia

Directory of Open Access Journals (Sweden)

Shannon Mackenzie

2014-04-01

than subject-extracted questions because the former are in non-canonical word order. Finally, the Intervener hypothesis suggests that only object-extracted Which-questions should be problematic, particularly for those participants with language disorders (e.g., Friedmann & Novogrodsky, 2011. An intervener is an NP that has similar properties to other NPs in the sentence, and thus results in similarity-based interference. Only object-extracted Which-questions contain an intervener (e.g., the fireman in (2b, which interferes with the chain consisting of the displaced Which-phrase, Which mailman, and its direct object gap occurring after the verb. Briefly here, only the Intervener Hypothesis was supported by our rich data set, and this was observed unambiguously for our participants with Broca’s aphasia. As an example (see Figure 1, we observed significantly greater proportion of gazes to the incorrect referent (i.e., the intervening NP in the object-extracted Which- relative to Who-questions beginning in the Verb-gap time window and extending throughout the remainder of the sentence and into the response period following the sentence. These patterns indicate lasting similarity-based interference effects during real-time sentence processing. The implications of our findings to extant accounts of sentence processing disruptions will be discussed, including accounts that root sentence comprehension impairments to memory-based interference.
Retrospective group fusion similarity search based on eROCE evaluation metric.

Science.gov (United States)

Avram, Sorin I; Crisan, Luminita; Bora, Alina; Pacureanu, Liliana M; Avram, Stefana; Kurunczi, Ludovic

2013-03-01

In this study, a simple evaluation metric, denoted as eROCE was proposed to measure the early enrichment of predictive methods. We demonstrated the superior robustness of eROCE compared to other known metrics throughout several active to inactive ratios ranging from 1:10 to 1:1000. Group fusion similarity search was investigated by varying 16 similarity coefficients, five molecular representations (binary and non-binary) and two group fusion rules using two reference structure set sizes. We used a dataset of 3478 actives and 43,938 inactive molecules and the enrichment was analyzed by means of eROCE. This retrospective study provides optimal similarity search parameters in the case of ALDH1A1 inhibitors. Copyright © 2013 Elsevier Ltd. All rights reserved.
Concept similarity and related categories in information retrieval using formal concept analysis

Science.gov (United States)

Eklund, P.; Ducrou, J.; Dau, F.

2012-11-01

The application of formal concept analysis to the problem of information retrieval has been shown useful but has lacked any real analysis of the idea of relevance ranking of search results. SearchSleuth is a program developed to experiment with the automated local analysis of Web search using formal concept analysis. SearchSleuth extends a standard search interface to include a conceptual neighbourhood centred on a formal concept derived from the initial query. This neighbourhood of the concept derived from the search terms is decorated with its upper and lower neighbours representing more general and special concepts, respectively. SearchSleuth is in many ways an archetype of search engines based on formal concept analysis with some novel features. In SearchSleuth, the notion of related categories - which are themselves formal concepts - is also introduced. This allows the retrieval focus to shift to a new formal concept called a sibling. This movement across the concept lattice needs to relate one formal concept to another in a principled way. This paper presents the issues concerning exploring, searching, and ordering the space of related categories. The focus is on understanding the use and meaning of proximity and semantic distance in the context of information retrieval using formal concept analysis.
Multidimensional Scaling Visualization Using Parametric Similarity Indices

Directory of Open Access Journals (Sweden)

J. A. Tenreiro Machado

2015-03-01

Full Text Available In this paper, we apply multidimensional scaling (MDS and parametric similarity indices (PSI in the analysis of complex systems (CS. Each CS is viewed as a dynamical system, exhibiting an output time-series to be interpreted as a manifestation of its behavior. We start by adopting a sliding window to sample the original data into several consecutive time periods. Second, we define a given PSI for tracking pieces of data. We then compare the windows for different values of the parameter, and we generate the corresponding MDS maps of ‘points’. Third, we use Procrustes analysis to linearly transform the MDS charts for maximum superposition and to build a globalMDS map of “shapes”. This final plot captures the time evolution of the phenomena and is sensitive to the PSI adopted. The generalized correlation, theMinkowski distance and four entropy-based indices are tested. The proposed approach is applied to the Dow Jones Industrial Average stock market index and the Europe Brent Spot Price FOB time-series.
Effect of acoustic similarity on short-term auditory memory in the monkey.

Science.gov (United States)

Scott, Brian H; Mishkin, Mortimer; Yin, Pingbo

2013-04-01

Recent evidence suggests that the monkey's short-term memory in audition depends on a passively retained sensory trace as opposed to a trace reactivated from long-term memory for use in working memory. Reliance on a passive sensory trace could render memory particularly susceptible to confusion between sounds that are similar in some acoustic dimension. If so, then in delayed matching-to-sample, the monkey's performance should be predicted by the similarity in the salient acoustic dimension between the sample and subsequent test stimulus, even at very short delays. To test this prediction and isolate the acoustic features relevant to short-term memory, we examined the pattern of errors made by two rhesus monkeys performing a serial, auditory delayed match-to-sample task with interstimulus intervals of 1 s. The analysis revealed that false-alarm errors did indeed result from similarity-based confusion between the sample and the subsequent nonmatch stimuli. Manipulation of the stimuli showed that removal of spectral cues was more disruptive to matching behavior than removal of temporal cues. In addition, the effect of acoustic similarity on false-alarm response was stronger at the first nonmatch stimulus than at the second one. This pattern of errors would be expected if the first nonmatch stimulus overwrote the sample's trace, and suggests that the passively retained trace is not only vulnerable to similarity-based confusion but is also highly susceptible to overwriting. Copyright © 2013 Elsevier B.V. All rights reserved.
SU-F-BRA-13: Knowledge-Based Treatment Planning for Prostate LDR Brachytherapy Based On Principle Component Analysis

Energy Technology Data Exchange (ETDEWEB)

Roper, J; Bradshaw, B; Godette, K; Schreibmann, E [Winship Cancer Institute of Emory University, Atlanta, GA (United States); Chanyavanich, V [Rocky Mountain Cancer Centers, Denver, CO (United States)

2015-06-15

Purpose: To create a knowledge-based algorithm for prostate LDR brachytherapy treatment planning that standardizes plan quality using seed arrangements tailored to individual physician preferences while being fast enough for real-time planning. Methods: A dataset of 130 prior cases was compiled for a physician with an active prostate seed implant practice. Ten cases were randomly selected to test the algorithm. Contours from the 120 library cases were registered to a common reference frame. Contour variations were characterized on a point by point basis using principle component analysis (PCA). A test case was converted to PCA vectors using the same process and then compared with each library case using a Mahalanobis distance to evaluate similarity. Rank order PCA scores were used to select the best-matched library case. The seed arrangement was extracted from the best-matched case and used as a starting point for planning the test case. Computational time was recorded. Any subsequent modifications were recorded that required input from a treatment planner to achieve an acceptable plan. Results: The computational time required to register contours from a test case and evaluate PCA similarity across the library was approximately 10s. Five of the ten test cases did not require any seed additions, deletions, or moves to obtain an acceptable plan. The remaining five test cases required on average 4.2 seed modifications. The time to complete manual plan modifications was less than 30s in all cases. Conclusion: A knowledge-based treatment planning algorithm was developed for prostate LDR brachytherapy based on principle component analysis. Initial results suggest that this approach can be used to quickly create treatment plans that require few if any modifications by the treatment planner. In general, test case plans have seed arrangements which are very similar to prior cases, and thus are inherently tailored to physician preferences.
Perceptions of ideal and former partner's personality and similarity

NARCIS (Netherlands)

Dijkstra, Pieternel; Barelds, Dick P.H.

2010-01-01

The present study aimed to test predictions based on both the ‗similarity-attraction‘ hypothesis and the ‗attraction-similarity‘ hypothesis, by studying perceptions of ideal and former partners. Based on the ‗similarity-attraction‘ hypothesis, we expected individuals to desire ideal partners who are
An improved DPSO with mutation based on similarity algorithm for optimization of transmission lines loading

International Nuclear Information System (INIS)

Shayeghi, H.; Mahdavi, M.; Bagheri, A.

2010-01-01

Static transmission network expansion planning (STNEP) problem acquires a principal role in power system planning and should be evaluated carefully. Up till now, various methods have been presented to solve the STNEP problem. But only in one of them, lines adequacy rate has been considered at the end of planning horizon and the problem has been optimized by discrete particle swarm optimization (DPSO). DPSO is a new population-based intelligence algorithm and exhibits good performance on solution of the large-scale, discrete and non-linear optimization problems like STNEP. However, during the running of the algorithm, the particles become more and more similar, and cluster into the best particle in the swarm, which make the swarm premature convergence around the local solution. In order to overcome these drawbacks and considering lines adequacy rate, in this paper, expansion planning has been implemented by merging lines loading parameter in the STNEP and inserting investment cost into the fitness function constraints using an improved DPSO algorithm. The proposed improved DPSO is a new conception, collectivity, which is based on similarity between the particle and the current global best particle in the swarm that can prevent the premature convergence of DPSO around the local solution. The proposed method has been tested on the Garver's network and a real transmission network in Iran, and compared with the DPSO based method for solution of the TNEP problem. The results show that the proposed improved DPSO based method by preventing the premature convergence is caused that with almost the same expansion costs, the network adequacy is increased considerably. Also, regarding the convergence curves of both methods, it can be seen that precision of the proposed algorithm for the solution of the STNEP problem is more than DPSO approach.

Single and two-phase similarity analysis of a reduced-scale natural convection loop relative to a full-scale prototype

International Nuclear Information System (INIS)

Botelho, David A.; Faccini, Jose L.H.

2002-01-01

The main topic in this paper is a new device being considered to improve nuclear reactor safety employing the natural circulation. A scaled experiment used to demonstrate the performance of the device is also described. We also applied a similarity analysis method for single and two-phase natural convection loop flow to the IEN CCN experiment and to an APEX like experiment to verify the degree of similarity relative to a full-scale prototype like the AP600. Most of the CCN similarity numbers that represent important single and two-phase similarity conditions are comparable to the APEX like loop non-dimensional numbers calculated employing the same methodology. Despite the much smaller geometric, pressure, and power scales, we conclude that the IEN CCN has single and two-phase natural circulation similarity numbers that represent fairly well the full-scale prototype. even lacking most complementary primary and safety systems, this IEN circuit provided a much valid experience to develop human, experimental, and analytical resources, besides its utilization as a training tool. (author)
Measuring transferring similarity via local information

Science.gov (United States)

Yin, Likang; Deng, Yong

2018-05-01

Recommender systems have developed along with the web science, and how to measure the similarity between users is crucial for processing collaborative filtering recommendation. Many efficient models have been proposed (i.g., the Pearson coefficient) to measure the direct correlation. However, the direct correlation measures are greatly affected by the sparsity of dataset. In other words, the direct correlation measures would present an inauthentic similarity if two users have a very few commonly selected objects. Transferring similarity overcomes this drawback by considering their common neighbors (i.e., the intermediates). Yet, the transferring similarity also has its drawback since it can only provide the interval of similarity. To break the limitations, we propose the Belief Transferring Similarity (BTS) model. The contributions of BTS model are: (1) BTS model addresses the issue of the sparsity of dataset by considering the high-order similarity. (2) BTS model transforms uncertain interval to a certain state based on fuzzy systems theory. (3) BTS model is able to combine the transferring similarity of different intermediates using information fusion method. Finally, we compare BTS models with nine different link prediction methods in nine different networks, and we also illustrate the convergence property and efficiency of the BTS model.
Multiscale sample entropy and cross-sample entropy based on symbolic representation and similarity of stock markets

Science.gov (United States)

Wu, Yue; Shang, Pengjian; Li, Yilong

2018-03-01

A modified multiscale sample entropy measure based on symbolic representation and similarity (MSEBSS) is proposed in this paper to research the complexity of stock markets. The modified algorithm reduces the probability of inducing undefined entropies and is confirmed to be robust to strong noise. Considering the validity and accuracy, MSEBSS is more reliable than Multiscale entropy (MSE) for time series mingled with much noise like financial time series. We apply MSEBSS to financial markets and results show American stock markets have the lowest complexity compared with European and Asian markets. There are exceptions to the regularity that stock markets show a decreasing complexity over the time scale, indicating a periodicity at certain scales. Based on MSEBSS, we introduce the modified multiscale cross-sample entropy measure based on symbolic representation and similarity (MCSEBSS) to consider the degree of the asynchrony between distinct time series. Stock markets from the same area have higher synchrony than those from different areas. And for stock markets having relative high synchrony, the entropy values will decrease with the increasing scale factor. While for stock markets having high asynchrony, the entropy values will not decrease with the increasing scale factor sometimes they tend to increase. So both MSEBSS and MCSEBSS are able to distinguish stock markets of different areas, and they are more helpful if used together for studying other features of financial time series.
A Survey of Binary Similarity and Distance Measures

Directory of Open Access Journals (Sweden)

Seung-Seok Choi

2010-02-01

Full Text Available The binary feature vector is one of the most common representations of patterns and measuring similarity and distance measures play a critical role in many problems such as clustering, classification, etc. Ever since Jaccard proposed a similarity measure to classify ecological species in 1901, numerous binary similarity and distance measures have been proposed in various fields. Applying appropriate measures results in more accurate data analysis. Notwithstanding, few comprehensive surveys on binary measures have been conducted. Hence we collected 76 binary similarity and distance measures used over the last century and reveal their correlations through the hierarchical clustering technique.
Universal self-similarity of propagating populations.

Science.gov (United States)

Eliazar, Iddo; Klafter, Joseph

2010-07-01

This paper explores the universal self-similarity of propagating populations. The following general propagation model is considered: particles are randomly emitted from the origin of a d-dimensional Euclidean space and propagate randomly and independently of each other in space; all particles share a statistically common--yet arbitrary--motion pattern; each particle has its own random propagation parameters--emission epoch, motion frequency, and motion amplitude. The universally self-similar statistics of the particles' displacements and first passage times (FPTs) are analyzed: statistics which are invariant with respect to the details of the displacement and FPT measurements and with respect to the particles' underlying motion pattern. Analysis concludes that the universally self-similar statistics are governed by Poisson processes with power-law intensities and by the Fréchet and Weibull extreme-value laws.
Universal self-similarity of propagating populations

Science.gov (United States)

Eliazar, Iddo; Klafter, Joseph

2010-07-01

This paper explores the universal self-similarity of propagating populations. The following general propagation model is considered: particles are randomly emitted from the origin of a d -dimensional Euclidean space and propagate randomly and independently of each other in space; all particles share a statistically common—yet arbitrary—motion pattern; each particle has its own random propagation parameters—emission epoch, motion frequency, and motion amplitude. The universally self-similar statistics of the particles’ displacements and first passage times (FPTs) are analyzed: statistics which are invariant with respect to the details of the displacement and FPT measurements and with respect to the particles’ underlying motion pattern. Analysis concludes that the universally self-similar statistics are governed by Poisson processes with power-law intensities and by the Fréchet and Weibull extreme-value laws.
Constructing an integrated gene similarity network for the identification of disease genes.

Science.gov (United States)

Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin

2017-09-20

Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .
Efficient similarity-based data clustering by optimal object to cluster reallocation.

Science.gov (United States)

Rossignol, Mathias; Lagrange, Mathieu; Cont, Arshia

2018-01-01

We present an iterative flat hard clustering algorithm designed to operate on arbitrary similarity matrices, with the only constraint that these matrices be symmetrical. Although functionally very close to kernel k-means, our proposal performs a maximization of average intra-class similarity, instead of a squared distance minimization, in order to remain closer to the semantics of similarities. We show that this approach permits the relaxing of some conditions on usable affinity matrices like semi-positiveness, as well as opening possibilities for computational optimization required for large datasets. Systematic evaluation on a variety of data sets shows that compared with kernel k-means and the spectral clustering methods, the proposed approach gives equivalent or better performance, while running much faster. Most notably, it significantly reduces memory access, which makes it a good choice for large data collections. Material enabling the reproducibility of the results is made available online.
IsoCleft Finder – a web-based tool for the detection and analysis of protein binding-site geometric and chemical similarities [v2; ref status: indexed, http://f1000r.es/13y

Directory of Open Access Journals (Sweden)

Natalja Kurbatova

2013-05-01

Full Text Available IsoCleft Finder is a web-based tool for the detection of local geometric and chemical similarities between potential small-molecule binding cavities and a non-redundant dataset of ligand-bound known small-molecule binding-sites. The non-redundant dataset developed as part of this study is composed of 7339 entries representing unique Pfam/PDB-ligand (hetero group code combinations with known levels of cognate ligand similarity. The query cavity can be uploaded by the user or detected automatically by the system using existing PDB entries as well as user-provided structures in PDB format. In all cases, the user can refine the definition of the cavity interactively via a browser-based Jmol 3D molecular visualization interface. Furthermore, users can restrict the search to a subset of the dataset using a cognate-similarity threshold. Local structural similarities are detected using the IsoCleft software and ranked according to two criteria (number of atoms in common and Tanimoto score of local structural similarity and the associated Z-score and p-value measures of statistical significance. The results, including predicted ligands, target proteins, similarity scores, number of atoms in common, etc., are shown in a powerful interactive graphical interface. This interface permits the visualization of target ligands superimposed on the query cavity and additionally provides a table of pairwise ligand topological similarities. Similarities between top scoring ligands serve as an additional tool to judge the quality of the results obtained. We present several examples where IsoCleft Finder provides useful functional information. IsoCleft Finder results are complementary to existing approaches for the prediction of protein function from structure, rational drug design and x-ray crystallography. IsoCleft Finder can be found at: http://bcb.med.usherbrooke.ca/isocleftfinder.
GEPSI: A Gene Expression Profile Similarity-Based Identification Method of Bioactive Components in Traditional Chinese Medicine Formula.

Science.gov (United States)

Zhang, Baixia; He, Shuaibing; Lv, Chenyang; Zhang, Yanling; Wang, Yun

2018-01-01

The identification of bioactive components in traditional Chinese medicine (TCM) is an important part of the TCM material foundation research. Recently, molecular docking technology has been extensively used for the identification of TCM bioactive components. However, target proteins that are used in molecular docking may not be the actual TCM target. For this reason, the bioactive components would likely be omitted or incorrect. To address this problem, this study proposed the GEPSI method that identified the target proteins of TCM based on the similarity of gene expression profiles. The similarity of the gene expression profiles affected by TCM and small molecular drugs was calculated. The pharmacological action of TCM may be similar to that of small molecule drugs that have a high similarity score. Indeed, the target proteins of the small molecule drugs could be considered TCM targets. Thus, we identified the bioactive components of a TCM by molecular docking and verified the reliability of this method by a literature investigation. Using the target proteins that TCM actually affected as targets, the identification of the bioactive components was more accurate. This study provides a fast and effective method for the identification of TCM bioactive components.
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.

Science.gov (United States)

Ma, Yan; Mazumdar, Madhu

2011-10-30

Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
Predicting microRNA-disease associations using label propagation based on linear neighborhood similarity.

Science.gov (United States)

Li, Guanghui; Luo, Jiawei; Xiao, Qiu; Liang, Cheng; Ding, Pingjian

2018-05-12

Interactions between microRNAs (miRNAs) and diseases can yield important information for uncovering novel prognostic markers. Since experimental determination of disease-miRNA associations is time-consuming and costly, attention has been given to designing efficient and robust computational techniques for identifying undiscovered interactions. In this study, we present a label propagation model with linear neighborhood similarity, called LPLNS, to predict unobserved miRNA-disease associations. Additionally, a preprocessing step is performed to derive new interaction likelihood profiles that will contribute to the prediction since new miRNAs and diseases lack known associations. Our results demonstrate that the LPLNS model based on the known disease-miRNA associations could achieve impressive performance with an AUC of 0.9034. Furthermore, we observed that the LPLNS model based on new interaction likelihood profiles could improve the performance to an AUC of 0.9127. This was better than other comparable methods. In addition, case studies also demonstrated our method's outstanding performance for inferring undiscovered interactions between miRNAs and diseases, especially for novel diseases. Copyright © 2018. Published by Elsevier Inc.
Improved cosine similarity measures of simplified neutrosophic sets for medical diagnoses.

Science.gov (United States)

Ye, Jun

2015-03-01

In pattern recognition and medical diagnosis, similarity measure is an important mathematical tool. To overcome some disadvantages of existing cosine similarity measures of simplified neutrosophic sets (SNSs) in vector space, this paper proposed improved cosine similarity measures of SNSs based on cosine function, including single valued neutrosophic cosine similarity measures and interval neutrosophic cosine similarity measures. Then, weighted cosine similarity measures of SNSs were introduced by taking into account the importance of each element. Further, a medical diagnosis method using the improved cosine similarity measures was proposed to solve medical diagnosis problems with simplified neutrosophic information. The improved cosine similarity measures between SNSs were introduced based on cosine function. Then, we compared the improved cosine similarity measures of SNSs with existing cosine similarity measures of SNSs by numerical examples to demonstrate their effectiveness and rationality for overcoming some shortcomings of existing cosine similarity measures of SNSs in some cases. In the medical diagnosis method, we can find a proper diagnosis by the cosine similarity measures between the symptoms and considered diseases which are represented by SNSs. Then, the medical diagnosis method based on the improved cosine similarity measures was applied to two medical diagnosis problems to show the applications and effectiveness of the proposed method. Two numerical examples all demonstrated that the improved cosine similarity measures of SNSs based on the cosine function can overcome the shortcomings of the existing cosine similarity measures between two vectors in some cases. By two medical diagnoses problems, the medical diagnoses using various similarity measures of SNSs indicated the identical diagnosis results and demonstrated the effectiveness and rationality of the diagnosis method proposed in this paper. The improved cosine measures of SNSs based on cosine
On self-similarity of crack layer

Science.gov (United States)

Botsis, J.; Kunin, B.

1987-01-01

The crack layer (CL) theory of Chudnovsky (1986), based on principles of thermodynamics of irreversible processes, employs a crucial hypothesis of self-similarity. The self-similarity hypothesis states that the value of the damage density at a point x of the active zone at a time t coincides with that at the corresponding point in the initial (t = 0) configuration of the active zone, the correspondence being given by a time-dependent affine transformation of the space variables. In this paper, the implications of the self-similarity hypothesis for qusi-static CL propagation is investigated using polystyrene as a model material and examining the evolution of damage distribution along the trailing edge which is approximated by a straight segment perpendicular to the crack path. The results support the self-similarity hypothesis adopted by the CL theory.
Voxel-based analysis of the diffusion tensor

International Nuclear Information System (INIS)

Abe, Osamu; Takao, Hidemasa; Gonoi, Wataru; Sasaki, Hiroki; Murakami, Mizuho; Ohtomo, Kuni; Kabasawa, Hiroyuki; Kawaguchi, Hiroshi; Goto, Masami; Yamada, Haruyasu; Yamasue, Hidenori; Kasai, Kiyoto; Aoki, Shigeki

2010-01-01

Diffusion tensor imaging (DTI) has provided important insights into the neurobiological basis for normal development and aging and various disease processes in the central nervous system. The aim of this article is to review the current protocols for DTI acquisition and preprocessing and statistical testing for a voxelwise analysis of DTI, focused on statistical parametric mapping (SPM) and tract-based spatial statistics (TBSS). We tested the effects of distortion correction induced by gradient nonlinearity on fractional anisotropy (FA) maps or FA skeletons processed via two SPM-based methods (coregistration and FA template methods), or TBSS-based method, respectively. With two SPM-based methods, we found similar results in some points (e.g., significant FA elevation for uncorrected images in anterior-dominant white matter and for corrected images in bilateral middle cerebellar peduncles) and different results in other points (e.g., significantly larger FA for corrected images with coregistration method, but significantly smaller with FA template method in bilateral internal capsules, extending to corona radiata, and semioval centers). In contrast, there was no area with significant difference between uncorrected and corrected FA skeletons with TBSS-based method. The discrepancy among these results was not explained in full, but possible explanations were misregistration and smoothing for the SPM-based methods and insensitivity to FA changes outside the local centers of white matter bundles for TBSS-based method. (orig.)
Effects of soil, altitude, rainfall, and distance on the floristic similarity of Atlantic Forest fragments in the east-Northeast

Directory of Open Access Journals (Sweden)

Flávia de Barros Prado Moura

2013-09-01

Full Text Available This paper presents the results of a floristic survey conducted on an Atlantic Forest fragment in the state of Alagoas, Brazil. Besides, the results of a similarity analysis between ten rainforest fragments from the Brazilian east-Northeast are presented. The floristic comparison was based on binary data with regard to the presence/ absence criterion for tree species identified in the ten fragments by means of Sørensen’s similarity index. A dendrogram was prepared using cluster analysis (Jaccard’s index and canonical correspondence analysis (CCA to test the abiotic factors, which can differently influence the similarity of fragments. The fragments showed low similarity indices. The variations were due to the fact that each fragment is a patch of what once was a continuous and heterogeneous region. However, the diversity loss, including the disappearance of more demanding species, can lead, in large-scale, to homogeneity and simplification of the northeastern Atlantic Forest.
Fludarabine Melphalan reduced-intensity conditioning allotransplanation provides similar disease control in lymphoid and myeloid malignancies: analysis of 344 patients.

Science.gov (United States)

Bryant, A; Nivison-Smith, I; Pillai, E S; Kennedy, G; Kalff, A; Ritchie, D; George, B; Hertzberg, M; Patil, S; Spencer, A; Fay, K; Cannell, P; Berkahn, L; Doocey, R; Spearing, R; Moore, J

2014-01-01

This was an Australasian Bone Marrow Transplant Recipient Registry (ABMTRR)-based retrospective study assessing the outcome of Fludarabine Melphalan (FluMel) reduced-intensity conditioning between 1998 and 2008. Median follow-up was 3.4 years. There were 344 patients with a median age of 54 years (18-68). In all, 234 patients had myeloid malignancies, with AML (n=166) being the commonest indication. There were 110 lymphoid patients with non-hodgkins lymphoma (NHL) (n=64) the main indication. TRM at day 100 was 14% with no significant difference between the groups. OS and disease-free survival (DFS) were similar between myeloid and lymphoid patients (57 and 50% at 3 years, respectively). There was no difference in cumulative incidence of relapse or GVHD between groups. Multivariate analysis revealed four significant adverse risk factors for DFS: donor other than HLA-identical sibling donor, not in remission at transplant, previous autologous transplant and recipient CMV positive. Chronic GVHD was associated with improved DFS in multivariate analysis predominantly due to a marked reduction in relapse (HR:0.44, P=0.003). This study confirms that FluMel provides durable and equivalent remissions in both myeloid and lymphoid malignancies. Disease stage and chronic GVHD remain important determinants of outcome for FluMel allografting.
Mixed quantization dimensions of self-similar measures

International Nuclear Information System (INIS)

Dai Meifeng; Wang Xiaoli; Chen Dandan

2012-01-01

Highlights: ► We define the mixed quantization dimension of finitely many measures. ► Formula of mixed quantization dimensions of self-similar measures is given. ► Illustrate the behavior of mixed quantization dimension as a function of order. - Abstract: Classical multifractal analysis studies the local scaling behaviors of a single measure. However recently mixed multifractal has generated interest. The purpose of this paper is some results about the mixed quantization dimensions of self-similar measures.
ANALYSIS DATA SETS USING HYBRID TECHNIQUES APPLIED ARTIFICIAL INTELLIGENCE BASED PRODUCTION SYSTEMS INTEGRATED DESIGN

OpenAIRE

Daniel-Petru GHENCEA; Miron ZAPCIU; Claudiu-Florinel BISU; Elena-Iuliana BOTEANU; Elena-Luminiţa OLTEANU

2017-01-01

The paper proposes a prediction model of behavior spindle from the point of view of the thermal deformations and the level of the vibrations by highlighting and processing the characteristic equations. This is a model analysis for the shaft with similar electro-mechanical characteristics can be achieved using a hybrid analysis based on artificial intelligence (genetic algorithms - artificial neural networks - fuzzy logic). The paper presents a prediction mode obtaining valid range of values f...
Notions of similarity for systems biology models.

Science.gov (United States)

Henkel, Ron; Hoehndorf, Robert; Kacprowski, Tim; Knüpfer, Christian; Liebermeister, Wolfram; Waltemath, Dagmar

2018-01-01

Systems biology models are rapidly increasing in complexity, size and numbers. When building large models, researchers rely on software tools for the retrieval, comparison, combination and merging of models, as well as for version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of 'similarity' may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here we survey existing methods for the comparison of models, introduce quantitative measures for model similarity, and discuss potential applications of combined similarity measures. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on a combination of different model aspects. The six aspects that we define as potentially relevant for similarity are underlying encoding, references to biological entities, quantitative behaviour, qualitative behaviour, mathematical equations and parameters and network structure. We argue that future similarity measures will benefit from combining these model aspects in flexible, problem-specific ways to mimic users' intuition about model similarity, and to support complex model searches in databases. © The Author 2016. Published by Oxford University Press.

How Do I Choose Thee? Let me Count the Ways': A Textual Analysis of Similarities and Differences in Modes of Decision-making in China and the United States

OpenAIRE

Elke U. Weber; Daniel R. Ames; Ann-Renee Blais

2004-01-01

This paper investigates the effect of decision-makers'culture on their implicit choice of how to make decisions. In a content analysis of major decisions described in American and Chinese twentieth-century novels, we test a series of hypotheses based on prior theoretical and empirical investigations of cross-cultural variation in human motivation and decision processes. The data show a striking degree of cultural similarity in the relationships between decision content, situational characteri...
Characterizing Chemical Similarity with Vibrational Spectroscopy: New Insights into the Substituent Effects in Monosubstituted Benzenes.

Science.gov (United States)

Tao, Yunwen; Zou, Wenli; Cremer, Dieter; Kraka, Elfi

2017-10-26

A novel approach is presented to assess chemical similarity based the local vibrational mode analysis developed by Konkoli and Cremer. The local mode frequency shifts are introduced as similarity descriptors that are sensitive to any electronic structure change. In this work, 59 different monosubstituted benzenes are compared. For a subset of 43 compounds, for which experimental data was available, the ortho-/para- and meta-directing effect in electrophilic aromatic substitution reactions could be correctly reproduced, proving the robustness of the new similarity index. For the remaining 16 compounds, the directing effect was predicted. The new approach is broadly applicable to all compounds for which either experimental or calculated vibrational frequency information is available.
A method for identifying hierarchical sub-networks / modules and weighting network links based on their similarity in sub-network / module affiliation

Directory of Open Access Journals (Sweden)

WenJun Zhang

2016-06-01

Full Text Available Some networks, including biological networks, consist of hierarchical sub-networks / modules. Based on my previous study, in present study a method for both identifying hierarchical sub-networks / modules and weighting network links is proposed. It is based on the cluster analysis in which between-node similarity in sets of adjacency nodes is used. Two matrices, linkWeightMat and linkClusterIDs, are achieved by using the algorithm. Two links with both the same weight in linkWeightMat and the same cluster ID in linkClusterIDs belong to the same sub-network / module. Two links with the same weight in linkWeightMat but different cluster IDs in linkClusterIDs belong to two sub-networks / modules at the same hirarchical level. However, a link with an unique cluster ID in linkClusterIDs does not belong to any sub-networks / modules. A sub-network / module of the greater weight is the more connected sub-network / modules. Matlab codes of the algorithm are presented.
[Baking method of Platycladi Cacumen Carbonisatum based on similarity of UPLC fingerprints].

Science.gov (United States)

Shan, Mingqiu; Chen, Chao; Yao, Xiaodong; Ding, Anwei

2010-09-01

To establish a baking method of Platycladi Cacumen Carbonisatum for providing a new idea to Carbonic Herbs' research. Samples were prepared in an oven for different time at different temperatures separately. Then the fingerprints of the samples were determined by UPLC. According to the standard fingerprint, the similarities of the samples' fingerprints were compared. The similarities of 3 samples, which were baked at 230 degrees C for 20 min, 30 min and at 240 degrees C for 20 min, were above 0.96. According to the similarities of the fingerprints and in view of the appearances, Platycladi Cacumen Carbonizing should be baked at 230 degrees C for 20 min.
Improved cosine similarity measures of simplified neutrosophic setsfor medical diagnoses

OpenAIRE

Jun Ye

2014-01-01

In pattern recognition and medical diagnosis, similarity measure is an important mathematicaltool. To overcome some disadvantages of existing cosine similarity measures of simplified neutrosophicsets (SNSs) in vector space, this paper proposed improved cosine similarity measures of SNSs based oncosine function, including single valued neutrosophic cosine similarity measures and interval neutro-sophic cosine similarity measures. Then, weighted cosine similarity measures of SNSs were introduced...
Improving performance of content-based image retrieval schemes in searching for similar breast mass regions: an assessment

International Nuclear Information System (INIS)

Wang Xiaohui; Park, Sang Cheol; Zheng Bin

2009-01-01

This study aims to assess three methods commonly used in content-based image retrieval (CBIR) schemes and investigate the approaches to improve scheme performance. A reference database involving 3000 regions of interest (ROIs) was established. Among them, 400 ROIs were randomly selected to form a testing dataset. Three methods, namely mutual information, Pearson's correlation and a multi-feature-based k-nearest neighbor (KNN) algorithm, were applied to search for the 15 'the most similar' reference ROIs to each testing ROI. The clinical relevance and visual similarity of searching results were evaluated using the areas under receiver operating characteristic (ROC) curves (A Z ) and average mean square difference (MSD) of the mass boundary spiculation level ratings between testing and selected ROIs, respectively. The results showed that the A Z values were 0.893 ± 0.009, 0.606 ± 0.021 and 0.699 ± 0.026 for the use of KNN, mutual information and Pearson's correlation, respectively. The A Z values increased to 0.724 ± 0.017 and 0.787 ± 0.016 for mutual information and Pearson's correlation when using ROIs with the size adaptively adjusted based on actual mass size. The corresponding MSD values were 2.107 ± 0.718, 2.301 ± 0.733 and 2.298 ± 0.743. The study demonstrates that due to the diversity of medical images, CBIR schemes using multiple image features and mass size-based ROIs can achieve significantly improved performance.
EVALUATION OF SEMANTIC SIMILARITY FOR SENTENCES IN NATURAL LANGUAGE BY MATHEMATICAL STATISTICS METHODS

Directory of Open Access Journals (Sweden)

A. E. Pismak

2016-03-01

Full Text Available Subject of Research. The paper is focused on Wiktionary articles structural organization in the aspect of its usage as the base for semantic network. Wiktionary community references, article templates and articles markup features are analyzed. The problem of numerical estimation for semantic similarity of structural elements in Wiktionary articles is considered. Analysis of existing software for semantic similarity estimation of such elements is carried out; algorithms of their functioning are studied; their advantages and disadvantages are shown. Methods. Mathematical statistics methods were used to analyze Wiktionary articles markup features. The method of semantic similarity computing based on statistics data for compared structural elements was proposed.Main Results. We have concluded that there is no possibility for direct use of Wiktionary articles as the source for semantic network. We have proposed to find hidden similarity between article elements, and for that purpose we have developed the algorithm for calculation of confidence coefficients proving that each pair of sentences is semantically near. The research of quantitative and qualitative characteristics for the developed algorithm has shown its major performance advantage over the other existing solutions in the presence of insignificantly higher error rate. Practical Relevance. The resulting algorithm may be useful in developing tools for automatic Wiktionary articles parsing. The developed method could be used in computing of semantic similarity for short text fragments in natural language in case of algorithm performance requirements are higher than its accuracy specifications.
Similarity-based distortion of visual short-term memory is due to perceptual averaging.

Science.gov (United States)

Dubé, Chad; Zhou, Feng; Kahana, Michael J; Sekuler, Robert

2014-03-01

A task-irrelevant stimulus can distort recall from visual short-term memory (VSTM). Specifically, reproduction of a task-relevant memory item is biased in the direction of the irrelevant memory item (Huang & Sekuler, 2010a). The present study addresses the hypothesis that such effects reflect the influence of neural averaging under conditions of uncertainty about the contents of VSTM (Alvarez, 2011; Ball & Sekuler, 1980). We manipulated subjects' attention to relevant and irrelevant study items whose similarity relationships were held constant, while varying how similar the study items were to a subsequent recognition probe. On each trial, subjects were shown one or two Gabor patches, followed by the probe; their task was to indicate whether the probe matched one of the study items. A brief cue told subjects which Gabor, first or second, would serve as that trial's target item. Critically, this cue appeared either before, between, or after the study items. A distributional analysis of the resulting mnemometric functions showed an inflation in probability density in the region spanning the spatial frequency of the average of the two memory items. This effect, due to an elevation in false alarms to probes matching the perceptual average, was diminished when cues were presented before both study items. These results suggest that (a) perceptual averages are computed obligatorily and (b) perceptual averages are relied upon to a greater extent when item representations are weakened. Implications of these results for theories of VSTM are discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.
A Self-Organizing Map-Based Approach to Generating Reduced-Size, Statistically Similar Climate Datasets

Science.gov (United States)

Cabell, R.; Delle Monache, L.; Alessandrini, S.; Rodriguez, L.

2015-12-01

Climate-based studies require large amounts of data in order to produce accurate and reliable results. Many of these studies have used 30-plus year data sets in order to produce stable and high-quality results, and as a result, many such data sets are available, generally in the form of global reanalyses. While the analysis of these data lead to high-fidelity results, its processing can be very computationally expensive. This computational burden prevents the utilization of these data sets for certain applications, e.g., when rapid response is needed in crisis management and disaster planning scenarios resulting from release of toxic material in the atmosphere. We have developed a methodology to reduce large climate datasets to more manageable sizes while retaining statistically similar results when used to produce ensembles of possible outcomes. We do this by employing a Self-Organizing Map (SOM) algorithm to analyze general patterns of meteorological fields over a regional domain of interest to produce a small set of "typical days" with which to generate the model ensemble. The SOM algorithm takes as input a set of vectors and generates a 2D map of representative vectors deemed most similar to the input set and to each other. Input predictors are selected that are correlated with the model output, which in our case is an Atmospheric Transport and Dispersion (T&D) model that is highly dependent on surface winds and boundary layer depth. To choose a subset of "typical days," each input day is assigned to its closest SOM map node vector and then ranked by distance. Each node vector is treated as a distribution and days are sampled from them by percentile. Using a 30-node SOM, with sampling every 20th percentile, we have been able to reduce 30 years of the Climate Forecast System Reanalysis (CFSR) data for the month of October to 150 "typical days." To estimate the skill of this approach, the "Measure of Effectiveness" (MOE) metric is used to compare area and overlap
A fully automatic end-to-end method for content-based image retrieval of CT scans with similar liver lesion annotations.

Science.gov (United States)

Spanier, A B; Caplan, N; Sosna, J; Acar, B; Joskowicz, L

2018-01-01

The goal of medical content-based image retrieval (M-CBIR) is to assist radiologists in the decision-making process by retrieving medical cases similar to a given image. One of the key interests of radiologists is lesions and their annotations, since the patient treatment depends on the lesion diagnosis. Therefore, a key feature of M-CBIR systems is the retrieval of scans with the most similar lesion annotations. To be of value, M-CBIR systems should be fully automatic to handle large case databases. We present a fully automatic end-to-end method for the retrieval of CT scans with similar liver lesion annotations. The input is a database of abdominal CT scans labeled with liver lesions, a query CT scan, and optionally one radiologist-specified lesion annotation of interest. The output is an ordered list of the database CT scans with the most similar liver lesion annotations. The method starts by automatically segmenting the liver in the scan. It then extracts a histogram-based features vector from the segmented region, learns the features' relative importance, and ranks the database scans according to the relative importance measure. The main advantages of our method are that it fully automates the end-to-end querying process, that it uses simple and efficient techniques that are scalable to large datasets, and that it produces quality retrieval results using an unannotated CT scan. Our experimental results on 9 CT queries on a dataset of 41 volumetric CT scans from the 2014 Image CLEF Liver Annotation Task yield an average retrieval accuracy (Normalized Discounted Cumulative Gain index) of 0.77 and 0.84 without/with annotation, respectively. Fully automatic end-to-end retrieval of similar cases based on image information alone, rather that on disease diagnosis, may help radiologists to better diagnose liver lesions.
Notions of similarity for computational biology models

KAUST Repository

Waltemath, Dagmar

2016-03-21

Computational models used in biology are rapidly increasing in complexity, size, and numbers. To build such large models, researchers need to rely on software tools for model retrieval, model combination, and version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of similarity may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here, we introduce a general notion of quantitative model similarities, survey the use of existing model comparison methods in model building and management, and discuss potential applications of model comparison. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on different model aspects. Potentially relevant aspects of a model comprise its references to biological entities, network structure, mathematical equations and parameters, and dynamic behaviour. Future similarity measures could combine these model aspects in flexible, problem-specific ways in order to mimic users\\' intuition about model similarity, and to support complex model searches in databases.
Notions of similarity for computational biology models

KAUST Repository

Waltemath, Dagmar; Henkel, Ron; Hoehndorf, Robert; Kacprowski, Tim; Knuepfer, Christian; Liebermeister, Wolfram

2016-01-01

Computational models used in biology are rapidly increasing in complexity, size, and numbers. To build such large models, researchers need to rely on software tools for model retrieval, model combination, and version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of similarity may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here, we introduce a general notion of quantitative model similarities, survey the use of existing model comparison methods in model building and management, and discuss potential applications of model comparison. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on different model aspects. Potentially relevant aspects of a model comprise its references to biological entities, network structure, mathematical equations and parameters, and dynamic behaviour. Future similarity measures could combine these model aspects in flexible, problem-specific ways in order to mimic users' intuition about model similarity, and to support complex model searches in databases.
A novel water quality data analysis framework based on time-series data mining.

Science.gov (United States)

Deng, Weihui; Wang, Guoyin

2017-07-01

The rapid development of time-series data mining provides an emerging method for water resource management research. In this paper, based on the time-series data mining methodology, we propose a novel and general analysis framework for water quality time-series data. It consists of two parts: implementation components and common tasks of time-series data mining in water quality data. In the first part, we propose to granulate the time series into several two-dimensional normal clouds and calculate the similarities in the granulated level. On the basis of the similarity matrix, the similarity search, anomaly detection, and pattern discovery tasks in the water quality time-series instance dataset can be easily implemented in the second part. We present a case study of this analysis framework on weekly Dissolve Oxygen time-series data collected from five monitoring stations on the upper reaches of Yangtze River, China. It discovered the relationship of water quality in the mainstream and tributary as well as the main changing patterns of DO. The experimental results show that the proposed analysis framework is a feasible and efficient method to mine the hidden and valuable knowledge from water quality historical time-series data. Copyright © 2017 Elsevier Ltd. All rights reserved.
SVM-based glioma grading. Optimization by feature reduction analysis

International Nuclear Information System (INIS)

Zoellner, Frank G.; Schad, Lothar R.; Emblem, Kyrre E.; Harvard Medical School, Boston, MA; Oslo Univ. Hospital

2012-01-01

We investigated the predictive power of feature reduction analysis approaches in support vector machine (SVM)-based classification of glioma grade. In 101 untreated glioma patients, three analytic approaches were evaluated to derive an optimal reduction in features; (i) Pearson's correlation coefficients (PCC), (ii) principal component analysis (PCA) and (iii) independent component analysis (ICA). Tumor grading was performed using a previously reported SVM approach including whole-tumor cerebral blood volume (CBV) histograms and patient age. Best classification accuracy was found using PCA at 85% (sensitivity = 89%, specificity = 84%) when reducing the feature vector from 101 (100-bins rCBV histogram + age) to 3 principal components. In comparison, classification accuracy by PCC was 82% (89%, 77%, 2 dimensions) and 79% by ICA (87%, 75%, 9 dimensions). For improved speed (up to 30%) and simplicity, feature reduction by all three methods provided similar classification accuracy to literature values (∝87%) while reducing the number of features by up to 98%. (orig.)
SVM-based glioma grading. Optimization by feature reduction analysis

Energy Technology Data Exchange (ETDEWEB)

Zoellner, Frank G.; Schad, Lothar R. [University Medical Center Mannheim, Heidelberg Univ., Mannheim (Germany). Computer Assisted Clinical Medicine; Emblem, Kyrre E. [Massachusetts General Hospital, Charlestown, A.A. Martinos Center for Biomedical Imaging, Boston MA (United States). Dept. of Radiology; Harvard Medical School, Boston, MA (United States); Oslo Univ. Hospital (Norway). The Intervention Center

2012-11-01

We investigated the predictive power of feature reduction analysis approaches in support vector machine (SVM)-based classification of glioma grade. In 101 untreated glioma patients, three analytic approaches were evaluated to derive an optimal reduction in features; (i) Pearson's correlation coefficients (PCC), (ii) principal component analysis (PCA) and (iii) independent component analysis (ICA). Tumor grading was performed using a previously reported SVM approach including whole-tumor cerebral blood volume (CBV) histograms and patient age. Best classification accuracy was found using PCA at 85% (sensitivity = 89%, specificity = 84%) when reducing the feature vector from 101 (100-bins rCBV histogram + age) to 3 principal components. In comparison, classification accuracy by PCC was 82% (89%, 77%, 2 dimensions) and 79% by ICA (87%, 75%, 9 dimensions). For improved speed (up to 30%) and simplicity, feature reduction by all three methods provided similar classification accuracy to literature values ({proportional_to}87%) while reducing the number of features by up to 98%. (orig.)
World Wide Web-based system for the calculation of substituent parameters and substituent similarity searches.

Science.gov (United States)

Ertl, P

1998-02-01

Easy to use, interactive, and platform-independent WWW-based tools are ideal for development of chemical applications. By using the newly emerging Web technologies such as Java applets and sophisticated scripting, it is possible to deliver powerful molecular processing capabilities directly to the desk of synthetic organic chemists. In Novartis Crop Protection in Basel, a Web-based molecular modelling system has been in use since 1995. In this article two new modules of this system are presented: a program for interactive calculation of important hydrophobic, electronic, and steric properties of organic substituents, and a module for substituent similarity searches enabling the identification of bioisosteric functional groups. Various possible applications of calculated substituent parameters are also discussed, including automatic design of molecules with the desired properties and creation of targeted virtual combinatorial libraries.
Cost Risk Analysis Based on Perception of the Engineering Process

Science.gov (United States)

Dean, Edwin B.; Wood, Darrell A.; Moore, Arlene A.; Bogart, Edward H.

1986-01-01

In most cost estimating applications at the NASA Langley Research Center (LaRC), it is desirable to present predicted cost as a range of possible costs rather than a single predicted cost. A cost risk analysis generates a range of cost for a project and assigns a probability level to each cost value in the range. Constructing a cost risk curve requires a good estimate of the expected cost of a project. It must also include a good estimate of expected variance of the cost. Many cost risk analyses are based upon an expert's knowledge of the cost of similar projects in the past. In a common scenario, a manager or engineer, asked to estimate the cost of a project in his area of expertise, will gather historical cost data from a similar completed project. The cost of the completed project is adjusted using the perceived technical and economic differences between the two projects. This allows errors from at least three sources. The historical cost data may be in error by some unknown amount. The managers' evaluation of the new project and its similarity to the old project may be in error. The factors used to adjust the cost of the old project may not correctly reflect the differences. Some risk analyses are based on untested hypotheses about the form of the statistical distribution that underlies the distribution of possible cost. The usual problem is not just to come up with an estimate of the cost of a project, but to predict the range of values into which the cost may fall and with what level of confidence the prediction is made. Risk analysis techniques that assume the shape of the underlying cost distribution and derive the risk curve from a single estimate plus and minus some amount usually fail to take into account the actual magnitude of the uncertainty in cost due to technical factors in the project itself. This paper addresses a cost risk method that is based on parametric estimates of the technical factors involved in the project being costed. The engineering
Privacy Preserving Similarity Based Text Retrieval through Blind Storage

Directory of Open Access Journals (Sweden)

Pinki Kumari

2016-09-01

Full Text Available Cloud computing is improving rapidly due to their more advantage and more data owners give interest to outsource their data into cloud storage for centralize their data. As huge files stored in the cloud storage, there is need to implement the keyword based search process to data user. At the same time to protect the privacy of data, encryption techniques are used for sensitive data, that encryption is done before outsourcing data to cloud server. But it is critical to search results in encryption data. In this system we propose similarity text retrieval from the blind storage blocks with encryption format. This system provides more security because of blind storage system. In blind storage system data is stored randomly on cloud storage. In Existing Data Owner cannot encrypt the document data as it was done only at server end. Everyone can access the data as there was no private key concept applied to maintained privacy of the data. But In our proposed system, Data Owner can encrypt the data himself using RSA algorithm. RSA is a public key-cryptosystem and it is widely used for sensitive data storage over Internet. In our system we use Text mining process for identifying the index files of user documents. Before encryption we also use NLP (Nature Language Processing technique to identify the keyword synonyms of data owner document. Here text mining process examines text word by word and collect literal meaning beyond the words group that composes the sentence. Those words are examined in API of word net so that only equivalent words can be identified for index file use. Our proposed system provides more secure and authorized way of recover the text in cloud storage with access control. Finally, our experimental result shows that our system is better than existing.
Modeling the angular motion dynamics of spacecraft with a magnetic attitude control system based on experimental studies and dynamic similarity

Science.gov (United States)

Kulkov, V. M.; Medvedskii, A. L.; Terentyev, V. V.; Firsyuk, S. O.; Shemyakov, A. O.

2017-12-01

The problem of spacecraft attitude control using electromagnetic systems interacting with the Earth's magnetic field is considered. A set of dimensionless parameters has been formed to investigate the spacecraft orientation regimes based on dynamically similar models. The results of experimental studies of small spacecraft with a magnetic attitude control system can be extrapolated to the in-orbit spacecraft motion control regimes by using the methods of the dimensional and similarity theory.
Relationship between genetic similarity and some productive traits ...

African Journals Online (AJOL)

Admin

Random amplified polymorphic DNA (RAPD) technique was applied to detect genetic similarity between five local chicken strains that have been selected for eggs and meat production in Egypt. Based on six oligonucleotide primers, the genetic similarity between the egg-producing strains (Anshas, Silver. Montazah and ...

Similarity joins in relational database systems

CERN Document Server

Augsten, Nikolaus

2013-01-01

State-of-the-art database systems manage and process a variety of complex objects, including strings and trees. For such objects equality comparisons are often not meaningful and must be replaced by similarity comparisons. This book describes the concepts and techniques to incorporate similarity into database systems. We start out by discussing the properties of strings and trees, and identify the edit distance as the de facto standard for comparing complex objects. Since the edit distance is computationally expensive, token-based distances have been introduced to speed up edit distance comput
ClusTrack: feature extraction and similarity measures for clustering of genome-wide data sets.

Directory of Open Access Journals (Sweden)

Halfdan Rydbeck

Full Text Available Clustering is a popular technique for explorative analysis of data, as it can reveal subgroupings and similarities between data in an unsupervised manner. While clustering is routinely applied to gene expression data, there is a lack of appropriate general methodology for clustering of sequence-level genomic and epigenomic data, e.g. ChIP-based data. We here introduce a general methodology for clustering data sets of coordinates relative to a genome assembly, i.e. genomic tracks. By defining appropriate feature extraction approaches and similarity measures, we allow biologically meaningful clustering to be performed for genomic tracks using standard clustering algorithms. An implementation of the methodology is provided through a tool, ClusTrack, which allows fine-tuned clustering analyses to be specified through a web-based interface. We apply our methods to the clustering of occupancy of the H3K4me1 histone modification in samples from a range of different cell types. The majority of samples form meaningful subclusters, confirming that the definitions of features and similarity capture biological, rather than technical, variation between the genomic tracks. Input data and results are available, and can be reproduced, through a Galaxy Pages document at http://hyperbrowser.uio.no/hb/u/hb-superuser/p/clustrack. The clustering functionality is available as a Galaxy tool, under the menu option "Specialized analyzis of tracks", and the submenu option "Cluster tracks based on genome level similarity", at the Genomic HyperBrowser server: http://hyperbrowser.uio.no/hb/.
Spectral analysis of multi-dimensional self-similar Markov processes

International Nuclear Information System (INIS)

Modarresi, N; Rezakhah, S

2010-01-01

In this paper we consider a discrete scale invariant (DSI) process {X(t), t in R + } with scale l > 1. We consider a fixed number of observations in every scale, say T, and acquire our samples at discrete points α k , k in W, where α is obtained by the equality l = α T and W = {0, 1, ...}. We thus provide a discrete time scale invariant (DT-SI) process X(.) with the parameter space {α k , k in W}. We find the spectral representation of the covariance function of such a DT-SI process. By providing the harmonic-like representation of multi-dimensional self-similar processes, spectral density functions of them are presented. We assume that the process {X(t), t in R + } is also Markov in the wide sense and provide a discrete time scale invariant Markov (DT-SIM) process with the above scheme of sampling. We present an example of the DT-SIM process, simple Brownian motion, by the above sampling scheme and verify our results. Finally, we find the spectral density matrix of such a DT-SIM process and show that its associated T-dimensional self-similar Markov process is fully specified by {R H j (1), R j H (0), j = 0, 1, ..., T - 1}, where R H j (τ) is the covariance function of jth and (j + τ)th observations of the process.
Quantitative Outline-based Shape Analysis and Classification of Planetary Craterforms using Supervised Learning Models

Science.gov (United States)

Slezak, Thomas Joseph; Radebaugh, Jani; Christiansen, Eric

2017-10-01

The shapes of craterform morphology on planetary surfaces provides rich information about their origins and evolution. While morphologic information provides rich visual clues to geologic processes and properties, the ability to quantitatively communicate this information is less easily accomplished. This study examines the morphology of craterforms using the quantitative outline-based shape methods of geometric morphometrics, commonly used in biology and paleontology. We examine and compare landforms on planetary surfaces using shape, a property of morphology that is invariant to translation, rotation, and size. We quantify the shapes of paterae on Io, martian calderas, terrestrial basaltic shield calderas, terrestrial ash-flow calderas, and lunar impact craters using elliptic Fourier analysis (EFA) and the Zahn and Roskies (Z-R) shape function, or tangent angle approach to produce multivariate shape descriptors. These shape descriptors are subjected to multivariate statistical analysis including canonical variate analysis (CVA), a multiple-comparison variant of discriminant analysis, to investigate the link between craterform shape and classification. Paterae on Io are most similar in shape to terrestrial ash-flow calderas and the shapes of terrestrial basaltic shield volcanoes are most similar to martian calderas. The shapes of lunar impact craters, including simple, transitional, and complex morphology, are classified with a 100% rate of success in all models. Multiple CVA models effectively predict and classify different craterforms using shape-based identification and demonstrate significant potential for use in the analysis of planetary surfaces.
Gym-based exercise and home-based exercise with telephone support have similar outcomes when used as maintenance programs in adults with chronic health conditions: a randomised trial

Directory of Open Access Journals (Sweden)

Paul Jansons

2017-07-01

Trial registration: ACTRN12610001035011. [Jansons P, Robins L, O’Brien L, Haines T (2017 Gym-based exercise and home-based exercise with telephone support have similar outcomes when used as maintenance programs in adults with chronic health conditions: a randomised trial. Journal of Physiotherapy 63: 154–160
The Similarity Hypothesis and New Analytical Support on the Estimation of Horizontal Infiltration into Sand

International Nuclear Information System (INIS)

Prevedello, C.L.; Loyola, J.M.T.

2010-01-01

A method based on a specific power-law relationship between the hydraulic head and the Boltzmann variable, presented using a similarity hypothesis, was recently generalized to a range of powers to satisfy the Bruce and Klute equation exactly. Here, considerations are presented on the proposed similarity assumption, and new analytical support is given to estimate the water density flux into and inside the soil, based on the concept of sorptivity and on Buckingham-Darcy's law. Results show that the new analytical solution satisfies both theories in the calculation of water density fluxes and is in agreement with experimental results of water infiltrating horizontally into sand. However, the utility of this analysis still needs to be verified for a variety of different textured soils having a diverse range of initial soil water contents.
DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures.

Science.gov (United States)

Mazandu, Gaston K; Mulder, Nicola J

2013-09-25

The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.
Retrieval evaluation and distance learning from perceived similarity between endomicroscopy videos.

Science.gov (United States)

André, Barbara; Vercauteren, Tom; Buchner, Anna M; Wallace, Michael B; Ayache, Nicholas

2011-01-01

Evaluating content-based retrieval (CBR) is challenging because it requires an adequate ground-truth. When the available groundtruth is limited to textual metadata such as pathological classes, retrieval results can only be evaluated indirectly, for example in terms of classification performance. In this study we first present a tool to generate perceived similarity ground-truth that enables direct evaluation of endomicroscopic video retrieval. This tool uses a four-points Likert scale and collects subjective pairwise similarities perceived by multiple expert observers. We then evaluate against the generated ground-truth a previously developed dense bag-of-visual-words method for endomicroscopic video retrieval. Confirming the results of previous indirect evaluation based on classification, our direct evaluation shows that this method significantly outperforms several other state-of-the-art CBR methods. In a second step, we propose to improve the CBR method by learning an adjusted similarity metric from the perceived similarity ground-truth. By minimizing a margin-based cost function that differentiates similar and dissimilar video pairs, we learn a weight vector applied to the visual word signatures of videos. Using cross-validation, we demonstrate that the learned similarity distance is significantly better correlated with the perceived similarity than the original visual-word-based distance.
Similar Task Features Shape Judgment and Categorization Processes

Science.gov (United States)

Hoffmann, Janina A.; von Helversen, Bettina; Rieskamp, Jörg

2016-01-01

The distinction between similarity-based and rule-based strategies has instigated a large body of research in categorization and judgment. Within both domains, the task characteristics guiding strategy shifts are increasingly well documented. Across domains, past research has observed shifts from rule-based strategies in judgment to…
Pathway-based Analysis of the Hidden Genetic Heterogeneities in Cancers

Directory of Open Access Journals (Sweden)

Xiaolei Zhao

2014-02-01

Full Text Available Many cancers apparently showing similar phenotypes are actually distinct at the molecular level, leading to very different responses to the same treatment. It has been recently demonstrated that pathway-based approaches are robust and reliable for genetic analysis of cancers. Nevertheless, it remains unclear whether such function-based approaches are useful in deciphering molecular heterogeneities in cancers. Therefore, we aimed to test this possibility in the present study. First, we used a NCI60 dataset to validate the ability of pathways to correctly partition samples. Next, we applied the proposed method to identify the hidden subtypes in diffuse large B-cell lymphoma (DLBCL. Finally, the clinical significance of the identified subtypes was verified using survival analysis. For the NCI60 dataset, we achieved highly accurate partitions that best fit the clinical cancer phenotypes. Subsequently, for a DLBCL dataset, we identified three hidden subtypes that showed very different 10-year overall survival rates (90%, 46% and 20% and were highly significantly (P = 0.008 correlated with the clinical survival rate. This study demonstrated that the pathway-based approach is promising for unveiling genetic heterogeneities in complex human diseases.
Quantitative mass spectrometry analysis reveals similar substrate consensus motif for human Mps1 kinase and Plk1.

Directory of Open Access Journals (Sweden)

Zhen Dou

Full Text Available BACKGROUND: Members of the Mps1 kinase family play an essential and evolutionarily conserved role in the spindle assembly checkpoint (SAC, a surveillance mechanism that ensures accurate chromosome segregation during mitosis. Human Mps1 (hMps1 is highly phosphorylated during mitosis and many phosphorylation sites have been identified. However, the upstream kinases responsible for these phosphorylations are not presently known. METHODOLOGY/PRINCIPAL FINDINGS: Here, we identify 29 in vivo phosphorylation sites in hMps1. While in vivo analyses indicate that Aurora B and hMps1 activity are required for mitotic hyper-phosphorylation of hMps1, in vitro kinase assays show that Cdk1, MAPK, Plk1 and hMps1 itself can directly phosphorylate hMps1. Although Aurora B poorly phosphorylates hMps1 in vitro, it positively regulates the localization of Mps1 to kinetochores in vivo. Most importantly, quantitative mass spectrometry analysis demonstrates that at least 12 sites within hMps1 can be attributed to autophosphorylation. Remarkably, these hMps1 autophosphorylation sites closely resemble the consensus motif of Plk1, demonstrating that these two mitotic kinases share a similar substrate consensus. CONCLUSIONS/SIGNIFICANCE: hMps1 kinase is regulated by Aurora B kinase and its autophosphorylation. Analysis on hMps1 autophosphorylation sites demonstrates that hMps1 has a substrate preference similar to Plk1 kinase.
Efficient Similarity Retrieval in Music Databases

DEFF Research Database (Denmark)

Ruxanda, Maria Magdalena; Jensen, Christian Søndergaard

2006-01-01

Audio music is increasingly becoming available in digital form, and the digital music collections of individuals continue to grow. Addressing the need for effective means of retrieving music from such collections, this paper proposes new techniques for content-based similarity search. Each music...
DTI analysis methods : Voxel-based analysis

NARCIS (Netherlands)

Van Hecke, Wim; Leemans, Alexander; Emsell, Louise

2016-01-01

Voxel-based analysis (VBA) of diffusion tensor imaging (DTI) data permits the investigation of voxel-wise differences or changes in DTI metrics in every voxel of a brain dataset. It is applied primarily in the exploratory analysis of hypothesized group-level alterations in DTI parameters, as it does
Efficacy of computer technology-based HIV prevention interventions: a meta-analysis.

Science.gov (United States)

Noar, Seth M; Black, Hulda G; Pierce, Larson B

2009-01-02

To conduct a meta-analysis of computer technology-based HIV prevention behavioral interventions aimed at increasing condom use among a variety of at-risk populations. Systematic review and meta-analysis of existing published and unpublished studies testing computer-based interventions. Meta-analytic techniques were used to compute and aggregate effect sizes for 12 randomized controlled trials that met inclusion criteria. Variables that had the potential to moderate intervention efficacy were also tested. The overall mean weighted effect size for condom use was d = 0.259 (95% confidence interval = 0.201, 0.317; Z = 8.74, P partners, and incident sexually transmitted diseases. In addition, interventions were significantly more efficacious when they were directed at men or women (versus mixed sex groups), utilized individualized tailoring, used a Stages of Change model, and had more intervention sessions. Computer technology-based HIV prevention interventions have similar efficacy to more traditional human-delivered interventions. Given their low cost to deliver, ability to customize intervention content, and flexible dissemination channels, they hold much promise for the future of HIV prevention.
TH-E-BRF-08: Subpopulations of Similarly-Responding Lesions in Metastatic Prostate Cancer

International Nuclear Information System (INIS)

Lin, C; Harmon, S; Perk, T; Jeraj, R

2014-01-01

Purpose: In patients with multiple lesions, resistance to cancer treatments and subsequent disease recurrence may be due to heterogeneity of response across lesions. This study aims to identify subpopulations of similarly-responding metastatic prostate cancer lesions in bone using quantitative PET metrics. Methods: Seven metastatic prostate cancer patients treated with AR-directed therapy received pre-treatment and mid-treatment [F-18]NaF PET/CT scans. Images were registered using an articulated CT registration algorithm and transformations were applied to PET segmentations. Midtreatment response was calculated on PET-based texture features. Hierarchical agglomerative clustering was used to form groups of similarly-responding lesions, with the number of natural clusters (K) determined by the inconsistency coefficient. Lesion clustering was performed within each patient, and for the pooled population. The cophenetic coefficient (C) quantified how well the data was clustered. The Jaccard Index (JI) assessed similarity of cluster assignments from patient clustering and from population clustering. Results: 188 lesions in seven patients were identified for analysis (between 6 to 53 lesions per patient). Lesion response was defined as percent change relative to pre-treatment for 23 uncorrelated PET-based feature identifiers. . High response heterogeneity was found across all lesions (i.e. range ΔSUVmax =−95.98% to 775.00%). For intra-patient clustering, K ranged from 1–20. Population-based clustering resulted in 75 clusters, of 1-6 lesions each. Intra-patient clustering resulted in higher quality clusters than population clustering (mean C=0.95, range=0.89 to 1.00). For all patients, cluster assignments from population clustering showed good agreement to intra-patient clustering (mean JI=0.87, range=0.68 to 1.00). Conclusion: Subpopulations of similarly-responding lesions were identified in patients with multiple metastatic lesions. Good agreement was found between
SpolSimilaritySearch - A web tool to compare and search similarities between spoligotypes of Mycobacterium tuberculosis complex.

Science.gov (United States)

Couvin, David; Zozio, Thierry; Rastogi, Nalin

2017-07-01

Spoligotyping is one of the most commonly used polymerase chain reaction (PCR)-based methods for identification and study of genetic diversity of Mycobacterium tuberculosis complex (MTBC). Despite its known limitations if used alone, the methodology is particularly useful when used in combination with other methods such as mycobacterial interspersed repetitive units - variable number of tandem DNA repeats (MIRU-VNTRs). At a worldwide scale, spoligotyping has allowed identification of information on 103,856 MTBC isolates (corresponding to 98049 clustered strains plus 5807 unique isolates from 169 countries of patient origin) contained within the SITVIT2 proprietary database of the Institut Pasteur de la Guadeloupe. The SpolSimilaritySearch web-tool described herein (available at: http://www.pasteur-guadeloupe.fr:8081/SpolSimilaritySearch) incorporates a similarity search algorithm allowing users to get a complete overview of similar spoligotype patterns (with information on presence or absence of 43 spacers) in the aforementioned worldwide database. This tool allows one to analyze spread and evolutionary patterns of MTBC by comparing similar spoligotype patterns, to distinguish between widespread, specific and/or confined patterns, as well as to pinpoint patterns with large deleted blocks, which play an intriguing role in the genetic epidemiology of M. tuberculosis. Finally, the SpolSimilaritySearch tool also provides with the country distribution patterns for each queried spoligotype. Copyright © 2017 Elsevier Ltd. All rights reserved.
Toxmatch-a new software tool to aid in the development and evaluation of chemically similar groups.

Science.gov (United States)

Patlewicz, G; Jeliazkova, N; Gallegos Saliner, A; Worth, A P

2008-01-01

Chemical similarity is a widely used concept in toxicology, and is based on the hypothesis that similar compounds should have similar biological activities. This forms the underlying basis for performing read-across, forming chemical groups and developing (Quantitative) Structure-Activity Relationships ((Q)SARs). Chemical similarity is often perceived as structural similarity but in fact there are a number of other approaches that can be used to assess similarity. A systematic similarity analysis usually comprises two main steps. Firstly the chemical structures to be compared need to be characterised in terms of relevant descriptors which encode their physicochemical, topological, geometrical and/or surface properties. A second step involves a quantitative comparison of those descriptors using similarity (or dissimilarity) indices. This work outlines the use of chemical similarity principles in the formation of endpoint specific chemical groupings. Examples are provided to illustrate the development and evaluation of chemical groupings using a new software application called Toxmatch that was recently commissioned by the European Chemicals Bureau (ECB), of the European Commission's Joint Research Centre. Insights from using this software are highlighted with specific focus on the prospective application of chemical groupings under the new chemicals legislation, REACH.
MAPPING THE SIMILARITIES OF SPECTRA: GLOBAL AND LOCALLY-BIASED APPROACHES TO SDSS GALAXIES

Energy Technology Data Exchange (ETDEWEB)

Lawlor, David [Statistical and Applied Mathematical Sciences Institute (United States); Budavári, Tamás [Dept. of Applied Mathematics and Statistics, The Johns Hopkins University (United States); Mahoney, Michael W. [International Computer Science Institute (United States)

2016-12-10

We present a novel approach to studying the diversity of galaxies. It is based on a novel spectral graph technique, that of locally-biased semi-supervised eigenvectors . Our method introduces new coordinates that summarize an entire spectrum, similar to but going well beyond the widely used Principal Component Analysis (PCA). Unlike PCA, however, this technique does not assume that the Euclidean distance between galaxy spectra is a good global measure of similarity. Instead, we relax that condition to only the most similar spectra, and we show that doing so yields more reliable results for many astronomical questions of interest. The global variant of our approach can identify very finely numerous astronomical phenomena of interest. The locally-biased variants of our basic approach enable us to explore subtle trends around a set of chosen objects. The power of the method is demonstrated in the Sloan Digital Sky Survey Main Galaxy Sample, by illustrating that the derived spectral coordinates carry an unprecedented amount of information.
Mapping the Similarities of Spectra: Global and Locally-biased Approaches to SDSS Galaxies

Science.gov (United States)

Lawlor, David; Budavári, Tamás; Mahoney, Michael W.

2016-12-01

We present a novel approach to studying the diversity of galaxies. It is based on a novel spectral graph technique, that of locally-biased semi-supervised eigenvectors. Our method introduces new coordinates that summarize an entire spectrum, similar to but going well beyond the widely used Principal Component Analysis (PCA). Unlike PCA, however, this technique does not assume that the Euclidean distance between galaxy spectra is a good global measure of similarity. Instead, we relax that condition to only the most similar spectra, and we show that doing so yields more reliable results for many astronomical questions of interest. The global variant of our approach can identify very finely numerous astronomical phenomena of interest. The locally-biased variants of our basic approach enable us to explore subtle trends around a set of chosen objects. The power of the method is demonstrated in the Sloan Digital Sky Survey Main Galaxy Sample, by illustrating that the derived spectral coordinates carry an unprecedented amount of information.
MAPPING THE SIMILARITIES OF SPECTRA: GLOBAL AND LOCALLY-BIASED APPROACHES TO SDSS GALAXIES

International Nuclear Information System (INIS)

Lawlor, David; Budavári, Tamás; Mahoney, Michael W.

2016-01-01

We present a novel approach to studying the diversity of galaxies. It is based on a novel spectral graph technique, that of locally-biased semi-supervised eigenvectors . Our method introduces new coordinates that summarize an entire spectrum, similar to but going well beyond the widely used Principal Component Analysis (PCA). Unlike PCA, however, this technique does not assume that the Euclidean distance between galaxy spectra is a good global measure of similarity. Instead, we relax that condition to only the most similar spectra, and we show that doing so yields more reliable results for many astronomical questions of interest. The global variant of our approach can identify very finely numerous astronomical phenomena of interest. The locally-biased variants of our basic approach enable us to explore subtle trends around a set of chosen objects. The power of the method is demonstrated in the Sloan Digital Sky Survey Main Galaxy Sample, by illustrating that the derived spectral coordinates carry an unprecedented amount of information.

Learning by similarity in coordination problems

Czech Academy of Sciences Publication Activity Database

Steiner, Jakub; Stewart, C.

-, č. 324 (2007), s. 1-40 ISSN 1211-3298 R&D Projects: GA MŠk LC542 Institutional research plan: CEZ:AV0Z70850503 Keywords : similarity * learning * case-based reasoning Subject RIV: AH - Economics http://www.cerge-ei.cz/pdf/wp/Wp324.pdf
A COMPARISON OF SEMANTIC SIMILARITY MODELS IN EVALUATING CONCEPT SIMILARITY

Directory of Open Access Journals (Sweden)

Q. X. Xu

2012-08-01

Full Text Available The semantic similarities are important in concept definition, recognition, categorization, interpretation, and integration. Many semantic similarity models have been established to evaluate semantic similarities of objects or/and concepts. To find out the suitability and performance of different models in evaluating concept similarities, we make a comparison of four main types of models in this paper: the geometric model, the feature model, the network model, and the transformational model. Fundamental principles and main characteristics of these models are introduced and compared firstly. Land use and land cover concepts of NLCD92 are employed as examples in the case study. The results demonstrate that correlations between these models are very high for a possible reason that all these models are designed to simulate the similarity judgement of human mind.
Size-based estimation of the status of fish stocks: simulation analysis and comparison with age-based estimations

DEFF Research Database (Denmark)

Kokkalis, Alexandros; Thygesen, Uffe Høgsbro; Nielsen, Anders

, were investigated and our estimations were compared to the ICES advice. Only size-specific catch data were used, in order to emulate data limited situations. The simulation analysis reveals that the status of the stock, i.e. F/Fmsy, is estimated more accurately than the fishing mortality F itself....... Specific knowledge of the natural mortality improves the estimation more than having information about all other life history parameters. Our approach gives, at least qualitatively, an estimated stock status which is similar to the results of an age-based assessment. Since our approach only uses size...
Syntactic computations in the language network: Characterising dynamic network properties using representational similarity analysis

Directory of Open Access Journals (Sweden)

Lorraine Komisarjevsky Tyler

2013-05-01

Full Text Available The core human capacity of syntactic analysis involves a left hemisphere network involving left inferior frontal gyrus (LIFG and posterior middle temporal gyrus (LMTG and the anatomical connections between them. Here we use MEG to determine the spatio-temporal properties of syntactic computations in this network. Listeners heard spoken sentences containing a local syntactic ambiguity (e.g. …landing planes…, at the offset of which they heard a disambiguating verb and decided whether it was an acceptable/unacceptable continuation of the sentence. We charted the time-course of processing and resolving syntactic ambiguity by measuring MEG responses from the onset of each word in the ambiguous phrase and the disambiguating word. We used representational similarity analysis (RSA to characterize syntactic information represented in the LIFG and LpMTG over time and to investigate their relationship to each other. Testing a variety of lexico-syntactic and ambiguity models against the MEG data, our results suggest early lexico-syntactic responses in the LpMTG and later effects of ambiguity in the LIFG, pointing to a clear differentiation in the functional roles of these two regions. Our results suggest the LpMTG represents and transmits lexical information to the LIFG, which responds to and resolves the ambiguity.
Multi-scale structural similarity index for motion detection

Directory of Open Access Journals (Sweden)

M. Abdel-Salam Nasr

2017-07-01

Full Text Available The most recent approach for measuring the image quality is the structural similarity index (SSI. This paper presents a novel algorithm based on the multi-scale structural similarity index for motion detection (MS-SSIM in videos. The MS-SSIM approach is based on modeling of image luminance, contrast and structure at multiple scales. The MS-SSIM has resulted in much better performance than the single scale SSI approach but at the cost of relatively lower processing speed. The major advantages of the presented algorithm are both: the higher detection accuracy and the quasi real-time processing speed.
Structural similarity-based predictions of protein interactions between HIV-1 and Homo sapiens

Directory of Open Access Journals (Sweden)

Gomez Shawn M

2010-04-01

Full Text Available Abstract Background In the course of infection, viruses such as HIV-1 must enter a cell, travel to sites where they can hijack host machinery to transcribe their genes and translate their proteins, assemble, and then leave the cell again, all while evading the host immune system. Thus, successful infection depends on the pathogen's ability to manipulate the biological pathways and processes of the organism it infects. Interactions between HIV-encoded and human proteins provide one means by which HIV-1 can connect into cellular pathways to carry out these survival processes. Results We developed and applied a computational approach to predict interactions between HIV and human proteins based on structural similarity of 9 HIV-1 proteins to human proteins having known interactions. Using functional data from RNAi studies as a filter, we generated over 2000 interaction predictions between HIV proteins and 406 unique human proteins. Additional filtering based on Gene Ontology cellular component annotation reduced the number of predictions to 502 interactions involving 137 human proteins. We find numerous known interactions as well as novel interactions showing significant functional relevance based on supporting Gene Ontology and literature evidence. Conclusions Understanding the interplay between HIV-1 and its human host will help in understanding the viral lifecycle and the ways in which this virus is able to manipulate its host. The results shown here provide a potential set of interactions that are amenable to further experimental manipulation as well as potential targets for therapeutic intervention.
Similarity and Modeling in Science and Engineering

CERN Document Server

Kuneš, Josef

2012-01-01

The present text sets itself in relief to other titles on the subject in that it addresses the means and methodologies versus a narrow specific-task oriented approach. Concepts and their developments which evolved to meet the changing needs of applications are addressed. This approach provides the reader with a general tool-box to apply to their specific needs. Two important tools are presented: dimensional analysis and the similarity analysis methods. The fundamental point of view, enabling one to sort all models, is that of information flux between a model and an original expressed by the similarity and abstraction. Each chapter includes original examples and ap-plications. In this respect, the models can be divided into several groups. The following models are dealt with separately by chapter; mathematical and physical models, physical analogues, deterministic, stochastic, and cybernetic computer models. The mathematical models are divided into asymptotic and phenomenological models. The phenomenological m...
Local-global alignment for finding 3D similarities in protein structures

Science.gov (United States)

Zemla, Adam T [Brentwood, CA

2011-09-20

A method of finding 3D similarities in protein structures of a first molecule and a second molecule. The method comprises providing preselected information regarding the first molecule and the second molecule. Comparing the first molecule and the second molecule using Longest Continuous Segments (LCS) analysis. Comparing the first molecule and the second molecule using Global Distance Test (GDT) analysis. Comparing the first molecule and the second molecule using Local Global Alignment Scoring function (LGA_S) analysis. Verifying constructed alignment and repeating the steps to find the regions of 3D similarities in protein structures.
Learning material recommendation based on case-based reasoning similarity scores

Science.gov (United States)

Masood, Mona; Mokmin, Nur Azlina Mohamed

2017-10-01

A personalized learning material recommendation is important in any Intelligent Tutoring System (ITS). Case-based Reasoning (CBR) is an Artificial Intelligent Algorithm that has been widely used in the development of ITS applications. This study has developed an ITS application that applied the CBR algorithm in the development process. The application has the ability to recommend the most suitable learning material to the specific student based on information in the student profile. In order to test the ability of the application in recommending learning material, two versions of the application were created. The first version displayed the most suitable learning material and the second version displayed the least preferable learning material. The results show the application has successfully assigned the students to the most suitable learning material.
Multi-Scale Scattering Transform in Music Similarity Measuring

Science.gov (United States)

Wang, Ruobai

Scattering transform is a Mel-frequency spectrum based, time-deformation stable method, which can be used in evaluating music similarity. Compared with Dynamic time warping, it has better performance in detecting similar audio signals under local time-frequency deformation. Multi-scale scattering means to combine scattering transforms of different window lengths. This paper argues that, multi-scale scattering transform is a good alternative of dynamic time warping in music similarity measuring. We tested the performance of multi-scale scattering transform against other popular methods, with data designed to represent different conditions.
Similar judgment method of brain neural pathway using DT-MRI

International Nuclear Information System (INIS)

Watashiba, Yasuhiro; Sakamoto, Naohisa; Sakai, Koji; Koyamada, Koji; Kanazawa, Masanori; Doi, Akio

2008-01-01

Nowadays, the visualization of brain neural pathway extracted by the tractography technology is thought as a useful effective tool for the detection of involved area and the analysis of sick cause by comparison of difference of normal and patient's nerve fiber configurations and for the support of the surgery planning and the forecast of progress after an operation. So far, for the observation of the brain neural pathway, the method of the user's subjectively judging the 3D shape of them displayed in the image has been used. However, in this kind of subjective observation, verification of the propriety for the diagnostic result is difficult, in addition it cannot obtain sufficient reliability. Therefore, we think that the system to compare the shape based on a quantitative evaluation is necessary. To resolve this problem, we propose the system that enables the shape of the brain neural pathway extracted by the tractography technology to be compared quantitatively. The proposed system realized to calculate similarity between two neural pathways, and to display the difference area according to the similarity. (author)
An Experimental Comparison of Similarity Assessment Measures for 3D Models on Constrained Surface Deformation

Science.gov (United States)

Quan, Lulin; Yang, Zhixin

2010-05-01

To address the issues in the area of design customization, this paper expressed the specification and application of the constrained surface deformation, and reported the experimental performance comparison of three prevail effective similarity assessment algorithms on constrained surface deformation domain. Constrained surface deformation becomes a promising method that supports for various downstream applications of customized design. Similarity assessment is regarded as the key technology for inspecting the success of new design via measuring the difference level between the deformed new design and the initial sample model, and indicating whether the difference level is within the limitation. According to our theoretical analysis and pre-experiments, three similarity assessment algorithms are suitable for this domain, including shape histogram based method, skeleton based method, and U system moment based method. We analyze their basic functions and implementation methodologies in detail, and do a series of experiments on various situations to test their accuracy and efficiency using precision-recall diagram. Shoe model is chosen as an industrial example for the experiments. It shows that shape histogram based method gained an optimal performance in comparison. Based on the result, we proposed a novel approach that integrating surface constrains and shape histogram description with adaptive weighting method, which emphasize the role of constrains during the assessment. The limited initial experimental result demonstrated that our algorithm outperforms other three algorithms. A clear direction for future development is also drawn at the end of the paper.
Normalization of similarity-based individual brain networks from gray matter MRI and its association with neurodevelopment in infants with intrauterine growth restriction.

Science.gov (United States)

Batalle, Dafnis; Muñoz-Moreno, Emma; Figueras, Francesc; Bargallo, Nuria; Eixarch, Elisenda; Gratacos, Eduard

2013-12-01

Obtaining individual biomarkers for the prediction of altered neurological outcome is a challenge of modern medicine and neuroscience. Connectomics based on magnetic resonance imaging (MRI) stands as a good candidate to exhaustively extract information from MRI by integrating the information obtained in a few network features that can be used as individual biomarkers of neurological outcome. However, this approach typically requires the use of diffusion and/or functional MRI to extract individual brain networks, which require high acquisition times and present an extreme sensitivity to motion artifacts, critical problems when scanning fetuses and infants. Extraction of individual networks based on morphological similarity from gray matter is a new approach that benefits from the power of graph theory analysis to describe gray matter morphology as a large-scale morphological network from a typical clinical anatomic acquisition such as T1-weighted MRI. In the present paper we propose a methodology to normalize these large-scale morphological networks to a brain network with standardized size based on a parcellation scheme. The proposed methodology was applied to reconstruct individual brain networks of 63 one-year-old infants, 41 infants with intrauterine growth restriction (IUGR) and 22 controls, showing altered network features in the IUGR group, and their association with neurodevelopmental outcome at two years of age by means of ordinal regression analysis of the network features obtained with Bayley Scale for Infant and Toddler Development, third edition. Although it must be more widely assessed, this methodology stands as a good candidate for the development of biomarkers for altered neurodevelopment in the pediatric population. © 2013 Elsevier Inc. All rights reserved.
SIMILARITY OR DISSIMILARITY BETWEEN PUBLIC AND PRIVATE SECTOR

Directory of Open Access Journals (Sweden)

ANDREEA CÎRSTEA

2015-08-01

Full Text Available Consolidated financial statements represent one of the main benefits that the public sector reforms brought. The novelty of the subject sparked out interest for a detailed research, research that can bring an added value to the development of this issue in the public sector. The paper aims to analyze the degree of similarity and dissimilarity between the initial regulations regarding the issue of consolidated reporting in the public and private sector. In order to obtain information about the similarity or dissimilarity between IPSAS and IAS regarding to consolidation we used correlation and/or association coefficients. We conclude that there is a high similarity between the two sets of standards, thing that is not surprising, because it is known that IPSAS are based on IAS. Even if IPSAS are based on IAS, there still are differences which arouse from the specificity of each sector.
Bilateral Trade Flows and Income Distribution Similarity

Science.gov (United States)

2016-01-01

Current models of bilateral trade neglect the effects of income distribution. This paper addresses the issue by accounting for non-homothetic consumer preferences and hence investigating the role of income distribution in the context of the gravity model of trade. A theoretically justified gravity model is estimated for disaggregated trade data (Dollar volume is used as dependent variable) using a sample of 104 exporters and 108 importers for 1980–2003 to achieve two main goals. We define and calculate new measures of income distribution similarity and empirically confirm that greater similarity of income distribution between countries implies more trade. Using distribution-based measures as a proxy for demand similarities in gravity models, we find consistent and robust support for the hypothesis that countries with more similar income-distributions trade more with each other. The hypothesis is also confirmed at disaggregated level for differentiated product categories. PMID:27137462
Self-Similarity Based Corresponding-Point Extraction from Weakly Textured Stereo Pairs

Directory of Open Access Journals (Sweden)

Min Mao

2014-01-01

Full Text Available For the areas of low textured in image pairs, there is nearly no point that can be detected by traditional methods. The information in these areas will not be extracted by classical interest-point detectors. In this paper, a novel weakly textured point detection method is presented. The points with weakly textured characteristic are detected by the symmetry concept. The proposed approach considers the gray variability of the weakly textured local regions. The detection mechanism can be separated into three steps: region-similarity computation, candidate point searching, and refinement of weakly textured point set. The mechanism of radius scale selection and texture strength conception are used in the second step and the third step, respectively. The matching algorithm based on sparse representation (SRM is used for matching the detected points in different images. The results obtained on image sets with different objects show high robustness of the method to background and intraclass variations as well as to different photometric and geometric transformations; the points detected by this method are also the complement of points detected by classical detectors from the literature. And we also verify the efficacy of SRM by comparing with classical algorithms under the occlusion and corruption situations for matching the weakly textured points. Experiments demonstrate the effectiveness of the proposed weakly textured point detection algorithm.
Molecular structure based property modeling: Development/ improvement of property models through a systematic property-data-model analysis

DEFF Research Database (Denmark)

Hukkerikar, Amol Shivajirao; Sarup, Bent; Sin, Gürkan

2013-01-01

models. To make the property-data-model analysis fast and efficient, an approach based on the “molecular structure similarity criteria” to identify molecules (mono-functional, bi-functional, etc.) containing specified set of structural parameters (that is, groups) is employed. The method has been applied...
Clustering and visualizing similarity networks of membrane proteins.

Science.gov (United States)

Hu, Geng-Ming; Mai, Te-Lun; Chen, Chi-Ming

2015-08-01

We proposed a fast and unsupervised clustering method, minimum span clustering (MSC), for analyzing the sequence-structure-function relationship of biological networks, and demonstrated its validity in clustering the sequence/structure similarity networks (SSN) of 682 membrane protein (MP) chains. The MSC clustering of MPs based on their sequence information was found to be consistent with their tertiary structures and functions. For the largest seven clusters predicted by MSC, the consistency in chain function within the same cluster is found to be 100%. From analyzing the edge distribution of SSN for MPs, we found a characteristic threshold distance for the boundary between clusters, over which SSN of MPs could be properly clustered by an unsupervised sparsification of the network distance matrix. The clustering results of MPs from both MSC and the unsupervised sparsification methods are consistent with each other, and have high intracluster similarity and low intercluster similarity in sequence, structure, and function. Our study showed a strong sequence-structure-function relationship of MPs. We discussed evidence of convergent evolution of MPs and suggested applications in finding structural similarities and predicting biological functions of MP chains based on their sequence information. © 2015 Wiley Periodicals, Inc.
Optimization of interactive visual-similarity-based search

NARCIS (Netherlands)

Nguyen, G.P.; Worring, M.

2008-01-01

At one end of the spectrum, research in interactive content-based retrieval concentrates on machine learning methods for effective use of relevance feedback. On the other end, the information visualization community focuses on effective methods for conveying information to the user. What is lacking
Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks.

Science.gov (United States)

Oh, S June; Joung, Je-Gun; Chang, Jeong-Ho; Zhang, Byoung-Tak

2006-06-06

To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees. To compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes. By combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway structures using meta-level information rather than sequence

Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks

Directory of Open Access Journals (Sweden)

Chang Jeong-Ho

2006-06-01

Full Text Available Abstract Background To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees. Results To compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes. Conclusion By combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway
In Silico target fishing: addressing a "Big Data" problem by ligand-based similarity rankings with data fusion.

Science.gov (United States)

Liu, Xian; Xu, Yuan; Li, Shanshan; Wang, Yulan; Peng, Jianlong; Luo, Cheng; Luo, Xiaomin; Zheng, Mingyue; Chen, Kaixian; Jiang, Hualiang

2014-01-01

Ligand-based in silico target fishing can be used to identify the potential interacting target of bioactive ligands, which is useful for understanding the polypharmacology and safety profile of existing drugs. The underlying principle of the approach is that known bioactive ligands can be used as reference to predict the targets for a new compound. We tested a pipeline enabling large-scale target fishing and drug repositioning, based on simple fingerprint similarity rankings with data fusion. A large library containing 533 drug relevant targets with 179,807 active ligands was compiled, where each target was defined by its ligand set. For a given query molecule, its target profile is generated by similarity searching against the ligand sets assigned to each target, for which individual searches utilizing multiple reference structures are then fused into a single ranking list representing the potential target interaction profile of the query compound. The proposed approach was validated by 10-fold cross validation and two external tests using data from DrugBank and Therapeutic Target Database (TTD). The use of the approach was further demonstrated with some examples concerning the drug repositioning and drug side-effects prediction. The promising results suggest that the proposed method is useful for not only finding promiscuous drugs for their new usages, but also predicting some important toxic liabilities. With the rapid increasing volume and diversity of data concerning drug related targets and their ligands, the simple ligand-based target fishing approach would play an important role in assisting future drug design and discovery.
Investigation of psychophysical similarity measures for selection of similar images in the diagnosis of clustered microcalcifications on mammograms

International Nuclear Information System (INIS)

Muramatsu, Chisako; Li Qiang; Schmidt, Robert; Shiraishi, Junji; Doi, Kunio

2008-01-01

coefficient between the gold standard and the psychophysical similarity measure through the use of seven features was relatively high (r=0.71) and was comparable to the correlation coefficients between the ratings by one radiologist and the average ratings by nine radiologists (r=0.69±0.07). The correlation coefficient was improved compared to that of a distance-based method (r=0.58). The result indicated that similar images selected by the psychophysical similarity measure may be useful to radiologists in the diagnosis of clustered microcalcifications on mammograms.
Investigation of psychophysical similarity measures for selection of similar images in the diagnosis of clustered microcalcifications on mammograms

Energy Technology Data Exchange (ETDEWEB)

Muramatsu, Chisako; Li Qiang; Schmidt, Robert; Shiraishi, Junji; Doi, Kunio [Department of Radiology, University of Chicago, 5841 South Maryland Avenue, Chicago, Illinois 60637 (United States) and Department of Intelligent Image Information, Gifu University, 1-1 Yanagido, Gifu (Japan); Department of Radiology, Duke Advanced Imaging Labs, Duke University, 2424 Erwin Road, Suite 302, Durham, North Carolina 27705 (United States); Department of Radiology, University of Chicago, 5841 South Maryland Avenue, Chicago, Illinois 60637 (United States)

2008-12-15

was selected. The correlation coefficient between the gold standard and the psychophysical similarity measure through the use of seven features was relatively high (r=0.71) and was comparable to the correlation coefficients between the ratings by one radiologist and the average ratings by nine radiologists (r=0.69{+-}0.07). The correlation coefficient was improved compared to that of a distance-based method (r=0.58). The result indicated that similar images selected by the psychophysical similarity measure may be useful to radiologists in the diagnosis of clustered microcalcifications on mammograms.
Centrifugal fans: Similarity, scaling laws, and fan performance

Science.gov (United States)

Sardar, Asad Mohammad

Centrifugal fans are rotodynamic machines used for moving air continuously against moderate pressures through ventilation and air conditioning systems. There are five major topics presented in this thesis: (1) analysis of the fan scaling laws and consequences of dynamic similarity on modelling; (2) detailed flow visualization studies (in water) covering the flow path starting at the fan blade exit to the evaporator core of an actual HVAC fan scroll-diffuser module; (3) mean velocity and turbulence intensity measurements (flow field studies) at the inlet and outlet of large scale blower; (4) fan installation effects on overall fan performance and evaluation of fan testing methods; (5) two point coherence and spectral measurements conducted on an actual HVAC fan module for flow structure identification of possible aeroacoustic noise sources. A major objective of the study was to identity flow structures within the HVAC module that are responsible for noise and in particular "rumble noise" generation. Possible mechanisms for the generation of flow induced noise in the automotive HVAC fan module are also investigated. It is demonstrated that different modes of HVAC operation represent very different internal flow characteristics. This has implications on both fan HVAC airflow performance and noise characteristics. It is demonstrated from principles of complete dynamic similarity that fan scaling laws require that Reynolds, number matching is a necessary condition for developing scale model fans or fan test facilities. The physical basis for the fan scaling laws derived was established from both pure dimensional analysis and also from the fundamental equations of fluid motion. Fan performance was measured in a three times scale model (large scale blower) in air of an actual forward curved automotive HVAC blower. Different fan testing methods (based on AMCA fan test codes) were compared on the basis of static pressure measurements. Also, the flow through an actual HVAC
Extracting intrinsic functional networks with feature-based group independent component analysis.

Science.gov (United States)

Calhoun, Vince D; Allen, Elena

2013-04-01

There is increasing use of functional imaging data to understand the macro-connectome of the human brain. Of particular interest is the structure and function of intrinsic networks (regions exhibiting temporally coherent activity both at rest and while a task is being performed), which account for a significant portion of the variance in functional MRI data. While networks are typically estimated based on the temporal similarity between regions (based on temporal correlation, clustering methods, or independent component analysis [ICA]), some recent work has suggested that these intrinsic networks can be extracted from the inter-subject covariation among highly distilled features, such as amplitude maps reflecting regions modulated by a task or even coordinates extracted from large meta analytic studies. In this paper our goal was to explicitly compare the networks obtained from a first-level ICA (ICA on the spatio-temporal functional magnetic resonance imaging (fMRI) data) to those from a second-level ICA (i.e., ICA on computed features rather than on the first-level fMRI data). Convergent results from simulations, task-fMRI data, and rest-fMRI data show that the second-level analysis is slightly noisier than the first-level analysis but yields strikingly similar patterns of intrinsic networks (spatial correlations as high as 0.85 for task data and 0.65 for rest data, well above the empirical null) and also preserves the relationship of these networks with other variables such as age (for example, default mode network regions tended to show decreased low frequency power for first-level analyses and decreased loading parameters for second-level analyses). In addition, the best-estimated second-level results are those which are the most strongly reflected in the input feature. In summary, the use of feature-based ICA appears to be a valid tool for extracting intrinsic networks. We believe it will become a useful and important approach in the study of the macro
Structural similarities between brain and linguistic data provide evidence of semantic relations in the brain.

Directory of Open Access Journals (Sweden)

Colleen E Crangle

Full Text Available This paper presents a new method of analysis by which structural similarities between brain data and linguistic data can be assessed at the semantic level. It shows how to measure the strength of these structural similarities and so determine the relatively better fit of the brain data with one semantic model over another. The first model is derived from WordNet, a lexical database of English compiled by language experts. The second is given by the corpus-based statistical technique of latent semantic analysis (LSA, which detects relations between words that are latent or hidden in text. The brain data are drawn from experiments in which statements about the geography of Europe were presented auditorily to participants who were asked to determine their truth or falsity while electroencephalographic (EEG recordings were made. The theoretical framework for the analysis of the brain and semantic data derives from axiomatizations of theories such as the theory of differences in utility preference. Using brain-data samples from individual trials time-locked to the presentation of each word, ordinal relations of similarity differences are computed for the brain data and for the linguistic data. In each case those relations that are invariant with respect to the brain and linguistic data, and are correlated with sufficient statistical strength, amount to structural similarities between the brain and linguistic data. Results show that many more statistically significant structural similarities can be found between the brain data and the WordNet-derived data than the LSA-derived data. The work reported here is placed within the context of other recent studies of semantics and the brain. The main contribution of this paper is the new method it presents for the study of semantics and the brain and the focus it permits on networks of relations detected in brain data and represented by a semantic model.
Isentropic and non-isentropic sel-similar implosions

International Nuclear Information System (INIS)

Rodriguez, Manuel; Linan, Amable.

1978-01-01

The self-similar motion of an implosive shock at the instant close to the reflection time at the center of the sphere (or cylinder), before and after that reflection occurs, is described. The material is considered to be a perfect gas. A detailed analysis is given of the ordinary differential equations that describe the velocity, density and pressure distributions, obtaining the numerical solution for several values of sigma. Asymptotic solutions are given for small values of 1/sigma and (sigma - 1). Also, the self-similar process of the isentropic compression of a sphere (or cylinder), with initial conditions of uniform density and zero velocity, is given. An asimptotic solution, valid for large values of the maximum density ratio, is obtained. As a part of the solution, it is obtained the pressure-time dependence needed at the outer surface to get the self-similar solution. (author)
Matrix approach to the Shapley value and dual similar associated consistency

NARCIS (Netherlands)

Xu, G.; Driessen, Theo

Replacing associated consistency in Hamiache's axiom system by dual similar associated consistency, we axiomatize the Shapley value as the unique value verifying the inessential game property, continuity and dual similar associated consistency. Continuing the matrix analysis for Hamiache's
A new measure for functional similarity of gene products based on Gene Ontology

Directory of Open Access Journals (Sweden)

Lengauer Thomas

2006-06-01

Full Text Available Abstract Background Gene Ontology (GO is a standard vocabulary of functional terms and allows for coherent annotation of gene products. These annotations provide a basis for new methods that compare gene products regarding their molecular function and biological role. Results We present a new method for comparing sets of GO terms and for assessing the functional similarity of gene products. The method relies on two semantic similarity measures; simRel and funSim. One measure (simRel is applied in the comparison of the biological processes found in different groups of organisms. The other measure (funSim is used to find functionally related gene products within the same or between different genomes. Results indicate that the method, in addition to being in good agreement with established sequence similarity approaches, also provides a means for the identification of functionally related proteins independent of evolutionary relationships. The method is also applied to estimating functional similarity between all proteins in Saccharomyces cerevisiae and to visualizing the molecular function space of yeast in a map of the functional space. A similar approach is used to visualize the functional relationships between protein families. Conclusion The approach enables the comparison of the underlying molecular biology of different taxonomic groups and provides a new comparative genomics tool identifying functionally related gene products independent of homology. The proposed map of the functional space provides a new global view on the functional relationships between gene products or protein families.
Using sequence similarity networks for visualization of relationships across diverse protein superfamilies.

Directory of Open Access Journals (Sweden)

Holly J Atkinson

Full Text Available The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.
Using sequence similarity networks for visualization of relationships across diverse protein superfamilies.

Science.gov (United States)

Atkinson, Holly J; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C

2009-01-01

The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.
α-Cut method based importance measure for criticality analysis in fuzzy probability – Based fault tree analysis

International Nuclear Information System (INIS)

Purba, Julwan Hendry; Sony Tjahyani, D.T.; Widodo, Surip; Tjahjono, Hendro

2017-01-01

Highlights: •FPFTA deals with epistemic uncertainty using fuzzy probability. •Criticality analysis is important for reliability improvement. •An α-cut method based importance measure is proposed for criticality analysis in FPFTA. •The α-cut method based importance measure utilises α-cut multiplication, α-cut subtraction, and area defuzzification technique. •Benchmarking confirm that the proposed method is feasible for criticality analysis in FPFTA. -- Abstract: Fuzzy probability – based fault tree analysis (FPFTA) has been recently developed and proposed to deal with the limitations of conventional fault tree analysis. In FPFTA, reliabilities of basic events, intermediate events and top event are characterized by fuzzy probabilities. Furthermore, the quantification of the FPFTA is based on fuzzy multiplication rule and fuzzy complementation rule to propagate uncertainties from basic event to the top event. Since the objective of the fault tree analysis is to improve the reliability of the system being evaluated, it is necessary to find the weakest path in the system. For this purpose, criticality analysis can be implemented. Various importance measures, which are based on conventional probabilities, have been developed and proposed for criticality analysis in fault tree analysis. However, not one of those importance measures can be applied for criticality analysis in FPFTA, which is based on fuzzy probability. To be fully applied in nuclear power plant probabilistic safety assessment, FPFTA needs to have its corresponding importance measure. The objective of this study is to develop an α-cut method based importance measure to evaluate and rank the importance of basic events for criticality analysis in FPFTA. To demonstrate the applicability of the proposed measure, a case study is performed and its results are then benchmarked to the results generated by the four well known importance measures in conventional fault tree analysis. The results
Similarity analysis and prediction for data of structural acoustic and vibration

International Nuclear Information System (INIS)

Mei Liquan; Ding Xuemei; Zhang Shujuan

2010-01-01

Support vector machine (SVM) is a learning machine based on statistical learning theory, which can get a model having good generalization. It can solve 'learning more' when dealing with small size. It can also avoid 'dimensional disaster' when solving nonlinear problems. This paper works on the parameters optimization for support vector regression machine (SVRM) and its applications. Solution path algorithm can save much CPU time when it is employed to optimize the regularization parameter of SVRM. Simulated annealing algorithm has good ability of finding global optimal solution. An improved solution path algorithm and simulated annealing algorithm are combined to optimize parameters of SVRM in the regression analysis of the acoustic and vibration data for complex practical problems. The numerical results show the model has good predictive capability. (authors)
A similarity score-based two-phase heuristic approach to solve the dynamic cellular facility layout for manufacturing systems

Science.gov (United States)

Kumar, Ravi; Singh, Surya Prakash

2017-11-01

The dynamic cellular facility layout problem (DCFLP) is a well-known NP-hard problem. It has been estimated that the efficient design of DCFLP reduces the manufacturing cost of products by maintaining the minimum material flow among all machines in all cells, as the material flow contributes around 10-30% of the total product cost. However, being NP hard, solving the DCFLP optimally is very difficult in reasonable time. Therefore, this article proposes a novel similarity score-based two-phase heuristic approach to solve the DCFLP optimally considering multiple products in multiple times to be manufactured in the manufacturing layout. In the first phase of the proposed heuristic, a machine-cell cluster is created based on similarity scores between machines. This is provided as an input to the second phase to minimize inter/intracell material handling costs and rearrangement costs over the entire planning period. The solution methodology of the proposed approach is demonstrated. To show the efficiency of the two-phase heuristic approach, 21 instances are generated and solved using the optimization software package LINGO. The results show that the proposed approach can optimally solve the DCFLP in reasonable time.
Similarly shaped letters evoke similar colors in grapheme-color synesthesia.

Science.gov (United States)

Brang, David; Rouw, Romke; Ramachandran, V S; Coulson, Seana

2011-04-01

Grapheme-color synesthesia is a neurological condition in which viewing numbers or letters (graphemes) results in the concurrent sensation of color. While the anatomical substrates underlying this experience are well understood, little research to date has investigated factors influencing the particular colors associated with particular graphemes or how synesthesia occurs developmentally. A recent suggestion of such an interaction has been proposed in the cascaded cross-tuning (CCT) model of synesthesia, which posits that in synesthetes connections between grapheme regions and color area V4 participate in a competitive activation process, with synesthetic colors arising during the component-stage of grapheme processing. This model more directly suggests that graphemes sharing similar component features (lines, curves, etc.) should accordingly activate more similar synesthetic colors. To test this proposal, we created and regressed synesthetic color-similarity matrices for each of 52 synesthetes against a letter-confusability matrix, an unbiased measure of visual similarity among graphemes. Results of synesthetes' grapheme-color correspondences indeed revealed that more similarly shaped graphemes corresponded with more similar synesthetic colors, with stronger effects observed in individuals with more intense synesthetic experiences (projector synesthetes). These results support the CCT model of synesthesia, implicate early perceptual mechanisms as driving factors in the elicitation of synesthetic hues, and further highlight the relationship between conceptual and perceptual factors in this phenomenon. Copyright © 2011 Elsevier Ltd. All rights reserved.
Investigating Correlation between Protein Sequence Similarity and Semantic Similarity Using Gene Ontology Annotations.

Science.gov (United States)

Ikram, Najmul; Qadir, Muhammad Abdul; Afzal, Muhammad Tanvir

2018-01-01

Sequence similarity is a commonly used measure to compare proteins. With the increasing use of ontologies, semantic (function) similarity is getting importance. The correlation between these measures has been applied in the evaluation of new semantic similarity methods, and in protein function prediction. In this research, we investigate the relationship between the two similarity methods. The results suggest absence of a strong correlation between sequence and semantic similarities. There is a large number of proteins with low sequence similarity and high semantic similarity. We observe that Pearson's correlation coefficient is not sufficient to explain the nature of this relationship. Interestingly, the term semantic similarity values above 0 and below 1 do not seem to play a role in improving the correlation. That is, the correlation coefficient depends only on the number of common GO terms in proteins under comparison, and the semantic similarity measurement method does not influence it. Semantic similarity and sequence similarity have a distinct behavior. These findings are of significant effect for future works on protein comparison, and will help understand the semantic similarity between proteins in a better way.
Comparison of the efficacy and safety of S-1-based and capecitabine-based regimens in gastrointestinal cancer: a meta-analysis.

Directory of Open Access Journals (Sweden)

Xunlei Zhang

Full Text Available Oral fluoropyrimidine (S-1, capecitabine has been considered as an important part of various regimens. We aimed to evaluate the efficacy and safety of S-1-based therapy versus capecitabine -based therapy in gastrointestinal cancers.Eligible studies were identified from Pubmed, EMBASE. Additionally, abstracts presented at American Society of Clinical Oncology (ASCO conferences held between 2000 and 2013 were searched to identify relevant clinical trials. The outcome included overall survival (OS, progression-free survival (PFS, overall response rate (ORR, disease control rate (DCR and advent events.A total of 6 studies (4 RCTs and 2 retrospective analysis studies containing 790 participants were included in this meta-analysis, including 401 patients in the S-1-based group and 389 patients in the capecitabine-based group. Results of our meta-analysis indicated that S-1-based and capecitabine-based regimens showed very similar efficacy in terms of PFS (HR 0.92, 95% CI 0.78-1.09, P = 0.360, OS (HR 1.01, 95% CI 0.84-1.21, P = 0.949, ORR (HR 1.04, 95% CI 0.87-1.25, P = 0.683 and DCR (HR 1.02, 95% CI 0.94-1.10, P = 0.639. There was also no significant difference in toxicity between regimens other than mild more hand-foot syndrome in capecitabine-based regimens.Both the S-1-based and capecitabine-based regimens are equally active and well tolerated, and have the potential of backbone chemotherapy regimen in further studies of gastrointestinal cancers.
CHOOSING A HEALTH INSTITUTION WITH MULTIPLE CORRESPONDENCE ANALYSIS AND CLUSTER ANALYSIS IN A POPULATION BASED STUDY

Directory of Open Access Journals (Sweden)

ASLI SUNER

2013-06-01

Full Text Available Multiple correspondence analysis is a method making easy to interpret the categorical variables given in contingency tables, showing the similarities, associations as well as divergences among these variables via graphics on a lower dimensional space. Clustering methods are helped to classify the grouped data according to their similarities and to get useful summarized data from them. In this study, interpretations of multiple correspondence analysis are supported by cluster analysis; factors affecting referred health institute such as age, disease group and health insurance are examined and it is aimed to compare results of the methods.
Right fusiform response patterns reflect visual object identity rather than semantic similarity.

Science.gov (United States)

Bruffaerts, Rose; Dupont, Patrick; De Grauwe, Sophie; Peeters, Ronald; De Deyne, Simon; Storms, Gerrit; Vandenberghe, Rik

2013-12-01

We previously reported the neuropsychological consequences of a lesion confined to the middle and posterior part of the right fusiform gyrus (case JA) causing a partial loss of knowledge of visual attributes of concrete entities in the absence of category-selectivity (animate versus inanimate). We interpreted this in the context of a two-step model that distinguishes structural description knowledge from associative-semantic processing and implicated the lesioned area in the former process. To test this hypothesis in the intact brain, multi-voxel pattern analysis was used in a series of event-related fMRI studies in a total of 46 healthy subjects. We predicted that activity patterns in this region would be determined by the identity of rather than the conceptual similarity between concrete entities. In a prior behavioral experiment features were generated for each entity by more than 1000 subjects. Based on a hierarchical clustering analysis the entities were organised into 3 semantic clusters (musical instruments, vehicles, tools). Entities were presented as words or pictures. With foveal presentation of pictures, cosine similarity between fMRI response patterns in right fusiform cortex appeared to reflect both the identity of and the semantic similarity between the entities. No such effects were found for words in this region. The effect of object identity was invariant for location, scaling, orientation axis and color (grayscale versus color). It also persisted for different exemplars referring to a same concrete entity. The apparent semantic similarity effect however was not invariant. This study provides further support for a neurobiological distinction between structural description knowledge and processing of semantic relationships and confirms the role of right mid-posterior fusiform cortex in the former process, in accordance with previous lesion evidence. © 2013.

Information loss method to measure node similarity in networks

Science.gov (United States)

Li, Yongli; Luo, Peng; Wu, Chong

2014-09-01

Similarity measurement for the network node has been paid increasing attention in the field of statistical physics. In this paper, we propose an entropy-based information loss method to measure the node similarity. The whole model is established based on this idea that less information loss is caused by seeing two more similar nodes as the same. The proposed new method has relatively low algorithm complexity, making it less time-consuming and more efficient to deal with the large scale real-world network. In order to clarify its availability and accuracy, this new approach was compared with some other selected approaches on two artificial examples and synthetic networks. Furthermore, the proposed method is also successfully applied to predict the network evolution and predict the unknown nodes' attributions in the two application examples.
Identification of similar regions of protein structures using integrated sequence and structure analysis tools

Directory of Open Access Journals (Sweden)

Heiland Randy

2006-03-01

Full Text Available Abstract Background Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site http://www.sblest.org/ and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest. Results Users are able to submit their own queries or use a structure already in the PDB. Currently the databases that a user can query include the popular structural datasets ASTRAL 40 v1.69, ASTRAL 95 v1.69, CLUSTER50, CLUSTER70 and CLUSTER90 and PDBSELECT25. The results can be downloaded directly from the site and include function prediction, analysis of the most conserved environments and automated annotation of query proteins. These results reflect both the hits found with PSI-BLAST, HMMer and with S-BLEST. We have evaluated how well annotation transfer can be performed on SCOP ID's, Gene Ontology (GO ID's and EC Numbers. The method is very efficient and totally automated, generally taking around fifteen minutes for a 400 residue protein. Conclusion With structural genomics initiatives determining structures with little, if any, functional characterization
Similarity of Ferrosilicon Submerged Arc Furnaces With Different Geometrical Parameters

Directory of Open Access Journals (Sweden)

Machulec B.

2017-12-01

Full Text Available In order to determine reasons of unsatisfactory production output regarding one of the 12 MVA furnaces, a comparative analysis with a furnace of higher power that showed a markedly better production output was performed. For comparison of ferrosilicon furnaces with different geometrical parameters and transformer powers, the theory of physical similarity was applied. Geometrical, electrical and thermal parameters of the reaction zones are included in the comparative analysis. For furnaces with different geometrical parameters, it is important to ensure the same temperature conditions of the reaction zones. Due to diverse mechanisms of heat generation, different criteria for determination of thermal and electrical similarity for the upper and lower reaction zones were assumed contrary to other publications. The parameter c3 (Westly was assumed the similarity criterion for the upper furnace zones where heat is generated as a result of resistive heating while the parameter J1 (Jaccard was assumed the similarity criterion for the lower furnace zones where heat is generated due to arc radiation.
New Similarity Functions

DEFF Research Database (Denmark)

Yazdani, Hossein; Ortiz-Arroyo, Daniel; Kwasnicka, Halina

2016-01-01

spaces, in addition to their similarity in the vector space. Prioritized Weighted Feature Distance (PWFD) works similarly as WFD, but provides the ability to give priorities to desirable features. The accuracy of the proposed functions are compared with other similarity functions on several data sets....... Our results show that the proposed functions work better than other methods proposed in the literature....
Individual-based versus aggregate meta-analysis in multi-database studies of pregnancy outcomes

DEFF Research Database (Denmark)

Selmer, Randi; Haglund, Bengt; Furu, Kari

2016-01-01

Purpose: Compare analyses of a pooled data set on the individual level with aggregate meta-analysis in a multi-database study. Methods: We reanalysed data on 2.3 million births in a Nordic register based cohort study. We compared estimated odds ratios (OR) for the effect of selective serotonin...... covariates in the pooled data set, and 1.53 (1.19–1.96) after country-optimized adjustment. Country-specific adjusted analyses at the substance level were not possible for RVOTO. Conclusion: Results of fixed effects meta-analysis and individual-based analyses of a pooled dataset were similar in this study...... reuptake inhibitors (SSRI) and venlafaxine use in pregnancy on any cardiovascular birth defect and the rare outcome right ventricular outflow tract obstructions (RVOTO). Common covariates included maternal age, calendar year, birth order, maternal diabetes, and co-medication. Additional covariates were...
A REGION-BASED MULTI-SCALE APPROACH FOR OBJECT-BASED IMAGE ANALYSIS

Directory of Open Access Journals (Sweden)

T. Kavzoglu

2016-06-01

Full Text Available Within the last two decades, object-based image analysis (OBIA considering objects (i.e. groups of pixels instead of pixels has gained popularity and attracted increasing interest. The most important stage of the OBIA is image segmentation that groups spectrally similar adjacent pixels considering not only the spectral features but also spatial and textural features. Although there are several parameters (scale, shape, compactness and band weights to be set by the analyst, scale parameter stands out the most important parameter in segmentation process. Estimating optimal scale parameter is crucially important to increase the classification accuracy that depends on image resolution, image object size and characteristics of the study area. In this study, two scale-selection strategies were implemented in the image segmentation process using pan-sharped Qickbird-2 image. The first strategy estimates optimal scale parameters for the eight sub-regions. For this purpose, the local variance/rate of change (LV-RoC graphs produced by the ESP-2 tool were analysed to determine fine, moderate and coarse scales for each region. In the second strategy, the image was segmented using the three candidate scale values (fine, moderate, coarse determined from the LV-RoC graph calculated for whole image. The nearest neighbour classifier was applied in all segmentation experiments and equal number of pixels was randomly selected to calculate accuracy metrics (overall accuracy and kappa coefficient. Comparison of region-based and image-based segmentation was carried out on the classified images and found that region-based multi-scale OBIA produced significantly more accurate results than image-based single-scale OBIA. The difference in classification accuracy reached to 10% in terms of overall accuracy.
Semantic similarity between ontologies at different scales

Energy Technology Data Exchange (ETDEWEB)

Zhang, Qingpeng; Haglin, David J.

2016-04-01

In the past decade, existing and new knowledge and datasets has been encoded in different ontologies for semantic web and biomedical research. The size of ontologies is often very large in terms of number of concepts and relationships, which makes the analysis of ontologies and the represented knowledge graph computational and time consuming. As the ontologies of various semantic web and biomedical applications usually show explicit hierarchical structures, it is interesting to explore the trade-offs between ontological scales and preservation/precision of results when we analyze ontologies. This paper presents the first effort of examining the capability of this idea via studying the relationship between scaling biomedical ontologies at different levels and the semantic similarity values. We evaluate the semantic similarity between three Gene Ontology slims (Plant, Yeast, and Candida, among which the latter two belong to the same kingdom—Fungi) using four popular measures commonly applied to biomedical ontologies (Resnik, Lin, Jiang-Conrath, and SimRel). The results of this study demonstrate that with proper selection of scaling levels and similarity measures, we can significantly reduce the size of ontologies without losing substantial detail. In particular, the performance of Jiang-Conrath and Lin are more reliable and stable than that of the other two in this experiment, as proven by (a) consistently showing that Yeast and Candida are more similar (as compared to Plant) at different scales, and (b) small deviations of the similarity values after excluding a majority of nodes from several lower scales. This study provides a deeper understanding of the application of semantic similarity to biomedical ontologies, and shed light on how to choose appropriate semantic similarity measures for biomedical engineering.
Pathway-Based Kernel Boosting for the Analysis of Genome-Wide Association Studies

Science.gov (United States)

Manitz, Juliane; Burger, Patricia; Amos, Christopher I.; Chang-Claude, Jenny; Wichmann, Heinz-Erich; Kneib, Thomas; Bickeböller, Heike

2017-01-01

The analysis of genome-wide association studies (GWAS) benefits from the investigation of biologically meaningful gene sets, such as gene-interaction networks (pathways). We propose an extension to a successful kernel-based pathway analysis approach by integrating kernel functions into a powerful algorithmic framework for variable selection, to enable investigation of multiple pathways simultaneously. We employ genetic similarity kernels from the logistic kernel machine test (LKMT) as base-learners in a boosting algorithm. A model to explain case-control status is created iteratively by selecting pathways that improve its prediction ability. We evaluated our method in simulation studies adopting 50 pathways for different sample sizes and genetic effect strengths. Additionally, we included an exemplary application of kernel boosting to a rheumatoid arthritis and a lung cancer dataset. Simulations indicate that kernel boosting outperforms the LKMT in certain genetic scenarios. Applications to GWAS data on rheumatoid arthritis and lung cancer resulted in sparse models which were based on pathways interpretable in a clinical sense. Kernel boosting is highly flexible in terms of considered variables and overcomes the problem of multiple testing. Additionally, it enables the prediction of clinical outcomes. Thus, kernel boosting constitutes a new, powerful tool in the analysis of GWAS data and towards the understanding of biological processes involved in disease susceptibility. PMID:28785300
Pathway-Based Kernel Boosting for the Analysis of Genome-Wide Association Studies.

Science.gov (United States)

Friedrichs, Stefanie; Manitz, Juliane; Burger, Patricia; Amos, Christopher I; Risch, Angela; Chang-Claude, Jenny; Wichmann, Heinz-Erich; Kneib, Thomas; Bickeböller, Heike; Hofner, Benjamin

2017-01-01

The analysis of genome-wide association studies (GWAS) benefits from the investigation of biologically meaningful gene sets, such as gene-interaction networks (pathways). We propose an extension to a successful kernel-based pathway analysis approach by integrating kernel functions into a powerful algorithmic framework for variable selection, to enable investigation of multiple pathways simultaneously. We employ genetic similarity kernels from the logistic kernel machine test (LKMT) as base-learners in a boosting algorithm. A model to explain case-control status is created iteratively by selecting pathways that improve its prediction ability. We evaluated our method in simulation studies adopting 50 pathways for different sample sizes and genetic effect strengths. Additionally, we included an exemplary application of kernel boosting to a rheumatoid arthritis and a lung cancer dataset. Simulations indicate that kernel boosting outperforms the LKMT in certain genetic scenarios. Applications to GWAS data on rheumatoid arthritis and lung cancer resulted in sparse models which were based on pathways interpretable in a clinical sense. Kernel boosting is highly flexible in terms of considered variables and overcomes the problem of multiple testing. Additionally, it enables the prediction of clinical outcomes. Thus, kernel boosting constitutes a new, powerful tool in the analysis of GWAS data and towards the understanding of biological processes involved in disease susceptibility.
Coordinate based random effect size meta-analysis of neuroimaging studies.

Science.gov (United States)

Tench, C R; Tanasescu, Radu; Constantinescu, C S; Auer, D P; Cottam, W J

2017-06-01

Low power in neuroimaging studies can make them difficult to interpret, and Coordinate based meta-analysis (CBMA) may go some way to mitigating this issue. CBMA has been used in many analyses to detect where published functional MRI or voxel-based morphometry studies testing similar hypotheses report significant summary results (coordinates) consistently. Only the reported coordinates and possibly t statistics are analysed, and statistical significance of clusters is determined by coordinate density. Here a method of performing coordinate based random effect size meta-analysis and meta-regression is introduced. The algorithm (ClusterZ) analyses both coordinates and reported t statistic or Z score, standardised by the number of subjects. Statistical significance is determined not by coordinate density, but by a random effects meta-analyses of reported effects performed cluster-wise using standard statistical methods and taking account of censoring inherent in the published summary results. Type 1 error control is achieved using the false cluster discovery rate (FCDR), which is based on the false discovery rate. This controls both the family wise error rate under the null hypothesis that coordinates are randomly drawn from a standard stereotaxic space, and the proportion of significant clusters that are expected under the null. Such control is necessary to avoid propagating and even amplifying the very issues motivating the meta-analysis in the first place. ClusterZ is demonstrated on both numerically simulated data and on real data from reports of grey matter loss in multiple sclerosis (MS) and syndromes suggestive of MS, and of painful stimulus in healthy controls. The software implementation is available to download and use freely. Copyright © 2017 Elsevier Inc. All rights reserved.
Interbehavioral psychology and radical behaviorism: Some similarities and differences

Science.gov (United States)

Morris, Edward K.

1984-01-01

Both J. R. Kantor's interbehavioral psychology and B. F. Skinner's radical behaviorism represent wellarticulated approaches to a natural science of behavior. As such, they share a number of similar features, yet they also differ on a number of dimensions. Some of these similarities and differences are examined by describing their emergence in the professional literature and by comparing the respective units of analysis of the two approaches—the interbehavioral field and the three-term contingency. An evaluation of the similarities and differences shows the similarities to be largely fundamental, and the differences largely ones of emphasis. Nonetheless, the two approaches do make unique contributions to a natural science of behavior, the integration of which can facilitate the development of that science and its acceptance among other sciences and within society at large. PMID:22478612
New Methodology for Measuring Semantic Functional Similarity Based on Bidirectional Integration

Science.gov (United States)

Jeong, Jong Cheol

2013-01-01

1.2 billion users in Facebook, 17 million articles in Wikipedia, and 190 million tweets per day have demanded significant increase of information processing through Internet in recent years. Similarly life sciences and bioinformatics also have faced issues of processing Big data due to the explosion of publicly available genomic information…
Learning semantic and visual similarity for endomicroscopy video retrieval.

Science.gov (United States)

Andre, Barbara; Vercauteren, Tom; Buchner, Anna M; Wallace, Michael B; Ayache, Nicholas

2012-06-01

Content-based image retrieval (CBIR) is a valuable computer vision technique which is increasingly being applied in the medical community for diagnosis support. However, traditional CBIR systems only deliver visual outputs, i.e., images having a similar appearance to the query, which is not directly interpretable by the physicians. Our objective is to provide a system for endomicroscopy video retrieval which delivers both visual and semantic outputs that are consistent with each other. In a previous study, we developed an adapted bag-of-visual-words method for endomicroscopy retrieval, called "Dense-Sift," that computes a visual signature for each video. In this paper, we present a novel approach to complement visual similarity learning with semantic knowledge extraction, in the field of in vivo endomicroscopy. We first leverage a semantic ground truth based on eight binary concepts, in order to transform these visual signatures into semantic signatures that reflect how much the presence of each semantic concept is expressed by the visual words describing the videos. Using cross-validation, we demonstrate that, in terms of semantic detection, our intuitive Fisher-based method transforming visual-word histograms into semantic estimations outperforms support vector machine (SVM) methods with statistical significance. In a second step, we propose to improve retrieval relevance by learning an adjusted similarity distance from a perceived similarity ground truth. As a result, our distance learning method allows to statistically improve the correlation with the perceived similarity. We also demonstrate that, in terms of perceived similarity, the recall performance of the semantic signatures is close to that of visual signatures and significantly better than those of several state-of-the-art CBIR methods. The semantic signatures are thus able to communicate high-level medical knowledge while being consistent with the low-level visual signatures and much shorter than them
Self-similar slip distributions on irregular shaped faults

Science.gov (United States)

Herrero, A.; Murphy, S.

2018-06-01

We propose a strategy to place a self-similar slip distribution on a complex fault surface that is represented by an unstructured mesh. This is possible by applying a strategy based on the composite source model where a hierarchical set of asperities, each with its own slip function which is dependent on the distance from the asperity centre. Central to this technique is the efficient, accurate computation of distance between two points on the fault surface. This is known as the geodetic distance problem. We propose a method to compute the distance across complex non-planar surfaces based on a corollary of the Huygens' principle. The difference between this method compared to others sample-based algorithms which precede it is the use of a curved front at a local level to calculate the distance. This technique produces a highly accurate computation of the distance as the curvature of the front is linked to the distance from the source. Our local scheme is based on a sequence of two trilaterations, producing a robust algorithm which is highly precise. We test the strategy on a planar surface in order to assess its ability to keep the self-similarity properties of a slip distribution. We also present a synthetic self-similar slip distribution on a real slab topography for a M8.5 event. This method for computing distance may be extended to the estimation of first arrival times in both complex 3D surfaces or 3D volumes.
Voxel-Based Morphometry ALE meta-analysis of Bipolar Disorder

Science.gov (United States)

Magana, Omar; Laird, Robert

2012-03-01

A meta-analysis was performed independently to view the changes in gray matter (GM) on patients with Bipolar disorder (BP). The meta-analysis was conducted on a Talairach Space using GingerALE to determine the voxels and their permutation. In order to achieve the data acquisition, published experiments and similar research studies were uploaded onto the online Voxel-Based Morphometry database (VBM). By doing so, coordinates of activation locations were extracted from Bipolar disorder related journals utilizing Sleuth. Once the coordinates of given experiments were selected and imported to GingerALE, a Gaussian was performed on all foci points to create the concentration points of GM on BP patients. The results included volume reductions and variations of GM between Normal Healthy controls and Patients with Bipolar disorder. A significant amount of GM clusters were obtained in Normal Healthy controls over BP patients on the right precentral gyrus, right anterior cingulate, and the left inferior frontal gyrus. In future research, more published journals could be uploaded onto the database and another VBM meta-analysis could be performed including more activation coordinates or a variation of age groups.
Clustering biomolecular complexes by residue contacts similarity

NARCIS (Netherlands)

Garcia Lopes Maia Rodrigues, João; Trellet, Mikaël; Schmitz, Christophe; Kastritis, Panagiotis; Karaca, Ezgi; Melquiond, Adrien S J; Bonvin, Alexandre M J J; Garcia Lopes Maia Rodrigues, João

Inaccuracies in computational molecular modeling methods are often counterweighed by brute-force generation of a plethora of putative solutions. These are then typically sieved via structural clustering based on similarity measures such as the root mean square deviation (RMSD) of atomic positions.
Ontology-based content analysis of US patent applications from 2001-2010.

Science.gov (United States)

Weber, Lutz; Böhme, Timo; Irmer, Matthias

2013-01-01

Ontology-based semantic text analysis methods allow to automatically extract knowledge relationships and data from text documents. In this review, we have applied these technologies for the systematic analysis of pharmaceutical patents. Hierarchical concepts from the knowledge domains of chemical compounds, diseases and proteins were used to annotate full-text US patent applications that deal with pharmacological activities of chemical compounds and filed in the years 2001-2010. Compounds claimed in these applications have been classified into their respective compound classes to review the distribution of scaffold types or general compound classes such as natural products in a time-dependent manner. Similarly, the target proteins and claimed utility of the compounds have been classified and the most relevant were extracted. The method presented allows the discovery of the main areas of innovation as well as emerging fields of patenting activities - providing a broad statistical basis for competitor analysis and decision-making efforts.
Clinical phenotype-based gene prioritization: an initial study using semantic similarity and the human phenotype ontology.

Science.gov (United States)

Masino, Aaron J; Dechene, Elizabeth T; Dulik, Matthew C; Wilkens, Alisha; Spinner, Nancy B; Krantz, Ian D; Pennington, Jeffrey W; Robinson, Peter N; White, Peter S

2014-07-21

Exome sequencing is a promising method for diagnosing patients with a complex phenotype. However, variant interpretation relative to patient phenotype can be challenging in some scenarios, particularly clinical assessment of rare complex phenotypes. Each patient's sequence reveals many possibly damaging variants that must be individually assessed to establish clear association with patient phenotype. To assist interpretation, we implemented an algorithm that ranks a given set of genes relative to patient phenotype. The algorithm orders genes by the semantic similarity computed between phenotypic descriptors associated with each gene and those describing the patient. Phenotypic descriptor terms are taken from the Human Phenotype Ontology (HPO) and semantic similarity is derived from each term's information content. Model validation was performed via simulation and with clinical data. We simulated 33 Mendelian diseases with 100 patients per disease. We modeled clinical conditions by adding noise and imprecision, i.e. phenotypic terms unrelated to the disease and terms less specific than the actual disease terms. We ranked the causative gene against all 2488 HPO annotated genes. The median causative gene rank was 1 for the optimal and noise cases, 12 for the imprecision case, and 60 for the imprecision with noise case. Additionally, we examined a clinical cohort of subjects with hearing impairment. The disease gene median rank was 22. However, when also considering the patient's exome data and filtering non-exomic and common variants, the median rank improved to 3. Semantic similarity can rank a causative gene highly within a gene list relative to patient phenotype characteristics, provided that imprecision is mitigated. The clinical case results suggest that phenotype rank combined with variant analysis provides significant improvement over the individual approaches. We expect that this combined prioritization approach may increase accuracy and decrease effort for
A framework for intelligent reliability centered maintenance analysis

International Nuclear Information System (INIS)

Cheng Zhonghua; Jia Xisheng; Gao Ping; Wu Su; Wang Jianzhao

2008-01-01

To improve the efficiency of reliability-centered maintenance (RCM) analysis, case-based reasoning (CBR), as a kind of artificial intelligence (AI) technology, was successfully introduced into RCM analysis process, and a framework for intelligent RCM analysis (IRCMA) was studied. The idea for IRCMA is based on the fact that the historical records of RCM analysis on similar items can be referenced and used for the current RCM analysis of a new item. Because many common or similar items may exist in the analyzed equipment, the repeated tasks of RCM analysis can be considerably simplified or avoided by revising the similar cases in conducting RCM analysis. Based on the previous theory studies, an intelligent RCM analysis system (IRCMAS) prototype was developed. This research has focused on the description of the definition, basic principles as well as a framework of IRCMA, and discussion of critical techniques in the IRCMA. Finally, IRCMAS prototype is presented based on a case study
[Similarity system theory to evaluate similarity of chromatographic fingerprints of traditional Chinese medicine].

Science.gov (United States)

Liu, Yongsuo; Meng, Qinghua; Jiang, Shumin; Hu, Yuzhu

2005-03-01

The similarity evaluation of the fingerprints is one of the most important problems in the quality control of the traditional Chinese medicine (TCM). Similarity measures used to evaluate the similarity of the common peaks in the chromatogram of TCM have been discussed. Comparative studies were carried out among correlation coefficient, cosine of the angle and an improved extent similarity method using simulated data and experimental data. Correlation coefficient and cosine of the angle are not sensitive to the differences of the data set. They are still not sensitive to the differences of the data even after normalization. According to the similarity system theory, an improved extent similarity method was proposed. The improved extent similarity is more sensitive to the differences of the data sets than correlation coefficient and cosine of the angle. And the character of the data sets needs not to be changed compared with log-transformation. The improved extent similarity can be used to evaluate the similarity of the chromatographic fingerprints of TCM.

CONTEXT BASED FOOD IMAGE ANALYSIS

OpenAIRE

He, Ye; Xu, Chang; Khanna, Nitin; Boushey, Carol J.; Delp, Edward J.

2013-01-01

We are developing a dietary assessment system that records daily food intake through the use of food images. Recognizing food in an image is difficult due to large visual variance with respect to eating or preparation conditions. This task becomes even more challenging when different foods have similar visual appearance. In this paper we propose to incorporate two types of contextual dietary information, food co-occurrence patterns and personalized learning models, in food image analysis to r...
A comparative analysis of Painleve, Lax pair, and similarity transformation methods in obtaining the integrability conditions of nonlinear Schroedinger equations

International Nuclear Information System (INIS)

Al Khawaja, U.

2010-01-01

We derive the integrability conditions of nonautonomous nonlinear Schroedinger equations using the Lax pair and similarity transformation methods. We present a comparative analysis of these integrability conditions with those of the Painleve method. We show that while the Painleve integrability conditions restrict the dispersion, nonlinearity, and dissipation/gain coefficients to be space independent and the external potential to be only a quadratic function of position, the Lax Pair and the similarity transformation methods allow for space-dependent coefficients and an external potential that is not restricted to the quadratic form. The integrability conditions of the Painleve method are retrieved as a special case of our general integrability conditions. We also derive the integrability conditions of nonautonomous nonlinear Schroedinger equations for two- and three-spacial dimensions.
Using ontology-based semantic similarity to facilitate the article screening process for systematic reviews.

Science.gov (United States)

Ji, Xiaonan; Ritter, Alan; Yen, Po-Yin

2017-05-01

Systematic Reviews (SRs) are utilized to summarize evidence from high quality studies and are considered the preferred source of evidence-based practice (EBP). However, conducting SRs can be time and labor intensive due to the high cost of article screening. In previous studies, we demonstrated utilizing established (lexical) article relationships to facilitate the identification of relevant articles in an efficient and effective manner. Here we propose to enhance article relationships with background semantic knowledge derived from Unified Medical Language System (UMLS) concepts and ontologies. We developed a pipelined semantic concepts representation process to represent articles from an SR into an optimized and enriched semantic space of UMLS concepts. Throughout the process, we leveraged concepts and concept relations encoded in biomedical ontologies (SNOMED-CT and MeSH) within the UMLS framework to prompt concept features of each article. Article relationships (similarities) were established and represented as a semantic article network, which was readily applied to assist with the article screening process. We incorporated the concept of active learning to simulate an interactive article recommendation process, and evaluated the performance on 15 completed SRs. We used work saved over sampling at 95% recall (WSS95) as the performance measure. We compared the WSS95 performance of our ontology-based semantic approach to existing lexical feature approaches and corpus-based semantic approaches, and found that we had better WSS95 in most SRs. We also had the highest average WSS95 of 43.81% and the highest total WSS95 of 657.18%. We demonstrated using ontology-based semantics to facilitate the identification of relevant articles for SRs. Effective concepts and concept relations derived from UMLS ontologies can be utilized to establish article semantic relationships. Our approach provided a promising performance and can easily apply to any SR topics in the
The study on the cephalometric similarity between parents and offspring

Energy Technology Data Exchange (ETDEWEB)

Kang, Woo Ghon; Ahn, Hyung Kyu [Department of Radiology, College of Dentistry, Seoul National University, Seoul (Korea, Republic of)

1975-11-15

The study was performed to investigate cephalometric similarity between parents and offspring of the Korean family by lateral cephalometric analysis. The lateral cephalograms consist of the 8 families comprising 16 parents, 5 sons and 7 daughters. In order to make an investigation of the similarity, 12 measuring points were set up, and 22 linear measurements on each depth, height and 5 angular measurements were made. The author drew up the profilograms to compare parents with offspring in each family group. The obtained results were as follows: 1. There was no common similarity on specific region between parents and offspring in each family group. 2. There was partial similarity between single parent and offspring. 3. The partial similarity between single parent and offspring was noted on the upper face in general.
The study on the cephalometric similarity between parents and offspring

International Nuclear Information System (INIS)

Kang, Woo Ghon; Ahn, Hyung Kyu

1975-01-01

The study was performed to investigate cephalometric similarity between parents and offspring of the Korean family by lateral cephalometric analysis. The lateral cephalograms consist of the 8 families comprising 16 parents, 5 sons and 7 daughters. In order to make an investigation of the similarity, 12 measuring points were set up, and 22 linear measurements on each depth, height and 5 angular measurements were made. The author drew up the profilograms to compare parents with offspring in each family group. The obtained results were as follows: 1. There was no common similarity on specific region between parents and offspring in each family group. 2. There was partial similarity between single parent and offspring. 3. The partial similarity between single parent and offspring was noted on the upper face in general.
Pairing symmetries of several iron-based superconductor families and some similarities with cuprates and heavy-fermions

Directory of Open Access Journals (Sweden)

Das Tanmoy

2012-03-01

Full Text Available We show that, by using the unit-cell transformation between 1 Fe per unit cell to 2 Fe per unit cell, one can qualitatively understand the pairing symmetry of several families of iron-based superconductors. In iron-pnictides and iron-chalcogenides, the nodeless s±-pairing and the resulting magnetic resonance mode transform nicely between the two unit cells, while retaining all physical properties unchanged. However, when the electron-pocket disappears from the Fermi surface with complete doping in KFe2As2, we find that the unit-cell invariant requirement prohibits the occurrence of s±-pairing symmetry (caused by inter-hole-pocket nesting. However, the intra-pocket nesting is compatible here, which leads to a nodal d-wave pairing. The corresponding Fermi surface topology and the pairing symmetry are similar to Ce-based heavy-fermion superconductors. Furthermore, when the Fermi surface hosts only electron-pockets in KyFe2-xSe2, the inter-electron-pocket nesting induces a nodeless and isotropic d-wave pairing. This situation is analogous to the electron-doped cuprates, where the strong antiferromagnetic order creates similar disconnected electron-pocket Fermi surface, and hence nodeless d-wave pairing appears. The unit-cell transformation in KyFe2-xSe2 exhibits that the d-wave pairing breaks the translational symmetry of the 2 Fe unit cell, and thus cannot be realized unless a vacancy ordering forms to compensate for it. These results are consistent with the coexistence picture of a competing order and nodeless d-wave superconductivity in both cuprates and KyFe1.6Se2.
In silico pattern-based analysis of the human cytomegalovirus genome.

Science.gov (United States)

Rigoutsos, Isidore; Novotny, Jiri; Huynh, Tien; Chin-Bow, Stephen T; Parida, Laxmi; Platt, Daniel; Coleman, David; Shenk, Thomas

2003-04-01

More than 200 open reading frames (ORFs) from the human cytomegalovirus genome have been reported as potentially coding for proteins. We have used two pattern-based in silico approaches to analyze this set of putative viral genes. With the help of an objective annotation method that is based on the Bio-Dictionary, a comprehensive collection of amino acid patterns that describes the currently known natural sequence space of proteins, we have reannotated all of the previously reported putative genes of the human cytomegalovirus. Also, with the help of MUSCA, a pattern-based multiple sequence alignment algorithm, we have reexamined the original human cytomegalovirus gene family definitions. Our analysis of the genome shows that many of the coded proteins comprise amino acid combinations that are unique to either the human cytomegalovirus or the larger group of herpesviruses. We have confirmed that a surprisingly large portion of the analyzed ORFs encode membrane proteins, and we have discovered a significant number of previously uncharacterized proteins that are predicted to be G-protein-coupled receptor homologues. The analysis also indicates that many of the encoded proteins undergo posttranslational modifications such as hydroxylation, phosphorylation, and glycosylation. ORFs encoding proteins with similar functional behavior appear in neighboring regions of the human cytomegalovirus genome. All of the results of the present study can be found and interactively explored online (http://cbcsrv.watson.ibm.com/virus/).
Distributional Similarity for Chinese: Exploiting Characters and Radicals

Directory of Open Access Journals (Sweden)

Peng Jin

2012-01-01

Full Text Available Distributional Similarity has attracted considerable attention in the field of natural language processing as an automatic means of countering the ubiquitous problem of sparse data. As a logographic language, Chinese words consist of characters and each of them is composed of one or more radicals. The meanings of characters are usually highly related to the words which contain them. Likewise, radicals often make a predictable contribution to the meaning of a character: characters that have the same components tend to have similar or related meanings. In this paper, we utilize these properties of the Chinese language to improve Chinese word similarity computation. Given a content word, we first extract similar words based on a large corpus and a similarity score for ranking. This rank is then adjusted according to the characters and components shared between the similar word and the target word. Experiments on two gold standard datasets show that the adjusted rank is superior and closer to human judgments than the original rank. In addition to quantitative evaluation, we examine the reasons behind errors drawing on linguistic phenomena for our explanations.
Frame-based safety analysis approach for decision-based errors

International Nuclear Information System (INIS)

Fan, Chin-Feng; Yihb, Swu

1997-01-01

A frame-based approach is proposed to analyze decision-based errors made by automatic controllers or human operators due to erroneous reference frames. An integrated framework, Two Frame Model (TFM), is first proposed to model the dynamic interaction between the physical process and the decision-making process. Two important issues, consistency and competing processes, are raised. Consistency between the physical and logic frames makes a TFM-based system work properly. Loss of consistency refers to the failure mode that the logic frame does not accurately reflect the state of the controlled processes. Once such failure occurs, hazards may arise. Among potential hazards, the competing effect between the controller and the controlled process is the most severe one, which may jeopardize a defense-in-depth design. When the logic and physical frames are inconsistent, conventional safety analysis techniques are inadequate. We propose Frame-based Fault Tree; Analysis (FFTA) and Frame-based Event Tree Analysis (FETA) under TFM to deduce the context for decision errors and to separately generate the evolution of the logical frame as opposed to that of the physical frame. This multi-dimensional analysis approach, different from the conventional correctness-centred approach, provides a panoramic view in scenario generation. Case studies using the proposed techniques are also given to demonstrate their usage and feasibility
Similarity-based Fisherfaces

DEFF Research Database (Denmark)

Delgado-Gomez, David; Fagertun, Jens; Ersbøll, Bjarne Kjær

2009-01-01

databases (XM2VTS, AR and Equinox) show consistently good results. The proposed algorithm achieves Equal Error Rate (EER) and Half-Total Error Rate (HTER) values in the ranges of 0.41-1.67% and 0.1-1.95%, respectively. Our approach yields results comparable to the top two winners in recent contests reported...
Probing multi-scale self-similarity of tissue structures using light scattering spectroscopy: prospects in pre-cancer detection

Science.gov (United States)

Chatterjee, Subhasri; Das, Nandan K.; Kumar, Satish; Mohapatra, Sonali; Pradhan, Asima; Panigrahi, Prasanta K.; Ghosh, Nirmalya

2013-02-01

Multi-resolution analysis on the spatial refractive index inhomogeneities in the connective tissue regions of human cervix reveals clear signature of multifractality. We have thus developed an inverse analysis strategy for extraction and quantification of the multifractality of spatial refractive index fluctuations from the recorded light scattering signal. The method is based on Fourier domain pre-processing of light scattering data using Born approximation, and its subsequent analysis through Multifractal Detrended Fluctuation Analysis model. The method has been validated on several mono- and multi-fractal scattering objects whose self-similar properties are user controlled and known a-priori. Following successful validation, this approach has initially been explored for differentiating between different grades of precancerous human cervical tissues.
Contrasting HIV phylogenetic relationships and V3 loop protein similarities

Energy Technology Data Exchange (ETDEWEB)

Korber, B. (Los Alamos National Lab., NM (United States) Santa Fe Inst., NM (United States)); Myers, G. (Los Alamos National Lab., NM (United States))

1992-01-01

At least five distinct sequence subtypes of HIV-I can be identified from the major centers of the AMS pandemic. While it is too early to tell whether these subtypes are serologically or phenotypically similar or distinct in terms of properties such as pathogenicity and transmissibility, we can begin to investigate their potential for phenotypic divergence at the protein sequence level. Phylogenetic analysis of HIV DNA sequences is being widely used to examine lineages of different viral strains as they evolve and spread throughout the globe. We have identified five distinct HIV-1 subtypes (designated A-E), or clades, based on phylogenetic clustering patterns generated from genetic information from both the gag and envelope (env) genes from a spectrum of international isolates. Our initial observations concerning both HIV-1 and HIV-2 sequences indicate that conserved patterns in protein chemistry may indeed exist across distant lineages. Such patterns in V3 loop amino acid chemistry may be indicative of stable lineages or convergence within this highly variable, though functionally and immunologically critical, region. We think that there may be parallels between the apparently stable HIV-2 V3 lineage and the previously mentioned HIV-1 V3 loops which are very similar at the protein level despite being distant by cladistic analysis, and which do not possess the distinctive positively charged residues. Highly conserved V3 loop protein sequences are also encountered in SIVAGMs and CIVs (chimpanzee viral strains), which do not appear to be pathogenic in their wild-caught natural hosts.
Contrasting HIV phylogenetic relationships and V3 loop protein similarities

Energy Technology Data Exchange (ETDEWEB)

Korber, B. [Los Alamos National Lab., NM (United States)]|[Santa Fe Inst., NM (United States); Myers, G. [Los Alamos National Lab., NM (United States)

1992-12-31

At least five distinct sequence subtypes of HIV-I can be identified from the major centers of the AMS pandemic. While it is too early to tell whether these subtypes are serologically or phenotypically similar or distinct in terms of properties such as pathogenicity and transmissibility, we can begin to investigate their potential for phenotypic divergence at the protein sequence level. Phylogenetic analysis of HIV DNA sequences is being widely used to examine lineages of different viral strains as they evolve and spread throughout the globe. We have identified five distinct HIV-1 subtypes (designated A-E), or clades, based on phylogenetic clustering patterns generated from genetic information from both the gag and envelope (env) genes from a spectrum of international isolates. Our initial observations concerning both HIV-1 and HIV-2 sequences indicate that conserved patterns in protein chemistry may indeed exist across distant lineages. Such patterns in V3 loop amino acid chemistry may be indicative of stable lineages or convergence within this highly variable, though functionally and immunologically critical, region. We think that there may be parallels between the apparently stable HIV-2 V3 lineage and the previously mentioned HIV-1 V3 loops which are very similar at the protein level despite being distant by cladistic analysis, and which do not possess the distinctive positively charged residues. Highly conserved V3 loop protein sequences are also encountered in SIVAGMs and CIVs (chimpanzee viral strains), which do not appear to be pathogenic in their wild-caught natural hosts.
Word Similarity from Dictionaries: Inferring Fuzzy Measures from Fuzzy Graphs

Directory of Open Access Journals (Sweden)

Vicenc Torra

2008-01-01

Full Text Available WORD SIMILARITY FROM DICTIONARIES: INFERRING FUZZY MEASURES FROM FUZZY GRAPHS The computation of similarities between words is a basic element of information retrieval systems, when retrieval is not solely based on word matching. In this work we consider a measure between words based on dictionaries. This is achieved assuming that a dictionary is formalized as a fuzzy graph. We show that the approach permits to compute measures not only for pairs of words but for sets of them.
Construction of a phylogenetic tree of photosynthetic prokaryotes based on average similarities of whole genome sequences.

Directory of Open Access Journals (Sweden)

Soichirou Satoh

Full Text Available Phylogenetic trees have been constructed for a wide range of organisms using gene sequence information, especially through the identification of orthologous genes that have been vertically inherited. The number of available complete genome sequences is rapidly increasing, and many tools for construction of genome trees based on whole genome sequences have been proposed. However, development of a reasonable method of using complete genome sequences for construction of phylogenetic trees has not been established. We have developed a method for construction of phylogenetic trees based on the average sequence similarities of whole genome sequences. We used this method to examine the phylogeny of 115 photosynthetic prokaryotes, i.e., cyanobacteria, Chlorobi, proteobacteria, Chloroflexi, Firmicutes and nonphotosynthetic organisms including Archaea. Although the bootstrap values for the branching order of phyla were low, probably due to lateral gene transfer and saturated mutation, the obtained tree was largely consistent with the previously reported phylogenetic trees, indicating that this method is a robust alternative to traditional phylogenetic methods.
IMMAN: free software for information theory-based chemometric analysis.

Science.gov (United States)

Urias, Ricardo W Pino; Barigye, Stephen J; Marrero-Ponce, Yovani; García-Jacas, César R; Valdes-Martiní, José R; Perez-Gimenez, Facundo

2015-05-01

The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for Information theory-based CheMoMetrics ANalysis) are presented. This is multi-platform software developed in the Java programming language, designed with a remarkably user-friendly graphical interface for the computation of a collection of information-theoretic functions adapted for rank-based unsupervised and supervised feature selection tasks. A total of 20 feature selection parameters are presented, with the unsupervised and supervised frameworks represented by 10 approaches in each case. Several information-theoretic parameters traditionally used as molecular descriptors (MDs) are adapted for use as unsupervised rank-based feature selection methods. On the other hand, a generalization scheme for the previously defined differential Shannon's entropy is discussed, as well as the introduction of Jeffreys information measure for supervised feature selection. Moreover, well-known information-theoretic feature selection parameters, such as information gain, gain ratio, and symmetrical uncertainty are incorporated to the IMMAN software ( http://mobiosd-hub.com/imman-soft/ ), following an equal-interval discretization approach. IMMAN offers data pre-processing functionalities, such as missing values processing, dataset partitioning, and browsing. Moreover, single parameter or ensemble (multi-criteria) ranking options are provided. Consequently, this software is suitable for tasks like dimensionality reduction, feature ranking, as well as comparative diversity analysis of data matrices. Simple examples of applications performed with this program are presented. A comparative study between IMMAN and WEKA feature selection tools using the Arcene dataset was performed, demonstrating similar behavior. In addition, it is revealed that the use of IMMAN unsupervised feature selection methods improves the performance of both IMMAN and WEKA
Determining the semantic similarities among Gene Ontology terms.

Science.gov (United States)

Taha, Kamal

2013-05-01

We present in this paper novel techniques that determine the semantic relationships among GeneOntology (GO) terms. We implemented these techniques in a prototype system called GoSE, which resides between user application and GO database. Given a set S of GO terms, GoSE would return another set S' of GO terms, where each term in S' is semantically related to each term in S. Most current research is focused on determining the semantic similarities among GO ontology terms based solely on their IDs and proximity to one another in the GO graph structure, while overlooking the contexts of the terms, which may lead to erroneous results. The context of a GO term T is the set of other terms, whose existence in the GO graph structure is dependent on T. We propose novel techniques that determine the contexts of terms based on the concept of existence dependency. We present a stack-based sort-merge algorithm employing these techniques for determining the semantic similarities among GO terms.We evaluated GoSE experimentally and compared it with three existing methods. The results of measuring the semantic similarities among genes in KEGG and Pfam pathways retrieved from the DBGET and Sanger Pfam databases, respectively, have shown that our method outperforms the other three methods in recall and precision.
Parallel trajectory similarity joins in spatial networks

KAUST Repository

Shang, Shuo

2018-04-04

The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider two cases of trajectory similarity joins (TS-Joins), including a threshold-based join (Tb-TS-Join) and a top-k TS-Join (k-TS-Join), where the objects are trajectories of vehicles moving in road networks. Given two sets of trajectories and a threshold θ, the Tb-TS-Join returns all pairs of trajectories from the two sets with similarity above θ. In contrast, the k-TS-Join does not take a threshold as a parameter, and it returns the top-k most similar trajectory pairs from the two sets. The TS-Joins target diverse applications such as trajectory near-duplicate detection, data cleaning, ridesharing recommendation, and traffic congestion prediction. With these applications in mind, we provide purposeful definitions of similarity. To enable efficient processing of the TS-Joins on large sets of trajectories, we develop search space pruning techniques and enable use of the parallel processing capabilities of modern processors. Specifically, we present a two-phase divide-and-conquer search framework that lays the foundation for the algorithms for the Tb-TS-Join and the k-TS-Join that rely on different pruning techniques to achieve efficiency. For each trajectory, the algorithms first find similar trajectories. Then they merge the results to obtain the final result. The algorithms for the two joins exploit different upper and lower bounds on the spatiotemporal trajectory similarity and different heuristic scheduling strategies for search space pruning. Their per-trajectory searches are independent of each other and can be performed in parallel, and the mergings have constant cost. An empirical study with real data offers insight in the performance of the algorithms and demonstrates that they are capable of outperforming well-designed baseline algorithms by an order of magnitude.
Parallel trajectory similarity joins in spatial networks

KAUST Repository

Shang, Shuo; Chen, Lisi; Wei, Zhewei; Jensen, Christian S.; Zheng, Kai; Kalnis, Panos

2018-01-01

The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider two cases of trajectory similarity joins (TS-Joins), including a threshold-based join (Tb-TS-Join) and a top-k TS-Join (k-TS-Join), where the objects are trajectories of vehicles moving in road networks. Given two sets of trajectories and a threshold θ, the Tb-TS-Join returns all pairs of trajectories from the two sets with similarity above θ. In contrast, the k-TS-Join does not take a threshold as a parameter, and it returns the top-k most similar trajectory pairs from the two sets. The TS-Joins target diverse applications such as trajectory near-duplicate detection, data cleaning, ridesharing recommendation, and traffic congestion prediction. With these applications in mind, we provide purposeful definitions of similarity. To enable efficient processing of the TS-Joins on large sets of trajectories, we develop search space pruning techniques and enable use of the parallel processing capabilities of modern processors. Specifically, we present a two-phase divide-and-conquer search framework that lays the foundation for the algorithms for the Tb-TS-Join and the k-TS-Join that rely on different pruning techniques to achieve efficiency. For each trajectory, the algorithms first find similar trajectories. Then they merge the results to obtain the final result. The algorithms for the two joins exploit different upper and lower bounds on the spatiotemporal trajectory similarity and different heuristic scheduling strategies for search space pruning. Their per-trajectory searches are independent of each other and can be performed in parallel, and the mergings have constant cost. An empirical study with real data offers insight in the performance of the algorithms and demonstrates that they are capable of outperforming well-designed baseline algorithms by an order of magnitude.
Algorithmic prediction of inter-song similarity in Western popular music

NARCIS (Netherlands)

Novello, A.; Par, van de S.L.J.D.E.; McKinney, M.F.; Kohlrausch, A.G.

2013-01-01

We investigate a method for automatic extraction of inter-song similarity for songs selected from several genres of Western popular music. The specific purpose of this approach is to evaluate the predictive power of different feature extraction sets based on human perception of music similarity and

An electrophysiological signature of summed similarity in visual working memory

NARCIS (Netherlands)

Van Vugt, Marieke K.; Sekuler, Robert; Wilson, Hugh R.; Kahana, Michael J.

Summed-similarity models of short-term item recognition posit that participants base their judgments of an item's prior occurrence on that item's summed similarity to the ensemble of items on the remembered list. We examined the neural predictions of these models in 3 short-term recognition memory
A Signal Processing Method to Explore Similarity in Protein Flexibility

Directory of Open Access Journals (Sweden)

Simina Vasilache

2010-01-01

Full Text Available Understanding mechanisms of protein flexibility is of great importance to structural biology. The ability to detect similarities between proteins and their patterns is vital in discovering new information about unknown protein functions. A Distance Constraint Model (DCM provides a means to generate a variety of flexibility measures based on a given protein structure. Although information about mechanical properties of flexibility is critical for understanding protein function for a given protein, the question of whether certain characteristics are shared across homologous proteins is difficult to assess. For a proper assessment, a quantified measure of similarity is necessary. This paper begins to explore image processing techniques to quantify similarities in signals and images that characterize protein flexibility. The dataset considered here consists of three different families of proteins, with three proteins in each family. The similarities and differences found within flexibility measures across homologous proteins do not align with sequence-based evolutionary methods.
Similar Efficacy with Omalizumab in Chronic Idiopathic/Spontaneous Urticaria Despite Different Background Therapy.

Science.gov (United States)

Casale, Thomas B; Bernstein, Jonathan A; Maurer, Marcus; Saini, Sarbjit S; Trzaskoma, Benjamin; Chen, Hubert; Grattan, Clive E; Gimenéz-Arnau, Ana; Kaplan, Allen P; Rosén, Karin

2015-01-01

Data from the 3 omalizumab pivotal trials in patients with chronic idiopathic urticaria/chronic spontaneous urticaria (CIU/CSU) represent the largest database of patients reported to date with refractory disease (omalizumab, n = 733; placebo, n = 242). The objective of this study was to compare results from ASTERIA I and II, which included only approved doses of H1-antihistamine as background therapy based on regulatory authority requirements, to those from GLACIAL, which permitted higher doses of H1-antihistamines as well as other types of background therapy, in a post hoc analysis. Efficacy data from the placebo, omalizumab 150-mg, and omalizumab 300-mg treatment arms of ASTERIA I and II were pooled and analyzed (n = 162 and n = 160, respectively). The 300-mg treatment arm analyses were compared with the analysis of data from GLACIAL (n = 252) using analysis of covariance models. The key efficacy endpoint was change from baseline to week 12 in mean weekly itch severity score (ISS); other endpoints were also evaluated. Safety data were pooled from all 3 studies. Mean ISS was significantly reduced from baseline at week 12 in the pooled ASTERIA I and II omalizumab 150- and 300-mg treatment arms and in the GLACIAL omalizumab 300-mg arm. The weekly ISS reduction magnitude at week 12 was similar between the omalizumab 300-mg groups in the ASTERIA I and II pooled and GLACIAL studies. Similar treatment effect sizes were observed across multiple endpoints. Omalizumab was well tolerated and the adverse-event profile was similar regardless of background therapy for CIU/CSU. The overall safety profile was generally consistent with omalizumab therapy in allergic asthma. Omalizumab 300 mg was safe and effective in reducing CIU/CSU symptoms regardless of background therapy. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Self-similar pattern formation and continuous mechanics of self-similar systems

Directory of Open Access Journals (Sweden)

A. V. Dyskin

2007-01-01

Full Text Available In many cases, the critical state of systems that reached the threshold is characterised by self-similar pattern formation. We produce an example of pattern formation of this kind – formation of self-similar distribution of interacting fractures. Their formation starts with the crack growth due to the action of stress fluctuations. It is shown that even when the fluctuations have zero average the cracks generated by them could grow far beyond the scale of stress fluctuations. Further development of the fracture system is controlled by crack interaction leading to the emergence of self-similar crack distributions. As a result, the medium with fractures becomes discontinuous at any scale. We develop a continuum fractal mechanics to model its physical behaviour. We introduce a continuous sequence of continua of increasing scales covering this range of scales. The continuum of each scale is specified by the representative averaging volume elements of the corresponding size. These elements determine the resolution of the continuum. Each continuum hides the cracks of scales smaller than the volume element size while larger fractures are modelled explicitly. Using the developed formalism we investigate the stability of self-similar crack distributions with respect to crack growth and show that while the self-similar distribution of isotropically oriented cracks is stable, the distribution of parallel cracks is not. For the isotropically oriented cracks scaling of permeability is determined. For permeable materials (rocks with self-similar crack distributions permeability scales as cube of crack radius. This property could be used for detecting this specific mechanism of formation of self-similar crack distributions.
Support vector machine learning-based fMRI data group analysis.

Science.gov (United States)

Wang, Ze; Childress, Anna R; Wang, Jiongjiong; Detre, John A

2007-07-15

To explore the multivariate nature of fMRI data and to consider the inter-subject brain response discrepancies, a multivariate and brain response model-free method is fundamentally required. Two such methods are presented in this paper by integrating a machine learning algorithm, the support vector machine (SVM), and the random effect model. Without any brain response modeling, SVM was used to extract a whole brain spatial discriminance map (SDM), representing the brain response difference between the contrasted experimental conditions. Population inference was then obtained through the random effect analysis (RFX) or permutation testing (PMU) on the individual subjects' SDMs. Applied to arterial spin labeling (ASL) perfusion fMRI data, SDM RFX yielded lower false-positive rates in the null hypothesis test and higher detection sensitivity for synthetic activations with varying cluster size and activation strengths, compared to the univariate general linear model (GLM)-based RFX. For a sensory-motor ASL fMRI study, both SDM RFX and SDM PMU yielded similar activation patterns to GLM RFX and GLM PMU, respectively, but with higher t values and cluster extensions at the same significance level. Capitalizing on the absence of temporal noise correlation in ASL data, this study also incorporated PMU in the individual-level GLM and SVM analyses accompanied by group-level analysis through RFX or group-level PMU. Providing inferences on the probability of being activated or deactivated at each voxel, these individual-level PMU-based group analysis methods can be used to threshold the analysis results of GLM RFX, SDM RFX or SDM PMU.
Team-Based Care: A Concept Analysis.

Science.gov (United States)

Baik, Dawon

2017-10-01

The purpose of this concept analysis is to clarify and analyze the concept of team-based care in clinical practice. Team-based care has garnered attention as a way to enhance healthcare delivery and patient care related to quality and safety. However, there is no consensus on the concept of team-based care; as a result, the lack of common definition impedes further studies on team-based care. This analysis was conducted using Walker and Avant's strategy. Literature searches were conducted using PubMed, Cumulative Index to Nursing and Allied Health Literature (CINAHL), and PsycINFO, with a timeline from January 1985 to December 2015. The analysis demonstrates that the concept of team-based care has three core attributes: (a) interprofessional collaboration, (b) patient-centered approach, and (c) integrated care process. This is accomplished through understanding other team members' roles and responsibilities, a climate of mutual respect, and organizational support. Consequences of team-based care are identified with three aspects: (a) patient, (b) healthcare professional, and (c) healthcare organization. This concept analysis helps better understand the characteristics of team-based care in the clinical practice as well as promote the development of a theoretical definition of team-based care. © 2016 Wiley Periodicals, Inc.
Prioritization of candidate disease genes by combining topological similarity and semantic similarity.

Science.gov (United States)

Liu, Bin; Jin, Min; Zeng, Pan

2015-10-01

The identification of gene-phenotype relationships is very important for the treatment of human diseases. Studies have shown that genes causing the same or similar phenotypes tend to interact with each other in a protein-protein interaction (PPI) network. Thus, many identification methods based on the PPI network model have achieved good results. However, in the PPI network, some interactions between the proteins encoded by candidate gene and the proteins encoded by known disease genes are very weak. Therefore, some studies have combined the PPI network with other genomic information and reported good predictive performances. However, we believe that the results could be further improved. In this paper, we propose a new method that uses the semantic similarity between the candidate gene and known disease genes to set the initial probability vector of a random walk with a restart algorithm in a human PPI network. The effectiveness of our method was demonstrated by leave-one-out cross-validation, and the experimental results indicated that our method outperformed other methods. Additionally, our method can predict new causative genes of multifactor diseases, including Parkinson's disease, breast cancer and obesity. The top predictions were good and consistent with the findings in the literature, which further illustrates the effectiveness of our method. Copyright © 2015 Elsevier Inc. All rights reserved.
Trajectory Based Traffic Analysis

DEFF Research Database (Denmark)

Krogh, Benjamin Bjerre; Andersen, Ove; Lewis-Kelham, Edwin

2013-01-01

We present the INTRA system for interactive path-based traffic analysis. The analyses are developed in collaboration with traffic researchers and provide novel insights into conditions such as congestion, travel-time, choice of route, and traffic-flow. INTRA supports interactive point-and-click a......We present the INTRA system for interactive path-based traffic analysis. The analyses are developed in collaboration with traffic researchers and provide novel insights into conditions such as congestion, travel-time, choice of route, and traffic-flow. INTRA supports interactive point......-and-click analysis, due to a novel and efficient indexing structure. With the web-site daisy.aau.dk/its/spqdemo/we will demonstrate several analyses, using a very large real-world data set consisting of 1.9 billion GPS records (1.5 million trajectories) recorded from more than 13000 vehicles, and touching most...
Hierarchical Matching of Traffic Information Services Using Semantic Similarity

Directory of Open Access Journals (Sweden)

Zongtao Duan

2018-01-01

Full Text Available Service matching aims to find the information similar to a given query, which has numerous applications in web search. Although existing methods yield promising results, they are not applicable for transportation. In this paper, we propose a multilevel matching method based on semantic technology, towards efficiently searching the traffic information requested. Our approach is divided into two stages: service clustering, which prunes candidate services that are not promising, and functional matching. The similarity at function level between services is computed by grouping the connections between the services into inheritance and noninheritance relationships. We also developed a three-layer framework with a semantic similarity measure that requires less time and space cost than existing method since the scale of candidate services is significantly smaller than the whole transportation network. The OWL_TC4 based service set was used to verify the proposed approach. The accuracy of offline service clustering reached 93.80%, and it reduced the response time to 651 ms when the total number of candidate services was 1000. Moreover, given the different thresholds for the semantic similarity measure, the proposed mixed matching model did better in terms of recall and precision (i.e., up to 72.7% and 80%, respectively, for more than 1000 services compared to the compared models based on information theory and taxonomic distance. These experimental results confirmed the effectiveness and validity of service matching for responding quickly and accurately to user queries.
Self-similar oscillations of a Z pinch

International Nuclear Information System (INIS)

Felber, F.S.

1982-01-01

A new analytic, self-similar solution of the equations of ideal magnetohydrodynamics describes cylindrically symmetric plasmas conducting constant current. The solution indicates that an adiabatic Z pinch oscillates radially with a period typically of the order of a few acoustic transit times. A stability analysis, which shows the growth rate of the sausage instability to be a saturating function of wavenumber, suggests that the oscillations are observable
Testing surrogacy assumptions: can threatened and endangered plants be grouped by biological similarity and abundances?

Directory of Open Access Journals (Sweden)

Judy P Che-Castaldo

Full Text Available There is renewed interest in implementing surrogate species approaches in conservation planning due to the large number of species in need of management but limited resources and data. One type of surrogate approach involves selection of one or a few species to represent a larger group of species requiring similar management actions, so that protection and persistence of the selected species would result in conservation of the group of species. However, among the criticisms of surrogate approaches is the need to test underlying assumptions, which remain rarely examined. In this study, we tested one of the fundamental assumptions underlying use of surrogate species in recovery planning: that there exist groups of threatened and endangered species that are sufficiently similar to warrant similar management or recovery criteria. Using a comprehensive database of all plant species listed under the U.S. Endangered Species Act and tree-based random forest analysis, we found no evidence of species groups based on a set of distributional and biological traits or by abundances and patterns of decline. Our results suggested that application of surrogate approaches for endangered species recovery would be unjustified. Thus, conservation planning focused on individual species and their patterns of decline will likely be required to recover listed species.
Algorithm of reducing the false positives in IDS based on correlation Analysis

Science.gov (United States)

Liu, Jianyi; Li, Sida; Zhang, Ru

2018-03-01

This paper proposes an algorithm of reducing the false positives in IDS based on correlation Analysis. Firstly, the algorithm analyzes the distinguishing characteristics of false positives and real alarms, and preliminary screen the false positives; then use the method of attribute similarity clustering to the alarms and further reduces the amount of alarms; finally, according to the characteristics of multi-step attack, associated it by the causal relationship. The paper also proposed a reverse causation algorithm based on the attack association method proposed by the predecessors, turning alarm information into a complete attack path. Experiments show that the algorithm simplifies the number of alarms, improve the efficiency of alarm processing, and contribute to attack purposes identification and alarm accuracy improvement.
Mechanics of ultra-stretchable self-similar serpentine interconnects

International Nuclear Information System (INIS)

Zhang, Yihui; Fu, Haoran; Su, Yewang; Xu, Sheng

2013-01-01

Graphical abstract: We developed analytical models of flexibility and elastic-stretchability for self-similar interconnect. The analytic solutions agree very well with the finite element analyses, both demonstrating that the elastic-stretchability more than doubles when the order of self-similar structure increases by one. Design optimization yields 90% and 50% elastic stretchability for systems with surface filling ratios of 50% and 70% of active devices, respectively. The analytic models are useful for the development of stretchable electronics that simultaneously demand large coverage of active devices, such as stretchable photovoltaics and electronic eye-ball cameras. -- Abstract: Electrical interconnects that adopt self-similar, serpentine layouts offer exceptional levels of stretchability in systems that consist of collections of small, non-stretchable active devices in the so-called island–bridge design. This paper develops analytical models of flexibility and elastic stretchability for such structures, and establishes recursive formulae at different orders of self-similarity. The analytic solutions agree well with finite element analysis, with both demonstrating that the elastic stretchability more than doubles when the order of the self-similar structure increases by one. Design optimization yields 90% and 50% elastic stretchability for systems with surface filling ratios of 50% and 70% of active devices, respectively
Self-similarity of solitary waves on inertia-dominated falling liquid films.

Science.gov (United States)

Denner, Fabian; Pradas, Marc; Charogiannis, Alexandros; Markides, Christos N; van Wachem, Berend G M; Kalliadasis, Serafim

2016-03-01

We propose consistent scaling of solitary waves on inertia-dominated falling liquid films, which accurately accounts for the driving physical mechanisms and leads to a self-similar characterization of solitary waves. Direct numerical simulations of the entire two-phase system are conducted using a state-of-the-art finite volume framework for interfacial flows in an open domain that was previously validated against experimental film-flow data with excellent agreement. We present a detailed analysis of the wave shape and the dispersion of solitary waves on 34 different water films with Reynolds numbers Re=20-120 and surface tension coefficients σ=0.0512-0.072 N m(-1) on substrates with inclination angles β=19°-90°. Following a detailed analysis of these cases we formulate a consistent characterization of the shape and dispersion of solitary waves, based on a newly proposed scaling derived from the Nusselt flat film solution, that unveils a self-similarity as well as the driving mechanism of solitary waves on gravity-driven liquid films. Our results demonstrate that the shape of solitary waves, i.e., height and asymmetry of the wave, is predominantly influenced by the balance of inertia and surface tension. Furthermore, we find that the dispersion of solitary waves on the inertia-dominated falling liquid films considered in this study is governed by nonlinear effects and only driven by inertia, with surface tension and gravity having a negligible influence.
Column Selection for Biomedical Analysis Supported by Column Classification Based on Four Test Parameters.

Science.gov (United States)

Plenis, Alina; Rekowska, Natalia; Bączek, Tomasz

2016-01-21

This article focuses on correlating the column classification obtained from the method created at the Katholieke Universiteit Leuven (KUL), with the chromatographic resolution attained in biomedical separation. In the KUL system, each column is described with four parameters, which enables estimation of the FKUL value characterising similarity of those parameters to the selected reference stationary phase. Thus, a ranking list based on the FKUL value can be calculated for the chosen reference column, then correlated with the results of the column performance test. In this study, the column performance test was based on analysis of moclobemide and its two metabolites in human plasma by liquid chromatography (LC), using 18 columns. The comparative study was performed using traditional correlation of the FKUL values with the retention parameters of the analytes describing the column performance test. In order to deepen the comparative assessment of both data sets, factor analysis (FA) was also used. The obtained results indicated that the stationary phase classes, closely related according to the KUL method, yielded comparable separation for the target substances. Therefore, the column ranking system based on the FKUL-values could be considered supportive in the choice of the appropriate column for biomedical analysis.
New product forecasting demand by using neural networks and similar product analysis

Directory of Open Access Journals (Sweden)

Alfonso T. Sarmiento

2014-01-01

Full Text Available Esta investigación presenta una metodología para pronosticar productos nuevos que combina el pronóstico de productos similares. La parte cuantitativa del método usa una red neuronal artificial para calcular el pronóstico de cada producto similar. Estos pronósticos individuales son combinados usando una técnica cualitativa basada en un factor que mide la similaridad entre los productos análogos y el producto nuevo. Para ilustrar la metodología se presenta un caso de estudio de dos grandes compañías multinacionales en el sector de alimentos. Los resultados de este estudio mostraron en el 86 por ciento de los casos analizados pronósticos más exactos usando el método propuesto.
Automated dating of the world’s language families based on lexical similarity

NARCIS (Netherlands)

Holman, E.W.; Brown, C.H.; Wichmann, S.; Müller, A.; Velupillai, V.; Hammarström, H.; Sauppe, S.; Jung, H.; Bakker, D.; Brown, P.; Belyaev, O.; Urban, M.; Mailhammer, R.; List, J.-M.; Egorov, D.

2011-01-01

This paper describes a computerized alternative to glottochronology for estimating elapsed time since parent languages diverged into daughter languages. The method, developed by the Automated Similarity Judgment Program (ASJP) consortium, is different from glottochronology in four major respects:
Trust-Enhanced Cloud Service Selection Model Based on QoS Analysis.

Science.gov (United States)

Pan, Yuchen; Ding, Shuai; Fan, Wenjuan; Li, Jing; Yang, Shanlin

2015-01-01

Cloud computing technology plays a very important role in many areas, such as in the construction and development of the smart city. Meanwhile, numerous cloud services appear on the cloud-based platform. Therefore how to how to select trustworthy cloud services remains a significant problem in such platforms, and extensively investigated owing to the ever-growing needs of users. However, trust relationship in social network has not been taken into account in existing methods of cloud service selection and recommendation. In this paper, we propose a cloud service selection model based on the trust-enhanced similarity. Firstly, the direct, indirect, and hybrid trust degrees are measured based on the interaction frequencies among users. Secondly, we estimate the overall similarity by combining the experience usability measured based on Jaccard's Coefficient and the numerical distance computed by Pearson Correlation Coefficient. Then through using the trust degree to modify the basic similarity, we obtain a trust-enhanced similarity. Finally, we utilize the trust-enhanced similarity to find similar trusted neighbors and predict the missing QoS values as the basis of cloud service selection and recommendation. The experimental results show that our approach is able to obtain optimal results via adjusting parameters and exhibits high effectiveness. The cloud services ranking by our model also have better QoS properties than other methods in the comparison experiments.
PubMed-supported clinical term weighting approach for improving inter-patient similarity measure in diagnosis prediction.

Science.gov (United States)

Chan, Lawrence Wc; Liu, Ying; Chan, Tao; Law, Helen Kw; Wong, S C Cesar; Yeung, Andy Ph; Lo, K F; Yeung, S W; Kwok, K Y; Chan, William Yl; Lau, Thomas Yh; Shyu, Chi-Ren

2015-06-02

Similarity-based retrieval of Electronic Health Records (EHRs) from large clinical information systems provides physicians the evidence support in making diagnoses or referring examinations for the suspected cases. Clinical Terms in EHRs represent high-level conceptual information and the similarity measure established based on these terms reflects the chance of inter-patient disease co-occurrence. The assumption that clinical terms are equally relevant to a disease is unrealistic, reducing the prediction accuracy. Here we propose a term weighting approach supported by PubMed search engine to address this issue. We collected and studied 112 abdominal computed tomography imaging examination reports from four hospitals in Hong Kong. Clinical terms, which are the image findings related to hepatocellular carcinoma (HCC), were extracted from the reports. Through two systematic PubMed search methods, the generic and specific term weightings were established by estimating the conditional probabilities of clinical terms given HCC. Each report was characterized by an ontological feature vector and there were totally 6216 vector pairs. We optimized the modified direction cosine (mDC) with respect to a regularization constant embedded into the feature vector. Equal, generic and specific term weighting approaches were applied to measure the similarity of each pair and their performances for predicting inter-patient co-occurrence of HCC diagnoses were compared by using Receiver Operating Characteristics (ROC) analysis. The Areas under the curves (AUROCs) of similarity scores based on equal, generic and specific term weighting approaches were 0.735, 0.728 and 0.743 respectively (p PubMed. Our findings suggest that the optimized similarity measure with specific term weighting to EHRs can improve significantly the accuracy for predicting the inter-patient co-occurrence of diagnosis when compared with equal and generic term weighting approaches.
Agent-based simulation for human-induced hazard analysis.

Science.gov (United States)

Bulleit, William M; Drewek, Matthew W

2011-02-01

Terrorism could be treated as a hazard for design purposes. For instance, the terrorist hazard could be analyzed in a manner similar to the way that seismic hazard is handled. No matter how terrorism is dealt with in the design of systems, the need for predictions of the frequency and magnitude of the hazard will be required. And, if the human-induced hazard is to be designed for in a manner analogous to natural hazards, then the predictions should be probabilistic in nature. The model described in this article is a prototype model that used agent-based modeling (ABM) to analyze terrorist attacks. The basic approach in this article of using ABM to model human-induced hazards has been preliminarily validated in the sense that the attack magnitudes seem to be power-law distributed and attacks occur mostly in regions where high levels of wealth pass through, such as transit routes and markets. The model developed in this study indicates that ABM is a viable approach to modeling socioeconomic-based infrastructure systems for engineering design to deal with human-induced hazards. © 2010 Society for Risk Analysis.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.