WorldWideScience

Sample records for similarity searches based

  1. SHOP: scaffold hopping by GRID-based similarity searches

    DEFF Research Database (Denmark)

    Bergmann, Rikke; Linusson, Anna; Zamora, Ismael

    2007-01-01

    A new GRID-based method for scaffold hopping (SHOP) is presented. In a fully automatic manner, scaffolds were identified in a database based on three types of 3D-descriptors. SHOP's ability to recover scaffolds was assessed and validated by searching a database spiked with fragments of known...... scaffolds were in the 31 top-ranked scaffolds. SHOP also identified new scaffolds with substantially different chemotypes from the queries. Docking analysis indicated that the new scaffolds would have similar binding modes to those of the respective query scaffolds observed in X-ray structures...

  2. Density-based similarity measures for content based search

    Energy Technology Data Exchange (ETDEWEB)

    Hush, Don R [Los Alamos National Laboratory; Porter, Reid B [Los Alamos National Laboratory; Ruggiero, Christy E [Los Alamos National Laboratory

    2009-01-01

    We consider the query by multiple example problem where the goal is to identify database samples whose content is similar to a coUection of query samples. To assess the similarity we use a relative content density which quantifies the relative concentration of the query distribution to the database distribution. If the database distribution is a mixture of the query distribution and a background distribution then it can be shown that database samples whose relative content density is greater than a particular threshold {rho} are more likely to have been generated by the query distribution than the background distribution. We describe an algorithm for predicting samples with relative content density greater than {rho} that is computationally efficient and possesses strong performance guarantees. We also show empirical results for applications in computer network monitoring and image segmentation.

  3. Optimization of interactive visual-similarity-based search

    NARCIS (Netherlands)

    Nguyen, G.P.; Worring, M.

    2008-01-01

    At one end of the spectrum, research in interactive content-based retrieval concentrates on machine learning methods for effective use of relevance feedback. On the other end, the information visualization community focuses on effective methods for conveying information to the user. What is lacking

  4. Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement

    National Research Council Canada - National Science Library

    Ortega-Binderberger, Michael

    2002-01-01

    ... as a critical area of research. This thesis explores how to enhance database systems with content based search over arbitrary abstract data types in a similarity based framework with query refinement...

  5. OS2: Oblivious similarity based searching for encrypted data outsourced to an untrusted domain

    Science.gov (United States)

    Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Ramzan, Naeem

    2017-01-01

    Public cloud storage services are becoming prevalent and myriad data sharing, archiving and collaborative services have emerged which harness the pay-as-you-go business model of public cloud. To ensure privacy and confidentiality often encrypted data is outsourced to such services, which further complicates the process of accessing relevant data by using search queries. Search over encrypted data schemes solve this problem by exploiting cryptographic primitives and secure indexing to identify outsourced data that satisfy the search criteria. Almost all of these schemes rely on exact matching between the encrypted data and search criteria. A few schemes which extend the notion of exact matching to similarity based search, lack realism as those schemes rely on trusted third parties or due to increase storage and computational complexity. In this paper we propose Oblivious Similarity based Search (OS2) for encrypted data. It enables authorized users to model their own encrypted search queries which are resilient to typographical errors. Unlike conventional methodologies, OS2 ranks the search results by using similarity measure offering a better search experience than exact matching. It utilizes encrypted bloom filter and probabilistic homomorphic encryption to enable authorized users to access relevant data without revealing results of search query evaluation process to the untrusted cloud service provider. Encrypted bloom filter based search enables OS2 to reduce search space to potentially relevant encrypted data avoiding unnecessary computation on public cloud. The efficacy of OS2 is evaluated on Google App Engine for various bloom filter lengths on different cloud configurations. PMID:28692697

  6. Inference-Based Similarity Search in Randomized Montgomery Domains for Privacy-Preserving Biometric Identification.

    Science.gov (United States)

    Wang, Yi; Wan, Jianwu; Guo, Jun; Cheung, Yiu-Ming; C Yuen, Pong

    2017-07-14

    Similarity search is essential to many important applications and often involves searching at scale on high-dimensional data based on their similarity to a query. In biometric applications, recent vulnerability studies have shown that adversarial machine learning can compromise biometric recognition systems by exploiting the biometric similarity information. Existing methods for biometric privacy protection are in general based on pairwise matching of secured biometric templates and have inherent limitations in search efficiency and scalability. In this paper, we propose an inference-based framework for privacy-preserving similarity search in Hamming space. Our approach builds on an obfuscated distance measure that can conceal Hamming distance in a dynamic interval. Such a mechanism enables us to systematically design statistically reliable methods for retrieving most likely candidates without knowing the exact distance values. We further propose to apply Montgomery multiplication for generating search indexes that can withstand adversarial similarity analysis, and show that information leakage in randomized Montgomery domains can be made negligibly small. Our experiments on public biometric datasets demonstrate that the inference-based approach can achieve a search accuracy close to the best performance possible with secure computation methods, but the associated cost is reduced by orders of magnitude compared to cryptographic primitives.

  7. [Formula: see text]: Oblivious similarity based searching for encrypted data outsourced to an untrusted domain.

    Science.gov (United States)

    Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Ramzan, Naeem; Khan, Wajahat Ali

    2017-01-01

    Public cloud storage services are becoming prevalent and myriad data sharing, archiving and collaborative services have emerged which harness the pay-as-you-go business model of public cloud. To ensure privacy and confidentiality often encrypted data is outsourced to such services, which further complicates the process of accessing relevant data by using search queries. Search over encrypted data schemes solve this problem by exploiting cryptographic primitives and secure indexing to identify outsourced data that satisfy the search criteria. Almost all of these schemes rely on exact matching between the encrypted data and search criteria. A few schemes which extend the notion of exact matching to similarity based search, lack realism as those schemes rely on trusted third parties or due to increase storage and computational complexity. In this paper we propose Oblivious Similarity based Search ([Formula: see text]) for encrypted data. It enables authorized users to model their own encrypted search queries which are resilient to typographical errors. Unlike conventional methodologies, [Formula: see text] ranks the search results by using similarity measure offering a better search experience than exact matching. It utilizes encrypted bloom filter and probabilistic homomorphic encryption to enable authorized users to access relevant data without revealing results of search query evaluation process to the untrusted cloud service provider. Encrypted bloom filter based search enables [Formula: see text] to reduce search space to potentially relevant encrypted data avoiding unnecessary computation on public cloud. The efficacy of [Formula: see text] is evaluated on Google App Engine for various bloom filter lengths on different cloud configurations.

  8. Similarity-based search of model organism, disease and drug effect phenotypes

    KAUST Repository

    Hoehndorf, Robert

    2015-02-19

    Background: Semantic similarity measures over phenotype ontologies have been demonstrated to provide a powerful approach for the analysis of model organism phenotypes, the discovery of animal models of human disease, novel pathways, gene functions, druggable therapeutic targets, and determination of pathogenicity. Results: We have developed PhenomeNET 2, a system that enables similarity-based searches over a large repository of phenotypes in real-time. It can be used to identify strains of model organisms that are phenotypically similar to human patients, diseases that are phenotypically similar to model organism phenotypes, or drug effect profiles that are similar to the phenotypes observed in a patient or model organism. PhenomeNET 2 is available at http://aber-owl.net/phenomenet. Conclusions: Phenotype-similarity searches can provide a powerful tool for the discovery and investigation of molecular mechanisms underlying an observed phenotypic manifestation. PhenomeNET 2 facilitates user-defined similarity searches and allows researchers to analyze their data within a large repository of human, mouse and rat phenotypes.

  9. Generating "fragment-based virtual library" using pocket similarity search of ligand-receptor complexes.

    Science.gov (United States)

    Khashan, Raed S

    2015-01-01

    As the number of available ligand-receptor complexes is increasing, researchers are becoming more dedicated to mine these complexes to aid in the drug design and development process. We present free software which is developed as a tool for performing similarity search across ligand-receptor complexes for identifying binding pockets which are similar to that of a target receptor. The search is based on 3D-geometric and chemical similarity of the atoms forming the binding pocket. For each match identified, the ligand's fragment(s) corresponding to that binding pocket are extracted, thus forming a virtual library of fragments (FragVLib) that is useful for structure-based drug design. The program provides a very useful tool to explore available databases.

  10. BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server

    Directory of Open Access Journals (Sweden)

    Jiang Hualiang

    2010-01-01

    Full Text Available Abstract Background Genome sequencing and post-genomics projects such as structural genomics are extending the frontier of the study of sequence-structure-function relationship of genes and their products. Although many sequence/structure-based methods have been devised with the aim of deciphering this delicate relationship, there still remain large gaps in this fundamental problem, which continuously drives researchers to develop novel methods to extract relevant information from sequences and structures and to infer the functions of newly identified genes by genomics technology. Results Here we present an ultrafast method, named BSSF(Binding Site Similarity & Function, which enables researchers to conduct similarity searches in a comprehensive three-dimensional binding site database extracted from PDB structures. This method utilizes a fingerprint representation of the binding site and a validated statistical Z-score function scheme to judge the similarity between the query and database items, even if their similarities are only constrained in a sub-pocket. This fingerprint based similarity measurement was also validated on a known binding site dataset by comparing with geometric hashing, which is a standard 3D similarity method. The comparison clearly demonstrated the utility of this ultrafast method. After conducting the database searching, the hit list is further analyzed to provide basic statistical information about the occurrences of Gene Ontology terms and Enzyme Commission numbers, which may benefit researchers by helping them to design further experiments to study the query proteins. Conclusions This ultrafast web-based system will not only help researchers interested in drug design and structural genomics to identify similar binding sites, but also assist them by providing further analysis of hit list from database searching.

  11. World Wide Web-based system for the calculation of substituent parameters and substituent similarity searches.

    Science.gov (United States)

    Ertl, P

    1998-02-01

    Easy to use, interactive, and platform-independent WWW-based tools are ideal for development of chemical applications. By using the newly emerging Web technologies such as Java applets and sophisticated scripting, it is possible to deliver powerful molecular processing capabilities directly to the desk of synthetic organic chemists. In Novartis Crop Protection in Basel, a Web-based molecular modelling system has been in use since 1995. In this article two new modules of this system are presented: a program for interactive calculation of important hydrophobic, electronic, and steric properties of organic substituents, and a module for substituent similarity searches enabling the identification of bioisosteric functional groups. Various possible applications of calculated substituent parameters are also discussed, including automatic design of molecules with the desired properties and creation of targeted virtual combinatorial libraries.

  12. Application of 3D Zernike descriptors to shape-based ligand similarity searching

    Directory of Open Access Journals (Sweden)

    Venkatraman Vishwesh

    2009-12-01

    Full Text Available Abstract Background The identification of promising drug leads from a large database of compounds is an important step in the preliminary stages of drug design. Although shape is known to play a key role in the molecular recognition process, its application to virtual screening poses significant hurdles both in terms of the encoding scheme and speed. Results In this study, we have examined the efficacy of the alignment independent three-dimensional Zernike descriptor (3DZD for fast shape based similarity searching. Performance of this approach was compared with several other methods including the statistical moments based ultrafast shape recognition scheme (USR and SIMCOMP, a graph matching algorithm that compares atom environments. Three benchmark datasets are used to thoroughly test the methods in terms of their ability for molecular classification, retrieval rate, and performance under the situation that simulates actual virtual screening tasks over a large pharmaceutical database. The 3DZD performed better than or comparable to the other methods examined, depending on the datasets and evaluation metrics used. Reasons for the success and the failure of the shape based methods for specific cases are investigated. Based on the results for the three datasets, general conclusions are drawn with regard to their efficiency and applicability. Conclusion The 3DZD has unique ability for fast comparison of three-dimensional shape of compounds. Examples analyzed illustrate the advantages and the room for improvements for the 3DZD.

  13. Application of 3D Zernike descriptors to shape-based ligand similarity searching.

    Science.gov (United States)

    Venkatraman, Vishwesh; Chakravarthy, Padmasini Ramji; Kihara, Daisuke

    2009-12-17

    The identification of promising drug leads from a large database of compounds is an important step in the preliminary stages of drug design. Although shape is known to play a key role in the molecular recognition process, its application to virtual screening poses significant hurdles both in terms of the encoding scheme and speed. In this study, we have examined the efficacy of the alignment independent three-dimensional Zernike descriptor (3DZD) for fast shape based similarity searching. Performance of this approach was compared with several other methods including the statistical moments based ultrafast shape recognition scheme (USR) and SIMCOMP, a graph matching algorithm that compares atom environments. Three benchmark datasets are used to thoroughly test the methods in terms of their ability for molecular classification, retrieval rate, and performance under the situation that simulates actual virtual screening tasks over a large pharmaceutical database. The 3DZD performed better than or comparable to the other methods examined, depending on the datasets and evaluation metrics used. Reasons for the success and the failure of the shape based methods for specific cases are investigated. Based on the results for the three datasets, general conclusions are drawn with regard to their efficiency and applicability. The 3DZD has unique ability for fast comparison of three-dimensional shape of compounds. Examples analyzed illustrate the advantages and the room for improvements for the 3DZD.

  14. Retrospective group fusion similarity search based on eROCE evaluation metric.

    Science.gov (United States)

    Avram, Sorin I; Crisan, Luminita; Bora, Alina; Pacureanu, Liliana M; Avram, Stefana; Kurunczi, Ludovic

    2013-03-01

    In this study, a simple evaluation metric, denoted as eROCE was proposed to measure the early enrichment of predictive methods. We demonstrated the superior robustness of eROCE compared to other known metrics throughout several active to inactive ratios ranging from 1:10 to 1:1000. Group fusion similarity search was investigated by varying 16 similarity coefficients, five molecular representations (binary and non-binary) and two group fusion rules using two reference structure set sizes. We used a dataset of 3478 actives and 43,938 inactive molecules and the enrichment was analyzed by means of eROCE. This retrospective study provides optimal similarity search parameters in the case of ALDH1A1 inhibitors. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Fast business process similarity search

    NARCIS (Netherlands)

    Yan, Z.; Dijkman, R.M.; Grefen, P.W.P.J.

    2012-01-01

    Nowadays, it is common for organizations to maintain collections of hundreds or even thousands of business processes. Techniques exist to search through such a collection, for business process models that are similar to a given query model. However, those techniques compare the query model to each

  16. Similarity search processing. Paralelization and indexing technologies.

    Directory of Open Access Journals (Sweden)

    Eder Dos Santos

    2015-08-01

    The next Scientific-Technical Report addresses the similarity search and the implementation of metric structures on parallel environments. It also presents the state of the art related to similarity search on metric structures and parallelism technologies. Comparative analysis are also proposed, seeking to identify the behavior of a set of metric spaces and metric structures over processing platforms multicore-based and GPU-based.

  17. Object recognition based on Google's reverse image search and image similarity

    Science.gov (United States)

    Horváth, András.

    2015-12-01

    Image classification is one of the most challenging tasks in computer vision and a general multiclass classifier could solve many different tasks in image processing. Classification is usually done by shallow learning for predefined objects, which is a difficult task and very different from human vision, which is based on continuous learning of object classes and one requires years to learn a large taxonomy of objects which are not disjunct nor independent. In this paper I present a system based on Google image similarity algorithm and Google image database, which can classify a large set of different objects in a human like manner, identifying related classes and taxonomies.

  18. Mapping query terms to data and schema using content based similarity search in clinical information systems.

    Science.gov (United States)

    Safari, Leila; Patrick, Jon D

    2013-01-01

    This paper reports on the issues in mapping the terms of a query to the field names of the schema of an Entity Relationship (ER) model or to the data part of the Entity Attribute Value (EAV) model using similarity based Top-K algorithm in clinical information system together with an extension of EAV mapping for medication names. In addition, the details of the mapping algorithm and the required pre-processing including NLP (Natural Language Processing) tasks to prepare resources for mapping are explained. The experimental results on an example clinical information system demonstrate more than 84 per cent of accuracy in mapping. The results will be integrated into our proposed Clinical Data Analytics Language (CliniDAL) to automate mapping process in CliniDAL.

  19. Similarity-based search of model organism, disease and drug effect phenotypes

    KAUST Repository

    Hoehndorf, Robert; Gruenberger, Michael; Gkoutos, Georgios V; Schofield, Paul N

    2015-01-01

    Background: Semantic similarity measures over phenotype ontologies have been demonstrated to provide a powerful approach for the analysis of model organism phenotypes, the discovery of animal models of human disease, novel pathways, gene functions

  20. Improving performance of content-based image retrieval schemes in searching for similar breast mass regions: an assessment

    International Nuclear Information System (INIS)

    Wang Xiaohui; Park, Sang Cheol; Zheng Bin

    2009-01-01

    This study aims to assess three methods commonly used in content-based image retrieval (CBIR) schemes and investigate the approaches to improve scheme performance. A reference database involving 3000 regions of interest (ROIs) was established. Among them, 400 ROIs were randomly selected to form a testing dataset. Three methods, namely mutual information, Pearson's correlation and a multi-feature-based k-nearest neighbor (KNN) algorithm, were applied to search for the 15 'the most similar' reference ROIs to each testing ROI. The clinical relevance and visual similarity of searching results were evaluated using the areas under receiver operating characteristic (ROC) curves (A Z ) and average mean square difference (MSD) of the mass boundary spiculation level ratings between testing and selected ROIs, respectively. The results showed that the A Z values were 0.893 ± 0.009, 0.606 ± 0.021 and 0.699 ± 0.026 for the use of KNN, mutual information and Pearson's correlation, respectively. The A Z values increased to 0.724 ± 0.017 and 0.787 ± 0.016 for mutual information and Pearson's correlation when using ROIs with the size adaptively adjusted based on actual mass size. The corresponding MSD values were 2.107 ± 0.718, 2.301 ± 0.733 and 2.298 ± 0.743. The study demonstrates that due to the diversity of medical images, CBIR schemes using multiple image features and mass size-based ROIs can achieve significantly improved performance.

  1. Protein structural similarity search by Ramachandran codes

    Directory of Open Access Journals (Sweden)

    Chang Chih-Hung

    2007-08-01

    Full Text Available Abstract Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation. SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era.

  2. Similarity search of business process models

    NARCIS (Netherlands)

    Dumas, M.; García-Bañuelos, L.; Dijkman, R.M.

    2009-01-01

    Similarity search is a general class of problems in which a given object, called a query object, is compared against a collection of objects in order to retrieve those that most closely resemble the query object. This paper reviews recent work on an instance of this class of problems, where the

  3. Predicting the performance of fingerprint similarity searching.

    Science.gov (United States)

    Vogt, Martin; Bajorath, Jürgen

    2011-01-01

    Fingerprints are bit string representations of molecular structure that typically encode structural fragments, topological features, or pharmacophore patterns. Various fingerprint designs are utilized in virtual screening and their search performance essentially depends on three parameters: the nature of the fingerprint, the active compounds serving as reference molecules, and the composition of the screening database. It is of considerable interest and practical relevance to predict the performance of fingerprint similarity searching. A quantitative assessment of the potential that a fingerprint search might successfully retrieve active compounds, if available in the screening database, would substantially help to select the type of fingerprint most suitable for a given search problem. The method presented herein utilizes concepts from information theory to relate the fingerprint feature distributions of reference compounds to screening libraries. If these feature distributions do not sufficiently differ, active database compounds that are similar to reference molecules cannot be retrieved because they disappear in the "background." By quantifying the difference in feature distribution using the Kullback-Leibler divergence and relating the divergence to compound recovery rates obtained for different benchmark classes, fingerprint search performance can be quantitatively predicted.

  4. Optimal neighborhood indexing for protein similarity search.

    Science.gov (United States)

    Peterlongo, Pierre; Noé, Laurent; Lavenier, Dominique; Nguyen, Van Hoa; Kucherov, Gregory; Giraud, Mathieu

    2008-12-16

    Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional information to limit the number of random memory accesses. However, this improvement leads to a larger index that may become a bottleneck. In the case of protein similarity search, we propose to decrease the index size by reducing the amino acid alphabet. The paper presents two main contributions. First, we show that an optimal neighborhood indexing combining an alphabet reduction and a longer neighborhood leads to a reduction of 35% of memory involved into the process, without sacrificing the quality of results nor the computational time. Second, our approach led us to develop a new kind of substitution score matrices and their associated e-value parameters. In contrast to usual matrices, these matrices are rectangular since they compare amino acid groups from different alphabets. We describe the method used for computing those matrices and we provide some typical examples that can be used in such comparisons. Supplementary data can be found on the website http://bioinfo.lifl.fr/reblosum. We propose a practical index size reduction of the neighborhood data, that does not negatively affect the performance of large-scale search in protein sequences. Such an index can be used in any study involving large protein data. Moreover, rectangular substitution score matrices and their associated statistical parameters can have applications in any study involving an alphabet reduction.

  5. Optimal neighborhood indexing for protein similarity search

    Directory of Open Access Journals (Sweden)

    Nguyen Van

    2008-12-01

    Full Text Available Abstract Background Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional information to limit the number of random memory accesses. However, this improvement leads to a larger index that may become a bottleneck. In the case of protein similarity search, we propose to decrease the index size by reducing the amino acid alphabet. Results The paper presents two main contributions. First, we show that an optimal neighborhood indexing combining an alphabet reduction and a longer neighborhood leads to a reduction of 35% of memory involved into the process, without sacrificing the quality of results nor the computational time. Second, our approach led us to develop a new kind of substitution score matrices and their associated e-value parameters. In contrast to usual matrices, these matrices are rectangular since they compare amino acid groups from different alphabets. We describe the method used for computing those matrices and we provide some typical examples that can be used in such comparisons. Supplementary data can be found on the website http://bioinfo.lifl.fr/reblosum. Conclusion We propose a practical index size reduction of the neighborhood data, that does not negatively affect the performance of large-scale search in protein sequences. Such an index can be used in any study involving large protein data. Moreover, rectangular substitution score matrices and their associated statistical parameters can have applications in any study involving an alphabet reduction.

  6. BLAST and FASTA similarity searching for multiple sequence alignment.

    Science.gov (United States)

    Pearson, William R

    2014-01-01

    BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.

  7. Domain similarity based orthology detection.

    Science.gov (United States)

    Bitard-Feildel, Tristan; Kemena, Carsten; Greenwood, Jenny M; Bornberg-Bauer, Erich

    2015-05-13

    Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins. We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison. We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda .

  8. Combined semantic and similarity search in medical image databases

    Science.gov (United States)

    Seifert, Sascha; Thoma, Marisa; Stegmaier, Florian; Hammon, Matthias; Kramer, Martin; Huber, Martin; Kriegel, Hans-Peter; Cavallaro, Alexander; Comaniciu, Dorin

    2011-03-01

    The current diagnostic process at hospitals is mainly based on reviewing and comparing images coming from multiple time points and modalities in order to monitor disease progression over a period of time. However, for ambiguous cases the radiologist deeply relies on reference literature or second opinion. Although there is a vast amount of acquired images stored in PACS systems which could be reused for decision support, these data sets suffer from weak search capabilities. Thus, we present a search methodology which enables the physician to fulfill intelligent search scenarios on medical image databases combining ontology-based semantic and appearance-based similarity search. It enabled the elimination of 12% of the top ten hits which would arise without taking the semantic context into account.

  9. SpolSimilaritySearch - A web tool to compare and search similarities between spoligotypes of Mycobacterium tuberculosis complex.

    Science.gov (United States)

    Couvin, David; Zozio, Thierry; Rastogi, Nalin

    2017-07-01

    Spoligotyping is one of the most commonly used polymerase chain reaction (PCR)-based methods for identification and study of genetic diversity of Mycobacterium tuberculosis complex (MTBC). Despite its known limitations if used alone, the methodology is particularly useful when used in combination with other methods such as mycobacterial interspersed repetitive units - variable number of tandem DNA repeats (MIRU-VNTRs). At a worldwide scale, spoligotyping has allowed identification of information on 103,856 MTBC isolates (corresponding to 98049 clustered strains plus 5807 unique isolates from 169 countries of patient origin) contained within the SITVIT2 proprietary database of the Institut Pasteur de la Guadeloupe. The SpolSimilaritySearch web-tool described herein (available at: http://www.pasteur-guadeloupe.fr:8081/SpolSimilaritySearch) incorporates a similarity search algorithm allowing users to get a complete overview of similar spoligotype patterns (with information on presence or absence of 43 spacers) in the aforementioned worldwide database. This tool allows one to analyze spread and evolutionary patterns of MTBC by comparing similar spoligotype patterns, to distinguish between widespread, specific and/or confined patterns, as well as to pinpoint patterns with large deleted blocks, which play an intriguing role in the genetic epidemiology of M. tuberculosis. Finally, the SpolSimilaritySearch tool also provides with the country distribution patterns for each queried spoligotype. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. A Similarity Search Using Molecular Topological Graphs

    Directory of Open Access Journals (Sweden)

    Yoshifumi Fukunishi

    2009-01-01

    Full Text Available A molecular similarity measure has been developed using molecular topological graphs and atomic partial charges. Two kinds of topological graphs were used. One is the ordinary adjacency matrix and the other is a matrix which represents the minimum path length between two atoms of the molecule. The ordinary adjacency matrix is suitable to compare the local structures of molecules such as functional groups, and the other matrix is suitable to compare the global structures of molecules. The combination of these two matrices gave a similarity measure. This method was applied to in silico drug screening, and the results showed that it was effective as a similarity measure.

  11. Outsourced similarity search on metric data assets

    KAUST Repository

    Yiu, Man Lung

    2012-02-01

    This paper considers a cloud computing setting in which similarity querying of metric data is outsourced to a service provider. The data is to be revealed only to trusted users, not to the service provider or anyone else. Users query the server for the most similar data objects to a query example. Outsourcing offers the data owner scalability and a low-initial investment. The need for privacy may be due to the data being sensitive (e.g., in medicine), valuable (e.g., in astronomy), or otherwise confidential. Given this setting, the paper presents techniques that transform the data prior to supplying it to the service provider for similarity queries on the transformed data. Our techniques provide interesting trade-offs between query cost and accuracy. They are then further extended to offer an intuitive privacy guarantee. Empirical studies with real data demonstrate that the techniques are capable of offering privacy while enabling efficient and accurate processing of similarity queries.

  12. Outsourced Similarity Search on Metric Data Assets

    DEFF Research Database (Denmark)

    Yiu, Man Lung; Assent, Ira; Jensen, Christian S.

    2012-01-01

    . Outsourcing offers the data owner scalability and a low initial investment. The need for privacy may be due to the data being sensitive (e.g., in medicine), valuable (e.g., in astronomy), or otherwise confidential. Given this setting, the paper presents techniques that transform the data prior to supplying......This paper considers a cloud computing setting in which similarity querying of metric data is outsourced to a service provider. The data is to be revealed only to trusted users, not to the service provider or anyone else. Users query the server for the most similar data objects to a query example...

  13. Outsourced similarity search on metric data assets

    KAUST Repository

    Yiu, Man Lung; Assent, Ira; Jensen, Christian Sø ndergaard; Kalnis, Panos

    2012-01-01

    for the most similar data objects to a query example. Outsourcing offers the data owner scalability and a low-initial investment. The need for privacy may be due to the data being sensitive (e.g., in medicine), valuable (e.g., in astronomy), or otherwise

  14. Similarity relations in visual search predict rapid visual categorization

    Science.gov (United States)

    Mohan, Krithika; Arun, S. P.

    2012-01-01

    How do we perform rapid visual categorization?It is widely thought that categorization involves evaluating the similarity of an object to other category items, but the underlying features and similarity relations remain unknown. Here, we hypothesized that categorization performance is based on perceived similarity relations between items within and outside the category. To this end, we measured the categorization performance of human subjects on three diverse visual categories (animals, vehicles, and tools) and across three hierarchical levels (superordinate, basic, and subordinate levels among animals). For the same subjects, we measured their perceived pair-wise similarities between objects using a visual search task. Regardless of category and hierarchical level, we found that the time taken to categorize an object could be predicted using its similarity to members within and outside its category. We were able to account for several classic categorization phenomena, such as (a) the longer times required to reject category membership; (b) the longer times to categorize atypical objects; and (c) differences in performance across tasks and across hierarchical levels. These categorization times were also accounted for by a model that extracts coarse structure from an image. The striking agreement observed between categorization and visual search suggests that these two disparate tasks depend on a shared coarse object representation. PMID:23092947

  15. Semantic similarity measure in biomedical domain leverage web search engine.

    Science.gov (United States)

    Chen, Chi-Huang; Hsieh, Sheau-Ling; Weng, Yung-Ching; Chang, Wen-Yung; Lai, Feipei

    2010-01-01

    Semantic similarity measure plays an essential role in Information Retrieval and Natural Language Processing. In this paper we propose a page-count-based semantic similarity measure and apply it in biomedical domains. Previous researches in semantic web related applications have deployed various semantic similarity measures. Despite the usefulness of the measurements in those applications, measuring semantic similarity between two terms remains a challenge task. The proposed method exploits page counts returned by the Web Search Engine. We define various similarity scores for two given terms P and Q, using the page counts for querying P, Q and P AND Q. Moreover, we propose a novel approach to compute semantic similarity using lexico-syntactic patterns with page counts. These different similarity scores are integrated adapting support vector machines, to leverage the robustness of semantic similarity measures. Experimental results on two datasets achieve correlation coefficients of 0.798 on the dataset provided by A. Hliaoutakis, 0.705 on the dataset provide by T. Pedersen with physician scores and 0.496 on the dataset provided by T. Pedersen et al. with expert scores.

  16. Using SQL Databases for Sequence Similarity Searching and Analysis.

    Science.gov (United States)

    Pearson, William R; Mackey, Aaron J

    2017-09-13

    Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  17. How Google Web Search copes with very similar documents

    NARCIS (Netherlands)

    W. Mettrop (Wouter); P. Nieuwenhuysen; H. Smulders

    2006-01-01

    textabstractA significant portion of the computer files that carry documents, multimedia, programs etc. on the Web are identical or very similar to other files on the Web. How do search engines cope with this? Do they perform some kind of “deduplication”? How should users take into account that

  18. The HMMER Web Server for Protein Sequence Similarity Search.

    Science.gov (United States)

    Prakash, Ananth; Jeffryes, Matt; Bateman, Alex; Finn, Robert D

    2017-12-08

    Protein sequence similarity search is one of the most commonly used bioinformatics methods for identifying evolutionarily related proteins. In general, sequences that are evolutionarily related share some degree of similarity, and sequence-search algorithms use this principle to identify homologs. The requirement for a fast and sensitive sequence search method led to the development of the HMMER software, which in the latest version (v3.1) uses a combination of sophisticated acceleration heuristics and mathematical and computational optimizations to enable the use of profile hidden Markov models (HMMs) for sequence analysis. The HMMER Web server provides a common platform by linking the HMMER algorithms to databases, thereby enabling the search for homologs, as well as providing sequence and functional annotation by linking external databases. This unit describes three basic protocols and two alternate protocols that explain how to use the HMMER Web server using various input formats and user defined parameters. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  19. Target-nontarget similarity decreases search efficiency and increases stimulus-driven control in visual search.

    Science.gov (United States)

    Barras, Caroline; Kerzel, Dirk

    2017-10-01

    Some points of criticism against the idea that attentional selection is controlled by bottom-up processing were dispelled by the attentional window account. The attentional window account claims that saliency computations during visual search are only performed for stimuli inside the attentional window. Therefore, a small attentional window may avoid attentional capture by salient distractors because it is likely that the salient distractor is located outside the window. In contrast, a large attentional window increases the chances of attentional capture by a salient distractor. Large and small attentional windows have been associated with efficient (parallel) and inefficient (serial) search, respectively. We compared the effect of a salient color singleton on visual search for a shape singleton during efficient and inefficient search. To vary search efficiency, the nontarget shapes were either similar or dissimilar with respect to the shape singleton. We found that interference from the color singleton was larger with inefficient than efficient search, which contradicts the attentional window account. While inconsistent with the attentional window account, our results are predicted by computational models of visual search. Because of target-nontarget similarity, the target was less salient with inefficient than efficient search. Consequently, the relative saliency of the color distractor was higher with inefficient than with efficient search. Accordingly, stronger attentional capture resulted. Overall, the present results show that bottom-up control by stimulus saliency is stronger when search is difficult, which is inconsistent with the attentional window account.

  20. Combination of 2D/3D ligand-based similarity search in rapid virtual screening from multimillion compound repositories. Selection and biological evaluation of potential PDE4 and PDE5 inhibitors.

    Science.gov (United States)

    Dobi, Krisztina; Hajdú, István; Flachner, Beáta; Fabó, Gabriella; Szaszkó, Mária; Bognár, Melinda; Magyar, Csaba; Simon, István; Szisz, Dániel; Lőrincz, Zsolt; Cseh, Sándor; Dormán, György

    2014-05-28

    Rapid in silico selection of target focused libraries from commercial repositories is an attractive and cost effective approach. If structures of active compounds are available rapid 2D similarity search can be performed on multimillion compound databases but the generated library requires further focusing by various 2D/3D chemoinformatics tools. We report here a combination of the 2D approach with a ligand-based 3D method (Screen3D) which applies flexible matching to align reference and target compounds in a dynamic manner and thus to assess their structural and conformational similarity. In the first case study we compared the 2D and 3D similarity scores on an existing dataset derived from the biological evaluation of a PDE5 focused library. Based on the obtained similarity metrices a fusion score was proposed. The fusion score was applied to refine the 2D similarity search in a second case study where we aimed at selecting and evaluating a PDE4B focused library. The application of this fused 2D/3D similarity measure led to an increase of the hit rate from 8.5% (1st round, 47% inhibition at 10 µM) to 28.5% (2nd round at 50% inhibition at 10 µM) and the best two hits had 53 nM inhibitory activities.

  1. Query-dependent banding (QDB for faster RNA similarity searches.

    Directory of Open Access Journals (Sweden)

    Eric P Nawrocki

    2007-03-01

    Full Text Available When searching sequence databases for RNAs, it is desirable to score both primary sequence and RNA secondary structure similarity. Covariance models (CMs are probabilistic models well-suited for RNA similarity search applications. However, the computational complexity of CM dynamic programming alignment algorithms has limited their practical application. Here we describe an acceleration method called query-dependent banding (QDB, which uses the probabilistic query CM to precalculate regions of the dynamic programming lattice that have negligible probability, independently of the target database. We have implemented QDB in the freely available Infernal software package. QDB reduces the average case time complexity of CM alignment from LN(2.4 to LN(1.3 for a query RNA of N residues and a target database of L residues, resulting in a 4-fold speedup for typical RNA queries. Combined with other improvements to Infernal, including informative mixture Dirichlet priors on model parameters, benchmarks also show increased sensitivity and specificity resulting from improved parameterization.

  2. Applying ligands profiling using multiple extended electron distribution based field templates and feature trees similarity searching in the discovery of new generation of urea-based antineoplastic kinase inhibitors.

    Directory of Open Access Journals (Sweden)

    Eman M Dokla

    Full Text Available This study provides a comprehensive computational procedure for the discovery of novel urea-based antineoplastic kinase inhibitors while focusing on diversification of both chemotype and selectivity pattern. It presents a systematic structural analysis of the different binding motifs of urea-based kinase inhibitors and the corresponding configurations of the kinase enzymes. The computational model depends on simultaneous application of two protocols. The first protocol applies multiple consecutive validated virtual screening filters including SMARTS, support vector-machine model (ROC = 0.98, Bayesian model (ROC = 0.86 and structure-based pharmacophore filters based on urea-based kinase inhibitors complexes retrieved from literature. This is followed by hits profiling against different extended electron distribution (XED based field templates representing different kinase targets. The second protocol enables cancericidal activity verification by using the algorithm of feature trees (Ftrees similarity searching against NCI database. Being a proof-of-concept study, this combined procedure was experimentally validated by its utilization in developing a novel series of urea-based derivatives of strong anticancer activity. This new series is based on 3-benzylbenzo[d]thiazol-2(3H-one scaffold which has interesting chemical feasibility and wide diversification capability. Antineoplastic activity of this series was assayed in vitro against NCI 60 tumor-cell lines showing very strong inhibition of GI(50 as low as 0.9 uM. Additionally, its mechanism was unleashed using KINEX™ protein kinase microarray-based small molecule inhibitor profiling platform and cell cycle analysis showing a peculiar selectivity pattern against Zap70, c-src, Mink1, csk and MeKK2 kinases. Interestingly, it showed activity on syk kinase confirming the recent studies finding of the high activity of diphenyl urea containing compounds against this kinase. Allover, the new series

  3. MEASURING THE PERFORMANCE OF SIMILARITY PROPAGATION IN AN SEMANTIC SEARCH ENGINE

    Directory of Open Access Journals (Sweden)

    S. K. Jayanthi

    2013-10-01

    Full Text Available In the current scenario, web page result personalization is playing a vital role. Nearly 80 % of the users expect the best results in the first page itself without having any persistence to browse longer in URL mode. This research work focuses on two main themes: Semantic web search through online and Domain based search through offline. The first part is to find an effective method which allows grouping similar results together using BookShelf Data Structure and organizing the various clusters. The second one is focused on the academic domain based search through offline. This paper focuses on finding documents which are similar and how Vector space can be used to solve it. So more weightage is given for the principles and working methodology of similarity propagation. Cosine similarity measure is used for finding the relevancy among the documents.

  4. Music Retrieval based on Melodic Similarity

    NARCIS (Netherlands)

    Typke, R.

    2007-01-01

    This thesis introduces a method for measuring melodic similarity for notated music such as MIDI files. This music search algorithm views music as sets of notes that are represented as weighted points in the two-dimensional space of time and pitch. Two point sets can be compared by calculating how

  5. Detecting atypical examples of known domain types by sequence similarity searching: the SBASE domain library approach.

    Science.gov (United States)

    Dhir, Somdutta; Pacurar, Mircea; Franklin, Dino; Gáspári, Zoltán; Kertész-Farkas, Attila; Kocsor, András; Eisenhaber, Frank; Pongor, Sándor

    2010-11-01

    SBASE is a project initiated to detect known domain types and predicting domain architectures using sequence similarity searching (Simon et al., Protein Seq Data Anal, 5: 39-42, 1992, Pongor et al, Nucl. Acids. Res. 21:3111-3115, 1992). The current approach uses a curated collection of domain sequences - the SBASE domain library - and standard similarity search algorithms, followed by postprocessing which is based on a simple statistics of the domain similarity network (http://hydra.icgeb.trieste.it/sbase/). It is especially useful in detecting rare, atypical examples of known domain types which are sometimes missed even by more sophisticated methodologies. This approach does not require multiple alignment or machine learning techniques, and can be a useful complement to other domain detection methodologies. This article gives an overview of the project history as well as of the concepts and principles developed within this the project.

  6. Cloud4Psi: cloud computing for 3D protein structure similarity searching.

    Science.gov (United States)

    Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Kłapciński, Artur

    2014-10-01

    Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a consequence, performing similarity searching against large repositories of structural data requires increased computational resources that are not always available. Cloud computing provides huge amounts of computational power that can be provisioned on a pay-as-you-go basis. We have developed the cloud-based system that allows scaling of the similarity searching process vertically and horizontally. Cloud4Psi (Cloud for Protein Similarity) was tested in the Microsoft Azure cloud environment and provided good, almost linearly proportional acceleration when scaled out onto many computational units. Cloud4Psi is available as Software as a Service for testing purposes at: http://cloud4psi.cloudapp.net/. For source code and software availability, please visit the Cloud4Psi project home page at http://zti.polsl.pl/dmrozek/science/cloud4psi.htm. © The Author 2014. Published by Oxford University Press.

  7. Similarity-based pattern analysis and recognition

    CERN Document Server

    Pelillo, Marcello

    2013-01-01

    This accessible text/reference presents a coherent overview of the emerging field of non-Euclidean similarity learning. The book presents a broad range of perspectives on similarity-based pattern analysis and recognition methods, from purely theoretical challenges to practical, real-world applications. The coverage includes both supervised and unsupervised learning paradigms, as well as generative and discriminative models. Topics and features: explores the origination and causes of non-Euclidean (dis)similarity measures, and how they influence the performance of traditional classification alg

  8. Semantic similarity measures in the biomedical domain by leveraging a web search engine.

    Science.gov (United States)

    Hsieh, Sheau-Ling; Chang, Wen-Yung; Chen, Chi-Huang; Weng, Yung-Ching

    2013-07-01

    Various researches in web related semantic similarity measures have been deployed. However, measuring semantic similarity between two terms remains a challenging task. The traditional ontology-based methodologies have a limitation that both concepts must be resided in the same ontology tree(s). Unfortunately, in practice, the assumption is not always applicable. On the other hand, if the corpus is sufficiently adequate, the corpus-based methodologies can overcome the limitation. Now, the web is a continuous and enormous growth corpus. Therefore, a method of estimating semantic similarity is proposed via exploiting the page counts of two biomedical concepts returned by Google AJAX web search engine. The features are extracted as the co-occurrence patterns of two given terms P and Q, by querying P, Q, as well as P AND Q, and the web search hit counts of the defined lexico-syntactic patterns. These similarity scores of different patterns are evaluated, by adapting support vector machines for classification, to leverage the robustness of semantic similarity measures. Experimental results validating against two datasets: dataset 1 provided by A. Hliaoutakis; dataset 2 provided by T. Pedersen, are presented and discussed. In dataset 1, the proposed approach achieves the best correlation coefficient (0.802) under SNOMED-CT. In dataset 2, the proposed method obtains the best correlation coefficient (SNOMED-CT: 0.705; MeSH: 0.723) with physician scores comparing with measures of other methods. However, the correlation coefficients (SNOMED-CT: 0.496; MeSH: 0.539) with coder scores received opposite outcomes. In conclusion, the semantic similarity findings of the proposed method are close to those of physicians' ratings. Furthermore, the study provides a cornerstone investigation for extracting fully relevant information from digitizing, free-text medical records in the National Taiwan University Hospital database.

  9. Similarity searching and scaffold hopping in synthetically accessible combinatorial chemistry spaces.

    Science.gov (United States)

    Boehm, Markus; Wu, Tong-Ying; Claussen, Holger; Lemmen, Christian

    2008-04-24

    Large collections of combinatorial libraries are an integral element in today's pharmaceutical industry. It is of great interest to perform similarity searches against all virtual compounds that are synthetically accessible by any such library. Here we describe the successful application of a new software tool CoLibri on 358 combinatorial libraries based on validated reaction protocols to create a single chemistry space containing over 10 (12) possible products. Similarity searching with FTrees-FS allows the systematic exploration of this space without the need to enumerate all product structures. The search result is a set of virtual hits which are synthetically accessible by one or more of the existing reaction protocols. Grouping these virtual hits by their synthetic protocols allows the rapid design and synthesis of multiple follow-up libraries. Such library ideas support hit-to-lead design efforts for tasks like follow-up from high-throughput screening hits or scaffold hopping from one hit to another attractive series.

  10. Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

    Science.gov (United States)

    Mackey, Aaron J; Pearson, William R

    2004-10-01

    Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.

  11. Similarity-based Polymorphic Shellcode Detection

    Directory of Open Access Journals (Sweden)

    Denis Yurievich Gamayunov

    2013-02-01

    Full Text Available In the work the method for polymorphic shellcode dedection based on the set of known shellcodes is proposed. The method’s main idea is in sequential applying of deobfuscating transformations to a data analyzed and then recognizing similarity with malware samples. The method has been tested on the sets of shellcodes generated using Metasploit Framework v.4.1.0 and PELock Obfuscator and shows 87 % precision with zero false positives rate.

  12. δ-Similar Elimination to Enhance Search Performance of Multiobjective Evolutionary Algorithms

    Science.gov (United States)

    Aguirre, Hernán; Sato, Masahiko; Tanaka, Kiyoshi

    In this paper, we propose δ-similar elimination to improve the search performance of multiobjective evolutionary algorithms in combinatorial optimization problems. This method eliminates similar individuals in objective space to fairly distribute selection among the different regions of the instantaneous Pareto front. We investigate four eliminating methods analyzing their effects using NSGA-II. In addition, we compare the search performance of NSGA-II enhanced by our method and NSGA-II enhanced by controlled elitism.

  13. Content Based Searching for INIS

    International Nuclear Information System (INIS)

    Jain, V.; Jain, R.K.

    2016-01-01

    Full text: Whatever a user wants is available on the internet, but to retrieve the information efficiently, a multilingual and most-relevant document search engine is a must. Most current search engines are word based or pattern based. They do not consider the meaning of the query posed to them; purely based on the keywords of the query; no support of multilingual query and and dismissal of nonrelevant results. Current information-retrieval techniques either rely on an encoding process, using a certain perspective or classification scheme, to describe a given item, or perform a full-text analysis, searching for user-specified words. Neither case guarantees content matching because an encoded description might reflect only part of the content and the mere occurrence of a word does not necessarily reflect the document’s content. For general documents, there doesn’t yet seem to be a much better option than lazy full-text analysis, by manually going through those endless results pages. In contrast to this, new search engine should extract the meaning of the query and then perform the search based on this extracted meaning. New search engine should also employ Interlingua based machine translation technology to present information in the language of choice of the user. (author

  14. Image magnification based on similarity analogy

    International Nuclear Information System (INIS)

    Chen Zuoping; Ye Zhenglin; Wang Shuxun; Peng Guohua

    2009-01-01

    Aiming at the high time complexity of the decoding phase in the traditional image enlargement methods based on fractal coding, a novel image magnification algorithm is proposed in this paper, which has the advantage of iteration-free decoding, by using the similarity analogy between an image and its zoom-out and zoom-in. A new pixel selection technique is also presented to further improve the performance of the proposed method. Furthermore, by combining some existing fractal zooming techniques, an efficient image magnification algorithm is obtained, which can provides the image quality as good as the state of the art while greatly decrease the time complexity of the decoding phase.

  15. Towards novel organic high-Tc superconductors: Data mining using density of states similarity search

    Science.gov (United States)

    Geilhufe, R. Matthias; Borysov, Stanislav S.; Kalpakchi, Dmytro; Balatsky, Alexander V.

    2018-02-01

    Identifying novel functional materials with desired key properties is an important part of bridging the gap between fundamental research and technological advancement. In this context, high-throughput calculations combined with data-mining techniques highly accelerated this process in different areas of research during the past years. The strength of a data-driven approach for materials prediction lies in narrowing down the search space of thousands of materials to a subset of prospective candidates. Recently, the open-access organic materials database OMDB was released providing electronic structure data for thousands of previously synthesized three-dimensional organic crystals. Based on the OMDB, we report about the implementation of a novel density of states similarity search tool which is capable of retrieving materials with similar density of states to a reference material. The tool is based on the approximate nearest neighbor algorithm as implemented in the ANNOY library and can be applied via the OMDB web interface. The approach presented here is wide ranging and can be applied to various problems where the density of states is responsible for certain key properties of a material. As the first application, we report about materials exhibiting electronic structure similarities to the aromatic hydrocarbon p-terphenyl which was recently discussed as a potential organic high-temperature superconductor exhibiting a transition temperature in the order of 120 K under strong potassium doping. Although the mechanism driving the remarkable transition temperature remains under debate, we argue that the density of states, reflecting the electronic structure of a material, might serve as a crucial ingredient for the observed high Tc. To provide candidates which might exhibit comparable properties, we present 15 purely organic materials with similar features to p-terphenyl within the electronic structure, which also tend to have structural similarities with p

  16. SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2004-10-01

    Full Text Available Abstract Background Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. Results We describe the implementation of SS-Wrapper (Similarity Search Wrapper, a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST that provides a complementary solution for BLAST searches when the database is too large to fit into

  17. SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.

    Science.gov (United States)

    Wang, Chunlin; Lefkowitz, Elliot J

    2004-10-28

    Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Used together

  18. RxnFinder: biochemical reaction search engines using molecular structures, molecular fragments and reaction similarity.

    Science.gov (United States)

    Hu, Qian-Nan; Deng, Zhe; Hu, Huanan; Cao, Dong-Sheng; Liang, Yi-Zeng

    2011-09-01

    Biochemical reactions play a key role to help sustain life and allow cells to grow. RxnFinder was developed to search biochemical reactions from KEGG reaction database using three search criteria: molecular structures, molecular fragments and reaction similarity. RxnFinder is helpful to get reference reactions for biosynthesis and xenobiotics metabolism. RxnFinder is freely available via: http://sdd.whu.edu.cn/rxnfinder. qnhu@whu.edu.cn.

  19. Information filtering based on transferring similarity.

    Science.gov (United States)

    Sun, Duo; Zhou, Tao; Liu, Jian-Guo; Liu, Run-Ran; Jia, Chun-Xiao; Wang, Bing-Hong

    2009-07-01

    In this Brief Report, we propose an index of user similarity, namely, the transferring similarity, which involves all high-order similarities between users. Accordingly, we design a modified collaborative filtering algorithm, which provides remarkably higher accurate predictions than the standard collaborative filtering. More interestingly, we find that the algorithmic performance will approach its optimal value when the parameter, contained in the definition of transferring similarity, gets close to its critical value, before which the series expansion of transferring similarity is convergent and after which it is divergent. Our study is complementary to the one reported in [E. A. Leicht, P. Holme, and M. E. J. Newman, Phys. Rev. E 73, 026120 (2006)], and is relevant to the missing link prediction problem.

  20. A similarity-based data warehousing environment for medical images.

    Science.gov (United States)

    Teixeira, Jefferson William; Annibal, Luana Peixoto; Felipe, Joaquim Cezar; Ciferri, Ricardo Rodrigues; Ciferri, Cristina Dutra de Aguiar

    2015-11-01

    A core issue of the decision-making process in the medical field is to support the execution of analytical (OLAP) similarity queries over images in data warehousing environments. In this paper, we focus on this issue. We propose imageDWE, a non-conventional data warehousing environment that enables the storage of intrinsic features taken from medical images in a data warehouse and supports OLAP similarity queries over them. To comply with this goal, we introduce the concept of perceptual layer, which is an abstraction used to represent an image dataset according to a given feature descriptor in order to enable similarity search. Based on this concept, we propose the imageDW, an extended data warehouse with dimension tables specifically designed to support one or more perceptual layers. We also detail how to build an imageDW and how to load image data into it. Furthermore, we show how to process OLAP similarity queries composed of a conventional predicate and a similarity search predicate that encompasses the specification of one or more perceptual layers. Moreover, we introduce an index technique to improve the OLAP query processing over images. We carried out performance tests over a data warehouse environment that consolidated medical images from exams of several modalities. The results demonstrated the feasibility and efficiency of our proposed imageDWE to manage images and to process OLAP similarity queries. The results also demonstrated that the use of the proposed index technique guaranteed a great improvement in query processing. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Efficient blind search for similar-waveform earthquakes in years of continuous seismic data

    Science.gov (United States)

    Yoon, C. E.; Bergen, K.; Rong, K.; Elezabi, H.; Bailis, P.; Levis, P.; Beroza, G. C.

    2017-12-01

    Cross-correlating an earthquake waveform template with continuous seismic data has proven to be a sensitive, discriminating detector of small events missing from earthquake catalogs, but a key limitation of this approach is that it requires advance knowledge of the earthquake signals we wish to detect. To overcome this limitation, we can perform a blind search for events with similar waveforms, comparing waveforms from all possible times within the continuous data (Brown et al., 2008). However, the runtime for naive blind search scales quadratically with the duration of continuous data, making it impractical to process years of continuous data. The Fingerprint And Similarity Thresholding (FAST) detection method (Yoon et al., 2015) enables a comprehensive blind search for similar-waveform earthquakes in a fast, scalable manner by adapting data-mining techniques originally developed for audio and image search within massive databases. FAST converts seismic waveforms into compact "fingerprints", which are efficiently organized and searched within a database. In this way, FAST avoids the unnecessary comparison of dissimilar waveforms. To date, the longest duration of continuous data used for event detection with FAST was 3 months at a single station near Guy-Greenbrier, Arkansas, which revealed microearthquakes closely correlated with stages of hydraulic fracturing (Yoon et al., 2017). In this presentation we introduce an optimized, parallel version of the FAST software with improvements to the fingerprinting algorithm and the ability to detect events using continuous data from a network of stations (Bergen et al., 2016). We demonstrate its ability to detect low-magnitude earthquakes within several years of continuous data at locations of interest in California.

  2. A Framework for Similarity Search with Space-Time Tradeoffs using Locality Sensitive Filtering

    DEFF Research Database (Denmark)

    Christiani, Tobias Lybecker

    2017-01-01

    that satisfies certain locality-sensitivity properties, we can construct a dynamic data structure that solves the approximate near neighbor problem in $d$-dimensional space with query time $dn^{\\rho_q + o(1)}$, update time $dn^{\\rho_u + o(1)}$, and space usage $dn + n^{1 + \\rho_u + o(1)}$ where $n$ denotes......We present a framework for similarity search based on Locality-Sensitive Filtering~(LSF),generalizing the Indyk-Motwani (STOC 1998) Locality-Sensitive Hashing~(LSH) framework to support space-time tradeoffs. Given a family of filters, defined as a distribution over pairs of subsets of space...... the number of points in the data structure.The space-time tradeoff is tied to the tradeoff between query time and update time (insertions/deletions), controlled by the exponents $\\rho_q, \\rho_u$ that are determined by the filter family. \\\\ Locality-sensitive filtering was introduced by Becker et al. (SODA...

  3. Efficient Similarity Search Using the Earth Mover's Distance for Large Multimedia Databases

    DEFF Research Database (Denmark)

    Assent, Ira; Wichterich, Marc; Meisen, Tobias

    2008-01-01

    Multimedia similarity search in large databases requires efficient query processing. The Earth mover's distance, introduced in computer vision, is successfully used as a similarity model in a number of small-scale applications. Its computational complexity hindered its adoption in large multimedia...... databases. We enable directly indexing the Earth mover's distance in structures such as the R-tree and the VA-file by providing the accurate 'MinDist' function to any bounding rectangle in the index. We exploit the computational structure of the new MinDist to derive a new lower bound for the EMD Min...

  4. Evidence-based Medicine Search: a customizable federated search engine.

    Science.gov (United States)

    Bracke, Paul J; Howse, David K; Keim, Samuel M

    2008-04-01

    This paper reports on the development of a tool by the Arizona Health Sciences Library (AHSL) for searching clinical evidence that can be customized for different user groups. The AHSL provides services to the University of Arizona's (UA's) health sciences programs and to the University Medical Center. Librarians at AHSL collaborated with UA College of Medicine faculty to create an innovative search engine, Evidence-based Medicine (EBM) Search, that provides users with a simple search interface to EBM resources and presents results organized according to an evidence pyramid. EBM Search was developed with a web-based configuration component that allows the tool to be customized for different specialties. Informal and anecdotal feedback from physicians indicates that EBM Search is a useful tool with potential in teaching evidence-based decision making. While formal evaluation is still being planned, a tool such as EBM Search, which can be configured for specific user populations, may help lower barriers to information resources in an academic health sciences center.

  5. Mathematical programming solver based on local search

    CERN Document Server

    Gardi, Frédéric; Darlay, Julien; Estellon, Bertrand; Megel, Romain

    2014-01-01

    This book covers local search for combinatorial optimization and its extension to mixed-variable optimization. Although not yet understood from the theoretical point of view, local search is the paradigm of choice for tackling large-scale real-life optimization problems. Today's end-users demand interactivity with decision support systems. For optimization software, this means obtaining good-quality solutions quickly. Fast iterative improvement methods, like local search, are suited to satisfying such needs. Here the authors show local search in a new light, in particular presenting a new kind of mathematical programming solver, namely LocalSolver, based on neighborhood search. First, an iconoclast methodology is presented to design and engineer local search algorithms. The authors' concern about industrializing local search approaches is of particular interest for practitioners. This methodology is applied to solve two industrial problems with high economic stakes. Software based on local search induces ex...

  6. SimShiftDB; local conformational restraints derived from chemical shift similarity searches on a large synthetic database

    International Nuclear Information System (INIS)

    Ginzinger, Simon W.; Coles, Murray

    2009-01-01

    We present SimShiftDB, a new program to extract conformational data from protein chemical shifts using structural alignments. The alignments are obtained in searches of a large database containing 13,000 structures and corresponding back-calculated chemical shifts. SimShiftDB makes use of chemical shift data to provide accurate results even in the case of low sequence similarity, and with even coverage of the conformational search space. We compare SimShiftDB to HHSearch, a state-of-the-art sequence-based search tool, and to TALOS, the current standard tool for the task. We show that for a significant fraction of the predicted similarities, SimShiftDB outperforms the other two methods. Particularly, the high coverage afforded by the larger database often allows predictions to be made for residues not involved in canonical secondary structure, where TALOS predictions are both less frequent and more error prone. Thus SimShiftDB can be seen as a complement to currently available methods

  7. SimShiftDB; local conformational restraints derived from chemical shift similarity searches on a large synthetic database

    Energy Technology Data Exchange (ETDEWEB)

    Ginzinger, Simon W. [Center of Applied Molecular Engineering, University of Salzburg, Department of Molecular Biology, Division of Bioinformatics (Austria)], E-mail: simon@came.sbg.ac.at; Coles, Murray [Max-Planck-Institute for Developmental Biology, Department of Protein Evolution (Germany)], E-mail: Murray.Coles@tuebingen.mpg.de

    2009-03-15

    We present SimShiftDB, a new program to extract conformational data from protein chemical shifts using structural alignments. The alignments are obtained in searches of a large database containing 13,000 structures and corresponding back-calculated chemical shifts. SimShiftDB makes use of chemical shift data to provide accurate results even in the case of low sequence similarity, and with even coverage of the conformational search space. We compare SimShiftDB to HHSearch, a state-of-the-art sequence-based search tool, and to TALOS, the current standard tool for the task. We show that for a significant fraction of the predicted similarities, SimShiftDB outperforms the other two methods. Particularly, the high coverage afforded by the larger database often allows predictions to be made for residues not involved in canonical secondary structure, where TALOS predictions are both less frequent and more error prone. Thus SimShiftDB can be seen as a complement to currently available methods.

  8. Searching the protein structure database for ligand-binding site similarities using CPASS v.2

    Directory of Open Access Journals (Sweden)

    Caprez Adam

    2011-01-01

    Full Text Available Abstract Background A recent analysis of protein sequences deposited in the NCBI RefSeq database indicates that ~8.5 million protein sequences are encoded in prokaryotic and eukaryotic genomes, where ~30% are explicitly annotated as "hypothetical" or "uncharacterized" protein. Our Comparison of Protein Active-Site Structures (CPASS v.2 database and software compares the sequence and structural characteristics of experimentally determined ligand binding sites to infer a functional relationship in the absence of global sequence or structure similarity. CPASS is an important component of our Functional Annotation Screening Technology by NMR (FAST-NMR protocol and has been successfully applied to aid the annotation of a number of proteins of unknown function. Findings We report a major upgrade to our CPASS software and database that significantly improves its broad utility. CPASS v.2 is designed with a layered architecture to increase flexibility and portability that also enables job distribution over the Open Science Grid (OSG to increase speed. Similarly, the CPASS interface was enhanced to provide more user flexibility in submitting a CPASS query. CPASS v.2 now allows for both automatic and manual definition of ligand-binding sites and permits pair-wise, one versus all, one versus list, or list versus list comparisons. Solvent accessible surface area, ligand root-mean square difference, and Cβ distances have been incorporated into the CPASS similarity function to improve the quality of the results. The CPASS database has also been updated. Conclusions CPASS v.2 is more than an order of magnitude faster than the original implementation, and allows for multiple simultaneous job submissions. Similarly, the CPASS database of ligand-defined binding sites has increased in size by ~ 38%, dramatically increasing the likelihood of a positive search result. The modification to the CPASS similarity function is effective in reducing CPASS similarity scores

  9. Privacy Preserving Similarity Based Text Retrieval through Blind Storage

    Directory of Open Access Journals (Sweden)

    Pinki Kumari

    2016-09-01

    Full Text Available Cloud computing is improving rapidly due to their more advantage and more data owners give interest to outsource their data into cloud storage for centralize their data. As huge files stored in the cloud storage, there is need to implement the keyword based search process to data user. At the same time to protect the privacy of data, encryption techniques are used for sensitive data, that encryption is done before outsourcing data to cloud server. But it is critical to search results in encryption data. In this system we propose similarity text retrieval from the blind storage blocks with encryption format. This system provides more security because of blind storage system. In blind storage system data is stored randomly on cloud storage.  In Existing Data Owner cannot encrypt the document data as it was done only at server end. Everyone can access the data as there was no private key concept applied to maintained privacy of the data. But In our proposed system, Data Owner can encrypt the data himself using RSA algorithm.  RSA is a public key-cryptosystem and it is widely used for sensitive data storage over Internet. In our system we use Text mining process for identifying the index files of user documents. Before encryption we also use NLP (Nature Language Processing technique to identify the keyword synonyms of data owner document. Here text mining process examines text word by word and collect literal meaning beyond the words group that composes the sentence. Those words are examined in API of word net so that only equivalent words can be identified for index file use. Our proposed system provides more secure and authorized way of recover the text in cloud storage with access control. Finally, our experimental result shows that our system is better than existing.

  10. Proposal for a Similar Question Search System on a Q&A Site

    Directory of Open Access Journals (Sweden)

    Katsutoshi Kanamori

    2014-06-01

    Full Text Available There is a service to help Internet users obtain answers to specific questions when they visit a Q&A site. A Q&A site is very useful for the Internet user, but posted questions are often not answered immediately. This delay in answering occurs because in most cases another site user is answering the question manually. In this study, we propose a system that can present a question that is similar to a question posted by a user. An advantage of this system is that a user can refer to an answer to a similar question. This research measures the similarity of a candidate question based on word and dependency parsing. In an experiment, we examined the effectiveness of the proposed system for questions actually posted on the Q&A site. The result indicates that the system can show the questioner the answer to a similar question. However, the system still has a number of aspects that should be improved.

  11. Quantum signature scheme based on a quantum search algorithm

    International Nuclear Information System (INIS)

    Yoon, Chun Seok; Kang, Min Sung; Lim, Jong In; Yang, Hyung Jin

    2015-01-01

    We present a quantum signature scheme based on a two-qubit quantum search algorithm. For secure transmission of signatures, we use a quantum search algorithm that has not been used in previous quantum signature schemes. A two-step protocol secures the quantum channel, and a trusted center guarantees non-repudiation that is similar to other quantum signature schemes. We discuss the security of our protocol. (paper)

  12. Similar words analysis based on POS-CBOW language model

    Directory of Open Access Journals (Sweden)

    Dongru RUAN

    2015-10-01

    Full Text Available Similar words analysis is one of the important aspects in the field of natural language processing, and it has important research and application values in text classification, machine translation and information recommendation. Focusing on the features of Sina Weibo's short text, this paper presents a language model named as POS-CBOW, which is a kind of continuous bag-of-words language model with the filtering layer and part-of-speech tagging layer. The proposed approach can adjust the word vectors' similarity according to the cosine similarity and the word vectors' part-of-speech metrics. It can also filter those similar words set on the base of the statistical analysis model. The experimental result shows that the similar words analysis algorithm based on the proposed POS-CBOW language model is better than that based on the traditional CBOW language model.

  13. Multicriteria decision-making method based on a cosine similarity ...

    African Journals Online (AJOL)

    the cosine similarity measure is often used in information retrieval, citation analysis, and automatic classification. However, it scarcely deals with trapezoidal fuzzy information and multicriteria decision-making problems. For this purpose, a cosine similarity measure between trapezoidal fuzzy numbers is proposed based on ...

  14. A Model-Based Approach to Constructing Music Similarity Functions

    Science.gov (United States)

    West, Kris; Lamere, Paul

    2006-12-01

    Several authors have presented systems that estimate the audio similarity of two pieces of music through the calculation of a distance metric, such as the Euclidean distance, between spectral features calculated from the audio, related to the timbre or pitch of the signal. These features can be augmented with other, temporally or rhythmically based features such as zero-crossing rates, beat histograms, or fluctuation patterns to form a more well-rounded music similarity function. It is our contention that perceptual or cultural labels, such as the genre, style, or emotion of the music, are also very important features in the perception of music. These labels help to define complex regions of similarity within the available feature spaces. We demonstrate a machine-learning-based approach to the construction of a similarity metric, which uses this contextual information to project the calculated features into an intermediate space where a music similarity function that incorporates some of the cultural information may be calculated.

  15. Fast Depiction Invariant Visual Similarity for Content Based Image Retrieval Based on Data-driven Visual Similarity using Linear Discriminant Analysis

    Science.gov (United States)

    Wihardi, Y.; Setiawan, W.; Nugraha, E.

    2018-01-01

    On this research we try to build CBIRS based on Learning Distance/Similarity Function using Linear Discriminant Analysis (LDA) and Histogram of Oriented Gradient (HoG) feature. Our method is invariant to depiction of image, such as similarity of image to image, sketch to image, and painting to image. LDA can decrease execution time compared to state of the art method, but it still needs an improvement in term of accuracy. Inaccuracy in our experiment happen because we did not perform sliding windows search and because of low number of negative samples as natural-world images.

  16. Phishing Detection: Analysis of Visual Similarity Based Approaches

    Directory of Open Access Journals (Sweden)

    Ankit Kumar Jain

    2017-01-01

    Full Text Available Phishing is one of the major problems faced by cyber-world and leads to financial losses for both industries and individuals. Detection of phishing attack with high accuracy has always been a challenging issue. At present, visual similarities based techniques are very useful for detecting phishing websites efficiently. Phishing website looks very similar in appearance to its corresponding legitimate website to deceive users into believing that they are browsing the correct website. Visual similarity based phishing detection techniques utilise the feature set like text content, text format, HTML tags, Cascading Style Sheet (CSS, image, and so forth, to make the decision. These approaches compare the suspicious website with the corresponding legitimate website by using various features and if the similarity is greater than the predefined threshold value then it is declared phishing. This paper presents a comprehensive analysis of phishing attacks, their exploitation, some of the recent visual similarity based approaches for phishing detection, and its comparative study. Our survey provides a better understanding of the problem, current solution space, and scope of future research to deal with phishing attacks efficiently using visual similarity based approaches.

  17. Natural texture retrieval based on perceptual similarity measurement

    Science.gov (United States)

    Gao, Ying; Dong, Junyu; Lou, Jianwen; Qi, Lin; Liu, Jun

    2018-04-01

    A typical texture retrieval system performs feature comparison and might not be able to make human-like judgments of image similarity. Meanwhile, it is commonly known that perceptual texture similarity is difficult to be described by traditional image features. In this paper, we propose a new texture retrieval scheme based on texture perceptual similarity. The key of the proposed scheme is that prediction of perceptual similarity is performed by learning a non-linear mapping from image features space to perceptual texture space by using Random Forest. We test the method on natural texture dataset and apply it on a new wallpapers dataset. Experimental results demonstrate that the proposed texture retrieval scheme with perceptual similarity improves the retrieval performance over traditional image features.

  18. Space based microlensing planet searches

    Directory of Open Access Journals (Sweden)

    Tisserand Patrick

    2013-04-01

    Full Text Available The discovery of extra-solar planets is arguably the most exciting development in astrophysics during the past 15 years, rivalled only by the detection of dark energy. Two projects unite the communities of exoplanet scientists and cosmologists: the proposed ESA M class mission EUCLID and the large space mission WFIRST, top ranked by the Astronomy 2010 Decadal Survey report. The later states that: “Space-based microlensing is the optimal approach to providing a true statistical census of planetary systems in the Galaxy, over a range of likely semi-major axes”. They also add: “This census, combined with that made by the Kepler mission, will determine how common Earth-like planets are over a wide range of orbital parameters”. We will present a status report of the results obtained by microlensing on exoplanets and the new objectives of the next generation of ground based wide field imager networks. We will finally discuss the fantastic prospect offered by space based microlensing at the horizon 2020–2025.

  19. Link-Based Similarity Measures Using Reachability Vectors

    Directory of Open Access Journals (Sweden)

    Seok-Ho Yoon

    2014-01-01

    Full Text Available We present a novel approach for computing link-based similarities among objects accurately by utilizing the link information pertaining to the objects involved. We discuss the problems with previous link-based similarity measures and propose a novel approach for computing link based similarities that does not suffer from these problems. In the proposed approach each target object is represented by a vector. Each element of the vector corresponds to all the objects in the given data, and the value of each element denotes the weight for the corresponding object. As for this weight value, we propose to utilize the probability of reaching from the target object to the specific object, computed using the “Random Walk with Restart” strategy. Then, we define the similarity between two objects as the cosine similarity of the two vectors. In this paper, we provide examples to show that our approach does not suffer from the aforementioned problems. We also evaluate the performance of the proposed methods in comparison with existing link-based measures, qualitatively and quantitatively, with respect to two kinds of data sets, scientific papers and Web documents. Our experimental results indicate that the proposed methods significantly outperform the existing measures.

  20. Similarity-Based Interference and the Acquisition of Adjunct Control

    Directory of Open Access Journals (Sweden)

    Juliana Gerard

    2017-10-01

    Full Text Available Previous research on the acquisition of adjunct control has observed non-adultlike behavior for sentences like “John bumped Mary after tripping on the sidewalk.” While adults only allow a subject control interpretation for these sentences (that John tripped on the sidewalk, preschool-aged children have been reported to allow a much wider range of interpretations. A number of different tasks have been used with the aim of identifying a grammatical source of children’s errors. In this paper, we consider the role of extragrammatical factors. In two comprehension experiments, we demonstrate that error rates go up when the similarity increases between an antecedent and a linearly intervening noun phrase, first with similarity in gender, and next with similarity in number marking. This suggests that difficulties with adjunct control are to be explained (at least in part by the sentence processing mechanisms that underlie similarity-based interference in adults.

  1. Similarity-based recommendation of new concepts to a terminology

    NARCIS (Netherlands)

    Chandar, Praveen; Yaman, Anil; Hoxha, Julia; He, Zhe; Weng, Chunhua

    2015-01-01

    Terminologies can suffer from poor concept coverage due to delays in addition of new concepts. This study tests a similarity-based approach to recommending concepts from a text corpus to a terminology. Our approach involves extraction of candidate concepts from a given text corpus, which are

  2. Similarity Digest Search: A Survey and Comparative Analysis of Strategies to Perform Known File Filtering Using Approximate Matching

    Directory of Open Access Journals (Sweden)

    Vitor Hugo Galhardo Moia

    2017-01-01

    Full Text Available Digital forensics is a branch of Computer Science aiming at investigating and analyzing electronic devices in the search for crime evidence. There are several ways to perform this search. Known File Filter (KFF is one of them, where a list of interest objects is used to reduce/separate data for analysis. Holding a database of hashes of such objects, the examiner performs lookups for matches against the target device. However, due to limitations over hash functions (inability to detect similar objects, new methods have been designed, called approximate matching. This sort of function has interesting characteristics for KFF investigations but suffers mainly from high costs when dealing with huge data sets, as the search is usually done by brute force. To mitigate this problem, strategies have been developed to better perform lookups. In this paper, we present the state of the art of similarity digest search strategies, along with a detailed comparison involving several aspects, as time complexity, memory requirement, and search precision. Our results show that none of the approaches address at least these main aspects. Finally, we discuss future directions and present requirements for a new strategy aiming to fulfill current limitations.

  3. A Model-Based Approach to Constructing Music Similarity Functions

    Directory of Open Access Journals (Sweden)

    Lamere Paul

    2007-01-01

    Full Text Available Several authors have presented systems that estimate the audio similarity of two pieces of music through the calculation of a distance metric, such as the Euclidean distance, between spectral features calculated from the audio, related to the timbre or pitch of the signal. These features can be augmented with other, temporally or rhythmically based features such as zero-crossing rates, beat histograms, or fluctuation patterns to form a more well-rounded music similarity function. It is our contention that perceptual or cultural labels, such as the genre, style, or emotion of the music, are also very important features in the perception of music. These labels help to define complex regions of similarity within the available feature spaces. We demonstrate a machine-learning-based approach to the construction of a similarity metric, which uses this contextual information to project the calculated features into an intermediate space where a music similarity function that incorporates some of the cultural information may be calculated.

  4. Algorithm Research of Individualized Travelling Route Recommendation Based on Similarity

    Directory of Open Access Journals (Sweden)

    Xue Shan

    2015-01-01

    Full Text Available Although commercial recommendation system has made certain achievement in travelling route development, the recommendation system is facing a series of challenges because of people’s increasing interest in travelling. It is obvious that the core content of the recommendation system is recommendation algorithm. The advantages of recommendation algorithm can bring great effect to the recommendation system. Based on this, this paper applies traditional collaborative filtering algorithm for analysis. Besides, illustrating the deficiencies of the algorithm, such as the rating unicity and rating matrix sparsity, this paper proposes an improved algorithm combing the multi-similarity algorithm based on user and the element similarity algorithm based on user, so as to compensate for the deficiencies that traditional algorithm has within a controllable range. Experimental results have shown that the improved algorithm has obvious advantages in comparison with the traditional one. The improved algorithm has obvious effect on remedying the rating matrix sparsity and rating unicity.

  5. Dyniqx: a novel meta-search engine for metadata based cross search

    OpenAIRE

    Zhu, Jianhan; Song, Dawei; Eisenstadt, Marc; Barladeanu, Cristi; Rüger, Stefan

    2008-01-01

    The effect of metadata in collection fusion has not been sufficiently studied. In response to this, we present a novel meta-search engine called Dyniqx for metadata based cross search. Dyniqx exploits the availability of metadata in academic search services such as PubMed and Google Scholar etc for fusing search results from heterogeneous search engines. In addition, metadata from these search engines are used for generating dynamic query controls such as sliders and tick boxes etc which are ...

  6. Accelerator-based neutrino oscillation searches

    International Nuclear Information System (INIS)

    Whitehouse, D.A.; Rameika, R.; Stanton, N.

    1993-01-01

    This paper attempts to summarize the neutrino oscillation section of the Workshop on Future Directions in Particle and Nuclear Physics at Multi-GeV Hadron Beam Facilities. There were very lively discussions about the merits of the different oscillation channels, experiments, and facilities, but we believe a substantial consensus emerged. First, the next decade is one of great potential for discovery in neutrino physics, but it is also one of great peril. The possibility that neutrino oscillations explain the solar neutrino and atmospheric neutrino experiments, and the indirect evidence that Hot Dark Matter (HDM) in the form of light neutrinos might make up 30% of the mass of the universe, point to areas where accelerator-based experiments could play a crucial role in piecing together the puzzle. At the same time, the field faces a very uncertain future. The LSND experiment at LAMPF is the only funded neutrino oscillation experiment in the United States and it is threatened by the abrupt shutdown of LAMPF proposed for fiscal 1994. The future of neutrino physics at the Brookhaven National Laboratory AGS depends the continuation of High Energy Physics (HEP) funding after the RHIC startup. Most proposed neutrino oscillation searches at Fermilab depend on the completion of the Main Injector project and on the construction of a new neutrino beamline, which is uncertain at this point. The proposed KAON facility at TRIUMF would provide a neutrino beam similar to that at the AGS but with a much increase intensity. The future of KAON is also uncertain. Despite the difficult obstacles present, there is a real possibility that we are on the verge of understanding the masses and mixings of the neutrinos. The physics importance of such a discovery can not be overstated. The current experimental status and future possibilities are discussed below

  7. HBLAST: Parallelised sequence similarity--A Hadoop MapReducable basic local alignment search tool.

    Science.gov (United States)

    O'Driscoll, Aisling; Belogrudov, Vladislav; Carroll, John; Kropp, Kai; Walsh, Paul; Ghazal, Peter; Sleator, Roy D

    2015-04-01

    The recent exponential growth of genomic databases has resulted in the common task of sequence alignment becoming one of the major bottlenecks in the field of computational biology. It is typical for these large datasets and complex computations to require cost prohibitive High Performance Computing (HPC) to function. As such, parallelised solutions have been proposed but many exhibit scalability limitations and are incapable of effectively processing "Big Data" - the name attributed to datasets that are extremely large, complex and require rapid processing. The Hadoop framework, comprised of distributed storage and a parallelised programming framework known as MapReduce, is specifically designed to work with such datasets but it is not trivial to efficiently redesign and implement bioinformatics algorithms according to this paradigm. The parallelisation strategy of "divide and conquer" for alignment algorithms can be applied to both data sets and input query sequences. However, scalability is still an issue due to memory constraints or large databases, with very large database segmentation leading to additional performance decline. Herein, we present Hadoop Blast (HBlast), a parallelised BLAST algorithm that proposes a flexible method to partition both databases and input query sequences using "virtual partitioning". HBlast presents improved scalability over existing solutions and well balanced computational work load while keeping database segmentation and recompilation to a minimum. Enhanced BLAST search performance on cheap memory constrained hardware has significant implications for in field clinical diagnostic testing; enabling faster and more accurate identification of pathogenic DNA in human blood or tissue samples. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. A similarity based agglomerative clustering algorithm in networks

    Science.gov (United States)

    Liu, Zhiyuan; Wang, Xiujuan; Ma, Yinghong

    2018-04-01

    The detection of clusters is benefit for understanding the organizations and functions of networks. Clusters, or communities, are usually groups of nodes densely interconnected but sparsely linked with any other clusters. To identify communities, an efficient and effective community agglomerative algorithm based on node similarity is proposed. The proposed method initially calculates similarities between each pair of nodes, and form pre-partitions according to the principle that each node is in the same community as its most similar neighbor. After that, check each partition whether it satisfies community criterion. For the pre-partitions who do not satisfy, incorporate them with others that having the biggest attraction until there are no changes. To measure the attraction ability of a partition, we propose an attraction index that based on the linked node's importance in networks. Therefore, our proposed method can better exploit the nodes' properties and network's structure. To test the performance of our algorithm, both synthetic and empirical networks ranging in different scales are tested. Simulation results show that the proposed algorithm can obtain superior clustering results compared with six other widely used community detection algorithms.

  9. Multicriteria Similarity-Based Anomaly Detection Using Pareto Depth Analysis.

    Science.gov (United States)

    Hsiao, Ko-Jen; Xu, Kevin S; Calder, Jeff; Hero, Alfred O

    2016-06-01

    We consider the problem of identifying patterns in a data set that exhibits anomalous behavior, often referred to as anomaly detection. Similarity-based anomaly detection algorithms detect abnormally large amounts of similarity or dissimilarity, e.g., as measured by the nearest neighbor Euclidean distances between a test sample and the training samples. In many application domains, there may not exist a single dissimilarity measure that captures all possible anomalous patterns. In such cases, multiple dissimilarity measures can be defined, including nonmetric measures, and one can test for anomalies by scalarizing using a nonnegative linear combination of them. If the relative importance of the different dissimilarity measures are not known in advance, as in many anomaly detection applications, the anomaly detection algorithm may need to be executed multiple times with different choices of weights in the linear combination. In this paper, we propose a method for similarity-based anomaly detection using a novel multicriteria dissimilarity measure, the Pareto depth. The proposed Pareto depth analysis (PDA) anomaly detection algorithm uses the concept of Pareto optimality to detect anomalies under multiple criteria without having to run an algorithm multiple times with different choices of weights. The proposed PDA approach is provably better than using linear combinations of the criteria, and shows superior performance on experiments with synthetic and real data sets.

  10. Combination of Pharmacophore Matching, 2D Similarity Search, and In Vitro Biological Assays in the Selection of Potential 5-HT6 Antagonists from Large Commercial Repositories.

    Science.gov (United States)

    Dobi, Krisztina; Flachner, Beáta; Pukáncsik, Mária; Máthé, Enikő; Bognár, Melinda; Szaszkó, Mária; Magyar, Csaba; Hajdú, István; Lőrincz, Zsolt; Simon, István; Fülöp, Ferenc; Cseh, Sándor; Dormán, György

    2015-10-01

    Rapid in silico selection of target-focused libraries from commercial repositories is an attractive and cost-effective approach. If structures of active compounds are available, rapid 2D similarity search can be performed on multimillion compound databases, but the generated library requires further focusing. We report here a combination of the 2D approach with pharmacophore matching which was used for selecting 5-HT6 antagonists. In the first screening round, 12 compounds showed >85% antagonist efficacy of the 91 screened. For the second-round (hit validation) screening phase, pharmacophore models were built, applied, and compared with the routine 2D similarity search. Three pharmacophore models were created based on the structure of the reference compounds and the first-round hit compounds. The pharmacophore search resulted in a high hit rate (40%) and led to novel chemotypes, while 2D similarity search had slightly better hit rate (51%), but lacking the novelty. To demonstrate the power of the virtual screening cascade, ligand efficiency indices were also calculated and their steady improvement was confirmed. © 2015 John Wiley & Sons A/S.

  11. An Energy-Based Similarity Measure for Time Series

    Directory of Open Access Journals (Sweden)

    Pierre Brunagel

    2007-11-01

    Full Text Available A new similarity measure, called SimilB, for time series analysis, based on the cross-ΨB-energy operator (2004, is introduced. ΨB is a nonlinear measure which quantifies the interaction between two time series. Compared to Euclidean distance (ED or the Pearson correlation coefficient (CC, SimilB includes the temporal information and relative changes of the time series using the first and second derivatives of the time series. SimilB is well suited for both nonstationary and stationary time series and particularly those presenting discontinuities. Some new properties of ΨB are presented. Particularly, we show that ΨB as similarity measure is robust to both scale and time shift. SimilB is illustrated with synthetic time series and an artificial dataset and compared to the CC and the ED measures.

  12. New Genome Similarity Measures based on Conserved Gene Adjacencies.

    Science.gov (United States)

    Doerr, Daniel; Kowada, Luis Antonio B; Araujo, Eloi; Deshpande, Shachi; Dantas, Simone; Moret, Bernard M E; Stoye, Jens

    2017-06-01

    Many important questions in molecular biology, evolution, and biomedicine can be addressed by comparative genomic approaches. One of the basic tasks when comparing genomes is the definition of measures of similarity (or dissimilarity) between two genomes, for example, to elucidate the phylogenetic relationships between species. The power of different genome comparison methods varies with the underlying formal model of a genome. The simplest models impose the strong restriction that each genome under study must contain the same genes, each in exactly one copy. More realistic models allow several copies of a gene in a genome. One speaks of gene families, and comparative genomic methods that allow this kind of input are called gene family-based. The most powerful-but also most complex-models avoid this preprocessing of the input data and instead integrate the family assignment within the comparative analysis. Such methods are called gene family-free. In this article, we study an intermediate approach between family-based and family-free genomic similarity measures. Introducing this simpler model, called gene connections, we focus on the combinatorial aspects of gene family-free genome comparison. While in most cases, the computational costs to the general family-free case are the same, we also find an instance where the gene connections model has lower complexity. Within the gene connections model, we define three variants of genomic similarity measures that have different expression powers. We give polynomial-time algorithms for two of them, while we show NP-hardness for the third, most powerful one. We also generalize the measures and algorithms to make them more robust against recent local disruptions in gene order. Our theoretical findings are supported by experimental results, proving the applicability and performance of our newly defined similarity measures.

  13. Chemical Information in Scirus and BASE (Bielefeld Academic Search Engine)

    Science.gov (United States)

    Bendig, Regina B.

    2009-01-01

    The author sought to determine to what extent the two search engines, Scirus and BASE (Bielefeld Academic Search Engines), would be useful to first-year university students as the first point of searching for chemical information. Five topics were searched and the first ten records of each search result were evaluated with regard to the type of…

  14. MAC/FAC: A Model of Similarity-Based Retrieval

    Science.gov (United States)

    1994-10-01

    Grapes (0.28) 327 Sour Grapes, analog The Taming of the Shrew (0.22), Merry Wives 251 (0.18), S[11 stories], Sour Grapes (-0.19) Sour Grapes, literal... The Institute for the 0 1 Learning Sciences Northwestern University CD• 00 MAC/FAC: A MODEL OF SIMILARITY-BASED RETRIEVAL Kenneth D. Forbus Dedre...Gentner Keith Law Technical Report #59 • October 1994 94-35188 wit Establisthed in 1989 with the support of Andersen Consulting Form Approved REPORT

  15. Improved personalized recommendation based on a similarity network

    Science.gov (United States)

    Wang, Ximeng; Liu, Yun; Xiong, Fei

    2016-08-01

    A recommender system helps individual users find the preferred items rapidly and has attracted extensive attention in recent years. Many successful recommendation algorithms are designed on bipartite networks, such as network-based inference or heat conduction. However, most of these algorithms define the resource-allocation methods for an average allocation. That is not reasonable because average allocation cannot indicate the user choice preference and the influence between users which leads to a series of non-personalized recommendation results. We propose a personalized recommendation approach that combines the similarity function and bipartite network to generate a similarity network that improves the resource-allocation process. Our model introduces user influence into the recommender system and states that the user influence can make the resource-allocation process more reasonable. We use four different metrics to evaluate our algorithms for three benchmark data sets. Experimental results show that the improved recommendation on a similarity network can obtain better accuracy and diversity than some competing approaches.

  16. The Evolution of Facultative Conformity Based on Similarity.

    Science.gov (United States)

    Efferson, Charles; Lalive, Rafael; Cacault, Maria Paula; Kistler, Deborah

    2016-01-01

    Conformist social learning can have a pronounced impact on the cultural evolution of human societies, and it can shape both the genetic and cultural evolution of human social behavior more broadly. Conformist social learning is beneficial when the social learner and the demonstrators from whom she learns are similar in the sense that the same behavior is optimal for both. Otherwise, the social learner's optimum is likely to be rare among demonstrators, and conformity is costly. The trade-off between these two situations has figured prominently in the longstanding debate about the evolution of conformity, but the importance of the trade-off can depend critically on the flexibility of one's social learning strategy. We developed a gene-culture coevolutionary model that allows cognition to encode and process information about the similarity between naive learners and experienced demonstrators. Facultative social learning strategies that condition on perceived similarity evolve under certain circumstances. When this happens, facultative adjustments are often asymmetric. Asymmetric adjustments mean that the tendency to follow the majority when learners perceive demonstrators as similar is stronger than the tendency to follow the minority when learners perceive demonstrators as different. In an associated incentivized experiment, we found that social learners adjusted how they used social information based on perceived similarity, but adjustments were symmetric. The symmetry of adjustments completely eliminated the commonly assumed trade-off between cases in which learners and demonstrators share an optimum versus cases in which they do not. In a second experiment that maximized the potential for social learners to follow their preferred strategies, a few social learners exhibited an inclination to follow the majority. Most, however, did not respond systematically to social information. Additionally, in the complete absence of information about their similarity to

  17. The Evolution of Facultative Conformity Based on Similarity.

    Directory of Open Access Journals (Sweden)

    Charles Efferson

    Full Text Available Conformist social learning can have a pronounced impact on the cultural evolution of human societies, and it can shape both the genetic and cultural evolution of human social behavior more broadly. Conformist social learning is beneficial when the social learner and the demonstrators from whom she learns are similar in the sense that the same behavior is optimal for both. Otherwise, the social learner's optimum is likely to be rare among demonstrators, and conformity is costly. The trade-off between these two situations has figured prominently in the longstanding debate about the evolution of conformity, but the importance of the trade-off can depend critically on the flexibility of one's social learning strategy. We developed a gene-culture coevolutionary model that allows cognition to encode and process information about the similarity between naive learners and experienced demonstrators. Facultative social learning strategies that condition on perceived similarity evolve under certain circumstances. When this happens, facultative adjustments are often asymmetric. Asymmetric adjustments mean that the tendency to follow the majority when learners perceive demonstrators as similar is stronger than the tendency to follow the minority when learners perceive demonstrators as different. In an associated incentivized experiment, we found that social learners adjusted how they used social information based on perceived similarity, but adjustments were symmetric. The symmetry of adjustments completely eliminated the commonly assumed trade-off between cases in which learners and demonstrators share an optimum versus cases in which they do not. In a second experiment that maximized the potential for social learners to follow their preferred strategies, a few social learners exhibited an inclination to follow the majority. Most, however, did not respond systematically to social information. Additionally, in the complete absence of information about their

  18. The Evolution of Facultative Conformity Based on Similarity

    Science.gov (United States)

    Efferson, Charles; Lalive, Rafael; Cacault, Maria Paula; Kistler, Deborah

    2016-01-01

    Conformist social learning can have a pronounced impact on the cultural evolution of human societies, and it can shape both the genetic and cultural evolution of human social behavior more broadly. Conformist social learning is beneficial when the social learner and the demonstrators from whom she learns are similar in the sense that the same behavior is optimal for both. Otherwise, the social learner’s optimum is likely to be rare among demonstrators, and conformity is costly. The trade-off between these two situations has figured prominently in the longstanding debate about the evolution of conformity, but the importance of the trade-off can depend critically on the flexibility of one’s social learning strategy. We developed a gene-culture coevolutionary model that allows cognition to encode and process information about the similarity between naive learners and experienced demonstrators. Facultative social learning strategies that condition on perceived similarity evolve under certain circumstances. When this happens, facultative adjustments are often asymmetric. Asymmetric adjustments mean that the tendency to follow the majority when learners perceive demonstrators as similar is stronger than the tendency to follow the minority when learners perceive demonstrators as different. In an associated incentivized experiment, we found that social learners adjusted how they used social information based on perceived similarity, but adjustments were symmetric. The symmetry of adjustments completely eliminated the commonly assumed trade-off between cases in which learners and demonstrators share an optimum versus cases in which they do not. In a second experiment that maximized the potential for social learners to follow their preferred strategies, a few social learners exhibited an inclination to follow the majority. Most, however, did not respond systematically to social information. Additionally, in the complete absence of information about their similarity to

  19. A general framework for regularized, similarity-based image restoration.

    Science.gov (United States)

    Kheradmand, Amin; Milanfar, Peyman

    2014-12-01

    Any image can be represented as a function defined on a weighted graph, in which the underlying structure of the image is encoded in kernel similarity and associated Laplacian matrices. In this paper, we develop an iterative graph-based framework for image restoration based on a new definition of the normalized graph Laplacian. We propose a cost function, which consists of a new data fidelity term and regularization term derived from the specific definition of the normalized graph Laplacian. The normalizing coefficients used in the definition of the Laplacian and associated regularization term are obtained using fast symmetry preserving matrix balancing. This results in some desired spectral properties for the normalized Laplacian such as being symmetric, positive semidefinite, and returning zero vector when applied to a constant image. Our algorithm comprises of outer and inner iterations, where in each outer iteration, the similarity weights are recomputed using the previous estimate and the updated objective function is minimized using inner conjugate gradient iterations. This procedure improves the performance of the algorithm for image deblurring, where we do not have access to a good initial estimate of the underlying image. In addition, the specific form of the cost function allows us to render the spectral analysis for the solutions of the corresponding linear equations. In addition, the proposed approach is general in the sense that we have shown its effectiveness for different restoration problems, including deblurring, denoising, and sharpening. Experimental results verify the effectiveness of the proposed algorithm on both synthetic and real examples.

  20. Fast protein tertiary structure retrieval based on global surface shape similarity.

    Science.gov (United States)

    Sael, Lee; Li, Bin; La, David; Fang, Yi; Ramani, Karthik; Rustamov, Raif; Kihara, Daisuke

    2008-09-01

    Characterization and identification of similar tertiary structure of proteins provides rich information for investigating function and evolution. The importance of structure similarity searches is increasing as structure databases continue to expand, partly due to the structural genomics projects. A crucial drawback of conventional protein structure comparison methods, which compare structures by their main-chain orientation or the spatial arrangement of secondary structure, is that a database search is too slow to be done in real-time. Here we introduce a global surface shape representation by three-dimensional (3D) Zernike descriptors, which represent a protein structure compactly as a series expansion of 3D functions. With this simplified representation, the search speed against a few thousand structures takes less than a minute. To investigate the agreement between surface representation defined by 3D Zernike descriptor and conventional main-chain based representation, a benchmark was performed against a protein classification generated by the combinatorial extension algorithm. Despite the different representation, 3D Zernike descriptor retrieved proteins of the same conformation defined by combinatorial extension in 89.6% of the cases within the top five closest structures. The real-time protein structure search by 3D Zernike descriptor will open up new possibility of large-scale global and local protein surface shape comparison. 2008 Wiley-Liss, Inc.

  1. Neutrosophic Refined Similarity Measure Based on Cosine Function

    Directory of Open Access Journals (Sweden)

    Said Broumi

    2014-12-01

    Full Text Available In this paper, the cosine similarity measure of neutrosophic refined (multi- sets is proposed and its properties are studied. The concept of this cosine similarity measure of neutrosophic refined sets is the extension of improved cosine similarity measure of single valued neutrosophic. Finally, using this cosine similarity measure of neutrosophic refined set, the application of medical diagnosis is presented.

  2. Personalizing Web Search based on User Profile

    OpenAIRE

    Utage, Sharyu; Ahire, Vijaya

    2016-01-01

    Web Search engine is most widely used for information retrieval from World Wide Web. These Web Search engines help user to find most useful information. When different users Searches for same information, search engine provide same result without understanding who is submitted that query. Personalized web search it is search technique for proving useful result. This paper models preference of users as hierarchical user profiles. a framework is proposed called UPS. It generalizes profile and m...

  3. A Full-Text-Based Search Engine for Finding Highly Matched Documents Across Multiple Categories

    Science.gov (United States)

    Nguyen, Hung D.; Steele, Gynelle C.

    2016-01-01

    This report demonstrates the full-text-based search engine that works on any Web-based mobile application. The engine has the capability to search databases across multiple categories based on a user's queries and identify the most relevant or similar. The search results presented here were found using an Android (Google Co.) mobile device; however, it is also compatible with other mobile phones.

  4. XSemantic: An Extension of LCA Based XML Semantic Search

    Science.gov (United States)

    Supasitthimethee, Umaporn; Shimizu, Toshiyuki; Yoshikawa, Masatoshi; Porkaew, Kriengkrai

    One of the most convenient ways to query XML data is a keyword search because it does not require any knowledge of XML structure or learning a new user interface. However, the keyword search is ambiguous. The users may use different terms to search for the same information. Furthermore, it is difficult for a system to decide which node is likely to be chosen as a return node and how much information should be included in the result. To address these challenges, we propose an XML semantic search based on keywords called XSemantic. On the one hand, we give three definitions to complete in terms of semantics. Firstly, the semantic term expansion, our system is robust from the ambiguous keywords by using the domain ontology. Secondly, to return semantic meaningful answers, we automatically infer the return information from the user queries and take advantage of the shortest path to return meaningful connections between keywords. Thirdly, we present the semantic ranking that reflects the degree of similarity as well as the semantic relationship so that the search results with the higher relevance are presented to the users first. On the other hand, in the LCA and the proximity search approaches, we investigated the problem of information included in the search results. Therefore, we introduce the notion of the Lowest Common Element Ancestor (LCEA) and define our simple rule without any requirement on the schema information such as the DTD or XML Schema. The first experiment indicated that XSemantic not only properly infers the return information but also generates compact meaningful results. Additionally, the benefits of our proposed semantics are demonstrated by the second experiment.

  5. A measure of association between vectors based on "similarity covariance"

    OpenAIRE

    Pascual-Marqui, Roberto D.; Lehmann, Dietrich; Kochi, Kieko; Kinoshita, Toshihiko; Yamada, Naoto

    2013-01-01

    The "maximum similarity correlation" definition introduced in this study is motivated by the seminal work of Szekely et al on "distance covariance" (Ann. Statist. 2007, 35: 2769-2794; Ann. Appl. Stat. 2009, 3: 1236-1265). Instead of using Euclidean distances "d" as in Szekely et al, we use "similarity", which can be defined as "exp(-d/s)", where the scaling parameter s>0 controls how rapidly the similarity falls off with distance. Scale parameters are chosen by maximizing the similarity corre...

  6. Top-k Keyword Search Over Graphs Based On Backward Search

    Directory of Open Access Journals (Sweden)

    Zeng Jia-Hui

    2017-01-01

    Full Text Available Keyword search is one of the most friendly and intuitive information retrieval methods. Using the keyword search to get the connected subgraph has a lot of application in the graph-based cognitive computation, and it is a basic technology. This paper focuses on the top-k keyword searching over graphs. We implemented a keyword search algorithm which applies the backward search idea. The algorithm locates the keyword vertices firstly, and then applies backward search to find rooted trees that contain query keywords. The experiment shows that query time is affected by the iteration number of the algorithm.

  7. Density-based retrieval from high-similarity image databases

    DEFF Research Database (Denmark)

    Hansen, Michael Edberg; Carstensen, Jens Michael

    2004-01-01

    Many image classification problems can fruitfully be thought of as image retrieval in a "high similarity image database" (HSID) characterized by being tuned towards a specific application and having a high degree of visual similarity between entries that should be distinguished. We introduce a me...

  8. Neutrosophic Cubic MCGDM Method Based on Similarity Measure

    Directory of Open Access Journals (Sweden)

    Surapati Pramanik

    2017-06-01

    Full Text Available The notion of neutrosophic cubic set is originated from the hybridization of the concept of neutrosophic set and interval valued neutrosophic set. We define similarity measure for neutrosophic cubic sets and prove some of its basic properties.

  9. AREAL FEATURE MATCHING BASED ON SIMILARITY USING CRITIC METHOD

    Directory of Open Access Journals (Sweden)

    J. Kim

    2015-10-01

    Full Text Available In this paper, we propose an areal feature matching method that can be applied for many-to-many matching, which involves matching a simple entity with an aggregate of several polygons or two aggregates of several polygons with fewer user intervention. To this end, an affine transformation is applied to two datasets by using polygon pairs for which the building name is the same. Then, two datasets are overlaid with intersected polygon pairs that are selected as candidate matching pairs. If many polygons intersect at this time, we calculate the inclusion function between such polygons. When the value is more than 0.4, many of the polygons are aggregated as single polygons by using a convex hull. Finally, the shape similarity is calculated between the candidate pairs according to the linear sum of the weights computed in CRITIC method and the position similarity, shape ratio similarity, and overlap similarity. The candidate pairs for which the value of the shape similarity is more than 0.7 are determined as matching pairs. We applied the method to two geospatial datasets: the digital topographic map and the KAIS map in South Korea. As a result, the visual evaluation showed two polygons that had been well detected by using the proposed method. The statistical evaluation indicates that the proposed method is accurate when using our test dataset with a high F-measure of 0.91.

  10. Areal Feature Matching Based on Similarity Using Critic Method

    Science.gov (United States)

    Kim, J.; Yu, K.

    2015-10-01

    In this paper, we propose an areal feature matching method that can be applied for many-to-many matching, which involves matching a simple entity with an aggregate of several polygons or two aggregates of several polygons with fewer user intervention. To this end, an affine transformation is applied to two datasets by using polygon pairs for which the building name is the same. Then, two datasets are overlaid with intersected polygon pairs that are selected as candidate matching pairs. If many polygons intersect at this time, we calculate the inclusion function between such polygons. When the value is more than 0.4, many of the polygons are aggregated as single polygons by using a convex hull. Finally, the shape similarity is calculated between the candidate pairs according to the linear sum of the weights computed in CRITIC method and the position similarity, shape ratio similarity, and overlap similarity. The candidate pairs for which the value of the shape similarity is more than 0.7 are determined as matching pairs. We applied the method to two geospatial datasets: the digital topographic map and the KAIS map in South Korea. As a result, the visual evaluation showed two polygons that had been well detected by using the proposed method. The statistical evaluation indicates that the proposed method is accurate when using our test dataset with a high F-measure of 0.91.

  11. HTTP-based Search and Ordering Using ECHO's REST-based and OpenSearch APIs

    Science.gov (United States)

    Baynes, K.; Newman, D. J.; Pilone, D.

    2012-12-01

    Metadata is an important entity in the process of cataloging, discovering, and describing Earth science data. NASA's Earth Observing System (EOS) ClearingHOuse (ECHO) acts as the core metadata repository for EOSDIS data centers, providing a centralized mechanism for metadata and data discovery and retrieval. By supporting both the ESIP's Federated Search API and its own search and ordering interfaces, ECHO provides multiple capabilities that facilitate ease of discovery and access to its ever-increasing holdings. Users are able to search and export metadata in a variety of formats including ISO 19115, json, and ECHO10. This presentation aims to inform technically savvy clients interested in automating search and ordering of ECHO's metadata catalog. The audience will be introduced to practical and applicable examples of end-to-end workflows that demonstrate finding, sub-setting and ordering data that is bound by keyword, temporal and spatial constraints. Interaction with the ESIP OpenSearch Interface will be highlighted, as will ECHO's own REST-based API.

  12. Case-based reasoning diagnostic technique based on multi-attribute similarity

    Energy Technology Data Exchange (ETDEWEB)

    Makoto, Takahashi [Tohoku University, Miyagi (Japan); Akio, Gofuku [Okayama University, Okayamaa (Japan)

    2014-08-15

    Case-based diagnostic technique has been developed based on the multi-attribute similarity. Specific feature of the developed system is to use multiple attributes of process signals for similarity evaluation to retrieve a similar case stored in a case base. The present technique has been applied to the measurement data from Monju with some simulated anomalies. The results of numerical experiments showed that the present technique can be utilizes as one of the methods for a hybrid-type diagnosis system.

  13. Similarity-Based Unification: A Multi-Adjoint Approach

    Czech Academy of Sciences Publication Activity Database

    Medina, J.; Ojeda-Aciego, M.; Vojtáš, Peter

    2004-01-01

    Roč. 146, č. 1 (2004), s. 43-62 ISSN 0165-0114 Source of funding: V - iné verejné zdroje Keywords : similarity * fuzzy unification Subject RIV: BA - General Mathematics Impact factor: 0.734, year: 2004

  14. Similarity score computation for minutiae-based fingerprint recognition

    CSIR Research Space (South Africa)

    Khanyile, NP

    2014-09-01

    Full Text Available This paper identifies and analyses the factors that contribute to the similarity between two sets of minutiae points as well as the probability that two sets of minutiae points were extracted from fingerprints of the same finger. Minutiae...

  15. Patient Similarity in Prediction Models Based on Health Data: A Scoping Review

    Science.gov (United States)

    Sharafoddini, Anis; Dubin, Joel A

    2017-01-01

    Background Physicians and health policy makers are required to make predictions during their decision making in various medical problems. Many advances have been made in predictive modeling toward outcome prediction, but these innovations target an average patient and are insufficiently adjustable for individual patients. One developing idea in this field is individualized predictive analytics based on patient similarity. The goal of this approach is to identify patients who are similar to an index patient and derive insights from the records of similar patients to provide personalized predictions.. Objective The aim is to summarize and review published studies describing computer-based approaches for predicting patients’ future health status based on health data and patient similarity, identify gaps, and provide a starting point for related future research. Methods The method involved (1) conducting the review by performing automated searches in Scopus, PubMed, and ISI Web of Science, selecting relevant studies by first screening titles and abstracts then analyzing full-texts, and (2) documenting by extracting publication details and information on context, predictors, missing data, modeling algorithm, outcome, and evaluation methods into a matrix table, synthesizing data, and reporting results. Results After duplicate removal, 1339 articles were screened in abstracts and titles and 67 were selected for full-text review. In total, 22 articles met the inclusion criteria. Within included articles, hospitals were the main source of data (n=10). Cardiovascular disease (n=7) and diabetes (n=4) were the dominant patient diseases. Most studies (n=18) used neighborhood-based approaches in devising prediction models. Two studies showed that patient similarity-based modeling outperformed population-based predictive methods. Conclusions Interest in patient similarity-based predictive modeling for diagnosis and prognosis has been growing. In addition to raw/coded health

  16. Branch length similarity entropy-based descriptors for shape representation

    Science.gov (United States)

    Kwon, Ohsung; Lee, Sang-Hee

    2017-11-01

    In previous studies, we showed that the branch length similarity (BLS) entropy profile could be successfully used for the shape recognition such as battle tanks, facial expressions, and butterflies. In the present study, we proposed new descriptors, roundness, symmetry, and surface roughness, for the recognition, which are more accurate and fast in the computation than the previous descriptors. The roundness represents how closely a shape resembles to a circle, the symmetry characterizes how much one shape is similar with another when the shape is moved in flip, and the surface roughness quantifies the degree of vertical deviations of a shape boundary. To evaluate the performance of the descriptors, we used the database of leaf images with 12 species. Each species consisted of 10 - 20 leaf images and the total number of images were 160. The evaluation showed that the new descriptors successfully discriminated the leaf species. We believe that the descriptors can be a useful tool in the field of pattern recognition.

  17. SDL: Saliency-Based Dictionary Learning Framework for Image Similarity.

    Science.gov (United States)

    Sarkar, Rituparna; Acton, Scott T

    2018-02-01

    In image classification, obtaining adequate data to learn a robust classifier has often proven to be difficult in several scenarios. Classification of histological tissue images for health care analysis is a notable application in this context due to the necessity of surgery, biopsy or autopsy. To adequately exploit limited training data in classification, we propose a saliency guided dictionary learning method and subsequently an image similarity technique for histo-pathological image classification. Salient object detection from images aids in the identification of discriminative image features. We leverage the saliency values for the local image regions to learn a dictionary and respective sparse codes for an image, such that the more salient features are reconstructed with smaller error. The dictionary learned from an image gives a compact representation of the image itself and is capable of representing images with similar content, with comparable sparse codes. We employ this idea to design a similarity measure between a pair of images, where local image features of one image, are encoded with the dictionary learned from the other and vice versa. To effectively utilize the learned dictionary, we take into account the contribution of each dictionary atom in the sparse codes to generate a global image representation for image comparison. The efficacy of the proposed method was evaluated using three tissue data sets that consist of mammalian kidney, lung and spleen tissue, breast cancer, and colon cancer tissue images. From the experiments, we observe that our methods outperform the state of the art with an increase of 14.2% in the average classification accuracy over all data sets.

  18. Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles

    Science.gov (United States)

    Liu, Rey-Long

    2015-01-01

    Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling) that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references) in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no) articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations. PMID:26440794

  19. Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles.

    Directory of Open Access Journals (Sweden)

    Rey-Long Liu

    Full Text Available Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations.

  20. Solution of the weighted symmetric similarity transformations based on quaternions

    Science.gov (United States)

    Mercan, H.; Akyilmaz, O.; Aydin, C.

    2017-12-01

    A new method through Gauss-Helmert model of adjustment is presented for the solution of the similarity transformations, either 3D or 2D, in the frame of errors-in-variables (EIV) model. EIV model assumes that all the variables in the mathematical model are contaminated by random errors. Total least squares estimation technique may be used to solve the EIV model. Accounting for the heteroscedastic uncertainty both in the target and the source coordinates, that is the more common and general case in practice, leads to a more realistic estimation of the transformation parameters. The presented algorithm can handle the heteroscedastic transformation problems, i.e., positions of the both target and the source points may have full covariance matrices. Therefore, there is no limitation such as the isotropic or the homogenous accuracy for the reference point coordinates. The developed algorithm takes the advantage of the quaternion definition which uniquely represents a 3D rotation matrix. The transformation parameters: scale, translations, and the quaternion (so that the rotation matrix) along with their covariances, are iteratively estimated with rapid convergence. Moreover, prior least squares (LS) estimation of the unknown transformation parameters is not required to start the iterations. We also show that the developed method can also be used to estimate the 2D similarity transformation parameters by simply treating the problem as a 3D transformation problem with zero (0) values assigned for the z-components of both target and source points. The efficiency of the new algorithm is presented with the numerical examples and comparisons with the results of the previous studies which use the same data set. Simulation experiments for the evaluation and comparison of the proposed and the conventional weighted LS (WLS) method is also presented.

  1. SA-Search: a web tool for protein structure mining based on a Structural Alphabet.

    Science.gov (United States)

    Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

    2004-07-01

    SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search.

  2. QSAR models based on quantum topological molecular similarity.

    Science.gov (United States)

    Popelier, P L A; Smith, P J

    2006-07-01

    A new method called quantum topological molecular similarity (QTMS) was fairly recently proposed [J. Chem. Inf. Comp. Sc., 41, 2001, 764] to construct a variety of medicinal, ecological and physical organic QSAR/QSPRs. QTMS method uses quantum chemical topology (QCT) to define electronic descriptors drawn from modern ab initio wave functions of geometry-optimised molecules. It was shown that the current abundance of computing power can be utilised to inject realistic descriptors into QSAR/QSPRs. In this article we study seven datasets of medicinal interest : the dissociation constants (pK(a)) for a set of substituted imidazolines , the pK(a) of imidazoles , the ability of a set of indole derivatives to displace [(3)H] flunitrazepam from binding to bovine cortical membranes , the influenza inhibition constants for a set of benzimidazoles , the interaction constants for a set of amides and the enzyme liver alcohol dehydrogenase , the natriuretic activity of sulphonamide carbonic anhydrase inhibitors and the toxicity of a series of benzyl alcohols. A partial least square analysis in conjunction with a genetic algorithm delivered excellent models. They are also able to highlight the active site, of the ligand or the molecule whose structure determines the activity. The advantages and limitations of QTMS are discussed.

  3. Location-based Services using Image Search

    DEFF Research Database (Denmark)

    Vertongen, Pieter-Paulus; Hansen, Dan Witzner

    2008-01-01

    Recent developments in image search has made them sufficiently efficient to be used in real-time applications. GPS has become a popular navigation tool. While GPS information provide reasonably good accuracy, they are not always present in all hand held devices nor are they accurate in all situat...... of the image search engine and database image location knowledge, the location is determined of the query image and associated data can be presented to the user....

  4. Virtual screening by a new Clustering-based Weighted Similarity Extreme Learning Machine approach.

    Science.gov (United States)

    Pasupa, Kitsuchart; Kudisthalert, Wasu

    2018-01-01

    Machine learning techniques are becoming popular in virtual screening tasks. One of the powerful machine learning algorithms is Extreme Learning Machine (ELM) which has been applied to many applications and has recently been applied to virtual screening. We propose the Weighted Similarity ELM (WS-ELM) which is based on a single layer feed-forward neural network in a conjunction of 16 different similarity coefficients as activation function in the hidden layer. It is known that the performance of conventional ELM is not robust due to random weight selection in the hidden layer. Thus, we propose a Clustering-based WS-ELM (CWS-ELM) that deterministically assigns weights by utilising clustering algorithms i.e. k-means clustering and support vector clustering. The experiments were conducted on one of the most challenging datasets-Maximum Unbiased Validation Dataset-which contains 17 activity classes carefully selected from PubChem. The proposed algorithms were then compared with other machine learning techniques such as support vector machine, random forest, and similarity searching. The results show that CWS-ELM in conjunction with support vector clustering yields the best performance when utilised together with Sokal/Sneath(1) coefficient. Furthermore, ECFP_6 fingerprint presents the best results in our framework compared to the other types of fingerprints, namely ECFP_4, FCFP_4, and FCFP_6.

  5. Computer-based literature search in medical institutions in India

    Directory of Open Access Journals (Sweden)

    Kalita Jayantee

    2007-01-01

    Full Text Available Aim: To study the use of computer-based literature search and its application in clinical training and patient care as a surrogate marker of evidence-based medicine. Materials and Methods: A questionnaire comprising of questions on purpose (presentation, patient management, research, realm (site accessed, nature and frequency of search, effect, infrastructure, formal training in computer based literature search and suggestions for further improvement were sent to residents and faculty of a Postgraduate Medical Institute (PGI and a Medical College. The responses were compared amongst different subgroups of respondents. Results: Out of 300 subjects approached 194 responded; of whom 103 were from PGI and 91 from Medical College. There were 97 specialty residents, 58 super-specialty residents and 39 faculty members. Computer-based literature search was done at least once a month by 89% though there was marked variability in frequency and extent. The motivation for computer-based literature search was for presentation in 90%, research in 65% and patient management in 60.3%. The benefit of search was acknowledged in learning and teaching by 80%, research by 65% and patient care by 64.4% of respondents. Formal training in computer based literature search was received by 41% of whom 80% were residents. Residents from PGI did more frequent and more extensive computer-based literature search, which was attributed to better infrastructure and training. Conclusion: Training and infrastructure both are crucial for computer-based literature search, which may translate into evidence based medicine.

  6. Attention-based image similarity measure with application to content-based information retrieval

    Science.gov (United States)

    Stentiford, Fred W. M.

    2003-01-01

    Whilst storage and capture technologies are able to cope with huge numbers of images, image retrieval is in danger of rendering many repositories valueless because of the difficulty of access. This paper proposes a similarity measure that imposes only very weak assumptions on the nature of the features used in the recognition process. This approach does not make use of a pre-defined set of feature measurements which are extracted from a query image and used to match those from database images, but instead generates features on a trial and error basis during the calculation of the similarity measure. This has the significant advantage that features that determine similarity can match whatever image property is important in a particular region whether it be a shape, a texture, a colour or a combination of all three. It means that effort is expended searching for the best feature for the region rather than expecting that a fixed feature set will perform optimally over the whole area of an image and over every image in a database. The similarity measure is evaluated on a problem of distinguishing similar shapes in sets of black and white symbols.

  7. A grammar checker based on web searching

    Directory of Open Access Journals (Sweden)

    Joaquim Moré

    2006-05-01

    Full Text Available This paper presents an English grammar and style checker for non-native English speakers. The main characteristic of this checker is the use of an Internet search engine. As the number of web pages written in English is immense, the system hypothesises that a piece of text not found on the Web is probably badly written. The system also hypothesises that the Web will provide examples of how the content of the text segment can be expressed in a grammatically correct and idiomatic way. Thus, when the checker warns the user about the odd nature of a text segment, the Internet engine searches for contexts that can help the user decide whether he/she should correct the segment or not. By means of a search engine, the checker also suggests use of other expressions that appear on the Web more often than the expression he/she actually wrote.

  8. AN OVERVIEW OF SEARCHING AND DISCOVERING WEB BASED INFORMATION RESOURCES

    Directory of Open Access Journals (Sweden)

    Cezar VASILESCU

    2010-01-01

    Full Text Available The Internet becomes for most of us a daily used instrument, for professional or personal reasons. We even do not remember the times when a computer and a broadband connection were luxury items. More and more people are relying on the complicated web network to find the needed information.This paper presents an overview of Internet search related issues, upon search engines and describes the parties and the basic mechanism that is embedded in a search for web based information resources. Also presents ways to increase the efficiency of web searches, through a better understanding of what search engines ignore at websites content.

  9. Toward translational incremental similarity-based reasoning in breast cancer grading

    Science.gov (United States)

    Tutac, Adina E.; Racoceanu, Daniel; Leow, Wee-Keng; Müller, Henning; Putti, Thomas; Cretu, Vladimir

    2009-02-01

    One of the fundamental issues in bridging the gap between the proliferation of Content-Based Image Retrieval (CBIR) systems in the scientific literature and the deficiency of their usage in medical community is based on the characteristic of CBIR to access information by images or/and text only. Yet, the way physicians are reasoning about patients leads intuitively to a case representation. Hence, a proper solution to overcome this gap is to consider a CBIR approach inspired by Case-Based Reasoning (CBR), which naturally introduces medical knowledge structured by cases. Moreover, in a CBR system, the knowledge is incrementally added and learned. The purpose of this study is to initiate a translational solution from CBIR algorithms to clinical practice, using a CBIR/CBR hybrid approach. Therefore, we advance the idea of a translational incremental similarity-based reasoning (TISBR), using combined CBIR and CBR characteristics: incremental learning of medical knowledge, medical case-based structure of the knowledge (CBR), image usage to retrieve similar cases (CBIR), similarity concept (central for both paradigms). For this purpose, three major axes are explored: the indexing, the cases retrieval and the search refinement, applied to Breast Cancer Grading (BCG), a powerful breast cancer prognosis exam. The effectiveness of this strategy is currently evaluated over cases provided by the Pathology Department of Singapore National University Hospital, for the indexing. With its current accuracy, TISBR launches interesting perspectives for complex reasoning in future medical research, opening the way to a better knowledge traceability and a better acceptance rate of computer-aided diagnosis assistance among practitioners.

  10. Signature-based global searches at CDF

    International Nuclear Information System (INIS)

    Hocker, James Andrew

    2008-01-01

    Data collected in Run II of the Fermilab Tevatron are searched for indications of new electroweak scale physics. Rather than focusing on particular new physics scenarios, CDF data are analyzed for discrepancies with respect to the Standard Model prediction. Gross features of the data, mass bumps, and significant excesses of events with large summed transverse momentum are examined in a model-independent and quasi-model-independent approach. This global search for new physics in over three hundred exclusive final states in 2 fb -1 of p(bar p) collisions at √s = 1.96 TeV reveals no significant indication of physics beyond the Standard Model

  11. Object-based target templates guide attention during visual search

    OpenAIRE

    Berggren, Nick; Eimer, Martin

    2018-01-01

    During visual search, attention is believed to be controlled in a strictly feature-based fashion, without any guidance by object-based target representations. To challenge this received view, we measured electrophysiological markers of attentional selection (N2pc component) and working memory (SPCN) in search tasks where two possible targets were defined by feature conjunctions (e.g., blue circles and green squares). Critically, some search displays also contained nontargets with two target f...

  12. Assessment and Comparison of Search capabilities of Web-based Meta-Search Engines: A Checklist Approach

    Directory of Open Access Journals (Sweden)

    Alireza Isfandiyari Moghadam

    2010-03-01

    Full Text Available   The present investigation concerns evaluation, comparison and analysis of search options existing within web-based meta-search engines. 64 meta-search engines were identified. 19 meta-search engines that were free, accessible and compatible with the objectives of the present study were selected. An author’s constructed check list was used for data collection. Findings indicated that all meta-search engines studied used the AND operator, phrase search, number of results displayed setting, previous search query storage and help tutorials. Nevertheless, none of them demonstrated any search options for hypertext searching and displaying the size of the pages searched. 94.7% support features such as truncation, keywords in title and URL search and text summary display. The checklist used in the study could serve as a model for investigating search options in search engines, digital libraries and other internet search tools.

  13. Considerations for the development of task-based search engines

    DEFF Research Database (Denmark)

    Petcu, Paula; Dragusin, Radu

    2013-01-01

    Based on previous experience from working on a task-based search engine, we present a list of suggestions and ideas for an Information Retrieval (IR) framework that could inform the development of next generation professional search systems. The specific task that we start from is the clinicians......' information need in finding rare disease diagnostic hypotheses at the time and place where medical decisions are made. Our experience from the development of a search engine focused on supporting clinicians in completing this task has provided us valuable insights in what aspects should be considered...... by the developers of vertical search engines....

  14. Attribute-Based Proxy Re-Encryption with Keyword Search

    Science.gov (United States)

    Shi, Yanfeng; Liu, Jiqiang; Han, Zhen; Zheng, Qingji; Zhang, Rui; Qiu, Shuo

    2014-01-01

    Keyword search on encrypted data allows one to issue the search token and conduct search operations on encrypted data while still preserving keyword privacy. In the present paper, we consider the keyword search problem further and introduce a novel notion called attribute-based proxy re-encryption with keyword search (), which introduces a promising feature: In addition to supporting keyword search on encrypted data, it enables data owners to delegate the keyword search capability to some other data users complying with the specific access control policy. To be specific, allows (i) the data owner to outsource his encrypted data to the cloud and then ask the cloud to conduct keyword search on outsourced encrypted data with the given search token, and (ii) the data owner to delegate other data users keyword search capability in the fine-grained access control manner through allowing the cloud to re-encrypted stored encrypted data with a re-encrypted data (embedding with some form of access control policy). We formalize the syntax and security definitions for , and propose two concrete constructions for : key-policy and ciphertext-policy . In the nutshell, our constructions can be treated as the integration of technologies in the fields of attribute-based cryptography and proxy re-encryption cryptography. PMID:25549257

  15. Attribute-based proxy re-encryption with keyword search.

    Science.gov (United States)

    Shi, Yanfeng; Liu, Jiqiang; Han, Zhen; Zheng, Qingji; Zhang, Rui; Qiu, Shuo

    2014-01-01

    Keyword search on encrypted data allows one to issue the search token and conduct search operations on encrypted data while still preserving keyword privacy. In the present paper, we consider the keyword search problem further and introduce a novel notion called attribute-based proxy re-encryption with keyword search (ABRKS), which introduces a promising feature: In addition to supporting keyword search on encrypted data, it enables data owners to delegate the keyword search capability to some other data users complying with the specific access control policy. To be specific, ABRKS allows (i) the data owner to outsource his encrypted data to the cloud and then ask the cloud to conduct keyword search on outsourced encrypted data with the given search token, and (ii) the data owner to delegate other data users keyword search capability in the fine-grained access control manner through allowing the cloud to re-encrypted stored encrypted data with a re-encrypted data (embedding with some form of access control policy). We formalize the syntax and security definitions for ABRKS, and propose two concrete constructions for ABRKS: key-policy ABRKS and ciphertext-policy ABRKS. In the nutshell, our constructions can be treated as the integration of technologies in the fields of attribute-based cryptography and proxy re-encryption cryptography.

  16. Impact of Glaucoma and Dry Eye on Text-Based Searching

    Science.gov (United States)

    Sun, Michelle J.; Rubin, Gary S.; Akpek, Esen K.; Ramulu, Pradeep Y.

    2017-01-01

    Purpose We determine if visual field loss from glaucoma and/or measures of dry eye severity are associated with difficulty searching, as judged by slower search times on a text-based search task. Methods Glaucoma patients with bilateral visual field (VF) loss, patients with clinically significant dry eye, and normally-sighted controls were enrolled from the Wilmer Eye Institute clinics. Subjects searched three Yellow Pages excerpts for a specific phone number, and search time was recorded. Results A total of 50 glaucoma subjects, 40 dry eye subjects, and 45 controls completed study procedures. On average, glaucoma patients exhibited 57% longer search times compared to controls (95% confidence interval [CI], 26%–96%, P Dry eye subjects demonstrated similar search times compared to controls, though worse Ocular Surface Disease Index (OSDI) vision-related subscores were associated with longer search times (P dry eye (P > 0.08 for Schirmer's testing without anesthesia, corneal fluorescein staining, and tear film breakup time). Conclusions Text-based visual search is slower for glaucoma patients with greater levels of VF loss and dry eye patients with greater self-reported visual difficulty, and these difficulties may contribute to decreased quality of life in these groups. Translational Relevance Visual search is impaired in glaucoma and dry eye groups compared to controls, highlighting the need for compensatory strategies and tools to assist individuals in overcoming their deficiencies. PMID:28670502

  17. Virtual drug screen schema based on multiview similarity integration and ranking aggregation.

    Science.gov (United States)

    Kang, Hong; Sheng, Zhen; Zhu, Ruixin; Huang, Qi; Liu, Qi; Cao, Zhiwei

    2012-03-26

    The current drug virtual screen (VS) methods mainly include two categories. i.e., ligand/target structure-based virtual screen and that, utilizing protein-ligand interaction fingerprint information based on the large number of complex structures. Since the former one focuses on the one-side information while the later one focuses on the whole complex structure, they are thus complementary and can be boosted by each other. However, a common problem faced here is how to present a comprehensive understanding and evaluation of the various virtual screen results derived from various VS methods. Furthermore, there is still an urgent need for developing an efficient approach to fully integrate various VS methods from a comprehensive multiview perspective. In this study, our virtual screen schema based on multiview similarity integration and ranking aggregation was tested comprehensively with statistical evaluations, providing several novel and useful clues on how to perform drug VS from multiple heterogeneous data sources. (1) 18 complex structures of HIV-1 protease with ligands from the PDB were curated as a test data set and the VS was performed with five different drug representations. Ritonavir ( 1HXW ) was selected as the query in VS and the weighted ranks of the query results were aggregated from multiple views through four similarity integration approaches. (2) Further, one of the ranking aggregation methods was used to integrate the similarity ranks calculated by gene ontology (GO) fingerprint and structural fingerprint on the data set from connectivity map, and two typical HDAC and HSP90 inhibitors were chosen as the queries. The results show that rank aggregation can enhance the result of similarity searching in VS when two or more descriptions are involved and provide a more reasonable similarity rank result. Our study shows that integrated VS based on multiple data fusion can achieve a remarkable better performance compared to that from individual ones and

  18. IBRI-CASONTO: Ontology-based semantic search engine

    Directory of Open Access Journals (Sweden)

    Awny Sayed

    2017-11-01

    Full Text Available The vast availability of information, that added in a very fast pace, in the data repositories creates a challenge in extracting correct and accurate information. Which has increased the competition among developers in order to gain access to technology that seeks to understand the intent researcher and contextual meaning of terms. While the competition for developing an Arabic Semantic Search systems are still in their infancy, and the reason could be traced back to the complexity of Arabic Language. It has a complex morphological, grammatical and semantic aspects, as it is a highly inflectional and derivational language. In this paper, we try to highlight and present an Ontological Search Engine called IBRI-CASONTO for Colleges of Applied Sciences, Oman. Our proposed engine supports both Arabic and English language. It is also employed two types of search which are a keyword-based search and a semantics-based search. IBRI-CASONTO is based on different technologies such as Resource Description Framework (RDF data and Ontological graph. The experiments represent in two sections, first it shows a comparison among Entity-Search and the Classical-Search inside the IBRI-CASONTO itself, second it compares the Entity-Search of IBRI-CASONTO with currently used search engines, such as Kngine, Wolfram Alpha and the most popular engine nowadays Google, in order to measure their performance and efficiency.

  19. Search for brown dwarfs in the IRAS data bases

    International Nuclear Information System (INIS)

    Low, F.J.

    1986-01-01

    A report is given on the initial searches for brown dwarf stars in the IRAS data bases. The paper was presented to the workshop on 'Astrophysics of brown dwarfs', Virginia, USA, 1985. To date no brown dwarfs have been discovered in the solar neighbourhood. Opportunities for future searches with greater sensitivity and different wavelengths are outlined. (U.K.)

  20. Knowledge base, information search and intention to adopt innovation

    NARCIS (Netherlands)

    Rijnsoever, van F.J.; Castaldi, C.

    2008-01-01

    Innovation is a process that involves searching for new information. This paper builds upon theoretical insights on individual and organizational learning and proposes a knowledge based model of how actors search for information when confronted with innovation. The model takes into account different

  1. A product feature-based user-centric product search model

    OpenAIRE

    Ben Jabeur, Lamjed; Soulier, Laure; Tamine, Lynda; Mousset, Paul

    2016-01-01

    During the online shopping process, users would search for interesting products and quickly access those that fit with their needs among a long tail of similar or closely related products. Our contribution addresses head queries that are frequently submitted on e-commerce Web sites. Head queries usually target featured products with several variations, accessories, and complementary products. We present in this paper a product feature-based user-centric model for product search involving in a...

  2. SA-Search: a web tool for protein structure mining based on a Structural Alphabet

    OpenAIRE

    Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

    2004-01-01

    SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of f...

  3. Research on perturbation based Monte Carlo reactor criticality search

    International Nuclear Information System (INIS)

    Li Zeguang; Wang Kan; Li Yangliu; Deng Jingkang

    2013-01-01

    Criticality search is a very important aspect in reactor physics analysis. Due to the advantages of Monte Carlo method and the development of computer technologies, Monte Carlo criticality search is becoming more and more necessary and feasible. Traditional Monte Carlo criticality search method is suffered from large amount of individual criticality runs and uncertainty and fluctuation of Monte Carlo results. A new Monte Carlo criticality search method based on perturbation calculation is put forward in this paper to overcome the disadvantages of traditional method. By using only one criticality run to get initial k_e_f_f and differential coefficients of concerned parameter, the polynomial estimator of k_e_f_f changing function is solved to get the critical value of concerned parameter. The feasibility of this method was tested. The results show that the accuracy and efficiency of perturbation based criticality search method are quite inspiring and the method overcomes the disadvantages of traditional one. (authors)

  4. Identifying Potential Protein Targets for Toluene Using a Molecular Similarity Search, in Silico Docking and in Vitro Validation

    Science.gov (United States)

    2015-01-01

    performed under standard conditions. Ana- lysis of purified hemoglobin using SDS and native polyacryl - amide gel electrophoresis (PAGE) indicated that the...search of T3DB. They represent several families of proteins (calcium-transporting ATPases, sodium/ potassium -transporting ATPase, cytochrome P450...REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 Public reporting burden for this collection of information is estimated to average 1

  5. Evidence-based librarianship: searching for the needed EBL evidence.

    Science.gov (United States)

    Eldredge, J D

    2000-01-01

    This paper discusses the challenges of finding evidence needed to implement Evidence-Based Librarianship (EBL). Focusing first on database coverage for three health sciences librarianship journals, the article examines the information contents of different databases. Strategies are needed to search for relevant evidence in the library literature via these databases, and the problems associated with searching the grey literature of librarianship. Database coverage, plausible search strategies, and the grey literature of library science all pose challenges to finding the needed research evidence for practicing EBL. Health sciences librarians need to ensure that systems are designed that can track and provide access to needed research evidence to support Evidence-Based Librarianship (EBL).

  6. Multi-objective Search-based Mobile Testing

    OpenAIRE

    Mao, K.

    2017-01-01

    Despite the tremendous popularity of mobile applications, mobile testing still relies heavily on manual testing. This thesis presents mobile test automation approaches based on multi-objective search. We introduce three approaches: Sapienz (for native Android app testing), Octopuz (for hybrid/web JavaScript app testing) and Polariz (for using crowdsourcing to support search-based mobile testing). These three approaches represent the primary scientific and technical contributions of the thesis...

  7. A Profile-Based Framework for Factorial Similarity and the Congruence Coefficient.

    Science.gov (United States)

    Hartley, Anselma G; Furr, R Michael

    2017-01-01

    We present a novel profile-based framework for understanding factorial similarity in the context of exploratory factor analysis in general, and for understanding the congruence coefficient (a commonly used index of factor similarity) specifically. First, we introduce the profile-based framework articulating factorial similarity in terms of 3 intuitive components: general saturation similarity, differential saturation similarity, and configural similarity. We then articulate the congruence coefficient in terms of these components, along with 2 additional profile-based components, and we explain how these components resolve ambiguities that can be-and are-found when using the congruence coefficient. Finally, we present secondary analyses revealing that profile-based components of factorial are indeed linked to experts' actual evaluations of factorial similarity. Overall, the profile-based approach we present offers new insights into the ways in which researchers can examine factor similarity and holds the potential to enhance researchers' ability to understand the congruence coefficient.

  8. Neurophysiological Based Methods of Guided Image Search

    National Research Council Canada - National Science Library

    Marchak, Frank

    2003-01-01

    .... We developed a model of visual feature detection, the Neuronal Synchrony Model, based on neurophysiological models of temporal neuronal processing, to improve the accuracy of automatic detection...

  9. Content-based Music Search and Recommendation System

    Science.gov (United States)

    Takegawa, Kazuki; Hijikata, Yoshinori; Nishida, Shogo

    Recently, the turn volume of music data on the Internet has increased rapidly. This has increased the user's cost to find music data suiting their preference from such a large data set. We propose a content-based music search and recommendation system. This system has an interface for searching and finding music data and an interface for editing a user profile which is necessary for music recommendation. By exploiting the visualization of the feature space of music and the visualization of the user profile, the user can search music data and edit the user profile. Furthermore, by exploiting the infomation which can be acquired from each visualized object in a mutually complementary manner, we make it easier for the user to search music data and edit the user profile. Concretely, the system gives to the user an information obtained from the user profile when searching music data and an information obtained from the feature space of music when editing the user profile.

  10. In Search of...Brain-Based Education.

    Science.gov (United States)

    Bruer, John T.

    1999-01-01

    Debunks two ideas appearing in brain-based education articles: the educational significance of brain laterality (right brain versus left brain) and claims for a sensitive period of brain development in young children. Brain-based education literature provides a popular but misleading mix of fact, misinterpretation, and fantasy. (47 references (MLH)

  11. Self-Similarity Based Corresponding-Point Extraction from Weakly Textured Stereo Pairs

    Directory of Open Access Journals (Sweden)

    Min Mao

    2014-01-01

    Full Text Available For the areas of low textured in image pairs, there is nearly no point that can be detected by traditional methods. The information in these areas will not be extracted by classical interest-point detectors. In this paper, a novel weakly textured point detection method is presented. The points with weakly textured characteristic are detected by the symmetry concept. The proposed approach considers the gray variability of the weakly textured local regions. The detection mechanism can be separated into three steps: region-similarity computation, candidate point searching, and refinement of weakly textured point set. The mechanism of radius scale selection and texture strength conception are used in the second step and the third step, respectively. The matching algorithm based on sparse representation (SRM is used for matching the detected points in different images. The results obtained on image sets with different objects show high robustness of the method to background and intraclass variations as well as to different photometric and geometric transformations; the points detected by this method are also the complement of points detected by classical detectors from the literature. And we also verify the efficacy of SRM by comparing with classical algorithms under the occlusion and corruption situations for matching the weakly textured points. Experiments demonstrate the effectiveness of the proposed weakly textured point detection algorithm.

  12. Mobile Visual Search Based on Histogram Matching and Zone Weight Learning

    Science.gov (United States)

    Zhu, Chuang; Tao, Li; Yang, Fan; Lu, Tao; Jia, Huizhu; Xie, Xiaodong

    2018-01-01

    In this paper, we propose a novel image retrieval algorithm for mobile visual search. At first, a short visual codebook is generated based on the descriptor database to represent the statistical information of the dataset. Then, an accurate local descriptor similarity score is computed by merging the tf-idf weighted histogram matching and the weighting strategy in compact descriptors for visual search (CDVS). At last, both the global descriptor matching score and the local descriptor similarity score are summed up to rerank the retrieval results according to the learned zone weights. The results show that the proposed approach outperforms the state-of-the-art image retrieval method in CDVS.

  13. Update on CERN Search based on SharePoint 2013

    Science.gov (United States)

    Alvarez, E.; Fernandez, S.; Lossent, A.; Posada, I.; Silva, B.; Wagner, A.

    2017-10-01

    CERN’s enterprise Search solution “CERN Search” provides a central search solution for users and CERN service providers. A total of about 20 million public and protected documents from a wide range of document collections is indexed, including Indico, TWiki, Drupal, SharePoint, JACOW, E-group archives, EDMS, and CERN Web pages. In spring 2015, CERN Search was migrated to a new infrastructure based on SharePoint 2013. In the context of this upgrade, the document pre-processing and indexing process was redesigned and generalised. The new data feeding framework allows to profit from new functionality and it facilitates the long term maintenance of the system.

  14. Lung Cancer Signature Biomarkers: tissue specific semantic similarity based clustering of Digital Differential Display (DDD data

    Directory of Open Access Journals (Sweden)

    Srivastava Mousami

    2012-11-01

    Full Text Available Abstract Background The tissue-specific Unigene Sets derived from more than one million expressed sequence tags (ESTs in the NCBI, GenBank database offers a platform for identifying significantly and differentially expressed tissue-specific genes by in-silico methods. Digital differential display (DDD rapidly creates transcription profiles based on EST comparisons and numerically calculates, as a fraction of the pool of ESTs, the relative sequence abundance of known and novel genes. However, the process of identifying the most likely tissue for a specific disease in which to search for candidate genes from the pool of differentially expressed genes remains difficult. Therefore, we have used ‘Gene Ontology semantic similarity score’ to measure the GO similarity between gene products of lung tissue-specific candidate genes from control (normal and disease (cancer sets. This semantic similarity score matrix based on hierarchical clustering represents in the form of a dendrogram. The dendrogram cluster stability was assessed by multiple bootstrapping. Multiple bootstrapping also computes a p-value for each cluster and corrects the bias of the bootstrap probability. Results Subsequent hierarchical clustering by the multiple bootstrapping method (α = 0.95 identified seven clusters. The comparative, as well as subtractive, approach revealed a set of 38 biomarkers comprising four distinct lung cancer signature biomarker clusters (panel 1–4. Further gene enrichment analysis of the four panels revealed that each panel represents a set of lung cancer linked metastasis diagnostic biomarkers (panel 1, chemotherapy/drug resistance biomarkers (panel 2, hypoxia regulated biomarkers (panel 3 and lung extra cellular matrix biomarkers (panel 4. Conclusions Expression analysis reveals that hypoxia induced lung cancer related biomarkers (panel 3, HIF and its modulating proteins (TGM2, CSNK1A1, CTNNA1, NAMPT/Visfatin, TNFRSF1A, ETS1, SRC-1, FN1, APLP2, DMBT1

  15. GeNemo: a search engine for web-based functional genomic data.

    Science.gov (United States)

    Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng

    2016-07-08

    A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions. This distinguishes GeNemo from text or DNA sequence searches. The user can input any complete or partial functional genomic dataset, for example, a binding intensity file (bigWig) or a peak file. GeNemo reports any genomic regions, ranging from hundred bases to hundred thousand bases, from any of the online ENCODE datasets that share similar functional (binding, modification, accessibility) patterns. This is enabled by a Markov Chain Monte Carlo-based maximization process, executed on up to 24 parallel computing threads. By clicking on a search result, the user can visually compare her/his data with the found datasets and navigate the identified genomic regions. GeNemo is available at www.genemo.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Efficient Algorithm for Computing Link-based Similarity in Real World Networks

    DEFF Research Database (Denmark)

    Cai, Yuanzhe; Cong, Gao; Xu, Jia

    2009-01-01

    Similarity calculation has many applications, such as information retrieval, and collaborative filtering, among many others. It has been shown that link-based similarity measure, such as SimRank, is very effective in characterizing the object similarities in networks, such as the Web, by exploiti...

  17. Robust object tacking based on self-adaptive search area

    Science.gov (United States)

    Dong, Taihang; Zhong, Sheng

    2018-02-01

    Discriminative correlation filter (DCF) based trackers have recently achieved excellent performance with great computational efficiency. However, DCF based trackers suffer boundary effects, which result in the unstable performance in challenging situations exhibiting fast motion. In this paper, we propose a novel method to mitigate this side-effect in DCF based trackers. We change the search area according to the prediction of target motion. When the object moves fast, broad search area could alleviate boundary effects and reserve the probability of locating object. When the object moves slowly, narrow search area could prevent effect of useless background information and improve computational efficiency to attain real-time performance. This strategy can impressively soothe boundary effects in situations exhibiting fast motion and motion blur, and it can be used in almost all DCF based trackers. The experiments on OTB benchmark show that the proposed framework improves the performance compared with the baseline trackers.

  18. Visibiome: an efficient microbiome search engine based on a scalable, distributed architecture.

    Science.gov (United States)

    Azman, Syafiq Kamarul; Anwar, Muhammad Zohaib; Henschel, Andreas

    2017-07-24

    Given the current influx of 16S rRNA profiles of microbiota samples, it is conceivable that large amounts of them eventually are available for search, comparison and contextualization with respect to novel samples. This process facilitates the identification of similar compositional features in microbiota elsewhere and therefore can help to understand driving factors for microbial community assembly. We present Visibiome, a microbiome search engine that can perform exhaustive, phylogeny based similarity search and contextualization of user-provided samples against a comprehensive dataset of 16S rRNA profiles environments, while tackling several computational challenges. In order to scale to high demands, we developed a distributed system that combines web framework technology, task queueing and scheduling, cloud computing and a dedicated database server. To further ensure speed and efficiency, we have deployed Nearest Neighbor search algorithms, capable of sublinear searches in high-dimensional metric spaces in combination with an optimized Earth Mover Distance based implementation of weighted UniFrac. The search also incorporates pairwise (adaptive) rarefaction and optionally, 16S rRNA copy number correction. The result of a query microbiome sample is the contextualization against a comprehensive database of microbiome samples from a diverse range of environments, visualized through a rich set of interactive figures and diagrams, including barchart-based compositional comparisons and ranking of the closest matches in the database. Visibiome is a convenient, scalable and efficient framework to search microbiomes against a comprehensive database of environmental samples. The search engine leverages a popular but computationally expensive, phylogeny based distance metric, while providing numerous advantages over the current state of the art tool.

  19. 2-gram-based Phonetic Feature Generation for Convolutional Neural Network in Assessment of Trademark Similarity

    OpenAIRE

    Ko, Kyung Pyo; Lee, Kwang Hee; Jang, Mi So; Park, Gun Hong

    2018-01-01

    A trademark is a mark used to identify various commodities. If same or similar trademark is registered for the same or similar commodity, the purchaser of the goods may be confused. Therefore, in the process of trademark registration examination, the examiner judges whether the trademark is the same or similar to the other applied or registered trademarks. The confusion in trademarks is based on the visual, phonetic or conceptual similarity of the marks. In this paper, we focus specifically o...

  20. An Enhanced Rule-Based Web Scanner Based on Similarity Score

    Directory of Open Access Journals (Sweden)

    LEE, M.

    2016-08-01

    Full Text Available This paper proposes an enhanced rule-based web scanner in order to get better accuracy in detecting web vulnerabilities than the existing tools, which have relatively high false alarm rate when the web pages are installed in unconventional directory paths. Using the proposed matching method based on similarity score, the proposed scheme can determine whether two pages have the same vulnerabilities or not. With this method, the proposed scheme is able to figure out the target web pages are vulnerable by comparing them to the web pages that are known to have vulnerabilities. We show the proposed scanner reduces 12% false alarm rate compared to the existing well-known scanner through the performance evaluation via various experiments. The proposed scheme is especially helpful in detecting vulnerabilities of the web applications which come from well-known open-source web applications after small customization, which happens frequently in many small-sized companies.

  1. EKF-GPR-Based Fingerprint Renovation for Subset-Based Indoor Localization with Adjusted Cosine Similarity.

    Science.gov (United States)

    Yang, Junhua; Li, Yong; Cheng, Wei; Liu, Yang; Liu, Chenxi

    2018-01-22

    Received Signal Strength Indicator (RSSI) localization using fingerprint has become a prevailing approach for indoor localization. However, the fingerprint-collecting work is repetitive and time-consuming. After the original fingerprint radio map is built, it is laborious to upgrade the radio map. In this paper, we describe a Fingerprint Renovation System (FRS) based on crowdsourcing, which avoids the use of manual labour to obtain the up-to-date fingerprint status. Extended Kalman Filter (EKF) and Gaussian Process Regression (GPR) in FRS are combined to calculate the current state based on the original fingerprinting radio map. In this system, a method of subset acquisition also makes an immediate impression to reduce the huge computation caused by too many reference points (RPs). Meanwhile, adjusted cosine similarity (ACS) is employed in the online phase to solve the issue of outliers produced by cosine similarity. Both experiments and analytical simulation in a real Wireless Fidelity (Wi-Fi) environment indicate the usefulness of our system to significant performance improvements. The results show that FRS improves the accuracy by 19.6% in the surveyed area compared to the radio map un-renovated. Moreover, the proposed subset algorithm can bring less computation.

  2. EKF–GPR-Based Fingerprint Renovation for Subset-Based Indoor Localization with Adjusted Cosine Similarity

    Science.gov (United States)

    Yang, Junhua; Li, Yong; Cheng, Wei; Liu, Yang; Liu, Chenxi

    2018-01-01

    Received Signal Strength Indicator (RSSI) localization using fingerprint has become a prevailing approach for indoor localization. However, the fingerprint-collecting work is repetitive and time-consuming. After the original fingerprint radio map is built, it is laborious to upgrade the radio map. In this paper, we describe a Fingerprint Renovation System (FRS) based on crowdsourcing, which avoids the use of manual labour to obtain the up-to-date fingerprint status. Extended Kalman Filter (EKF) and Gaussian Process Regression (GPR) in FRS are combined to calculate the current state based on the original fingerprinting radio map. In this system, a method of subset acquisition also makes an immediate impression to reduce the huge computation caused by too many reference points (RPs). Meanwhile, adjusted cosine similarity (ACS) is employed in the online phase to solve the issue of outliers produced by cosine similarity. Both experiments and analytical simulation in a real Wireless Fidelity (Wi-Fi) environment indicate the usefulness of our system to significant performance improvements. The results show that FRS improves the accuracy by 19.6% in the surveyed area compared to the radio map un-renovated. Moreover, the proposed subset algorithm can bring less computation. PMID:29361805

  3. Tag-Based Social Image Search: Toward Relevant and Diverse Results

    Science.gov (United States)

    Yang, Kuiyuan; Wang, Meng; Hua, Xian-Sheng; Zhang, Hong-Jiang

    Recent years have witnessed a great success of social media websites. Tag-based image search is an important approach to access the image content of interest on these websites. However, the existing ranking methods for tag-based image search frequently return results that are irrelevant or lack of diversity. This chapter presents a diverse relevance ranking scheme which simultaneously takes relevance and diversity into account by exploring the content of images and their associated tags. First, it estimates the relevance scores of images with respect to the query term based on both visual information of images and semantic information of associated tags. Then semantic similarities of social images are estimated based on their tags. Based on the relevance scores and the similarities, the ranking list is generated by a greedy ordering algorithm which optimizes Average Diverse Precision (ADP), a novel measure that is extended from the conventional Average Precision (AP). Comprehensive experiments and user studies demonstrate the effectiveness of the approach.

  4. A novel line segment detection algorithm based on graph search

    Science.gov (United States)

    Zhao, Hong-dan; Liu, Guo-ying; Song, Xu

    2018-02-01

    To overcome the problem of extracting line segment from an image, a method of line segment detection was proposed based on the graph search algorithm. After obtaining the edge detection result of the image, the candidate straight line segments are obtained in four directions. For the candidate straight line segments, their adjacency relationships are depicted by a graph model, based on which the depth-first search algorithm is employed to determine how many adjacent line segments need to be merged. Finally we use the least squares method to fit the detected straight lines. The comparative experimental results verify that the proposed algorithm has achieved better results than the line segment detector (LSD).

  5. Parallel content-based sub-image retrieval using hierarchical searching.

    Science.gov (United States)

    Yang, Lin; Qi, Xin; Xing, Fuyong; Kurc, Tahsin; Saltz, Joel; Foran, David J

    2014-04-01

    The capacity to systematically search through large image collections and ensembles and detect regions exhibiting similar morphological characteristics is central to pathology diagnosis. Unfortunately, the primary methods used to search digitized, whole-slide histopathology specimens are slow and prone to inter- and intra-observer variability. The central objective of this research was to design, develop, and evaluate a content-based image retrieval system to assist doctors for quick and reliable content-based comparative search of similar prostate image patches. Given a representative image patch (sub-image), the algorithm will return a ranked ensemble of image patches throughout the entire whole-slide histology section which exhibits the most similar morphologic characteristics. This is accomplished by first performing hierarchical searching based on a newly developed hierarchical annular histogram (HAH). The set of candidates is then further refined in the second stage of processing by computing a color histogram from eight equally divided segments within each square annular bin defined in the original HAH. A demand-driven master-worker parallelization approach is employed to speed up the searching procedure. Using this strategy, the query patch is broadcasted to all worker processes. Each worker process is dynamically assigned an image by the master process to search for and return a ranked list of similar patches in the image. The algorithm was tested using digitized hematoxylin and eosin (H&E) stained prostate cancer specimens. We have achieved an excellent image retrieval performance. The recall rate within the first 40 rank retrieved image patches is ∼90%. Both the testing data and source code can be downloaded from http://pleiad.umdnj.edu/CBII/Bioinformatics/.

  6. Thai Language Sentence Similarity Computation Based on Syntactic Structure and Semantic Vector

    Science.gov (United States)

    Wang, Hongbin; Feng, Yinhan; Cheng, Liang

    2018-03-01

    Sentence similarity computation plays an increasingly important role in text mining, Web page retrieval, machine translation, speech recognition and question answering systems. Thai language as a kind of resources scarce language, it is not like Chinese language with HowNet and CiLin resources. So the Thai sentence similarity research faces some challenges. In order to solve this problem of the Thai language sentence similarity computation. This paper proposes a novel method to compute the similarity of Thai language sentence based on syntactic structure and semantic vector. This method firstly uses the Part-of-Speech (POS) dependency to calculate two sentences syntactic structure similarity, and then through the word vector to calculate two sentences semantic similarity. Finally, we combine the two methods to calculate two Thai language sentences similarity. The proposed method not only considers semantic, but also considers the sentence syntactic structure. The experiment result shows that this method in Thai language sentence similarity computation is feasible.

  7. A search for hot pulsators similar to PG1159-035 and the central star of K 1-16

    International Nuclear Information System (INIS)

    Bond, H.E.; Grauer, A.D.; Liebert, J.; Fleming, T.; Green, R.F.

    1987-01-01

    The variations of PG1159-035 (GWVir)were discovered by McGraw et al. This object is the prototype of a anew class of pulsating stars located in an instability strip at the left-hand edge of the HR diagram. PG1159-035 and the spectroscopically similar objects PG1707+427 and PG2131+066 display complex non-radial modes with periodicities of order 10 minutes. Grauer and Bond recently discovered that the central star of the planetary nebula Kohoutek 1-16 also exhibits pulsation properties, with dominant periodicities of 25-28 minutes. These four objects display the following characteristics: High effective temperatures (--10 5 Κ) and moderately high surface gravities (log g ≅ 6-8); He II, C IV, and O VI absorption lines in the optical spectra, often reversed with emission cores; No hydrogen lines clearly detected; The pulsational instability has been attributed to partial ionization of carbon and/or oxygen

  8. Differential Search Coils Based Magnetometers: Conditioning, Magnetic Sensitivity, Spatial Resolution

    Directory of Open Access Journals (Sweden)

    Timofeeva Maria

    2012-03-01

    Full Text Available A theoretical and experimental comparison of optimized search coils based magnetometers, operating either in the Flux mode or in the classical Lenz-Faraday mode, is presented. The improvements provided by the Flux mode in terms of bandwidth and measuring range of the sensor are detailed. Theory, SPICE model and measurements are in good agreement. The spatial resolution of the sensor is studied which is an important parameter for applications in non destructive evaluation. A general expression of the magnetic sensitivity of search coils sensors is derived. Solutions are proposed to design magnetometers with reduced weight and volume without degrading the magnetic sensitivity. An original differential search coil based magnetometer, made of coupled coils, operating in flux mode and connected to a differential transimpedance amplifier is proposed. It is shown that this structure is better in terms of volume occupancy than magnetometers using two separated coils without any degradation in magnetic sensitivity. Experimental results are in good agreement with calculations.

  9. Order Tracking Based on Robust Peak Search Instantaneous Frequency Estimation

    International Nuclear Information System (INIS)

    Gao, Y; Guo, Y; Chi, Y L; Qin, S R

    2006-01-01

    Order tracking plays an important role in non-stationary vibration analysis of rotating machinery, especially to run-up or coast down. An instantaneous frequency estimation (IFE) based order tracking of rotating machinery is introduced. In which, a peak search algorithms of spectrogram of time-frequency analysis is employed to obtain IFE of vibrations. An improvement to peak search is proposed, which can avoid strong non-order components or noises disturbing to the peak search work. Compared with traditional methods of order tracking, IFE based order tracking is simplified in application and only software depended. Testing testify the validity of the method. This method is an effective supplement to traditional methods, and the application in condition monitoring and diagnosis of rotating machinery is imaginable

  10. Smart Images Search based on Visual Features Fusion

    International Nuclear Information System (INIS)

    Saad, M.H.

    2013-01-01

    Image search engines attempt to give fast and accurate access to the wide range of the huge amount images available on the Internet. There have been a number of efforts to build search engines based on the image content to enhance search results. Content-Based Image Retrieval (CBIR) systems have achieved a great interest since multimedia files, such as images and videos, have dramatically entered our lives throughout the last decade. CBIR allows automatically extracting target images according to objective visual contents of the image itself, for example its shapes, colors and textures to provide more accurate ranking of the results. The recent approaches of CBIR differ in terms of which image features are extracted to be used as image descriptors for matching process. This thesis proposes improvements of the efficiency and accuracy of CBIR systems by integrating different types of image features. This framework addresses efficient retrieval of images in large image collections. A comparative study between recent CBIR techniques is provided. According to this study; image features need to be integrated to provide more accurate description of image content and better image retrieval accuracy. In this context, this thesis presents new image retrieval approaches that provide more accurate retrieval accuracy than previous approaches. The first proposed image retrieval system uses color, texture and shape descriptors to form the global features vector. This approach integrates the yc b c r color histogram as a color descriptor, the modified Fourier descriptor as a shape descriptor and modified Edge Histogram as a texture descriptor in order to enhance the retrieval results. The second proposed approach integrates the global features vector, which is used in the first approach, with the SURF salient point technique as local feature. The nearest neighbor matching algorithm with a proposed similarity measure is applied to determine the final image rank. The second approach

  11. Automated Search-Based Robustness Testing for Autonomous Vehicle Software

    Directory of Open Access Journals (Sweden)

    Kevin M. Betts

    2016-01-01

    Full Text Available Autonomous systems must successfully operate in complex time-varying spatial environments even when dealing with system faults that may occur during a mission. Consequently, evaluating the robustness, or ability to operate correctly under unexpected conditions, of autonomous vehicle control software is an increasingly important issue in software testing. New methods to automatically generate test cases for robustness testing of autonomous vehicle control software in closed-loop simulation are needed. Search-based testing techniques were used to automatically generate test cases, consisting of initial conditions and fault sequences, intended to challenge the control software more than test cases generated using current methods. Two different search-based testing methods, genetic algorithms and surrogate-based optimization, were used to generate test cases for a simulated unmanned aerial vehicle attempting to fly through an entryway. The effectiveness of the search-based methods in generating challenging test cases was compared to both a truth reference (full combinatorial testing and the method most commonly used today (Monte Carlo testing. The search-based testing techniques demonstrated better performance than Monte Carlo testing for both of the test case generation performance metrics: (1 finding the single most challenging test case and (2 finding the set of fifty test cases with the highest mean degree of challenge.

  12. Can social tagged images aid concept-based video search?

    NARCIS (Netherlands)

    Setz, A.T.; Snoek, C.G.M.

    2009-01-01

    This paper seeks to unravel whether commonly available social tagged images can be exploited as a training resource for concept-based video search. Since social tags are known to be ambiguous, overly personalized, and often error prone, we place special emphasis on the role of disambiguation. We

  13. A World Wide Web Region-Based Image Search Engine

    DEFF Research Database (Denmark)

    Kompatsiaris, Ioannis; Triantafyllou, Evangelia; Strintzis, Michael G.

    2001-01-01

    In this paper the development of an intelligent image content-based search engine for the World Wide Web is presented. This system will offer a new form of media representation and access of content available in WWW. Information Web Crawlers continuously traverse the Internet and collect images...

  14. Snippet-based relevance predictions for federated web search

    NARCIS (Netherlands)

    Demeester, Thomas; Nguyen, Dong-Phuong; Trieschnigg, Rudolf Berend; Develder, Chris; Hiemstra, Djoerd

    How well can the relevance of a page be predicted, purely based on snippets? This would be highly useful in a Federated Web Search setting where caching large amounts of result snippets is more feasible than caching entire pages. The experiments reported in this paper make use of result snippets and

  15. Constraint-based local search for container stowage slot planning

    DEFF Research Database (Denmark)

    Pacino, Dario; Jensen, Rune Møller; Bebbington, Tom

    2012-01-01

    -sea vessels. This paper describes the constrained-based local search algorithm used in the second phase of this approach where individual containers are assigned to slots in each bay section. The algorithm can solve this problem in an average of 0.18 seconds per bay, corresponding to a 20 seconds runtime...

  16. Extending the similarity-based XML multicast approach with digital signatures

    DEFF Research Database (Denmark)

    Azzini, Antonia; Marrara, Stefania; Jensen, Meiko

    2009-01-01

    This paper investigates the interplay between similarity-based SOAP message aggregation and digital signature application. An overview on the approaches resulting from the different orders for the tasks of signature application, verification, similarity aggregation and splitting is provided....... Depending on the intersection between similarity-aggregated and signed SOAP message parts, the paper discusses three different cases of signature application, and sketches their applicability and performance implications....

  17. Computer-Assisted Search Of Large Textual Data Bases

    Science.gov (United States)

    Driscoll, James R.

    1995-01-01

    "QA" denotes high-speed computer system for searching diverse collections of documents including (but not limited to) technical reference manuals, legal documents, medical documents, news releases, and patents. Incorporates previously available and emerging information-retrieval technology to help user intelligently and rapidly locate information found in large textual data bases. Technology includes provision for inquiries in natural language; statistical ranking of retrieved information; artificial-intelligence implementation of semantics, in which "surface level" knowledge found in text used to improve ranking of retrieved information; and relevance feedback, in which user's judgements of relevance of some retrieved documents used automatically to modify search for further information.

  18. Design of chemical space networks using a Tanimoto similarity variant based upon maximum common substructures.

    Science.gov (United States)

    Zhang, Bijun; Vogt, Martin; Maggiora, Gerald M; Bajorath, Jürgen

    2015-10-01

    Chemical space networks (CSNs) have recently been introduced as an alternative to other coordinate-free and coordinate-based chemical space representations. In CSNs, nodes represent compounds and edges pairwise similarity relationships. In addition, nodes are annotated with compound property information such as biological activity. CSNs have been applied to view biologically relevant chemical space in comparison to random chemical space samples and found to display well-resolved topologies at low edge density levels. The way in which molecular similarity relationships are assessed is an important determinant of CSN topology. Previous CSN versions were based on numerical similarity functions or the assessment of substructure-based similarity. Herein, we report a new CSN design that is based upon combined numerical and substructure similarity evaluation. This has been facilitated by calculating numerical similarity values on the basis of maximum common substructures (MCSs) of compounds, leading to the introduction of MCS-based CSNs (MCS-CSNs). This CSN design combines advantages of continuous numerical similarity functions with a robust and chemically intuitive substructure-based assessment. Compared to earlier version of CSNs, MCS-CSNs are characterized by a further improved organization of local compound communities as exemplified by the delineation of drug-like subspaces in regions of biologically relevant chemical space.

  19. Computer-aided beam arrangement based on similar cases in radiation treatment-planning databases for stereotactic lung radiation therapy

    International Nuclear Information System (INIS)

    Magome, Taiki; Shioyama, Yoshiyuki; Arimura, Hidetaka

    2013-01-01

    The purpose of this study was to develop a computer-aided method for determination of beam arrangements based on similar cases in a radiotherapy treatment-planning database for stereotactic lung radiation therapy. Similar-case-based beam arrangements were automatically determined based on the following two steps. First, the five most similar cases were searched, based on geometrical features related to the location, size and shape of the planning target volume, lung and spinal cord. Second, five beam arrangements of an objective case were automatically determined by registering five similar cases with the objective case, with respect to lung regions, by means of a linear registration technique. For evaluation of the beam arrangements five treatment plans were manually created by applying the beam arrangements determined in the second step to the objective case. The most usable beam arrangement was selected by sorting the five treatment plans based on eight plan evaluation indices, including the D95, mean lung dose and spinal cord maximum dose. We applied the proposed method to 10 test cases, by using an RTP database of 81 cases with lung cancer, and compared the eight plan evaluation indices between the original treatment plan and the corresponding most usable similar-case-based treatment plan. As a result, the proposed method may provide usable beam arrangements, which have no statistically significant differences from the original beam arrangements (P>0.05) in terms of the eight plan evaluation indices. Therefore, the proposed method could be employed as an educational tool for less experienced treatment planners. (author)

  20. Object-based target templates guide attention during visual search.

    Science.gov (United States)

    Berggren, Nick; Eimer, Martin

    2018-05-03

    During visual search, attention is believed to be controlled in a strictly feature-based fashion, without any guidance by object-based target representations. To challenge this received view, we measured electrophysiological markers of attentional selection (N2pc component) and working memory (sustained posterior contralateral negativity; SPCN) in search tasks where two possible targets were defined by feature conjunctions (e.g., blue circles and green squares). Critically, some search displays also contained nontargets with two target features (incorrect conjunction objects, e.g., blue squares). Because feature-based guidance cannot distinguish these objects from targets, any selective bias for targets will reflect object-based attentional control. In Experiment 1, where search displays always contained only one object with target-matching features, targets and incorrect conjunction objects elicited identical N2pc and SPCN components, demonstrating that attentional guidance was entirely feature-based. In Experiment 2, where targets and incorrect conjunction objects could appear in the same display, clear evidence for object-based attentional control was found. The target N2pc became larger than the N2pc to incorrect conjunction objects from 250 ms poststimulus, and only targets elicited SPCN components. This demonstrates that after an initial feature-based guidance phase, object-based templates are activated when they are required to distinguish target and nontarget objects. These templates modulate visual processing and control access to working memory, and their activation may coincide with the start of feature integration processes. Results also suggest that while multiple feature templates can be activated concurrently, only a single object-based target template can guide attention at any given time. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  1. A semantics-based method for clustering of Chinese web search results

    Science.gov (United States)

    Zhang, Hui; Wang, Deqing; Wang, Li; Bi, Zhuming; Chen, Yong

    2014-01-01

    Information explosion is a critical challenge to the development of modern information systems. In particular, when the application of an information system is over the Internet, the amount of information over the web has been increasing exponentially and rapidly. Search engines, such as Google and Baidu, are essential tools for people to find the information from the Internet. Valuable information, however, is still likely submerged in the ocean of search results from those tools. By clustering the results into different groups based on subjects automatically, a search engine with the clustering feature allows users to select most relevant results quickly. In this paper, we propose an online semantics-based method to cluster Chinese web search results. First, we employ the generalised suffix tree to extract the longest common substrings (LCSs) from search snippets. Second, we use the HowNet to calculate the similarities of the words derived from the LCSs, and extract the most representative features by constructing the vocabulary chain. Third, we construct a vector of text features and calculate snippets' semantic similarities. Finally, we improve the Chameleon algorithm to cluster snippets. Extensive experimental results have shown that the proposed algorithm has outperformed over the suffix tree clustering method and other traditional clustering methods.

  2. Multilevel Thresholding Segmentation Based on Harmony Search Optimization

    Directory of Open Access Journals (Sweden)

    Diego Oliva

    2013-01-01

    Full Text Available In this paper, a multilevel thresholding (MT algorithm based on the harmony search algorithm (HSA is introduced. HSA is an evolutionary method which is inspired in musicians improvising new harmonies while playing. Different to other evolutionary algorithms, HSA exhibits interesting search capabilities still keeping a low computational overhead. The proposed algorithm encodes random samples from a feasible search space inside the image histogram as candidate solutions, whereas their quality is evaluated considering the objective functions that are employed by the Otsu’s or Kapur’s methods. Guided by these objective values, the set of candidate solutions are evolved through the HSA operators until an optimal solution is found. Experimental results demonstrate the high performance of the proposed method for the segmentation of digital images.

  3. Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement

    Science.gov (United States)

    2002-01-01

    to the OODBMS approach. The ORDBMS approach produced such research prototypes as Postgres [155], and Starburst [67] and commercial products such as...Kemnitz. The POSTGRES Next-Generation Database Management System. Communications of the ACM, 34(10):78–92, 1991. [156] Michael Stonebreaker and Dorothy

  4. Custom Search Engines: Tools & Tips

    Science.gov (United States)

    Notess, Greg R.

    2008-01-01

    Few have the resources to build a Google or Yahoo! from scratch. Yet anyone can build a search engine based on a subset of the large search engines' databases. Use Google Custom Search Engine or Yahoo! Search Builder or any of the other similar programs to create a vertical search engine targeting sites of interest to users. The basic steps to…

  5. A Semidefinite Programming Based Search Strategy for Feature Selection with Mutual Information Measure.

    Science.gov (United States)

    Naghibi, Tofigh; Hoffmann, Sarah; Pfister, Beat

    2015-08-01

    Feature subset selection, as a special case of the general subset selection problem, has been the topic of a considerable number of studies due to the growing importance of data-mining applications. In the feature subset selection problem there are two main issues that need to be addressed: (i) Finding an appropriate measure function than can be fairly fast and robustly computed for high-dimensional data. (ii) A search strategy to optimize the measure over the subset space in a reasonable amount of time. In this article mutual information between features and class labels is considered to be the measure function. Two series expansions for mutual information are proposed, and it is shown that most heuristic criteria suggested in the literature are truncated approximations of these expansions. It is well-known that searching the whole subset space is an NP-hard problem. Here, instead of the conventional sequential search algorithms, we suggest a parallel search strategy based on semidefinite programming (SDP) that can search through the subset space in polynomial time. By exploiting the similarities between the proposed algorithm and an instance of the maximum-cut problem in graph theory, the approximation ratio of this algorithm is derived and is compared with the approximation ratio of the backward elimination method. The experiments show that it can be misleading to judge the quality of a measure solely based on the classification accuracy, without taking the effect of the non-optimum search strategy into account.

  6. An Analysis of the Applicability of Federal Law Regarding Hash-Based Searches of Digital Media

    Science.gov (United States)

    2014-06-01

    similarity matching, Fourth Amend- ment, federal law, search and seizure, warrant search, consent search, border search. 15. NUMBER OF PAGES 107 16. PRICE ...containing a white powdery substance labeled flour [53]. 3.3.17 United States v Heckenkamp 482 F.3d 1142 (9th Circuit 2007) People have a reasonable

  7. Random walk-based similarity measure method for patterns in complex object

    Directory of Open Access Journals (Sweden)

    Liu Shihu

    2017-04-01

    Full Text Available This paper discusses the similarity of the patterns in complex objects. The complex object is composed both of the attribute information of patterns and the relational information between patterns. Bearing in mind the specificity of complex object, a random walk-based similarity measurement method for patterns is constructed. In this method, the reachability of any two patterns with respect to the relational information is fully studied, and in the case of similarity of patterns with respect to the relational information can be calculated. On this bases, an integrated similarity measurement method is proposed, and algorithms 1 and 2 show the performed calculation procedure. One can find that this method makes full use of the attribute information and relational information. Finally, a synthetic example shows that our proposed similarity measurement method is validated.

  8. A study of concept-based similarity approaches for recommending program examples

    Science.gov (United States)

    Hosseini, Roya; Brusilovsky, Peter

    2017-07-01

    This paper investigates a range of concept-based example recommendation approaches that we developed to provide example-based problem-solving support in the domain of programming. The goal of these approaches is to offer students a set of most relevant remedial examples when they have trouble solving a code comprehension problem where students examine a program code to determine its output or the final value of a variable. In this paper, we use the ideas of semantic-level similarity-based linking developed in the area of intelligent hypertext to generate examples for the given problem. To determine the best-performing approach, we explored two groups of similarity approaches for selecting examples: non-structural approaches focusing on examples that are similar to the problem in terms of concept coverage and structural approaches focusing on examples that are similar to the problem by the structure of the content. We also explored the value of personalized example recommendation based on student's knowledge levels and learning goal of the exercise. The paper presents concept-based similarity approaches that we developed, explains the data collection studies and reports the result of comparative analysis. The results of our analysis showed better ranking performance of the personalized structural variant of cosine similarity approach.

  9. Supporting inter-topic entity search for biomedical Linked Data based on heterogeneous relationships.

    Science.gov (United States)

    Zong, Nansu; Lee, Sungin; Ahn, Jinhyun; Kim, Hong-Gee

    2017-08-01

    The keyword-based entity search restricts search space based on the preference of search. When given keywords and preferences are not related to the same biomedical topic, existing biomedical Linked Data search engines fail to deliver satisfactory results. This research aims to tackle this issue by supporting an inter-topic search-improving search with inputs, keywords and preferences, under different topics. This study developed an effective algorithm in which the relations between biomedical entities were used in tandem with a keyword-based entity search, Siren. The algorithm, PERank, which is an adaptation of Personalized PageRank (PPR), uses a pair of input: (1) search preferences, and (2) entities from a keyword-based entity search with a keyword query, to formalize the search results on-the-fly based on the index of the precomputed Individual Personalized PageRank Vectors (IPPVs). Our experiments were performed over ten linked life datasets for two query sets, one with keyword-preference topic correspondence (intra-topic search), and the other without (inter-topic search). The experiments showed that the proposed method achieved better search results, for example a 14% increase in precision for the inter-topic search than the baseline keyword-based search engine. The proposed method improved the keyword-based biomedical entity search by supporting the inter-topic search without affecting the intra-topic search based on the relations between different entities. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Search-based model identification of smart-structure damage

    Science.gov (United States)

    Glass, B. J.; Macalou, A.

    1991-01-01

    This paper describes the use of a combined model and parameter identification approach, based on modal analysis and artificial intelligence (AI) techniques, for identifying damage or flaws in a rotating truss structure incorporating embedded piezoceramic sensors. This smart structure example is representative of a class of structures commonly found in aerospace systems and next generation space structures. Artificial intelligence techniques of classification, heuristic search, and an object-oriented knowledge base are used in an AI-based model identification approach. A finite model space is classified into a search tree, over which a variant of best-first search is used to identify the model whose stored response most closely matches that of the input. Newly-encountered models can be incorporated into the model space. This adaptativeness demonstrates the potential for learning control. Following this output-error model identification, numerical parameter identification is used to further refine the identified model. Given the rotating truss example in this paper, noisy data corresponding to various damage configurations are input to both this approach and a conventional parameter identification method. The combination of the AI-based model identification with parameter identification is shown to lead to smaller parameter corrections than required by the use of parameter identification alone.

  11. Concurrence of rule- and similarity-based mechanisms in artificial grammar learning.

    Science.gov (United States)

    Opitz, Bertram; Hofmann, Juliane

    2015-03-01

    A current theoretical debate regards whether rule-based or similarity-based learning prevails during artificial grammar learning (AGL). Although the majority of findings are consistent with a similarity-based account of AGL it has been argued that these results were obtained only after limited exposure to study exemplars, and performance on subsequent grammaticality judgment tests has often been barely above chance level. In three experiments the conditions were investigated under which rule- and similarity-based learning could be applied. Participants were exposed to exemplars of an artificial grammar under different (implicit and explicit) learning instructions. The analysis of receiver operating characteristics (ROC) during a final grammaticality judgment test revealed that explicit but not implicit learning led to rule knowledge. It also demonstrated that this knowledge base is built up gradually while similarity knowledge governed the initial state of learning. Together these results indicate that rule- and similarity-based mechanisms concur during AGL. Moreover, it could be speculated that two different rule processes might operate in parallel; bottom-up learning via gradual rule extraction and top-down learning via rule testing. Crucially, the latter is facilitated by performance feedback that encourages explicit hypothesis testing. Copyright © 2015 Elsevier Inc. All rights reserved.

  12. Constraint-Based Local Search for Constrained Optimum Paths Problems

    Science.gov (United States)

    Pham, Quang Dung; Deville, Yves; van Hentenryck, Pascal

    Constrained Optimum Path (COP) problems arise in many real-life applications and are ubiquitous in communication networks. They have been traditionally approached by dedicated algorithms, which are often hard to extend with side constraints and to apply widely. This paper proposes a constraint-based local search (CBLS) framework for COP applications, bringing the compositionality, reuse, and extensibility at the core of CBLS and CP systems. The modeling contribution is the ability to express compositional models for various COP applications at a high level of abstraction, while cleanly separating the model and the search procedure. The main technical contribution is a connected neighborhood based on rooted spanning trees to find high-quality solutions to COP problems. The framework, implemented in COMET, is applied to Resource Constrained Shortest Path (RCSP) problems (with and without side constraints) and to the edge-disjoint paths problem (EDP). Computational results show the potential significance of the approach.

  13. Computational prediction of drug-drug interactions based on drugs functional similarities.

    Science.gov (United States)

    Ferdousi, Reza; Safdari, Reza; Omidi, Yadollah

    2017-06-01

    Therapeutic activities of drugs are often influenced by co-administration of drugs that may cause inevitable drug-drug interactions (DDIs) and inadvertent side effects. Prediction and identification of DDIs are extremely vital for the patient safety and success of treatment modalities. A number of computational methods have been employed for the prediction of DDIs based on drugs structures and/or functions. Here, we report on a computational method for DDIs prediction based on functional similarity of drugs. The model was set based on key biological elements including carriers, transporters, enzymes and targets (CTET). The model was applied for 2189 approved drugs. For each drug, all the associated CTETs were collected, and the corresponding binary vectors were constructed to determine the DDIs. Various similarity measures were conducted to detect DDIs. Of the examined similarity methods, the inner product-based similarity measures (IPSMs) were found to provide improved prediction values. Altogether, 2,394,766 potential drug pairs interactions were studied. The model was able to predict over 250,000 unknown potential DDIs. Upon our findings, we propose the current method as a robust, yet simple and fast, universal in silico approach for identification of DDIs. We envision that this proposed method can be used as a practical technique for the detection of possible DDIs based on the functional similarities of drugs. Copyright © 2017. Published by Elsevier Inc.

  14. Automated dating of the world’s language families based on lexical similarity

    OpenAIRE

    Holman, E.; Brown, C.; Wichmann, S.; Müller, A.; Velupillai, V.; Hammarström, H.; Sauppe, S.; Jung, H.; Bakker, D.; Brown, P.; Belyaev, O.; Urban, M.; Mailhammer, R.; List, J.; Egorov, D.

    2011-01-01

    This paper describes a computerized alternative to glottochronology for estimating elapsed time since parent languages diverged into daughter languages. The method, developed by the Automated Similarity Judgment Program (ASJP) consortium, is different from glottochronology in four major respects: (1) it is automated and thus is more objective, (2) it applies a uniform analytical approach to a single database of worldwide languages, (3) it is based on lexical similarity as determined from Leve...

  15. Composites Similarity Analysis Method Based on Knowledge Set in Composites Quality Control

    OpenAIRE

    Li Haifeng

    2016-01-01

    Composites similarity analysis is an important link of composites review, it can not only to declare composites review rechecking, still help composites applicants promptly have the research content relevant progress and avoid duplication. This paper mainly studies the composites similarity model in composites review. With the actual experience of composites management, based on the author’s knowledge set theory, paper analyzes deeply knowledge set representation of composites knowledge, impr...

  16. Protein-protein interaction network-based detection of functionally similar proteins within species.

    Science.gov (United States)

    Song, Baoxing; Wang, Fen; Guo, Yang; Sang, Qing; Liu, Min; Li, Dengyun; Fang, Wei; Zhang, Deli

    2012-07-01

    Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent. Copyright © 2012 Wiley Periodicals, Inc.

  17. Calculating the knowledge-based similarity of functional groups using crystallographic data

    Science.gov (United States)

    Watson, Paul; Willett, Peter; Gillet, Valerie J.; Verdonk, Marcel L.

    2001-09-01

    A knowledge-based method for calculating the similarity of functional groups is described and validated. The method is based on experimental information derived from small molecule crystal structures. These data are used in the form of scatterplots that show the likelihood of a non-bonded interaction being formed between functional group A (the `central group') and functional group B (the `contact group' or `probe'). The scatterplots are converted into three-dimensional maps that show the propensity of the probe at different positions around the central group. Here we describe how to calculate the similarity of a pair of central groups based on these maps. The similarity method is validated using bioisosteric functional group pairs identified in the Bioster database and Relibase. The Bioster database is a critical compilation of thousands of bioisosteric molecule pairs, including drugs, enzyme inhibitors and agrochemicals. Relibase is an object-oriented database containing structural data about protein-ligand interactions. The distributions of the similarities of the bioisosteric functional group pairs are compared with similarities for all the possible pairs in IsoStar, and are found to be significantly different. Enrichment factors are also calculated showing the similarity method is statistically significantly better than random in predicting bioisosteric functional group pairs.

  18. Three hybridization models based on local search scheme for job shop scheduling problem

    Science.gov (United States)

    Balbi Fraga, Tatiana

    2015-05-01

    This work presents three different hybridization models based on the general schema of Local Search Heuristics, named Hybrid Successive Application, Hybrid Neighborhood, and Hybrid Improved Neighborhood. Despite similar approaches might have already been presented in the literature in other contexts, in this work these models are applied to analyzes the solution of the job shop scheduling problem, with the heuristics Taboo Search and Particle Swarm Optimization. Besides, we investigate some aspects that must be considered in order to achieve better solutions than those obtained by the original heuristics. The results demonstrate that the algorithms derived from these three hybrid models are more robust than the original algorithms and able to get better results than those found by the single Taboo Search.

  19. A Particle Swarm Optimization-Based Approach with Local Search for Predicting Protein Folding.

    Science.gov (United States)

    Yang, Cheng-Hong; Lin, Yu-Shiun; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2017-10-01

    The hydrophobic-polar (HP) model is commonly used for predicting protein folding structures and hydrophobic interactions. This study developed a particle swarm optimization (PSO)-based algorithm combined with local search algorithms; specifically, the high exploration PSO (HEPSO) algorithm (which can execute global search processes) was combined with three local search algorithms (hill-climbing algorithm, greedy algorithm, and Tabu table), yielding the proposed HE-L-PSO algorithm. By using 20 known protein structures, we evaluated the performance of the HE-L-PSO algorithm in predicting protein folding in the HP model. The proposed HE-L-PSO algorithm exhibited favorable performance in predicting both short and long amino acid sequences with high reproducibility and stability, compared with seven reported algorithms. The HE-L-PSO algorithm yielded optimal solutions for all predicted protein folding structures. All HE-L-PSO-predicted protein folding structures possessed a hydrophobic core that is similar to normal protein folding.

  20. Gas load forecasting based on optimized fuzzy c-mean clustering analysis of selecting similar days

    Directory of Open Access Journals (Sweden)

    Qiu Jing

    2017-08-01

    Full Text Available Traditional fuzzy c-means (FCM clustering in short term load forecasting method is easy to fall into local optimum and is sensitive to the initial cluster center.In this paper,we propose to use global search feature of particle swarm optimization (PSO algorithm to avoid these shortcomings,and to use FCM optimization to select similar date of forecast as training sample of support vector machines.This will not only strengthen the data rule of training samples,but also ensure the consistency of data characteristics.Experimental results show that the prediction accuracy of this prediction model is better than that of BP neural network and support vector machine (SVM algorithms.

  1. Similarity-Based Prediction of Travel Times for Vehicles Traveling on Known Routes

    DEFF Research Database (Denmark)

    Tiesyte, Dalia; Jensen, Christian Søndergaard

    2008-01-01

    , historical data in combination with real-time data may be used to predict the future travel times of vehicles more accurately, thus improving the experience of the users who rely on such information. We propose a Nearest-Neighbor Trajectory (NNT) technique that identifies the historical trajectory......The use of centralized, real-time position tracking is proliferating in the areas of logistics and public transportation. Real-time positions can be used to provide up-to-date information to a variety of users, and they can also be accumulated for uses in subsequent data analyses. In particular...... of vehicles that travel along known routes. In empirical studies with real data from buses, we evaluate how well the proposed distance functions are capable of predicting future vehicle movements. Second, we propose a main-memory index structure that enables incremental similarity search and that is capable...

  2. A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

    Directory of Open Access Journals (Sweden)

    Ming Che Lee

    2014-01-01

    Full Text Available This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.

  3. A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

    Science.gov (United States)

    Chang, Jia Wei; Hsieh, Tung Cheng

    2014-01-01

    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure. PMID:24982952

  4. A grammar-based semantic similarity algorithm for natural language sentences.

    Science.gov (United States)

    Lee, Ming Che; Chang, Jia Wei; Hsieh, Tung Cheng

    2014-01-01

    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to "artificial language", such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.

  5. Web-based information search and retrieval: effects of strategy use and age on search success.

    Science.gov (United States)

    Stronge, Aideen J; Rogers, Wendy A; Fisk, Arthur D

    2006-01-01

    The purpose of this study was to investigate the relationship between strategy use and search success on the World Wide Web (i.e., the Web) for experienced Web users. An additional goal was to extend understanding of how the age of the searcher may influence strategy use. Current investigations of information search and retrieval on the Web have provided an incomplete picture of Web strategy use because participants have not been given the opportunity to demonstrate their knowledge of Web strategies while also searching for information on the Web. Using both behavioral and knowledge-engineering methods, we investigated searching behavior and system knowledge for 16 younger adults (M = 20.88 years of age) and 16 older adults (M = 67.88 years). Older adults were less successful than younger adults in finding correct answers to the search tasks. Knowledge engineering revealed that the age-related effect resulted from ineffective search strategies and amount of Web experience rather than age per se. Our analysis led to the development of a decision-action diagram representing search behavior for both age groups. Older adults had more difficulty than younger adults when searching for information on the Web. However, this difficulty was related to the selection of inefficient search strategies, which may have been attributable to a lack of knowledge about available Web search strategies. Actual or potential applications of this research include training Web users to search more effectively and suggestions to improve the design of search engines.

  6. Exploring personalized searches using tag-based user profiles and resource profiles in folksonomy.

    Science.gov (United States)

    Cai, Yi; Li, Qing; Xie, Haoran; Min, Huaqin

    2014-10-01

    With the increase in resource-sharing websites such as YouTube and Flickr, many shared resources have arisen on the Web. Personalized searches have become more important and challenging since users demand higher retrieval quality. To achieve this goal, personalized searches need to take users' personalized profiles and information needs into consideration. Collaborative tagging (also known as folksonomy) systems allow users to annotate resources with their own tags, which provides a simple but powerful way for organizing, retrieving and sharing different types of social resources. In this article, we examine the limitations of previous tag-based personalized searches. To handle these limitations, we propose a new method to model user profiles and resource profiles in collaborative tagging systems. We use a normalized term frequency to indicate the preference degree of a user on a tag. A novel search method using such profiles of users and resources is proposed to facilitate the desired personalization in resource searches. In our framework, instead of the keyword matching or similarity measurement used in previous works, the relevance measurement between a resource and a user query (termed the query relevance) is treated as a fuzzy satisfaction problem of a user's query requirements. We implement a prototype system called the Folksonomy-based Multimedia Retrieval System (FMRS). Experiments using the FMRS data set and the MovieLens data set show that our proposed method outperforms baseline methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. In Silico target fishing: addressing a "Big Data" problem by ligand-based similarity rankings with data fusion.

    Science.gov (United States)

    Liu, Xian; Xu, Yuan; Li, Shanshan; Wang, Yulan; Peng, Jianlong; Luo, Cheng; Luo, Xiaomin; Zheng, Mingyue; Chen, Kaixian; Jiang, Hualiang

    2014-01-01

    Ligand-based in silico target fishing can be used to identify the potential interacting target of bioactive ligands, which is useful for understanding the polypharmacology and safety profile of existing drugs. The underlying principle of the approach is that known bioactive ligands can be used as reference to predict the targets for a new compound. We tested a pipeline enabling large-scale target fishing and drug repositioning, based on simple fingerprint similarity rankings with data fusion. A large library containing 533 drug relevant targets with 179,807 active ligands was compiled, where each target was defined by its ligand set. For a given query molecule, its target profile is generated by similarity searching against the ligand sets assigned to each target, for which individual searches utilizing multiple reference structures are then fused into a single ranking list representing the potential target interaction profile of the query compound. The proposed approach was validated by 10-fold cross validation and two external tests using data from DrugBank and Therapeutic Target Database (TTD). The use of the approach was further demonstrated with some examples concerning the drug repositioning and drug side-effects prediction. The promising results suggest that the proposed method is useful for not only finding promiscuous drugs for their new usages, but also predicting some important toxic liabilities. With the rapid increasing volume and diversity of data concerning drug related targets and their ligands, the simple ligand-based target fishing approach would play an important role in assisting future drug design and discovery.

  8. An inter-comparison of similarity-based methods for organisation and classification of groundwater hydrographs

    Science.gov (United States)

    Haaf, Ezra; Barthel, Roland

    2018-04-01

    Classification and similarity based methods, which have recently received major attention in the field of surface water hydrology, namely through the PUB (prediction in ungauged basins) initiative, have not yet been applied to groundwater systems. However, it can be hypothesised, that the principle of "similar systems responding similarly to similar forcing" applies in subsurface hydrology as well. One fundamental prerequisite to test this hypothesis and eventually to apply the principle to make "predictions for ungauged groundwater systems" is efficient methods to quantify the similarity of groundwater system responses, i.e. groundwater hydrographs. In this study, a large, spatially extensive, as well as geologically and geomorphologically diverse dataset from Southern Germany and Western Austria was used, to test and compare a set of 32 grouping methods, which have previously only been used individually in local-scale studies. The resulting groupings are compared to a heuristic visual classification, which serves as a baseline. A performance ranking of these classification methods is carried out and differences in homogeneity of grouping results were shown, whereby selected groups were related to hydrogeological indices and geological descriptors. This exploratory empirical study shows that the choice of grouping method has a large impact on the object distribution within groups, as well as on the homogeneity of patterns captured in groups. The study provides a comprehensive overview of a large number of grouping methods, which can guide researchers when attempting similarity-based groundwater hydrograph classification.

  9. 3D Facial Similarity Measure Based on Geodesic Network and Curvatures

    Directory of Open Access Journals (Sweden)

    Junli Zhao

    2014-01-01

    Full Text Available Automated 3D facial similarity measure is a challenging and valuable research topic in anthropology and computer graphics. It is widely used in various fields, such as criminal investigation, kinship confirmation, and face recognition. This paper proposes a 3D facial similarity measure method based on a combination of geodesic and curvature features. Firstly, a geodesic network is generated for each face with geodesics and iso-geodesics determined and these network points are adopted as the correspondence across face models. Then, four metrics associated with curvatures, that is, the mean curvature, Gaussian curvature, shape index, and curvedness, are computed for each network point by using a weighted average of its neighborhood points. Finally, correlation coefficients according to these metrics are computed, respectively, as the similarity measures between two 3D face models. Experiments of different persons’ 3D facial models and different 3D facial models of the same person are implemented and compared with a subjective face similarity study. The results show that the geodesic network plays an important role in 3D facial similarity measure. The similarity measure defined by shape index is consistent with human’s subjective evaluation basically, and it can measure the 3D face similarity more objectively than the other indices.

  10. A new similarity index for nonlinear signal analysis based on local extrema patterns

    Science.gov (United States)

    Niknazar, Hamid; Motie Nasrabadi, Ali; Shamsollahi, Mohammad Bagher

    2018-02-01

    Common similarity measures of time domain signals such as cross-correlation and Symbolic Aggregate approximation (SAX) are not appropriate for nonlinear signal analysis. This is because of the high sensitivity of nonlinear systems to initial points. Therefore, a similarity measure for nonlinear signal analysis must be invariant to initial points and quantify the similarity by considering the main dynamics of signals. The statistical behavior of local extrema (SBLE) method was previously proposed to address this problem. The SBLE similarity index uses quantized amplitudes of local extrema to quantify the dynamical similarity of signals by considering patterns of sequential local extrema. By adding time information of local extrema as well as fuzzifying quantized values, this work proposes a new similarity index for nonlinear and long-term signal analysis, which extends the SBLE method. These new features provide more information about signals and reduce noise sensitivity by fuzzifying them. A number of practical tests were performed to demonstrate the ability of the method in nonlinear signal clustering and classification on synthetic data. In addition, epileptic seizure detection based on electroencephalography (EEG) signal processing was done by the proposed similarity to feature the potentials of the method as a real-world application tool.

  11. Similarity measurement method of high-dimensional data based on normalized net lattice subspace

    Institute of Scientific and Technical Information of China (English)

    Li Wenfa; Wang Gongming; Li Ke; Huang Su

    2017-01-01

    The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities occupies a large proportion of the similarity, leading to the dissimilarities between any results.A similarity measurement method of high-dimensional data based on normalized net lattice subspace is proposed.The data range of each dimension is divided into several intervals, and the components in different dimensions are mapped onto the corresponding interval.Only the component in the same or adjacent interval is used to calculate the similarity.To validate this meth-od, three data types are used, and seven common similarity measurement methods are compared. The experimental result indicates that the relative difference of the method is increasing with the di-mensionality and is approximately two or three orders of magnitude higher than the conventional method.In addition, the similarity range of this method in different dimensions is [0, 1], which is fit for similarity analysis after dimensionality reduction.

  12. Color-Based Image Retrieval from High-Similarity Image Databases

    DEFF Research Database (Denmark)

    Hansen, Michael Adsetts Edberg; Carstensen, Jens Michael

    2003-01-01

    Many image classification problems can fruitfully be thought of as image retrieval in a "high similarity image database" (HSID) characterized by being tuned towards a specific application and having a high degree of visual similarity between entries that should be distinguished. We introduce...... a method for HSID retrieval using a similarity measure based on a linear combination of Jeffreys-Matusita (JM) distances between distributions of color (and color derivatives) estimated from a set of automatically extracted image regions. The weight coefficients are estimated based on optimal retrieval...... performance. Experimental results on the difficult task of visually identifying clones of fungal colonies grown in a petri dish and categorization of pelts show a high retrieval accuracy of the method when combined with standardized sample preparation and image acquisition....

  13. Experimental Study of Dowel Bar Alternatives Based on Similarity Model Test

    Directory of Open Access Journals (Sweden)

    Chichun Hu

    2017-01-01

    Full Text Available In this study, a small-scaled accelerated loading test based on similarity theory and Accelerated Pavement Analyzer was developed to evaluate dowel bars with different materials and cross-sections. Jointed concrete specimen consisting of one dowel was designed as scaled model for the test, and each specimen was subjected to 864 thousand loading cycles. Deflections between jointed slabs were measured with dial indicators, and strains of the dowel bars were monitored with strain gauges. The load transfer efficiency, differential deflection, and dowel-concrete bearing stress for each case were calculated from these measurements. The test results indicated that the effect of the dowel modulus on load transfer efficiency can be characterized based on the similarity model test developed in the study. Moreover, round steel dowel was found to have similar performance to larger FRP dowel, and elliptical dowel can be preferentially considered in practice.

  14. Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences.

    Science.gov (United States)

    Tan, Yen Hock; Huang, He; Kihara, Daisuke

    2006-08-15

    Aligning distantly related protein sequences is a long-standing problem in bioinformatics, and a key for successful protein structure prediction. Its importance is increasing recently in the context of structural genomics projects because more and more experimentally solved structures are available as templates for protein structure modeling. Toward this end, recent structure prediction methods employ profile-profile alignments, and various ways of aligning two profiles have been developed. More fundamentally, a better amino acid similarity matrix can improve a profile itself; thereby resulting in more accurate profile-profile alignments. Here we have developed novel amino acid similarity matrices from knowledge-based amino acid contact potentials. Contact potentials are used because the contact propensity to the other amino acids would be one of the most conserved features of each position of a protein structure. The derived amino acid similarity matrices are tested on benchmark alignments at three different levels, namely, the family, the superfamily, and the fold level. Compared to BLOSUM45 and the other existing matrices, the contact potential-based matrices perform comparably in the family level alignments, but clearly outperform in the fold level alignments. The contact potential-based matrices perform even better when suboptimal alignments are considered. Comparing the matrices themselves with each other revealed that the contact potential-based matrices are very different from BLOSUM45 and the other matrices, indicating that they are located in a different basin in the amino acid similarity matrix space.

  15. Creating Usage Context-Based Object Similarities to Boost Recommender Systems in Technology Enhanced Learning

    Science.gov (United States)

    Niemann, Katja; Wolpers, Martin

    2015-01-01

    In this paper, we introduce a new way of detecting semantic similarities between learning objects by analysing their usage in web portals. Our approach relies on the usage-based relations between the objects themselves rather then on the content of the learning objects or on the relations between users and learning objects. We then take this new…

  16. Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection.

    Science.gov (United States)

    Chen, Yifei; Sun, Yuxing; Han, Bing-Qing

    2015-01-01

    Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.

  17. A DE-Based Scatter Search for Global Optimization Problems

    Directory of Open Access Journals (Sweden)

    Kun Li

    2015-01-01

    Full Text Available This paper proposes a hybrid scatter search (SS algorithm for continuous global optimization problems by incorporating the evolution mechanism of differential evolution (DE into the reference set updated procedure of SS to act as the new solution generation method. This hybrid algorithm is called a DE-based SS (SSDE algorithm. Since different kinds of mutation operators of DE have been proposed in the literature and they have shown different search abilities for different kinds of problems, four traditional mutation operators are adopted in the hybrid SSDE algorithm. To adaptively select the mutation operator that is most appropriate to the current problem, an adaptive mechanism for the candidate mutation operators is developed. In addition, to enhance the exploration ability of SSDE, a reinitialization method is adopted to create a new population and subsequently construct a new reference set whenever the search process of SSDE is trapped in local optimum. Computational experiments on benchmark problems show that the proposed SSDE is competitive or superior to some state-of-the-art algorithms in the literature.

  18. Applying Cuckoo Search for analysis of LFSR based cryptosystem

    Directory of Open Access Journals (Sweden)

    Maiya Din

    2016-09-01

    Full Text Available Cryptographic techniques are employed for minimizing security hazards to sensitive information. To make the systems more robust, cyphers or crypts being used need to be analysed for which cryptanalysts require ways to automate the process, so that cryptographic systems can be tested more efficiently. Evolutionary algorithms provide one such resort as these are capable of searching global optimal solution very quickly. Cuckoo Search (CS Algorithm has been used effectively in cryptanalysis of conventional systems like Vigenere and Transposition cyphers. Linear Feedback Shift Register (LFSR is a crypto primitive used extensively in design of cryptosystems. In this paper, we analyse LFSR based cryptosystem using Cuckoo Search to find correct initial states of used LFSR. Primitive polynomials of degree 11, 13, 17 and 19 are considered to analyse text crypts of length 200, 300 and 400 characters. Optimal solutions were obtained for the following CS parameters: Levy distribution parameter (β = 1.5 and Alien eggs discovering probability (pa = 0.25.

  19. Querying archetype-based EHRs by search ontology-based XPath engineering.

    Science.gov (United States)

    Kropf, Stefan; Uciteli, Alexandr; Schierle, Katrin; Krücken, Peter; Denecke, Kerstin; Herre, Heinrich

    2018-05-11

    Legacy data and new structured data can be stored in a standardized format as XML-based EHRs on XML databases. Querying documents on these databases is crucial for answering research questions. Instead of using free text searches, that lead to false positive results, the precision can be increased by constraining the search to certain parts of documents. A search ontology-based specification of queries on XML documents defines search concepts and relates them to parts in the XML document structure. Such query specification method is practically introduced and evaluated by applying concrete research questions formulated in natural language on a data collection for information retrieval purposes. The search is performed by search ontology-based XPath engineering that reuses ontologies and XML-related W3C standards. The key result is that the specification of research questions can be supported by the usage of search ontology-based XPath engineering. A deeper recognition of entities and a semantic understanding of the content is necessary for a further improvement of precision and recall. Key limitation is that the application of the introduced process requires skills in ontology and software development. In future, the time consuming ontology development could be overcome by implementing a new clinical role: the clinical ontologist. The introduced Search Ontology XML extension connects Search Terms to certain parts in XML documents and enables an ontology-based definition of queries. Search ontology-based XPath engineering can support research question answering by the specification of complex XPath expressions without deep syntax knowledge about XPaths.

  20. IntelliGO: a new vector-based semantic similarity measure including annotation origin

    Directory of Open Access Journals (Sweden)

    Devignes Marie-Dominique

    2010-12-01

    previously published measures. Conclusions The IntelliGO similarity measure provides a customizable and comprehensive method for quantifying gene similarity based on GO annotations. It also displays a robust set-discriminating power which suggests it will be useful for functional clustering. Availability An on-line version of the IntelliGO similarity measure is available at: http://bioinfo.loria.fr/Members/benabdsi/intelligo_project/

  1. MinHash-Based Fuzzy Keyword Search of Encrypted Data across Multiple Cloud Servers

    Directory of Open Access Journals (Sweden)

    Jingsha He

    2018-05-01

    Full Text Available To enhance the efficiency of data searching, most data owners store their data files in different cloud servers in the form of cipher-text. Thus, efficient search using fuzzy keywords becomes a critical issue in such a cloud computing environment. This paper proposes a method that aims at improving the efficiency of cipher-text retrieval and lowering storage overhead for fuzzy keyword search. In contrast to traditional approaches, the proposed method can reduce the complexity of Min-Hash-based fuzzy keyword search by using Min-Hash fingerprints to avoid the need to construct the fuzzy keyword set. The method will utilize Jaccard similarity to rank the results of retrieval, thus reducing the amount of calculation for similarity and saving a lot of time and space overhead. The method will also take consideration of multiple user queries through re-encryption technology and update user permissions dynamically. Security analysis demonstrates that the method can provide better privacy preservation and experimental results show that efficiency of cipher-text using the proposed method can improve the retrieval time and lower storage overhead as well.

  2. A Feature-Based Structural Measure: An Image Similarity Measure for Face Recognition

    Directory of Open Access Journals (Sweden)

    Noor Abdalrazak Shnain

    2017-08-01

    Full Text Available Facial recognition is one of the most challenging and interesting problems within the field of computer vision and pattern recognition. During the last few years, it has gained special attention due to its importance in relation to current issues such as security, surveillance systems and forensics analysis. Despite this high level of attention to facial recognition, the success is still limited by certain conditions; there is no method which gives reliable results in all situations. In this paper, we propose an efficient similarity index that resolves the shortcomings of the existing measures of feature and structural similarity. This measure, called the Feature-Based Structural Measure (FSM, combines the best features of the well-known SSIM (structural similarity index measure and FSIM (feature similarity index measure approaches, striking a balance between performance for similar and dissimilar images of human faces. In addition to the statistical structural properties provided by SSIM, edge detection is incorporated in FSM as a distinctive structural feature. Its performance is tested for a wide range of PSNR (peak signal-to-noise ratio, using ORL (Olivetti Research Laboratory, now AT&T Laboratory Cambridge and FEI (Faculty of Industrial Engineering, São Bernardo do Campo, São Paulo, Brazil databases. The proposed measure is tested under conditions of Gaussian noise; simulation results show that the proposed FSM outperforms the well-known SSIM and FSIM approaches in its efficiency of similarity detection and recognition of human faces.

  3. Genetic similarity among commercial oil palm materials based on microsatellite markers

    Directory of Open Access Journals (Sweden)

    Diana Arias

    2012-08-01

    Full Text Available Microsatellite markers are used to determine genetic similarities among individuals and might be used in various applications in breeding programs. For example, knowing the genetic similarity relationships of commercial planting materials helps to better understand their responses to environmental, agronomic and plant health factors. This study assessed 17 microsatellite markers in 9 crosses (D x P of Elaeis guineensis Jacq. from various commercial companies in Malaysia, France, Costa Rica and Colombia, in order to find possible genetic differences and/or similarities. Seventy-seven alleles were obtained, with an average of 4.5 alleles per primer and a range of 2-8 amplified alleles. The results show a significant reduction of alleles, compared to the number of alleles reported for wild oil palm populations. The obtained dendrogram shows the formation of two groups based on their genetic similarity. Group A, with ~76% similarity, contains the commercial material of 3 codes of Deli x La Mé crosses produced in France and Colombia, and group B, with ~66% genetic similarity, includes all the materials produced by commercial companies in Malaysia, France, Costa Rica and Colombia

  4. Weighted similarity-based clustering of chemical structures and bioactivity data in early drug discovery.

    Science.gov (United States)

    Perualila-Tan, Nolen Joy; Shkedy, Ziv; Talloen, Willem; Göhlmann, Hinrich W H; Moerbeke, Marijke Van; Kasim, Adetayo

    2016-08-01

    The modern process of discovering candidate molecules in early drug discovery phase includes a wide range of approaches to extract vital information from the intersection of biology and chemistry. A typical strategy in compound selection involves compound clustering based on chemical similarity to obtain representative chemically diverse compounds (not incorporating potency information). In this paper, we propose an integrative clustering approach that makes use of both biological (compound efficacy) and chemical (structural features) data sources for the purpose of discovering a subset of compounds with aligned structural and biological properties. The datasets are integrated at the similarity level by assigning complementary weights to produce a weighted similarity matrix, serving as a generic input in any clustering algorithm. This new analysis work flow is semi-supervised method since, after the determination of clusters, a secondary analysis is performed wherein it finds differentially expressed genes associated to the derived integrated cluster(s) to further explain the compound-induced biological effects inside the cell. In this paper, datasets from two drug development oncology projects are used to illustrate the usefulness of the weighted similarity-based clustering approach to integrate multi-source high-dimensional information to aid drug discovery. Compounds that are structurally and biologically similar to the reference compounds are discovered using this proposed integrative approach.

  5. Protein-protein interaction inference based on semantic similarity of Gene Ontology terms.

    Science.gov (United States)

    Zhang, Shu-Bo; Tang, Qiang-Rong

    2016-07-21

    Identifying protein-protein interactions is important in molecular biology. Experimental methods to this issue have their limitations, and computational approaches have attracted more and more attentions from the biological community. The semantic similarity derived from the Gene Ontology (GO) annotation has been regarded as one of the most powerful indicators for protein interaction. However, conventional methods based on GO similarity fail to take advantage of the specificity of GO terms in the ontology graph. We proposed a GO-based method to predict protein-protein interaction by integrating different kinds of similarity measures derived from the intrinsic structure of GO graph. We extended five existing methods to derive the semantic similarity measures from the descending part of two GO terms in the GO graph, then adopted a feature integration strategy to combines both the ascending and the descending similarity scores derived from the three sub-ontologies to construct various kinds of features to characterize each protein pair. Support vector machines (SVM) were employed as discriminate classifiers, and five-fold cross validation experiments were conducted on both human and yeast protein-protein interaction datasets to evaluate the performance of different kinds of integrated features, the experimental results suggest the best performance of the feature that combines information from both the ascending and the descending parts of the three ontologies. Our method is appealing for effective prediction of protein-protein interaction. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. A Novel Relevance Feedback Approach Based on Similarity Measure Modification in an X-Ray Image Retrieval System Based on Fuzzy Representation Using Fuzzy Attributed Relational Graph

    Directory of Open Access Journals (Sweden)

    Hossien Pourghassem

    2011-04-01

    Full Text Available Relevance feedback approaches is used to improve the performance of content-based image retrieval systems. In this paper, a novel relevance feedback approach based on similarity measure modification in an X-ray image retrieval system based on fuzzy representation using fuzzy attributed relational graph (FARG is presented. In this approach, optimum weight of each feature in feature vector is calculated using similarity rate between query image and relevant and irrelevant images in user feedback. The calculated weight is used to tune fuzzy graph matching algorithm as a modifier parameter in similarity measure. The standard deviation of the retrieved image features is applied to calculate the optimum weight. The proposed image retrieval system uses a FARG for representation of images, a fuzzy matching graph algorithm as similarity measure and a semantic classifier based on merging scheme for determination of the search space in image database. To evaluate relevance feedback approach in the proposed system, a standard X-ray image database consisting of 10000 images in 57 classes is used. The improvement of the evaluation parameters shows proficiency and efficiency of the proposed system.

  7. DOSim: An R package for similarity between diseases based on Disease Ontology

    Science.gov (United States)

    2011-01-01

    Background The construction of the Disease Ontology (DO) has helped promote the investigation of diseases and disease risk factors. DO enables researchers to analyse disease similarity by adopting semantic similarity measures, and has expanded our understanding of the relationships between different diseases and to classify them. Simultaneously, similarities between genes can also be analysed by their associations with similar diseases. As a result, disease heterogeneity is better understood and insights into the molecular pathogenesis of similar diseases have been gained. However, bioinformatics tools that provide easy and straight forward ways to use DO to study disease and gene similarity simultaneously are required. Results We have developed an R-based software package (DOSim) to compute the similarity between diseases and to measure the similarity between human genes in terms of diseases. DOSim incorporates a DO-based enrichment analysis function that can be used to explore the disease feature of an independent gene set. A multilayered enrichment analysis (GO and KEGG annotation) annotation function that helps users explore the biological meaning implied in a newly detected gene module is also part of the DOSim package. We used the disease similarity application to demonstrate the relationship between 128 different DO cancer terms. The hierarchical clustering of these 128 different cancers showed modular characteristics. In another case study, we used the gene similarity application on 361 obesity-related genes. The results revealed the complex pathogenesis of obesity. In addition, the gene module detection and gene module multilayered annotation functions in DOSim when applied on these 361 obesity-related genes helped extend our understanding of the complex pathogenesis of obesity risk phenotypes and the heterogeneity of obesity-related diseases. Conclusions DOSim can be used to detect disease-driven gene modules, and to annotate the modules for functions and

  8. A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.

    Science.gov (United States)

    Mazandu, Gaston K; Chimusa, Emile R; Mbiyavanga, Mamana; Mulder, Nicola J

    2016-02-01

    Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is a portable software package integrating all known GO information content-based semantic similarity measures and relevant biological applications associated with these measures. A-DaGO-Fun has the advantage not only of handling datasets from the current high-throughput genome-wide applications, but also allowing users to choose the most relevant semantic similarity approach for their biological applications and to adapt a given module to their needs. A-DaGO-Fun is freely available to the research community at http://web.cbio.uct.ac.za/ITGOM/adagofun. It is implemented in Linux using Python under free software (GNU General Public Licence). gmazandu@cbio.uct.ac.za or Nicola.Mulder@uct.ac.za Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. Parallel Harmony Search Based Distributed Energy Resource Optimization

    Energy Technology Data Exchange (ETDEWEB)

    Ceylan, Oguzhan [ORNL; Liu, Guodong [ORNL; Tomsovic, Kevin [University of Tennessee, Knoxville (UTK)

    2015-01-01

    This paper presents a harmony search based parallel optimization algorithm to minimize voltage deviations in three phase unbalanced electrical distribution systems and to maximize active power outputs of distributed energy resources (DR). The main contribution is to reduce the adverse impacts on voltage profile during a day as photovoltaics (PVs) output or electrical vehicles (EVs) charging changes throughout a day. The IEEE 123- bus distribution test system is modified by adding DRs and EVs under different load profiles. The simulation results show that by using parallel computing techniques, heuristic methods may be used as an alternative optimization tool in electrical power distribution systems operation.

  10. Supporting ontology-based keyword search over medical databases.

    Science.gov (United States)

    Kementsietsidis, Anastasios; Lim, Lipyeow; Wang, Min

    2008-11-06

    The proliferation of medical terms poses a number of challenges in the sharing of medical information among different stakeholders. Ontologies are commonly used to establish relationships between different terms, yet their role in querying has not been investigated in detail. In this paper, we study the problem of supporting ontology-based keyword search queries on a database of electronic medical records. We present several approaches to support this type of queries, study the advantages and limitations of each approach, and summarize the lessons learned as best practices.

  11. Anomaly Detection in Nanofibrous Materials by CNN-Based Self-Similarity

    Directory of Open Access Journals (Sweden)

    Paolo Napoletano

    2018-01-01

    Full Text Available Automatic detection and localization of anomalies in nanofibrous materials help to reduce the cost of the production process and the time of the post-production visual inspection process. Amongst all the monitoring methods, those exploiting Scanning Electron Microscope (SEM imaging are the most effective. In this paper, we propose a region-based method for the detection and localization of anomalies in SEM images, based on Convolutional Neural Networks (CNNs and self-similarity. The method evaluates the degree of abnormality of each subregion of an image under consideration by computing a CNN-based visual similarity with respect to a dictionary of anomaly-free subregions belonging to a training set. The proposed method outperforms the state of the art.

  12. Multispecies Coevolution Particle Swarm Optimization Based on Previous Search History

    Directory of Open Access Journals (Sweden)

    Danping Wang

    2017-01-01

    Full Text Available A hybrid coevolution particle swarm optimization algorithm with dynamic multispecies strategy based on K-means clustering and nonrevisit strategy based on Binary Space Partitioning fitness tree (called MCPSO-PSH is proposed. Previous search history memorized into the Binary Space Partitioning fitness tree can effectively restrain the individuals’ revisit phenomenon. The whole population is partitioned into several subspecies and cooperative coevolution is realized by an information communication mechanism between subspecies, which can enhance the global search ability of particles and avoid premature convergence to local optimum. To demonstrate the power of the method, comparisons between the proposed algorithm and state-of-the-art algorithms are grouped into two categories: 10 basic benchmark functions (10-dimensional and 30-dimensional, 10 CEC2005 benchmark functions (30-dimensional, and a real-world problem (multilevel image segmentation problems. Experimental results show that MCPSO-PSH displays a competitive performance compared to the other swarm-based or evolutionary algorithms in terms of solution accuracy and statistical tests.

  13. Behavioral similarity measurement based on image processing for robots that use imitative learning

    Science.gov (United States)

    Sterpin B., Dante G.; Martinez S., Fernando; Jacinto G., Edwar

    2017-02-01

    In the field of the artificial societies, particularly those are based on memetics, imitative behavior is essential for the development of cultural evolution. Applying this concept for robotics, through imitative learning, a robot can acquire behavioral patterns from another robot. Assuming that the learning process must have an instructor and, at least, an apprentice, the fact to obtain a quantitative measurement for their behavioral similarity, would be potentially useful, especially in artificial social systems focused on cultural evolution. In this paper the motor behavior of both kinds of robots, for two simple tasks, is represented by 2D binary images, which are processed in order to measure their behavioral similarity. The results shown here were obtained comparing some similarity measurement methods for binary images.

  14. Structure modulates similarity-based interference in sluicing: An eye tracking study.

    Directory of Open Access Journals (Sweden)

    Jesse A. Harris

    2015-12-01

    Full Text Available In cue-based content-addressable approaches to memory, a target and its competitors are retrieved in parallel from memory via a fast, associative cue-matching procedure under a severely limited focus of attention. Such a parallel matching procedure could in principle ignore the serial order or hierarchical structure characteristic of linguistic relations. I present an eye tracking while reading experiment that investigates whether the sentential position of a potential antecedent modulates the strength of similarity-based interference, a well-studied effect in which increased similarity in features between a target and its competitors results in slower and less accurate retrieval overall. The manipulation trades on an independently established Locality bias in sluiced structures to associate a wh-remnant (which ones in clausal ellipsis with the most local correlate (some wines, as in The tourists enjoyed some wines, but I don’t know which ones. The findings generally support cue-based parsing models of sentence processing that are subject to similarity-based interference in retrieval, and provide additional support to the growing body of evidence that retrieval is sensitive to both the structural position of a target antecedent and its competitors, and the specificity of retrieval cues.

  15. Optimizing top precision performance measure of content-based image retrieval by learning similarity function

    KAUST Repository

    Liang, Ru-Ze

    2017-04-24

    In this paper we study the problem of content-based image retrieval. In this problem, the most popular performance measure is the top precision measure, and the most important component of a retrieval system is the similarity function used to compare a query image against a database image. However, up to now, there is no existing similarity learning method proposed to optimize the top precision measure. To fill this gap, in this paper, we propose a novel similarity learning method to maximize the top precision measure. We model this problem as a minimization problem with an objective function as the combination of the losses of the relevant images ranked behind the top-ranked irrelevant image, and the squared Frobenius norm of the similarity function parameter. This minimization problem is solved as a quadratic programming problem. The experiments over two benchmark data sets show the advantages of the proposed method over other similarity learning methods when the top precision is used as the performance measure.

  16. Optimizing top precision performance measure of content-based image retrieval by learning similarity function

    KAUST Repository

    Liang, Ru-Ze; Shi, Lihui; Wang, Haoxiang; Meng, Jiandong; Wang, Jim Jing-Yan; Sun, Qingquan; Gu, Yi

    2017-01-01

    In this paper we study the problem of content-based image retrieval. In this problem, the most popular performance measure is the top precision measure, and the most important component of a retrieval system is the similarity function used to compare a query image against a database image. However, up to now, there is no existing similarity learning method proposed to optimize the top precision measure. To fill this gap, in this paper, we propose a novel similarity learning method to maximize the top precision measure. We model this problem as a minimization problem with an objective function as the combination of the losses of the relevant images ranked behind the top-ranked irrelevant image, and the squared Frobenius norm of the similarity function parameter. This minimization problem is solved as a quadratic programming problem. The experiments over two benchmark data sets show the advantages of the proposed method over other similarity learning methods when the top precision is used as the performance measure.

  17. New Architectures for Presenting Search Results Based on Web Search Engines Users Experience

    Science.gov (United States)

    Martinez, F. J.; Pastor, J. A.; Rodriguez, J. V.; Lopez, Rosana; Rodriguez, J. V., Jr.

    2011-01-01

    Introduction: The Internet is a dynamic environment which is continuously being updated. Search engines have been, currently are and in all probability will continue to be the most popular systems in this information cosmos. Method: In this work, special attention has been paid to the series of changes made to search engines up to this point,…

  18. Neutrosophic Similarity Score Based Weighted Histogram for Robust Mean-Shift Tracking

    Directory of Open Access Journals (Sweden)

    Keli Hu

    2017-10-01

    Full Text Available Visual object tracking is a critical task in computer vision. Challenging things always exist when an object needs to be tracked. For instance, background clutter is one of the most challenging problems. The mean-shift tracker is quite popular because of its efficiency and performance in a range of conditions. However, the challenge of background clutter also disturbs its performance. In this article, we propose a novel weighted histogram based on neutrosophic similarity score to help the mean-shift tracker discriminate the target from the background. Neutrosophic set (NS is a new branch of philosophy for dealing with incomplete, indeterminate, and inconsistent information. In this paper, we utilize the single valued neutrosophic set (SVNS, which is a subclass of NS to improve the mean-shift tracker. First, two kinds of criteria are considered as the object feature similarity and the background feature similarity, and each bin of the weight histogram is represented in the SVNS domain via three membership functions T(Truth, I(indeterminacy, and F(Falsity. Second, the neutrosophic similarity score function is introduced to fuse those two criteria and to build the final weight histogram. Finally, a novel neutrosophic weighted mean-shift tracker is proposed. The proposed tracker is compared with several mean-shift based trackers on a dataset of 61 public sequences. The results revealed that our method outperforms other trackers, especially when confronting background clutter.

  19. Approach for Text Classification Based on the Similarity Measurement between Normal Cloud Models

    Directory of Open Access Journals (Sweden)

    Jin Dai

    2014-01-01

    Full Text Available The similarity between objects is the core research area of data mining. In order to reduce the interference of the uncertainty of nature language, a similarity measurement between normal cloud models is adopted to text classification research. On this basis, a novel text classifier based on cloud concept jumping up (CCJU-TC is proposed. It can efficiently accomplish conversion between qualitative concept and quantitative data. Through the conversion from text set to text information table based on VSM model, the text qualitative concept, which is extraction from the same category, is jumping up as a whole category concept. According to the cloud similarity between the test text and each category concept, the test text is assigned to the most similar category. By the comparison among different text classifiers in different feature selection set, it fully proves that not only does CCJU-TC have a strong ability to adapt to the different text features, but also the classification performance is also better than the traditional classifiers.

  20. Gains Based Remedies: the misguided search for a doctrine

    Directory of Open Access Journals (Sweden)

    Tom Stafford

    2016-12-01

    Full Text Available ADVANCE ACCESSIn this article Tom Stafford (Paralegal at Clyde & Co LLP examines the phenomenon of “Gains Based Remedies”. These are awards that, unlike classical damage awards which are calculated by reference to the loss suffered by the claimant, correlate to the gain made by the defendant. A couple of common examples include an account of profits for breach of trust claims, or the “disgorgement” damages that were awarded in AG v Blake. These awards are however available for a spectrum of varied wrongs. Their seeming lack of unity has often baffled commentators who have tried to search for an underpinning doctrine. One particularly renowned commentary is that of Professor Edelman’s, who suggests that these wrongs can be understood by being broken down into one of two categories: awards which seek to deter wrongdoing, and awards which reverse a wrongful transfer of value. The purpose of this article is to discuss the flaws of this view of the law, and to suggest that in fact, any search for a doctrinal underpinning to Gains Based Remedies is misguided. The cases in which these awards are granted have only one feature common to all: the claimant’s loss is, for whatever reason, difficult or impossible to assess. For that reason, the courts use the only other measure of the wrong available: the defendant’s gain.

  1. A Dynamic Neighborhood Learning-Based Gravitational Search Algorithm.

    Science.gov (United States)

    Zhang, Aizhu; Sun, Genyun; Ren, Jinchang; Li, Xiaodong; Wang, Zhenjie; Jia, Xiuping

    2018-01-01

    Balancing exploration and exploitation according to evolutionary states is crucial to meta-heuristic search (M-HS) algorithms. Owing to its simplicity in theory and effectiveness in global optimization, gravitational search algorithm (GSA) has attracted increasing attention in recent years. However, the tradeoff between exploration and exploitation in GSA is achieved mainly by adjusting the size of an archive, named , which stores those superior agents after fitness sorting in each iteration. Since the global property of remains unchanged in the whole evolutionary process, GSA emphasizes exploitation over exploration and suffers from rapid loss of diversity and premature convergence. To address these problems, in this paper, we propose a dynamic neighborhood learning (DNL) strategy to replace the model and thereby present a DNL-based GSA (DNLGSA). The method incorporates the local and global neighborhood topologies for enhancing the exploration and obtaining adaptive balance between exploration and exploitation. The local neighborhoods are dynamically formed based on evolutionary states. To delineate the evolutionary states, two convergence criteria named limit value and population diversity, are introduced. Moreover, a mutation operator is designed for escaping from the local optima on the basis of evolutionary states. The proposed algorithm was evaluated on 27 benchmark problems with different characteristic and various difficulties. The results reveal that DNLGSA exhibits competitive performances when compared with a variety of state-of-the-art M-HS algorithms. Moreover, the incorporation of local neighborhood topology reduces the numbers of calculations of gravitational force and thus alleviates the high computational cost of GSA.

  2. Novel citation-based search method for scientific literature: application to meta-analyses.

    Science.gov (United States)

    Janssens, A Cecile J W; Gwinn, M

    2015-10-13

    Finding eligible studies for meta-analysis and systematic reviews relies on keyword-based searching as the gold standard, despite its inefficiency. Searching based on direct citations is not sufficiently comprehensive. We propose a novel strategy that ranks articles on their degree of co-citation with one or more "known" articles before reviewing their eligibility. In two independent studies, we aimed to reproduce the results of literature searches for sets of published meta-analyses (n = 10 and n = 42). For each meta-analysis, we extracted co-citations for the randomly selected 'known' articles from the Web of Science database, counted their frequencies and screened all articles with a score above a selection threshold. In the second study, we extended the method by retrieving direct citations for all selected articles. In the first study, we retrieved 82% of the studies included in the meta-analyses while screening only 11% as many articles as were screened for the original publications. Articles that we missed were published in non-English languages, published before 1975, published very recently, or available only as conference abstracts. In the second study, we retrieved 79% of included studies while screening half the original number of articles. Citation searching appears to be an efficient and reasonably accurate method for finding articles similar to one or more articles of interest for meta-analysis and reviews.

  3. Learning material recommendation based on case-based reasoning similarity scores

    Science.gov (United States)

    Masood, Mona; Mokmin, Nur Azlina Mohamed

    2017-10-01

    A personalized learning material recommendation is important in any Intelligent Tutoring System (ITS). Case-based Reasoning (CBR) is an Artificial Intelligent Algorithm that has been widely used in the development of ITS applications. This study has developed an ITS application that applied the CBR algorithm in the development process. The application has the ability to recommend the most suitable learning material to the specific student based on information in the student profile. In order to test the ability of the application in recommending learning material, two versions of the application were created. The first version displayed the most suitable learning material and the second version displayed the least preferable learning material. The results show the application has successfully assigned the students to the most suitable learning material.

  4. Research on electric and thermal characteristics of plasma torch based on similarity theory

    International Nuclear Information System (INIS)

    Cheng Changming; Tang Deli; Lan Wei

    2007-01-01

    Configuration and working principle of a DC non-transferred plasma torch have been introduced. Based on similarity theory, connections between the electric-thermal characteristics and operational parameter such as flowing gas rate and arc power have been investigated. Calculation and experiment are compared. The results indicate that the calculation results are in agreement with experimental ones. The formulas can be used for plasma torch improvement and optimization. (authors)

  5. LASSO-ligand activity by surface similarity order: a new tool for ligand based virtual screening.

    Science.gov (United States)

    Reid, Darryl; Sadjad, Bashir S; Zsoldos, Zsolt; Simon, Aniko

    2008-01-01

    Virtual Ligand Screening (VLS) has become an integral part of the drug discovery process for many pharmaceutical companies. Ligand similarity searches provide a very powerful method of screening large databases of ligands to identify possible hits. If these hits belong to new chemotypes the method is deemed even more successful. eHiTS LASSO uses a new interacting surface point types (ISPT) molecular descriptor that is generated from the 3D structure of the ligand, but unlike most 3D descriptors it is conformation independent. Combined with a neural network machine learning technique, LASSO screens molecular databases at an ultra fast speed of 1 million structures in under 1 min on a standard PC. The results obtained from eHiTS LASSO trained on relatively small training sets of just 2, 4 or 8 actives are presented using the diverse directory of useful decoys (DUD) dataset. It is shown that over a wide range of receptor families, eHiTS LASSO is consistently able to enrich screened databases and provides scaffold hopping ability.

  6. LASSO—ligand activity by surface similarity order: a new tool for ligand based virtual screening

    Science.gov (United States)

    Reid, Darryl; Sadjad, Bashir S.; Zsoldos, Zsolt; Simon, Aniko

    2008-06-01

    Virtual Ligand Screening (VLS) has become an integral part of the drug discovery process for many pharmaceutical companies. Ligand similarity searches provide a very powerful method of screening large databases of ligands to identify possible hits. If these hits belong to new chemotypes the method is deemed even more successful. eHiTS LASSO uses a new interacting surface point types (ISPT) molecular descriptor that is generated from the 3D structure of the ligand, but unlike most 3D descriptors it is conformation independent. Combined with a neural network machine learning technique, LASSO screens molecular databases at an ultra fast speed of 1 million structures in under 1 min on a standard PC. The results obtained from eHiTS LASSO trained on relatively small training sets of just 2, 4 or 8 actives are presented using the diverse directory of useful decoys (DUD) dataset. It is shown that over a wide range of receptor families, eHiTS LASSO is consistently able to enrich screened databases and provides scaffold hopping ability.

  7. Similar-Case-Based Optimization of Beam Arrangements in Stereotactic Body Radiotherapy for Assisting Treatment Planners

    Directory of Open Access Journals (Sweden)

    Taiki Magome

    2013-01-01

    Full Text Available Objective. To develop a similar-case-based optimization method for beam arrangements in lung stereotactic body radiotherapy (SBRT to assist treatment planners. Methods. First, cases that are similar to an objective case were automatically selected based on geometrical features related to a planning target volume (PTV location, PTV shape, lung size, and spinal cord position. Second, initial beam arrangements were determined by registration of similar cases with the objective case using a linear registration technique. Finally, beam directions of the objective case were locally optimized based on the cost function, which takes into account the radiation absorption in normal tissues and organs at risk. The proposed method was evaluated with 10 test cases and a treatment planning database including 81 cases, by using 11 planning evaluation indices such as tumor control probability and normal tissue complication probability (NTCP. Results. The procedure for the local optimization of beam arrangements improved the quality of treatment plans with significant differences (P<0.05 in the homogeneity index and conformity index for the PTV, V10, V20, mean dose, and NTCP for the lung. Conclusion. The proposed method could be usable as a computer-aided treatment planning tool for the determination of beam arrangements in SBRT.

  8. Dynamics based alignment of proteins: an alternative approach to quantify dynamic similarity

    Directory of Open Access Journals (Sweden)

    Lyngsø Rune

    2010-04-01

    Full Text Available Abstract Background The dynamic motions of many proteins are central to their function. It therefore follows that the dynamic requirements of a protein are evolutionary constrained. In order to assess and quantify this, one needs to compare the dynamic motions of different proteins. Comparing the dynamics of distinct proteins may also provide insight into how protein motions are modified by variations in sequence and, consequently, by structure. The optimal way of comparing complex molecular motions is, however, far from trivial. The majority of comparative molecular dynamics studies performed to date relied upon prior sequence or structural alignment to define which residues were equivalent in 3-dimensional space. Results Here we discuss an alternative methodology for comparative molecular dynamics that does not require any prior alignment information. We show it is possible to align proteins based solely on their dynamics and that we can use these dynamics-based alignments to quantify the dynamic similarity of proteins. Our method was tested on 10 representative members of the PDZ domain family. Conclusions As a result of creating pair-wise dynamics-based alignments of PDZ domains, we have found evolutionarily conserved patterns in their backbone dynamics. The dynamic similarity of PDZ domains is highly correlated with their structural similarity as calculated with Dali. However, significant differences in their dynamics can be detected indicating that sequence has a more refined role to play in protein dynamics than just dictating the overall fold. We suggest that the method should be generally applicable.

  9. Memoryless cooperative graph search based on the simulated annealing algorithm

    International Nuclear Information System (INIS)

    Hou Jian; Yan Gang-Feng; Fan Zhen

    2011-01-01

    We have studied the problem of reaching a globally optimal segment for a graph-like environment with a single or a group of autonomous mobile agents. Firstly, two efficient simulated-annealing-like algorithms are given for a single agent to solve the problem in a partially known environment and an unknown environment, respectively. It shows that under both proposed control strategies, the agent will eventually converge to a globally optimal segment with probability 1. Secondly, we use multi-agent searching to simultaneously reduce the computation complexity and accelerate convergence based on the algorithms we have given for a single agent. By exploiting graph partition, a gossip-consensus method based scheme is presented to update the key parameter—radius of the graph, ensuring that the agents spend much less time finding a globally optimal segment. (interdisciplinary physics and related areas of science and technology)

  10. Information retrieval for children based on the aggregated search paradigm

    NARCIS (Netherlands)

    Duarte Torres, Sergio

    This report presents research to develop information services for children by expanding and adapting current Information retrieval technologies according to the search characteristics and needs of children. Concretely, we will employ the aggregated search paradigm as theoretical framework. The

  11. Managing Online Search Statistics with dBASE III Plus.

    Science.gov (United States)

    Speer, Susan C.

    1987-01-01

    Describes a computer program designed to manage statistics about online searches which reports the number of searches by vendor, purpose, and librarian; calculates charges to departments and individuals; and prints monthly invoices to users with standing accounts. (CLB)

  12. Personalized Profile Based Search Interface With Ranked and Clustered Display

    National Research Council Canada - National Science Library

    Kumar, Sachin; Oztekin, B. U; Ertoz, Levent; Singhal, Saurabh; Han, Euihong; Kumar, Vipin

    2001-01-01

    We have developed an experimental meta-search engine, which takes the snippets from traditional search engines and presents them to the user either in the form of clusters, indices or re-ranked list...

  13. A Two-Stage Composition Method for Danger-Aware Services Based on Context Similarity

    Science.gov (United States)

    Wang, Junbo; Cheng, Zixue; Jing, Lei; Ota, Kaoru; Kansen, Mizuo

    Context-aware systems detect user's physical and social contexts based on sensor networks, and provide services that adapt to the user accordingly. Representing, detecting, and managing the contexts are important issues in context-aware systems. Composition of contexts is a useful method for these works, since it can detect a context by automatically composing small pieces of information to discover service. Danger-aware services are a kind of context-aware services which need description of relations between a user and his/her surrounding objects and between users. However when applying the existing composition methods to danger-aware services, they show the following shortcomings that (1) they have not provided an explicit method for representing composition of multi-user' contexts, (2) there is no flexible reasoning mechanism based on similarity of contexts, so that they can just provide services exactly following the predefined context reasoning rules. Therefore, in this paper, we propose a two-stage composition method based on context similarity to solve the above problems. The first stage is composition of the useful information to represent the context for a single user. The second stage is composition of multi-users' contexts to provide services by considering the relation of users. Finally the danger degree of the detected context is computed by using context similarity between the detected context and the predefined context. Context is dynamically represented based on two-stage composition rules and a Situation theory based Ontology, which combines the advantages of Ontology and Situation theory. We implement the system in an indoor ubiquitous environment, and evaluate the system through two experiments with the support of subjects. The experiment results show the method is effective, and the accuracy of danger detection is acceptable to a danger-aware system.

  14. Novel Agent Based-approach for Industrial Diagnosis: A Combined use Between Case-based Reasoning and Similarity Measure

    Directory of Open Access Journals (Sweden)

    Fatima Zohra Benkaddour

    2016-12-01

    Full Text Available In spunlace nonwovens industry, the maintenance task is very complex, it requires experts and operators collaboration. In this paper, we propose a new approach integrating an agent- based modelling with case-based reasoning that utilizes similarity measures and preferences module. The main purpose of our study is to compare and evaluate the most suitable similarity measure for our case. Furthermore, operators that are usually geographically dispersed, have to collaborate and negotiate to achieve mutual agreements, especially when their proposals (diagnosis lead to a conflicting situation. The experimentation shows that the suggested agent-based approach is very interesting and efficient for operators and experts who collaborate in INOTIS enterprise.

  15. Distributional and Knowledge-Based Approaches for Computing Portuguese Word Similarity

    Directory of Open Access Journals (Sweden)

    Hugo Gonçalo Oliveira

    2018-02-01

    Full Text Available Identifying similar and related words is not only key in natural language understanding but also a suitable task for assessing the quality of computational resources that organise words and meanings of a language, compiled by different means. This paper, which aims to be a reference for those interested in computing word similarity in Portuguese, presents several approaches for this task and is motivated by the recent availability of state-of-the-art distributional models of Portuguese words, which add to several lexical knowledge bases (LKBs for this language, available for a longer time. The previous resources were exploited to answer word similarity tests, which also became recently available for Portuguese. We conclude that there are several valid approaches for this task, but not one that outperforms all the others in every single test. Distributional models seem to capture relatedness better, while LKBs are better suited for computing genuine similarity, but, in general, better results are obtained when knowledge from different sources is combined.

  16. Biobotic insect swarm based sensor networks for search and rescue

    Science.gov (United States)

    Bozkurt, Alper; Lobaton, Edgar; Sichitiu, Mihail; Hedrick, Tyson; Latif, Tahmid; Dirafzoon, Alireza; Whitmire, Eric; Verderber, Alexander; Marin, Juan; Xiong, Hong

    2014-06-01

    The potential benefits of distributed robotics systems in applications requiring situational awareness, such as search-and-rescue in emergency situations, are indisputable. The efficiency of such systems requires robotic agents capable of coping with uncertain and dynamic environmental conditions. For example, after an earthquake, a tremendous effort is spent for days to reach to surviving victims where robotic swarms or other distributed robotic systems might play a great role in achieving this faster. However, current technology falls short of offering centimeter scale mobile agents that can function effectively under such conditions. Insects, the inspiration of many robotic swarms, exhibit an unmatched ability to navigate through such environments while successfully maintaining control and stability. We have benefitted from recent developments in neural engineering and neuromuscular stimulation research to fuse the locomotory advantages of insects with the latest developments in wireless networking technologies to enable biobotic insect agents to function as search-and-rescue agents. Our research efforts towards this goal include development of biobot electronic backpack technologies, establishment of biobot tracking testbeds to evaluate locomotion control efficiency, investigation of biobotic control strategies with Gromphadorhina portentosa cockroaches and Manduca sexta moths, establishment of a localization and communication infrastructure, modeling and controlling collective motion by learning deterministic and stochastic motion models, topological motion modeling based on these models, and the development of a swarm robotic platform to be used as a testbed for our algorithms.

  17. Neural circuits of eye movements during performance of the visual exploration task, which is similar to the responsive search score task, in schizophrenia patients and normal subjects

    International Nuclear Information System (INIS)

    Nemoto, Yasundo; Matsuda, Tetsuya; Matsuura, Masato

    2004-01-01

    Abnormal exploratory eye movements have been studied as a biological marker for schizophrenia. Using functional MRI (fMRI), we investigated brain activations of 12 healthy and 8 schizophrenic subjects during performance of a visual exploration task that is similar to the responsive search score task to clarify the neural basis of the abnormal exploratory eye movement. Performance data, such as the number of eye movements, the reaction time, and the percentage of correct answers showed no significant differences between the two groups. Only the normal subjects showed activations at the bilateral thalamus and the left anterior medial frontal cortex during the visual exploration tasks. In contrast, only the schizophrenic subjects showed activations at the right anterior cingulate gyms during the same tasks. The activation at the different locations between the two groups, the left anterior medial frontal cortex in normal subjects and the right anterior cingulate gyrus in schizophrenia subjects, was explained by the feature of the visual tasks. Hypoactivation at the bilateral thalamus supports a dysfunctional filtering theory of schizophrenia. (author)

  18. K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.

    Science.gov (United States)

    Lin, Jie; Adjeroh, Donald A; Jiang, Bing-Hua; Jiang, Yue

    2018-05-15

    Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). yueljiang@163.com. Supplementary data are available at Bioinformatics online.

  19. Patch Similarity Modulus and Difference Curvature Based Fourth-Order Partial Differential Equation for Image Denoising

    Directory of Open Access Journals (Sweden)

    Yunjiao Bai

    2015-01-01

    Full Text Available The traditional fourth-order nonlinear diffusion denoising model suffers the isolated speckles and the loss of fine details in the processed image. For this reason, a new fourth-order partial differential equation based on the patch similarity modulus and the difference curvature is proposed for image denoising. First, based on the intensity similarity of neighbor pixels, this paper presents a new edge indicator called patch similarity modulus, which is strongly robust to noise. Furthermore, the difference curvature which can effectively distinguish between edges and noise is incorporated into the denoising algorithm to determine the diffusion process by adaptively adjusting the size of the diffusion coefficient. The experimental results show that the proposed algorithm can not only preserve edges and texture details, but also avoid isolated speckles and staircase effect while filtering out noise. And the proposed algorithm has a better performance for the images with abundant details. Additionally, the subjective visual quality and objective evaluation index of the denoised image obtained by the proposed algorithm are higher than the ones from the related methods.

  20. Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks.

    Science.gov (United States)

    Arunraja, Muruganantham; Malathi, Veluchamy; Sakthivel, Erulappan

    2015-11-01

    Wireless sensor networks are engaged in various data gathering applications. The major bottleneck in wireless data gathering systems is the finite energy of sensor nodes. By conserving the on board energy, the life span of wireless sensor network can be well extended. Data communication being the dominant energy consuming activity of wireless sensor network, data reduction can serve better in conserving the nodal energy. Spatial and temporal correlation among the sensor data is exploited to reduce the data communications. Data similar cluster formation is an effective way to exploit spatial correlation among the neighboring sensors. By sending only a subset of data and estimate the rest using this subset is the contemporary way of exploiting temporal correlation. In Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks, we construct data similar iso-clusters with minimal communication overhead. The intra-cluster communication is reduced using adaptive-normalized least mean squares based dual prediction framework. The cluster head reduces the inter-cluster data payload using a lossless compressive forwarding technique. The proposed work achieves significant data reduction in both the intra-cluster and the inter-cluster communications, with the optimal data accuracy of collected data. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.

  1. Similarities and differences in coatings for magnesium-based stents and orthopaedic implants

    Directory of Open Access Journals (Sweden)

    Jun Ma

    2014-07-01

    Full Text Available Magnesium (Mg-based biodegradable materials are promising candidates for the new generation of implantable medical devices, particularly cardiovascular stents and orthopaedic implants. Mg-based cardiovascular stents represent the most innovative stent technology to date. However, these products still do not fully meet clinical requirements with regards to fast degradation rates, late restenosis, and thrombosis. Thus various surface coatings have been introduced to protect Mg-based stents from rapid corrosion and to improve biocompatibility. Similarly, different coatings have been used for orthopaedic implants, e.g., plates and pins for bone fracture fixation or as an interference screw for tendon-bone or ligament-bone insertion, to improve biocompatibility and corrosion resistance. Metal coatings, nanoporous inorganic coatings and permanent polymers have been proved to enhance corrosion resistance; however, inflammation and foreign body reactions have also been reported. By contrast, biodegradable polymers are more biocompatible in general and are favoured over permanent materials. Drugs are also loaded with biodegradable polymers to improve their performance. The key similarities and differences in coatings for Mg-based stents and orthopaedic implants are summarized.

  2. Neural Substrates of Similarity and Rule-based Strategies in Judgment

    Directory of Open Access Journals (Sweden)

    Bettina eVon Helversen

    2014-10-01

    Full Text Available Making accurate judgments is a core human competence and a prerequisite for success in many areas of life. Plenty of evidence exists that people can employ different judgment strategies to solve identical judgment problems. In categorization, it has been demonstrated that similarity-based and rule-based strategies are associated with activity in different brain regions. Building on this research, the present work tests whether solving two identical judgment problems recruits different neural substrates depending on people's judgment strategies. Combining cognitive modeling of judgment strategies at the behavioral level with functional magnetic resonance imaging (fMRI, we compare brain activity when using two archetypal judgment strategies: a similarity-based exemplar strategy and a rule-based heuristic strategy. Using an exemplar-based strategy should recruit areas involved in long-term memory processes to a larger extent than a heuristic strategy. In contrast, using a heuristic strategy should recruit areas involved in the application of rules to a larger extent than an exemplar-based strategy. Largely consistent with our hypotheses, we found that using an exemplar-based strategy led to relatively higher BOLD activity in the anterior prefrontal and inferior parietal cortex, presumably related to retrieval and selective attention processes. In contrast, using a heuristic strategy led to relatively higher activity in areas in the dorsolateral prefrontal and the temporal-parietal cortex associated with cognitive control and information integration. Thus, even when people solve identical judgment problems, different neural substrates can be recruited depending on the judgment strategy involved.

  3. Reducing 4D CT artifacts using optimized sorting based on anatomic similarity.

    Science.gov (United States)

    Johnston, Eric; Diehn, Maximilian; Murphy, James D; Loo, Billy W; Maxim, Peter G

    2011-05-01

    Four-dimensional (4D) computed tomography (CT) has been widely used as a tool to characterize respiratory motion in radiotherapy. The two most commonly used 4D CT algorithms sort images by the associated respiratory phase or displacement into a predefined number of bins, and are prone to image artifacts at transitions between bed positions. The purpose of this work is to demonstrate a method of reducing motion artifacts in 4D CT by incorporating anatomic similarity into phase or displacement based sorting protocols. Ten patient datasets were retrospectively sorted using both the displacement and phase based sorting algorithms. Conventional sorting methods allow selection of only the nearest-neighbor image in time or displacement within each bin. In our method, for each bed position either the displacement or the phase defines the center of a bin range about which several candidate images are selected. The two dimensional correlation coefficients between slices bordering the interface between adjacent couch positions are then calculated for all candidate pairings. Two slices have a high correlation if they are anatomically similar. Candidates from each bin are then selected to maximize the slice correlation over the entire data set using the Dijkstra's shortest path algorithm. To assess the reduction of artifacts, two thoracic radiation oncologists independently compared the resorted 4D datasets pairwise with conventionally sorted datasets, blinded to the sorting method, to choose which had the least motion artifacts. Agreement between reviewers was evaluated using the weighted kappa score. Anatomically based image selection resulted in 4D CT datasets with significantly reduced motion artifacts with both displacement (P = 0.0063) and phase sorting (P = 0.00022). There was good agreement between the two reviewers, with complete agreement 34 times and complete disagreement 6 times. Optimized sorting using anatomic similarity significantly reduces 4D CT motion

  4. Linear-fitting-based similarity coefficient map for tissue dissimilarity analysis in -w magnetic resonance imaging

    International Nuclear Information System (INIS)

    Yu Shao-De; Wu Shi-Bin; Xie Yao-Qin; Wang Hao-Yu; Wei Xin-Hua; Chen Xin; Pan Wan-Long; Hu Jiani

    2015-01-01

    Similarity coefficient mapping (SCM) aims to improve the morphological evaluation of weighted magnetic resonance imaging However, how to interpret the generated SCM map is still pending. Moreover, is it probable to extract tissue dissimilarity messages based on the theory behind SCM? The primary purpose of this paper is to address these two questions. First, the theory of SCM was interpreted from the perspective of linear fitting. Then, a term was embedded for tissue dissimilarity information. Finally, our method was validated with sixteen human brain image series from multi-echo . Generated maps were investigated from signal-to-noise ratio (SNR) and perceived visual quality, and then interpreted from intra- and inter-tissue intensity. Experimental results show that both perceptibility of anatomical structures and tissue contrast are improved. More importantly, tissue similarity or dissimilarity can be quantified and cross-validated from pixel intensity analysis. This method benefits image enhancement, tissue classification, malformation detection and morphological evaluation. (paper)

  5. Generalized sample entropy analysis for traffic signals based on similarity measure

    Science.gov (United States)

    Shang, Du; Xu, Mengjia; Shang, Pengjian

    2017-05-01

    Sample entropy is a prevailing method used to quantify the complexity of a time series. In this paper a modified method of generalized sample entropy and surrogate data analysis is proposed as a new measure to assess the complexity of a complex dynamical system such as traffic signals. The method based on similarity distance presents a different way of signals patterns match showing distinct behaviors of complexity. Simulations are conducted over synthetic data and traffic signals for providing the comparative study, which is provided to show the power of the new method. Compared with previous sample entropy and surrogate data analysis, the new method has two main advantages. The first one is that it overcomes the limitation about the relationship between the dimension parameter and the length of series. The second one is that the modified sample entropy functions can be used to quantitatively distinguish time series from different complex systems by the similar measure.

  6. Dimensionality Reduction of Hyperspectral Image with Graph-Based Discriminant Analysis Considering Spectral Similarity

    Directory of Open Access Journals (Sweden)

    Fubiao Feng

    2017-03-01

    Full Text Available Recently, graph embedding has drawn great attention for dimensionality reduction in hyperspectral imagery. For example, locality preserving projection (LPP utilizes typical Euclidean distance in a heat kernel to create an affinity matrix and projects the high-dimensional data into a lower-dimensional space. However, the Euclidean distance is not sufficiently correlated with intrinsic spectral variation of a material, which may result in inappropriate graph representation. In this work, a graph-based discriminant analysis with spectral similarity (denoted as GDA-SS measurement is proposed, which fully considers curves changing description among spectral bands. Experimental results based on real hyperspectral images demonstrate that the proposed method is superior to traditional methods, such as supervised LPP, and the state-of-the-art sparse graph-based discriminant analysis (SGDA.

  7. Permutation based decision making under fuzzy environment using Tabu search

    Directory of Open Access Journals (Sweden)

    Mahdi Bashiri

    2012-04-01

    Full Text Available One of the techniques, which are used for Multiple Criteria Decision Making (MCDM is the permutation. In the classical form of permutation, it is assumed that weights and decision matrix components are crisp. However, when group decision making is under consideration and decision makers could not agree on a crisp value for weights and decision matrix components, fuzzy numbers should be used. In this article, the fuzzy permutation technique for MCDM problems has been explained. The main deficiency of permutation is its big computational time, so a Tabu Search (TS based algorithm has been proposed to reduce the computational time. A numerical example has illustrated the proposed approach clearly. Then, some benchmark instances extracted from literature are solved by proposed TS. The analyses of the results show the proper performance of the proposed method.

  8. Quality assessment of protein model-structures based on structural and functional similarities.

    Science.gov (United States)

    Konopka, Bogumil M; Nebel, Jean-Christophe; Kotulska, Malgorzata

    2012-09-21

    Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and

  9. Collaborative Filtering Recommendation Based on Trust Model with Fused Similar Factor

    Directory of Open Access Journals (Sweden)

    Ye Li

    2017-01-01

    Full Text Available Recommended system is beneficial to e-commerce sites, which provides customers with product information and recommendations; the recommendation system is currently widely used in many fields. In an era of information explosion, the key challenges of the recommender system is to obtain valid information from the tremendous amount of information and produce high quality recommendations. However, when facing the large mount of information, the traditional collaborative filtering algorithm usually obtains a high degree of sparseness, which ultimately lead to low accuracy recommendations. To tackle this issue, we propose a novel algorithm named Collaborative Filtering Recommendation Based on Trust Model with Fused Similar Factor, which is based on the trust model and is combined with the user similarity. The novel algorithm takes into account the degree of interest overlap between the two users and results in a superior performance to the recommendation based on Trust Model in criteria of Precision, Recall, Diversity and Coverage. Additionally, the proposed model can effectively improve the efficiency of collaborative filtering algorithm and achieve high performance.

  10. Molecular basis sets - a general similarity-based approach for representing chemical spaces.

    Science.gov (United States)

    Raghavendra, Akshay S; Maggiora, Gerald M

    2007-01-01

    A new method, based on generalized Fourier analysis, is described that utilizes the concept of "molecular basis sets" to represent chemical space within an abstract vector space. The basis vectors in this space are abstract molecular vectors. Inner products among the basis vectors are determined using an ansatz that associates molecular similarities between pairs of molecules with their corresponding inner products. Moreover, the fact that similarities between pairs of molecules are, in essentially all cases, nonzero implies that the abstract molecular basis vectors are nonorthogonal, but since the similarity of a molecule with itself is unity, the molecular vectors are normalized to unity. A symmetric orthogonalization procedure, which optimally preserves the character of the original set of molecular basis vectors, is used to construct appropriate orthonormal basis sets. Molecules can then be represented, in general, by sets of orthonormal "molecule-like" basis vectors within a proper Euclidean vector space. However, the dimension of the space can become quite large. Thus, the work presented here assesses the effect of basis set size on a number of properties including the average squared error and average norm of molecular vectors represented in the space-the results clearly show the expected reduction in average squared error and increase in average norm as the basis set size is increased. Several distance-based statistics are also considered. These include the distribution of distances and their differences with respect to basis sets of differing size and several comparative distance measures such as Spearman rank correlation and Kruscal stress. All of the measures show that, even though the dimension can be high, the chemical spaces they represent, nonetheless, behave in a well-controlled and reasonable manner. Other abstract vector spaces analogous to that described here can also be constructed providing that the appropriate inner products can be directly

  11. Proceedings of the ECIR 2012 Workshop on Task-Based and Aggregated Search (TBAS2012)

    DEFF Research Database (Denmark)

    2012-01-01

    Task-based search aims to understand the user's current task and desired outcomes, and how this may provide useful context for the Information Retrieval (IR) process. An example of task-based search is situations where additional user information on e.g. the purpose of the search or what the user...

  12. A Systematic Understanding of Successful Web Searches in Information-Based Tasks

    Science.gov (United States)

    Zhou, Mingming

    2013-01-01

    The purpose of this study is to research how Chinese university students solve information-based problems. With the Search Performance Index as the measure of search success, participants were divided into high, medium and low-performing groups. Based on their web search logs, these three groups were compared along five dimensions of the search…

  13. Odour-based discrimination of similarity at the major histocompatibility complex in birds.

    Science.gov (United States)

    Leclaire, Sarah; Strandh, Maria; Mardon, Jérôme; Westerdahl, Helena; Bonadonna, Francesco

    2017-01-11

    Many animals are known to preferentially mate with partners that are dissimilar at the major histocompatibility complex (MHC) in order to maximize the antigen binding repertoire (or disease resistance) in their offspring. Although several mammals, fish or lizards use odour cues to assess MHC similarity with potential partners, the ability of birds to assess MHC similarity using olfactory cues has not yet been explored. Here we used a behavioural binary choice test and high-throughput-sequencing of MHC class IIB to determine whether blue petrels can discriminate MHC similarity based on odour cues alone. Blue petrels are seabirds with particularly good sense of smell, they have a reciprocal mate choice and are known to preferentially mate with MHC-dissimilar partners. Incubating males preferentially approached the odour of the more MHC-dissimilar female, whereas incubating females showed opposite preferences. Given their mating pattern, females were, however, expected to show preference for the odour of the more MHC-dissimilar male. Further studies are needed to determine whether, as in women and female mice, the preference varies with the reproductive cycle in blue petrel females. Our results provide the first evidence that birds can use odour cues only to assess MHC dissimilarity. © 2017 The Author(s).

  14. A Segment-Based Trajectory Similarity Measure in the Urban Transportation Systems.

    Science.gov (United States)

    Mao, Yingchi; Zhong, Haishi; Xiao, Xianjian; Li, Xiaofang

    2017-03-06

    With the rapid spread of built-in GPS handheld smart devices, the trajectory data from GPS sensors has grown explosively. Trajectory data has spatio-temporal characteristics and rich information. Using trajectory data processing techniques can mine the patterns of human activities and the moving patterns of vehicles in the intelligent transportation systems. A trajectory similarity measure is one of the most important issues in trajectory data mining (clustering, classification, frequent pattern mining, etc.). Unfortunately, the main similarity measure algorithms with the trajectory data have been found to be inaccurate, highly sensitive of sampling methods, and have low robustness for the noise data. To solve the above problems, three distances and their corresponding computation methods are proposed in this paper. The point-segment distance can decrease the sensitivity of the point sampling methods. The prediction distance optimizes the temporal distance with the features of trajectory data. The segment-segment distance introduces the trajectory shape factor into the similarity measurement to improve the accuracy. The three kinds of distance are integrated with the traditional dynamic time warping algorithm (DTW) algorithm to propose a new segment-based dynamic time warping algorithm (SDTW). The experimental results show that the SDTW algorithm can exhibit about 57%, 86%, and 31% better accuracy than the longest common subsequence algorithm (LCSS), and edit distance on real sequence algorithm (EDR) , and DTW, respectively, and that the sensitivity to the noise data is lower than that those algorithms.

  15. Effect of similar elements on improving glass-forming ability of La-Ce-based alloys

    International Nuclear Information System (INIS)

    Zhang Tao; Li Ran; Pang Shujie

    2009-01-01

    To date the effect of unlike component elements on glass-forming ability (GFA) of alloys have been studied extensively, and it is generally recognized that the main consisting elements of the alloys with high GFA usually have large difference in atomic size and atomic interaction (large negative heat of mixing) among them. In our recent work, a series of rare earth metal-based alloy compositions with superior GFA were found through the approach of coexistence of similar constituent elements. The quinary (La 0.5 Ce 0.5 ) 65 Al 10 (Co 0.6 Cu 0.4 ) 25 bulk metallic glass (BMG) in a rod form with a diameter up to 32 mm was synthesized by tilt-pour casting, for which the glass-forming ability is significantly higher than that for ternary Ln-Al-TM alloys (Ln = La or Ce; TM = Co or Cu) with critical diameters for glass-formation of several millimeters. We suggest that the strong frustration of crystallization by utilizing the coexistence of La-Ce and Co-Cu to complicate competing crystalline phases is helpful to construct BMG component with superior GFA. The results of our present work indicate that similar elements (elements with similar atomic size and chemical properties) have significant effect on GFA of alloys.

  16. Face Recognition Performance Improvement using a Similarity Score of Feature Vectors based on Probabilistic Histograms

    Directory of Open Access Journals (Sweden)

    SRIKOTE, G.

    2016-08-01

    Full Text Available This paper proposes an improved performance algorithm of face recognition to identify two face mismatch pairs in cases of incorrect decisions. The primary feature of this method is to deploy the similarity score with respect to Gaussian components between two previously unseen faces. Unlike the conventional classical vector distance measurement, our algorithms also consider the plot of summation of the similarity index versus face feature vector distance. A mixture of Gaussian models of labeled faces is also widely applicable to different biometric system parameters. By comparative evaluations, it has been shown that the efficiency of the proposed algorithm is superior to that of the conventional algorithm by an average accuracy of up to 1.15% and 16.87% when compared with 3x3 Multi-Region Histogram (MRH direct-bag-of-features and Principal Component Analysis (PCA-based face recognition systems, respectively. The experimental results show that similarity score consideration is more discriminative for face recognition compared to feature distance. Experimental results of Labeled Face in the Wild (LFW data set demonstrate that our algorithms are suitable for real applications probe-to-gallery identification of face recognition systems. Moreover, this proposed method can also be applied to other recognition systems and therefore additionally improves recognition scores.

  17. Uncovering highly obfuscated plagiarism cases using fuzzy semantic-based similarity model

    Directory of Open Access Journals (Sweden)

    Salha M. Alzahrani

    2015-07-01

    Full Text Available Highly obfuscated plagiarism cases contain unseen and obfuscated texts, which pose difficulties when using existing plagiarism detection methods. A fuzzy semantic-based similarity model for uncovering obfuscated plagiarism is presented and compared with five state-of-the-art baselines. Semantic relatedness between words is studied based on the part-of-speech (POS tags and WordNet-based similarity measures. Fuzzy-based rules are introduced to assess the semantic distance between source and suspicious texts of short lengths, which implement the semantic relatedness between words as a membership function to a fuzzy set. In order to minimize the number of false positives and false negatives, a learning method that combines a permission threshold and a variation threshold is used to decide true plagiarism cases. The proposed model and the baselines are evaluated on 99,033 ground-truth annotated cases extracted from different datasets, including 11,621 (11.7% handmade paraphrases, 54,815 (55.4% artificial plagiarism cases, and 32,578 (32.9% plagiarism-free cases. We conduct extensive experimental verifications, including the study of the effects of different segmentations schemes and parameter settings. Results are assessed using precision, recall, F-measure and granularity on stratified 10-fold cross-validation data. The statistical analysis using paired t-tests shows that the proposed approach is statistically significant in comparison with the baselines, which demonstrates the competence of fuzzy semantic-based model to detect plagiarism cases beyond the literal plagiarism. Additionally, the analysis of variance (ANOVA statistical test shows the effectiveness of different segmentation schemes used with the proposed approach.

  18. Noesis: Ontology based Scoped Search Engine and Resource Aggregator for Atmospheric Science

    Science.gov (United States)

    Ramachandran, R.; Movva, S.; Li, X.; Cherukuri, P.; Graves, S.

    2006-12-01

    The goal for search engines is to return results that are both accurate and complete. The search engines should find only what you really want and find everything you really want. Search engines (even meta search engines) lack semantics. The basis for search is simply based on string matching between the user's query term and the resource database and the semantics associated with the search string is not captured. For example, if an atmospheric scientist is searching for "pressure" related web resources, most search engines return inaccurate results such as web resources related to blood pressure. In this presentation Noesis, which is a meta-search engine and a resource aggregator that uses domain ontologies to provide scoped search capabilities will be described. Noesis uses domain ontologies to help the user scope the search query to ensure that the search results are both accurate and complete. The domain ontologies guide the user to refine their search query and thereby reduce the user's burden of experimenting with different search strings. Semantics are captured by refining the query terms to cover synonyms, specializations, generalizations and related concepts. Noesis also serves as a resource aggregator. It categorizes the search results from different online resources such as education materials, publications, datasets, web search engines that might be of interest to the user.

  19. Part-based deep representation for product tagging and search

    Science.gov (United States)

    Chen, Keqing

    2017-06-01

    Despite previous studies, tagging and indexing the product images remain challenging due to the large inner-class variation of the products. In the traditional methods, the quantized hand-crafted features such as SIFTs are extracted as the representation of the product images, which are not discriminative enough to handle the inner-class variation. For discriminative image representation, this paper firstly presents a novel deep convolutional neural networks (DCNNs) architect true pre-trained on a large-scale general image dataset. Compared to the traditional features, our DCNNs representation is of more discriminative power with fewer dimensions. Moreover, we incorporate the part-based model into the framework to overcome the negative effect of bad alignment and cluttered background and hence the descriptive ability of the deep representation is further enhanced. Finally, we collect and contribute a well-labeled shoe image database, i.e., the TBShoes, on which we apply the part-based deep representation for product image tagging and search, respectively. The experimental results highlight the advantages of the proposed part-based deep representation.

  20. PTree: pattern-based, stochastic search for maximum parsimony phylogenies

    Directory of Open Access Journals (Sweden)

    Ivan Gregor

    2013-06-01

    Full Text Available Phylogenetic reconstruction is vital to analyzing the evolutionary relationship of genes within and across populations of different species. Nowadays, with next generation sequencing technologies producing sets comprising thousands of sequences, robust identification of the tree topology, which is optimal according to standard criteria such as maximum parsimony, maximum likelihood or posterior probability, with phylogenetic inference methods is a computationally very demanding task. Here, we describe a stochastic search method for a maximum parsimony tree, implemented in a software package we named PTree. Our method is based on a new pattern-based technique that enables us to infer intermediate sequences efficiently where the incorporation of these sequences in the current tree topology yields a phylogenetic tree with a lower cost. Evaluation across multiple datasets showed that our method is comparable to the algorithms implemented in PAUP* or TNT, which are widely used by the bioinformatics community, in terms of topological accuracy and runtime. We show that our method can process large-scale datasets of 1,000–8,000 sequences. We believe that our novel pattern-based method enriches the current set of tools and methods for phylogenetic tree inference. The software is available under: http://algbio.cs.uni-duesseldorf.de/webapps/wa-download/.

  1. PTree: pattern-based, stochastic search for maximum parsimony phylogenies.

    Science.gov (United States)

    Gregor, Ivan; Steinbrück, Lars; McHardy, Alice C

    2013-01-01

    Phylogenetic reconstruction is vital to analyzing the evolutionary relationship of genes within and across populations of different species. Nowadays, with next generation sequencing technologies producing sets comprising thousands of sequences, robust identification of the tree topology, which is optimal according to standard criteria such as maximum parsimony, maximum likelihood or posterior probability, with phylogenetic inference methods is a computationally very demanding task. Here, we describe a stochastic search method for a maximum parsimony tree, implemented in a software package we named PTree. Our method is based on a new pattern-based technique that enables us to infer intermediate sequences efficiently where the incorporation of these sequences in the current tree topology yields a phylogenetic tree with a lower cost. Evaluation across multiple datasets showed that our method is comparable to the algorithms implemented in PAUP* or TNT, which are widely used by the bioinformatics community, in terms of topological accuracy and runtime. We show that our method can process large-scale datasets of 1,000-8,000 sequences. We believe that our novel pattern-based method enriches the current set of tools and methods for phylogenetic tree inference. The software is available under: http://algbio.cs.uni-duesseldorf.de/webapps/wa-download/.

  2. Developing a Grid-based search and categorization tool

    CERN Document Server

    Haya, Glenn; Vigen, Jens

    2003-01-01

    Grid technology has the potential to improve the accessibility of digital libraries. The participants in Project GRACE (Grid Search And Categorization Engine) are in the process of developing a search engine that will allow users to search through heterogeneous resources stored in geographically distributed digital collections. What differentiates this project from current search tools is that GRACE will be run on the European Data Grid, a large distributed network, and will not have a single centralized index as current web search engines do. In some cases, the distributed approach offers advantages over the centralized approach since it is more scalable, can be used on otherwise inaccessible material, and can provide advanced search options customized for each data source.

  3. LDA-Based Unified Topic Modeling for Similar TV User Grouping and TV Program Recommendation.

    Science.gov (United States)

    Pyo, Shinjee; Kim, Eunhui; Kim, Munchurl

    2015-08-01

    Social TV is a social media service via TV and social networks through which TV users exchange their experiences about TV programs that they are viewing. For social TV service, two technical aspects are envisioned: grouping of similar TV users to create social TV communities and recommending TV programs based on group and personal interests for personalizing TV. In this paper, we propose a unified topic model based on grouping of similar TV users and recommending TV programs as a social TV service. The proposed unified topic model employs two latent Dirichlet allocation (LDA) models. One is a topic model of TV users, and the other is a topic model of the description words for viewed TV programs. The two LDA models are then integrated via a topic proportion parameter for TV programs, which enforces the grouping of similar TV users and associated description words for watched TV programs at the same time in a unified topic modeling framework. The unified model identifies the semantic relation between TV user groups and TV program description word groups so that more meaningful TV program recommendations can be made. The unified topic model also overcomes an item ramp-up problem such that new TV programs can be reliably recommended to TV users. Furthermore, from the topic model of TV users, TV users with similar tastes can be grouped as topics, which can then be recommended as social TV communities. To verify our proposed method of unified topic-modeling-based TV user grouping and TV program recommendation for social TV services, in our experiments, we used real TV viewing history data and electronic program guide data from a seven-month period collected by a TV poll agency. The experimental results show that the proposed unified topic model yields an average 81.4% precision for 50 topics in TV program recommendation and its performance is an average of 6.5% higher than that of the topic model of TV users only. For TV user prediction with new TV programs, the average

  4. A Multi-Model Stereo Similarity Function Based on Monogenic Signal Analysis in Poisson Scale Space

    Directory of Open Access Journals (Sweden)

    Jinjun Li

    2011-01-01

    Full Text Available A stereo similarity function based on local multi-model monogenic image feature descriptors (LMFD is proposed to match interest points and estimate disparity map for stereo images. Local multi-model monogenic image features include local orientation and instantaneous phase of the gray monogenic signal, local color phase of the color monogenic signal, and local mean colors in the multiscale color monogenic signal framework. The gray monogenic signal, which is the extension of analytic signal to gray level image using Dirac operator and Laplace equation, consists of local amplitude, local orientation, and instantaneous phase of 2D image signal. The color monogenic signal is the extension of monogenic signal to color image based on Clifford algebras. The local color phase can be estimated by computing geometric product between the color monogenic signal and a unit reference vector in RGB color space. Experiment results on the synthetic and natural stereo images show the performance of the proposed approach.

  5. Image-based metal artifact reduction in x-ray computed tomography utilizing local anatomical similarity

    Science.gov (United States)

    Dong, Xue; Yang, Xiaofeng; Rosenfield, Jonathan; Elder, Eric; Dhabaan, Anees

    2017-03-01

    X-ray computed tomography (CT) is widely used in radiation therapy treatment planning in recent years. However, metal implants such as dental fillings and hip prostheses can cause severe bright and dark streaking artifacts in reconstructed CT images. These artifacts decrease image contrast and degrade HU accuracy, leading to inaccuracies in target delineation and dose calculation. In this work, a metal artifact reduction method is proposed based on the intrinsic anatomical similarity between neighboring CT slices. Neighboring CT slices from the same patient exhibit similar anatomical features. Exploiting this anatomical similarity, a gamma map is calculated as a weighted summation of relative HU error and distance error for each pixel in an artifact-corrupted CT image relative to a neighboring, artifactfree image. The minimum value in the gamma map for each pixel is used to identify an appropriate pixel from the artifact-free CT slice to replace the corresponding artifact-corrupted pixel. With the proposed method, the mean CT HU error was reduced from 360 HU and 460 HU to 24 HU and 34 HU on head and pelvis CT images, respectively. Dose calculation accuracy also improved, as the dose difference was reduced from greater than 20% to less than 4%. Using 3%/3mm criteria, the gamma analysis failure rate was reduced from 23.25% to 0.02%. An image-based metal artifact reduction method is proposed that replaces corrupted image pixels with pixels from neighboring CT slices free of metal artifacts. This method is shown to be capable of suppressing streaking artifacts, thereby improving HU and dose calculation accuracy.

  6. Machine Fault Detection Based on Filter Bank Similarity Features Using Acoustic and Vibration Analysis

    Directory of Open Access Journals (Sweden)

    Mauricio Holguín-Londoño

    2016-01-01

    Full Text Available Vibration and acoustic analysis actively support the nondestructive and noninvasive fault diagnostics of rotating machines at early stages. Nonetheless, the acoustic signal is less used because of its vulnerability to external interferences, hindering an efficient and robust analysis for condition monitoring (CM. This paper presents a novel methodology to characterize different failure signatures from rotating machines using either acoustic or vibration signals. Firstly, the signal is decomposed into several narrow-band spectral components applying different filter bank methods such as empirical mode decomposition, wavelet packet transform, and Fourier-based filtering. Secondly, a feature set is built using a proposed similarity measure termed cumulative spectral density index and used to estimate the mutual statistical dependence between each bandwidth-limited component and the raw signal. Finally, a classification scheme is carried out to distinguish the different types of faults. The methodology is tested in two laboratory experiments, including turbine blade degradation and rolling element bearing faults. The robustness of our approach is validated contaminating the signal with several levels of additive white Gaussian noise, obtaining high-performance outcomes that make the usage of vibration, acoustic, and vibroacoustic measurements in different applications comparable. As a result, the proposed fault detection based on filter bank similarity features is a promising methodology to implement in CM of rotating machinery, even using measurements with low signal-to-noise ratio.

  7. Traditional and robust vector selection methods for use with similarity based models

    International Nuclear Information System (INIS)

    Hines, J. W.; Garvey, D. R.

    2006-01-01

    Vector selection, or instance selection as it is often called in the data mining literature, performs a critical task in the development of nonparametric, similarity based models. Nonparametric, similarity based modeling (SBM) is a form of 'lazy learning' which constructs a local model 'on the fly' by comparing a query vector to historical, training vectors. For large training sets the creation of local models may become cumbersome, since each training vector must be compared to the query vector. To alleviate this computational burden, varying forms of training vector sampling may be employed with the goal of selecting a subset of the training data such that the samples are representative of the underlying process. This paper describes one such SBM, namely auto-associative kernel regression (AAKR), and presents five traditional vector selection methods and one robust vector selection method that may be used to select prototype vectors from a larger data set in model training. The five traditional vector selection methods considered are min-max, vector ordering, combination min-max and vector ordering, fuzzy c-means clustering, and Adeli-Hung clustering. Each method is described in detail and compared using artificially generated data and data collected from the steam system of an operating nuclear power plant. (authors)

  8. An improved DPSO with mutation based on similarity algorithm for optimization of transmission lines loading

    International Nuclear Information System (INIS)

    Shayeghi, H.; Mahdavi, M.; Bagheri, A.

    2010-01-01

    Static transmission network expansion planning (STNEP) problem acquires a principal role in power system planning and should be evaluated carefully. Up till now, various methods have been presented to solve the STNEP problem. But only in one of them, lines adequacy rate has been considered at the end of planning horizon and the problem has been optimized by discrete particle swarm optimization (DPSO). DPSO is a new population-based intelligence algorithm and exhibits good performance on solution of the large-scale, discrete and non-linear optimization problems like STNEP. However, during the running of the algorithm, the particles become more and more similar, and cluster into the best particle in the swarm, which make the swarm premature convergence around the local solution. In order to overcome these drawbacks and considering lines adequacy rate, in this paper, expansion planning has been implemented by merging lines loading parameter in the STNEP and inserting investment cost into the fitness function constraints using an improved DPSO algorithm. The proposed improved DPSO is a new conception, collectivity, which is based on similarity between the particle and the current global best particle in the swarm that can prevent the premature convergence of DPSO around the local solution. The proposed method has been tested on the Garver's network and a real transmission network in Iran, and compared with the DPSO based method for solution of the TNEP problem. The results show that the proposed improved DPSO based method by preventing the premature convergence is caused that with almost the same expansion costs, the network adequacy is increased considerably. Also, regarding the convergence curves of both methods, it can be seen that precision of the proposed algorithm for the solution of the STNEP problem is more than DPSO approach.

  9. Harmony Search Based Parameter Ensemble Adaptation for Differential Evolution

    Directory of Open Access Journals (Sweden)

    Rammohan Mallipeddi

    2013-01-01

    Full Text Available In differential evolution (DE algorithm, depending on the characteristics of the problem at hand and the available computational resources, different strategies combined with a different set of parameters may be effective. In addition, a single, well-tuned combination of strategies and parameters may not guarantee optimal performance because different strategies combined with different parameter settings can be appropriate during different stages of the evolution. Therefore, various adaptive/self-adaptive techniques have been proposed to adapt the DE strategies and parameters during the course of evolution. In this paper, we propose a new parameter adaptation technique for DE based on ensemble approach and harmony search algorithm (HS. In the proposed method, an ensemble of parameters is randomly sampled which form the initial harmony memory. The parameter ensemble evolves during the course of the optimization process by HS algorithm. Each parameter combination in the harmony memory is evaluated by testing them on the DE population. The performance of the proposed adaptation method is evaluated using two recently proposed strategies (DE/current-to-pbest/bin and DE/current-to-gr_best/bin as basic DE frameworks. Numerical results demonstrate the effectiveness of the proposed adaptation technique compared to the state-of-the-art DE based algorithms on a set of challenging test problems (CEC 2005.

  10. Parameter Search Algorithms for Microwave Radar-Based Breast Imaging: Focal Quality Metrics as Fitness Functions.

    Science.gov (United States)

    O'Loughlin, Declan; Oliveira, Bárbara L; Elahi, Muhammad Adnan; Glavin, Martin; Jones, Edward; Popović, Milica; O'Halloran, Martin

    2017-12-06

    Inaccurate estimation of average dielectric properties can have a tangible impact on microwave radar-based breast images. Despite this, recent patient imaging studies have used a fixed estimate although this is known to vary from patient to patient. Parameter search algorithms are a promising technique for estimating the average dielectric properties from the reconstructed microwave images themselves without additional hardware. In this work, qualities of accurately reconstructed images are identified from point spread functions. As the qualities of accurately reconstructed microwave images are similar to the qualities of focused microscopic and photographic images, this work proposes the use of focal quality metrics for average dielectric property estimation. The robustness of the parameter search is evaluated using experimental dielectrically heterogeneous phantoms on the three-dimensional volumetric image. Based on a very broad initial estimate of the average dielectric properties, this paper shows how these metrics can be used as suitable fitness functions in parameter search algorithms to reconstruct clear and focused microwave radar images.

  11. Three dimensional pattern recognition using feature-based indexing and rule-based search

    Science.gov (United States)

    Lee, Jae-Kyu

    In flexible automated manufacturing, robots can perform routine operations as well as recover from atypical events, provided that process-relevant information is available to the robot controller. Real time vision is among the most versatile sensing tools, yet the reliability of machine-based scene interpretation can be questionable. The effort described here is focused on the development of machine-based vision methods to support autonomous nuclear fuel manufacturing operations in hot cells. This thesis presents a method to efficiently recognize 3D objects from 2D images based on feature-based indexing. Object recognition is the identification of correspondences between parts of a current scene and stored views of known objects, using chains of segments or indexing vectors. To create indexed object models, characteristic model image features are extracted during preprocessing. Feature vectors representing model object contours are acquired from several points of view around each object and stored. Recognition is the process of matching stored views with features or patterns detected in a test scene. Two sets of algorithms were developed, one for preprocessing and indexed database creation, and one for pattern searching and matching during recognition. At recognition time, those indexing vectors with the highest match probability are retrieved from the model image database, using a nearest neighbor search algorithm. The nearest neighbor search predicts the best possible match candidates. Extended searches are guided by a search strategy that employs knowledge-base (KB) selection criteria. The knowledge-based system simplifies the recognition process and minimizes the number of iterations and memory usage. Novel contributions include the use of a feature-based indexing data structure together with a knowledge base. Both components improve the efficiency of the recognition process by improved structuring of the database of object features and reducing data base size

  12. Job Search Methods: Consequences for Gender-based Earnings Inequality.

    Science.gov (United States)

    Huffman, Matt L.; Torres, Lisa

    2001-01-01

    Data from adults in Atlanta, Boston, and Los Angeles (n=1,942) who searched for work using formal (ads, agencies) or informal (networks) methods indicated that type of method used did not contribute to the gender gap in earnings. Results do not support formal job search as a way to reduce gender inequality. (Contains 55 references.) (SK)

  13. Evidence-based medicine - searching the medical literature. Part 1.

    African Journals Online (AJOL)

    Ann Burgess

    password for access via HINARIa2 use that to log in. Then you can retrieve articles from the 6000 journals that will be available to you. You cannot retrieve the full text from journals that do not allow free access or HINARI access. How to search the literature on the internet. Before you start your search take a moment to think ...

  14. On the network-based emulation of human visual search

    NARCIS (Netherlands)

    Gerrissen, J.F.

    1991-01-01

    We describe the design of a computer emulator of human visual search. The emulator mechanism is eventually meant to support ergonomic assessment of the effect of display structure and protocol on search performance. As regards target identification and localization, it mimics a number of

  15. Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment

    Directory of Open Access Journals (Sweden)

    Manzini Giovanni

    2007-07-01

    Full Text Available Abstract Background Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. However, the alignment methods seem inadequate for post-genomic studies since they do not scale well with data set size and they seem to be confined only to genomic and proteomic sequences. Therefore, alignment-free similarity measures are actively pursued. Among those, USM (Universal Similarity Metric has gained prominence. It is based on the deep theory of Kolmogorov Complexity and universality is its most novel striking feature. Since it can only be approximated via data compression, USM is a methodology rather than a formula quantifying the similarity of two strings. Three approximations of USM are available, namely UCD (Universal Compression Dissimilarity, NCD (Normalized Compression Dissimilarity and CD (Compression Dissimilarity. Their applicability and robustness is tested on various data sets yielding a first massive quantitative estimate that the USM methodology and its approximations are of value. Despite the rich theory developed around USM, its experimental assessment has limitations: only a few data compressors have been tested in conjunction with USM and mostly at a qualitative level, no comparison among UCD, NCD and CD is available and no comparison of USM with existing methods, both based on alignments and not, seems to be available. Results We experimentally test the USM methodology by using 25 compressors, all three of its known approximations and six data sets of relevance to Molecular Biology. This offers the first systematic and quantitative experimental assessment of this methodology, that naturally complements the many theoretical and the preliminary experimental results available. Moreover, we compare the USM methodology both with methods based on alignments and not. We may group our experiments into two sets. The first one, performed via ROC

  16. Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment.

    Science.gov (United States)

    Ferragina, Paolo; Giancarlo, Raffaele; Greco, Valentina; Manzini, Giovanni; Valiente, Gabriel

    2007-07-13

    Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. However, the alignment methods seem inadequate for post-genomic studies since they do not scale well with data set size and they seem to be confined only to genomic and proteomic sequences. Therefore, alignment-free similarity measures are actively pursued. Among those, USM (Universal Similarity Metric) has gained prominence. It is based on the deep theory of Kolmogorov Complexity and universality is its most novel striking feature. Since it can only be approximated via data compression, USM is a methodology rather than a formula quantifying the similarity of two strings. Three approximations of USM are available, namely UCD (Universal Compression Dissimilarity), NCD (Normalized Compression Dissimilarity) and CD (Compression Dissimilarity). Their applicability and robustness is tested on various data sets yielding a first massive quantitative estimate that the USM methodology and its approximations are of value. Despite the rich theory developed around USM, its experimental assessment has limitations: only a few data compressors have been tested in conjunction with USM and mostly at a qualitative level, no comparison among UCD, NCD and CD is available and no comparison of USM with existing methods, both based on alignments and not, seems to be available. We experimentally test the USM methodology by using 25 compressors, all three of its known approximations and six data sets of relevance to Molecular Biology. This offers the first systematic and quantitative experimental assessment of this methodology, that naturally complements the many theoretical and the preliminary experimental results available. Moreover, we compare the USM methodology both with methods based on alignments and not. We may group our experiments into two sets. The first one, performed via ROC (Receiver Operating Curve) analysis, aims at

  17. A Secured Cognitive Agent based Multi-strategic Intelligent Search System

    Directory of Open Access Journals (Sweden)

    Neha Gulati

    2018-04-01

    Full Text Available Search Engine (SE is the most preferred information retrieval tool ubiquitously used. In spite of vast scale involvement of users in SE’s, their limited capabilities to understand the user/searcher context and emotions places high cognitive, perceptual and learning load on the user to maintain the search momentum. In this regard, the present work discusses a Cognitive Agent (CA based approach to support the user in Web-based search process. The work suggests a framework called Secured Cognitive Agent based Multi-strategic Intelligent Search System (CAbMsISS to assist the user in search process. It helps to reduce the contextual and emotional mismatch between the SE’s and user. After implementation of the proposed framework, performance analysis shows that CAbMsISS framework improves Query Retrieval Time (QRT and effectiveness for retrieving relevant results as compared to Present Search Engine (PSE. Supplementary to this, it also provides search suggestions when user accesses a resource previously tagged with negative emotions. Overall, the goal of the system is to enhance the search experience for keeping the user motivated. The framework provides suggestions through the search log that tracks the queries searched, resources accessed and emotions experienced during the search. The implemented framework also considers user security. Keywords: BDI model, Cognitive Agent, Emotion, Information retrieval, Intelligent search, Search Engine

  18. An efficient similarity measure for content based image retrieval using memetic algorithm

    Directory of Open Access Journals (Sweden)

    Mutasem K. Alsmadi

    2017-06-01

    Full Text Available Content based image retrieval (CBIR systems work by retrieving images which are related to the query image (QI from huge databases. The available CBIR systems extract limited feature sets which confine the retrieval efficacy. In this work, extensive robust and important features were extracted from the images database and then stored in the feature repository. This feature set is composed of color signature with the shape and color texture features. Where, features are extracted from the given QI in the similar fashion. Consequently, a novel similarity evaluation using a meta-heuristic algorithm called a memetic algorithm (genetic algorithm with great deluge is achieved between the features of the QI and the features of the database images. Our proposed CBIR system is assessed by inquiring number of images (from the test dataset and the efficiency of the system is evaluated by calculating precision-recall value for the results. The results were superior to other state-of-the-art CBIR systems in regard to precision.

  19. A Similarity-Based Approach for Audiovisual Document Classification Using Temporal Relation Analysis

    Directory of Open Access Journals (Sweden)

    Ferrane Isabelle

    2011-01-01

    Full Text Available Abstract We propose a novel approach for video classification that bases on the analysis of the temporal relationships between the basic events in audiovisual documents. Starting from basic segmentation results, we define a new representation method that is called Temporal Relation Matrix (TRM. Each document is then described by a set of TRMs, the analysis of which makes events of a higher level stand out. This representation has been first designed to analyze any audiovisual document in order to find events that may well characterize its content and its structure. The aim of this work is to use this representation to compute a similarity measure between two documents. Approaches for audiovisual documents classification are presented and discussed. Experimentations are done on a set of 242 video documents and the results show the efficiency of our proposals.

  20. A hybrid algorithm for selecting head-related transfer function based on similarity of anthropometric structures

    Science.gov (United States)

    Zeng, Xiang-Yang; Wang, Shu-Guang; Gao, Li-Ping

    2010-09-01

    As the basic data for virtual auditory technology, head-related transfer function (HRTF) has many applications in the areas of room acoustic modeling, spatial hearing and multimedia. How to individualize HRTF fast and effectively has become an opening problem at present. Based on the similarity and relativity of anthropometric structures, a hybrid HRTF customization algorithm, which has combined the method of principal component analysis (PCA), multiple linear regression (MLR) and database matching (DM), has been presented in this paper. The HRTFs selected by both the best match and the worst match have been applied into obtaining binaurally auralized sounds, which are then used for subjective listening experiments and the results are compared. For the area in the horizontal plane, the localization results have shown that the selection of HRTFs can enhance the localization accuracy and can also abate the problem of front-back confusion.

  1. The effective thermal conductivity of porous media based on statistical self-similarity

    International Nuclear Information System (INIS)

    Kou Jianlong; Wu Fengmin; Lu Hangjun; Xu Yousheng; Song Fuquan

    2009-01-01

    A fractal model is presented based on the thermal-electrical analogy technique and statistical self-similarity of fractal saturated porous media. A dimensionless effective thermal conductivity of saturated fractal porous media is studied by the relationship between the dimensionless effective thermal conductivity and the geometrical parameters of porous media with no empirical constant. Through this study, it is shown that the dimensionless effective thermal conductivity decreases with the increase of porosity (φ) and pore area fractal dimension (D f ) when k s /k g >1. The opposite trends is observed when k s /k g t ). The model predictions are compared with existing experimental data and the results show that they are in good agreement with existing experimental data.

  2. On the Power and Limits of Sequence Similarity Based Clustering of Proteins Into Families

    DEFF Research Database (Denmark)

    Wiwie, Christian; Röttger, Richard

    2017-01-01

    Over the last decades, we have observed an ongoing tremendous growth of available sequencing data fueled by the advancements in wet-lab technology. The sequencing information is only the beginning of the actual understanding of how organisms survive and prosper. It is, for instance, equally...... important to also unravel the proteomic repertoire of an organism. A classical computational approach for detecting protein families is a sequence-based similarity calculation coupled with a subsequent cluster analysis. In this work we have intensively analyzed various clustering tools on a large scale. We...... used the data to investigate the behavior of the tools' parameters underlining the diversity of the protein families. Furthermore, we trained regression models for predicting the expected performance of a clustering tool for an unknown data set and aimed to also suggest optimal parameters...

  3. Human-based percussion and self-similarity detection in electroacoustic music

    Science.gov (United States)

    Mills, John Anderson, III

    Electroacoustic music is music that uses electronic technology for the compositional manipulation of sound, and is a unique genre of music for many reasons. Analyzing electroacoustic music requires special measures, some of which are integrated into the design of a preliminary percussion analysis tool set for electroacoustic music. This tool set is designed to incorporate the human processing of music and sound. Models of the human auditory periphery are used as a front end to the analysis algorithms. The audio properties of percussivity and self-similarity are chosen as the focus because these properties are computable and informative. A collection of human judgments about percussion was undertaken to acquire clearly specified, sound-event dimensions that humans use as a percussive cue. A total of 29 participants was asked to make judgments about the percussivity of 360 pairs of synthesized snare-drum sounds. The grouped results indicate that of the dimensions tested rise time is the strongest cue for percussivity. String resonance also has a strong effect, but because of the complex nature of string resonance, it is not a fundamental dimension of a sound event. Gross spectral filtering also has an effect on the judgment of percussivity but the effect is weaker than for rise time and string resonance. Gross spectral filtering also has less effect when the stronger cue of rise time is modified simultaneously. A percussivity-profile algorithm (PPA) is designed to identify those instants in pieces of music that humans also would identify as percussive. The PPA is implemented using a time-domain, channel-based approach and psychoacoustic models. The input parameters are tuned to maximize performance at matching participants' choices in the percussion-judgment collection. After the PPA is tuned, the PPA then is used to analyze pieces of electroacoustic music. Real electroacoustic music introduces new challenges for the PPA, though those same challenges might affect

  4. Structural similarity-based predictions of protein interactions between HIV-1 and Homo sapiens

    Directory of Open Access Journals (Sweden)

    Gomez Shawn M

    2010-04-01

    Full Text Available Abstract Background In the course of infection, viruses such as HIV-1 must enter a cell, travel to sites where they can hijack host machinery to transcribe their genes and translate their proteins, assemble, and then leave the cell again, all while evading the host immune system. Thus, successful infection depends on the pathogen's ability to manipulate the biological pathways and processes of the organism it infects. Interactions between HIV-encoded and human proteins provide one means by which HIV-1 can connect into cellular pathways to carry out these survival processes. Results We developed and applied a computational approach to predict interactions between HIV and human proteins based on structural similarity of 9 HIV-1 proteins to human proteins having known interactions. Using functional data from RNAi studies as a filter, we generated over 2000 interaction predictions between HIV proteins and 406 unique human proteins. Additional filtering based on Gene Ontology cellular component annotation reduced the number of predictions to 502 interactions involving 137 human proteins. We find numerous known interactions as well as novel interactions showing significant functional relevance based on supporting Gene Ontology and literature evidence. Conclusions Understanding the interplay between HIV-1 and its human host will help in understanding the viral lifecycle and the ways in which this virus is able to manipulate its host. The results shown here provide a potential set of interactions that are amenable to further experimental manipulation as well as potential targets for therapeutic intervention.

  5. Permission-based Index Clustering for Secure Multi-User Search

    OpenAIRE

    Eirini C. Micheli; Giorgos Margaritis; Stergios V. Anastasiadis

    2015-01-01

    Secure keyword search in shared infrastructures prevents stored documents from leaking sensitive information to unauthorized users. A shared index provides confidentiality if it is exclusively used by users authorized to search all the indexed documents. We introduce the Lethe indexing workflow to improve query and update efficiency in secure keyword search. The Lethe workflow clusters together documents with similar sets of authorized users, and creates shared indices for configurable docume...

  6. Looking Similar Promotes Group Stability in a Game-Based Virtual Community.

    Science.gov (United States)

    Lortie, Catherine L; Guitton, Matthieu J

    2012-08-01

    Online support groups are popular Web-based resources that provide tailored information and peer support through virtual communities and fulfill the users' needs for empowerment and belonging. However, the therapeutic potential of online support groups is at present limited by the lack of systematic research on the cognitive mechanisms underlying social group cohesion in virtual communities. We might increase the benefits of participation in online support groups if we gain more insight into the factors that promote long-term commitment to peer support. One approach to foster the therapeutic potential of online support groups could be to increase social selection based on visual similarity. We performed a case study using the popular virtual setting of "World of Warcraft" (Blizzard Entertainment, Irvine, CA). We monitored the social dynamics of a virtual community composed of avatars whose appearance was identical during a period of 3 months, biweekly, for a total of 24 measures. We observed that this homogeneous community displayed a very high level of group stability over time in terms of the total number of members, the number of members that stayed the same, and the number of arrivals and departures, despite the fact that belonging to a heterogeneous group typically favors the success of the group with respect to game progression. Our results confirm that appearance can trigger social selection in online virtual communities. Displaying a similar appearance could be one way to strengthen social bonds among peers who share various health and well-being issues. Thus, the therapeutic potential of online support groups could be promoted through visual cohesion.

  7. Analysis on the Correlation of Traffic Flow in Hainan Province Based on Baidu Search

    Science.gov (United States)

    Chen, Caixia; Shi, Chun

    2018-03-01

    Internet search data records user’s search attention and consumer demand, providing necessary database for the Hainan traffic flow model. Based on Baidu Index, with Hainan traffic flow as example, this paper conduct both qualitative and quantitative analysis on the relationship between search keyword from Baidu Index and actual Hainan tourist traffic flow, and build multiple regression model by SPSS.

  8. GeoSearcher: Location-Based Ranking of Search Engine Results.

    Science.gov (United States)

    Watters, Carolyn; Amoudi, Ghada

    2003-01-01

    Discussion of Web queries with geospatial dimensions focuses on an algorithm that assigns location coordinates dynamically to Web sites based on the URL. Describes a prototype search system that uses the algorithm to re-rank search engine results for queries with a geospatial dimension, thus providing an alternative ranking order for search engine…

  9. Topology optimization based on the harmony search method

    International Nuclear Information System (INIS)

    Lee, Seung-Min; Han, Seog-Young

    2017-01-01

    A new topology optimization scheme based on a Harmony search (HS) as a metaheuristic method was proposed and applied to static stiffness topology optimization problems. To apply the HS to topology optimization, the variables in HS were transformed to those in topology optimization. Compliance was used as an objective function, and harmony memory was defined as the set of the optimized topology. Also, a parametric study for Harmony memory considering rate (HMCR), Pitch adjusting rate (PAR), and Bandwidth (BW) was performed to find the appropriate range for topology optimization. Various techniques were employed such as a filtering scheme, simple average scheme and harmony rate. To provide a robust optimized topology, the concept of the harmony rate update rule was also implemented. Numerical examples are provided to verify the effectiveness of the HS by comparing the optimal layouts of the HS with those of Bidirectional evolutionary structural optimization (BESO) and Artificial bee colony algorithm (ABCA). The following conclu- sions could be made: (1) The proposed topology scheme is very effective for static stiffness topology optimization problems in terms of stability, robustness and convergence rate. (2) The suggested method provides a symmetric optimized topology despite the fact that the HS is a stochastic method like the ABCA. (3) The proposed scheme is applicable and practical in manufacturing since it produces a solid-void design of the optimized topology. (4) The suggested method appears to be very effective for large scale problems like topology optimization.

  10. Topology optimization based on the harmony search method

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Seung-Min; Han, Seog-Young [Hanyang University, Seoul (Korea, Republic of)

    2017-06-15

    A new topology optimization scheme based on a Harmony search (HS) as a metaheuristic method was proposed and applied to static stiffness topology optimization problems. To apply the HS to topology optimization, the variables in HS were transformed to those in topology optimization. Compliance was used as an objective function, and harmony memory was defined as the set of the optimized topology. Also, a parametric study for Harmony memory considering rate (HMCR), Pitch adjusting rate (PAR), and Bandwidth (BW) was performed to find the appropriate range for topology optimization. Various techniques were employed such as a filtering scheme, simple average scheme and harmony rate. To provide a robust optimized topology, the concept of the harmony rate update rule was also implemented. Numerical examples are provided to verify the effectiveness of the HS by comparing the optimal layouts of the HS with those of Bidirectional evolutionary structural optimization (BESO) and Artificial bee colony algorithm (ABCA). The following conclu- sions could be made: (1) The proposed topology scheme is very effective for static stiffness topology optimization problems in terms of stability, robustness and convergence rate. (2) The suggested method provides a symmetric optimized topology despite the fact that the HS is a stochastic method like the ABCA. (3) The proposed scheme is applicable and practical in manufacturing since it produces a solid-void design of the optimized topology. (4) The suggested method appears to be very effective for large scale problems like topology optimization.

  11. The Cellular Differential Evolution Based on Chaotic Local Search

    Directory of Open Access Journals (Sweden)

    Qingfeng Ding

    2015-01-01

    Full Text Available To avoid immature convergence and tune the selection pressure in the differential evolution (DE algorithm, a new differential evolution algorithm based on cellular automata and chaotic local search (CLS or ccDE is proposed. To balance the exploration and exploitation tradeoff of differential evolution, the interaction among individuals is limited in cellular neighbors instead of controlling parameters in the canonical DE. To improve the optimizing performance of DE, the CLS helps by exploring a large region to avoid immature convergence in the early evolutionary stage and exploiting a small region to refine the final solutions in the later evolutionary stage. What is more, to improve the convergence characteristics and maintain the population diversity, the binomial crossover operator in the canonical DE may be instead by the orthogonal crossover operator without crossover rate. The performance of ccDE is widely evaluated on a set of 14 bound constrained numerical optimization problems compared with the canonical DE and several DE variants. The simulation results show that ccDE has better performances in terms of convergence rate and solution accuracy than other optimizers.

  12. Using ontology-based semantic similarity to facilitate the article screening process for systematic reviews.

    Science.gov (United States)

    Ji, Xiaonan; Ritter, Alan; Yen, Po-Yin

    2017-05-01

    Systematic Reviews (SRs) are utilized to summarize evidence from high quality studies and are considered the preferred source of evidence-based practice (EBP). However, conducting SRs can be time and labor intensive due to the high cost of article screening. In previous studies, we demonstrated utilizing established (lexical) article relationships to facilitate the identification of relevant articles in an efficient and effective manner. Here we propose to enhance article relationships with background semantic knowledge derived from Unified Medical Language System (UMLS) concepts and ontologies. We developed a pipelined semantic concepts representation process to represent articles from an SR into an optimized and enriched semantic space of UMLS concepts. Throughout the process, we leveraged concepts and concept relations encoded in biomedical ontologies (SNOMED-CT and MeSH) within the UMLS framework to prompt concept features of each article. Article relationships (similarities) were established and represented as a semantic article network, which was readily applied to assist with the article screening process. We incorporated the concept of active learning to simulate an interactive article recommendation process, and evaluated the performance on 15 completed SRs. We used work saved over sampling at 95% recall (WSS95) as the performance measure. We compared the WSS95 performance of our ontology-based semantic approach to existing lexical feature approaches and corpus-based semantic approaches, and found that we had better WSS95 in most SRs. We also had the highest average WSS95 of 43.81% and the highest total WSS95 of 657.18%. We demonstrated using ontology-based semantics to facilitate the identification of relevant articles for SRs. Effective concepts and concept relations derived from UMLS ontologies can be utilized to establish article semantic relationships. Our approach provided a promising performance and can easily apply to any SR topics in the

  13. Measuring time series regularity using nonlinear similarity-based sample entropy

    International Nuclear Information System (INIS)

    Xie Hongbo; He Weixing; Liu Hui

    2008-01-01

    Sampe Entropy (SampEn), a measure quantifying regularity and complexity, is believed to be an effective analyzing method of diverse settings that include both deterministic chaotic and stochastic processes, particularly operative in the analysis of physiological signals that involve relatively small amount of data. However, the similarity definition of vectors is based on Heaviside function, of which the boundary is discontinuous and hard, may cause some problems in the validity and accuracy of SampEn. Sigmoid function is a smoothed and continuous version of Heaviside function. To overcome the problems SampEn encountered, a modified SampEn (mSampEn) based on nonlinear Sigmoid function was proposed. The performance of mSampEn was tested on the independent identically distributed (i.i.d.) uniform random numbers, the MIX stochastic model, the Rossler map, and the Hennon map. The results showed that mSampEn was superior to SampEn in several aspects, including giving entropy definition in case of small parameters, better relative consistency, robust to noise, and more independence on record length when characterizing time series generated from either deterministic or stochastic system with different regularities

  14. Predicting microRNA-disease associations using label propagation based on linear neighborhood similarity.

    Science.gov (United States)

    Li, Guanghui; Luo, Jiawei; Xiao, Qiu; Liang, Cheng; Ding, Pingjian

    2018-05-12

    Interactions between microRNAs (miRNAs) and diseases can yield important information for uncovering novel prognostic markers. Since experimental determination of disease-miRNA associations is time-consuming and costly, attention has been given to designing efficient and robust computational techniques for identifying undiscovered interactions. In this study, we present a label propagation model with linear neighborhood similarity, called LPLNS, to predict unobserved miRNA-disease associations. Additionally, a preprocessing step is performed to derive new interaction likelihood profiles that will contribute to the prediction since new miRNAs and diseases lack known associations. Our results demonstrate that the LPLNS model based on the known disease-miRNA associations could achieve impressive performance with an AUC of 0.9034. Furthermore, we observed that the LPLNS model based on new interaction likelihood profiles could improve the performance to an AUC of 0.9127. This was better than other comparable methods. In addition, case studies also demonstrated our method's outstanding performance for inferring undiscovered interactions between miRNAs and diseases, especially for novel diseases. Copyright © 2018. Published by Elsevier Inc.

  15. Construction of a phylogenetic tree of photosynthetic prokaryotes based on average similarities of whole genome sequences.

    Directory of Open Access Journals (Sweden)

    Soichirou Satoh

    Full Text Available Phylogenetic trees have been constructed for a wide range of organisms using gene sequence information, especially through the identification of orthologous genes that have been vertically inherited. The number of available complete genome sequences is rapidly increasing, and many tools for construction of genome trees based on whole genome sequences have been proposed. However, development of a reasonable method of using complete genome sequences for construction of phylogenetic trees has not been established. We have developed a method for construction of phylogenetic trees based on the average sequence similarities of whole genome sequences. We used this method to examine the phylogeny of 115 photosynthetic prokaryotes, i.e., cyanobacteria, Chlorobi, proteobacteria, Chloroflexi, Firmicutes and nonphotosynthetic organisms including Archaea. Although the bootstrap values for the branching order of phyla were low, probably due to lateral gene transfer and saturated mutation, the obtained tree was largely consistent with the previously reported phylogenetic trees, indicating that this method is a robust alternative to traditional phylogenetic methods.

  16. Alephweb: a search engine based on the federated structure ...

    African Journals Online (AJOL)

    Revue d'Information Scientifique et Technique. Journal Home · ABOUT THIS JOURNAL · Advanced Search · Current Issue · Archives · Journal Home > Vol 7, No 1 (1997) >. Log in or Register to get access to full text downloads.

  17. A modified harmony search based method for optimal rural radial ...

    African Journals Online (AJOL)

    International Journal of Engineering, Science and Technology. Journal Home · ABOUT THIS JOURNAL · Advanced Search · Current Issue · Archives · Journal Home > Vol 2, No 3 (2010) >. Log in or Register to get access to full text downloads.

  18. The Impact of Similarity-Based Interference in Processing Wh-Questions in Aphasia

    Directory of Open Access Journals (Sweden)

    Shannon Mackenzie

    2014-04-01

    than subject-extracted questions because the former are in non-canonical word order. Finally, the Intervener hypothesis suggests that only object-extracted Which-questions should be problematic, particularly for those participants with language disorders (e.g., Friedmann & Novogrodsky, 2011. An intervener is an NP that has similar properties to other NPs in the sentence, and thus results in similarity-based interference. Only object-extracted Which-questions contain an intervener (e.g., the fireman in (2b, which interferes with the chain consisting of the displaced Which-phrase, Which mailman, and its direct object gap occurring after the verb. Briefly here, only the Intervener Hypothesis was supported by our rich data set, and this was observed unambiguously for our participants with Broca’s aphasia. As an example (see Figure 1, we observed significantly greater proportion of gazes to the incorrect referent (i.e., the intervening NP in the object-extracted Which- relative to Who-questions beginning in the Verb-gap time window and extending throughout the remainder of the sentence and into the response period following the sentence. These patterns indicate lasting similarity-based interference effects during real-time sentence processing. The implications of our findings to extant accounts of sentence processing disruptions will be discussed, including accounts that root sentence comprehension impairments to memory-based interference.

  19. Search Method Based on Figurative Indexation of Folksonomic Features of Graphic Files

    Directory of Open Access Journals (Sweden)

    Oleg V. Bisikalo

    2013-11-01

    Full Text Available In this paper the search method based on usage of figurative indexation of folksonomic characteristics of graphical files is described. The method takes into account extralinguistic information, is based on using a model of figurative thinking of humans. The paper displays the creation of a method of searching image files based on their formal, including folksonomical clues.

  20. Application of hybrid artificial fish swarm algorithm based on similar fragments in VRP

    Science.gov (United States)

    Che, Jinnuo; Zhou, Kang; Zhang, Xueyu; Tong, Xin; Hou, Lingyun; Jia, Shiyu; Zhen, Yiting

    2018-03-01

    Focused on the issue that the decrease of convergence speed and the precision of calculation at the end of the process in Artificial Fish Swarm Algorithm(AFSA) and instability of results, a hybrid AFSA based on similar fragments is proposed. Traditional AFSA enjoys a lot of obvious advantages in solving complex optimization problems like Vehicle Routing Problem(VRP). AFSA have a few limitations such as low convergence speed, low precision and instability of results. In this paper, two improvements are introduced. On the one hand, change the definition of the distance for artificial fish, as well as increase vision field of artificial fish, and the problem of speed and precision can be improved when solving VRP. On the other hand, mix artificial bee colony algorithm(ABC) into AFSA - initialize the population of artificial fish by the ABC, and it solves the problem of instability of results in some extend. The experiment results demonstrate that the optimal solution of the hybrid AFSA is easier to approach the optimal solution of the standard database than the other two algorithms. In conclusion, the hybrid algorithm can effectively solve the problem that instability of results and decrease of convergence speed and the precision of calculation at the end of the process.

  1. Detecting and classifying method based on similarity matching of Android malware behavior with profile.

    Science.gov (United States)

    Jang, Jae-Wook; Yun, Jaesung; Mohaisen, Aziz; Woo, Jiyoung; Kim, Huy Kang

    2016-01-01

    Mass-market mobile security threats have increased recently due to the growth of mobile technologies and the popularity of mobile devices. Accordingly, techniques have been introduced for identifying, classifying, and defending against mobile threats utilizing static, dynamic, on-device, and off-device techniques. Static techniques are easy to evade, while dynamic techniques are expensive. On-device techniques are evasion, while off-device techniques need being always online. To address some of those shortcomings, we introduce Andro-profiler, a hybrid behavior based analysis and classification system for mobile malware. Andro-profiler main goals are efficiency, scalability, and accuracy. For that, Andro-profiler classifies malware by exploiting the behavior profiling extracted from the integrated system logs including system calls. Andro-profiler executes a malicious application on an emulator in order to generate the integrated system logs, and creates human-readable behavior profiles by analyzing the integrated system logs. By comparing the behavior profile of malicious application with representative behavior profile for each malware family using a weighted similarity matching technique, Andro-profiler detects and classifies it into malware families. The experiment results demonstrate that Andro-profiler is scalable, performs well in detecting and classifying malware with accuracy greater than 98 %, outperforms the existing state-of-the-art work, and is capable of identifying 0-day mobile malware samples.

  2. A low-cost approach to electronic excitation energies based on the driven similarity renormalization group

    Science.gov (United States)

    Li, Chenyang; Verma, Prakash; Hannon, Kevin P.; Evangelista, Francesco A.

    2017-08-01

    We propose an economical state-specific approach to evaluate electronic excitation energies based on the driven similarity renormalization group truncated to second order (DSRG-PT2). Starting from a closed-shell Hartree-Fock wave function, a model space is constructed that includes all single or single and double excitations within a given set of active orbitals. The resulting VCIS-DSRG-PT2 and VCISD-DSRG-PT2 methods are introduced and benchmarked on a set of 28 organic molecules [M. Schreiber et al., J. Chem. Phys. 128, 134110 (2008)]. Taking CC3 results as reference values, mean absolute deviations of 0.32 and 0.22 eV are observed for VCIS-DSRG-PT2 and VCISD-DSRG-PT2 excitation energies, respectively. Overall, VCIS-DSRG-PT2 yields results with accuracy comparable to those from time-dependent density functional theory using the B3LYP functional, while VCISD-DSRG-PT2 gives excitation energies comparable to those from equation-of-motion coupled cluster with singles and doubles.

  3. Determination of optimal samples for robot calibration based on error similarity

    Directory of Open Access Journals (Sweden)

    Tian Wei

    2015-06-01

    Full Text Available Industrial robots are used for automatic drilling and riveting. The absolute position accuracy of an industrial robot is one of the key performance indexes in aircraft assembly, and can be improved through error compensation to meet aircraft assembly requirements. The achievable accuracy and the difficulty of accuracy compensation implementation are closely related to the choice of sampling points. Therefore, based on the error similarity error compensation method, a method for choosing sampling points on a uniform grid is proposed. A simulation is conducted to analyze the influence of the sample point locations on error compensation. In addition, the grid steps of the sampling points are optimized using a statistical analysis method. The method is used to generate grids and optimize the grid steps of a Kuka KR-210 robot. The experimental results show that the method for planning sampling data can be used to effectively optimize the sampling grid. After error compensation, the position accuracy of the robot meets the position accuracy requirements.

  4. Sample similarity analysis of angles of repose based on experimental results for DEM calibration

    Science.gov (United States)

    Tan, Yuan; Günthner, Willibald A.; Kessler, Stephan; Zhang, Lu

    2017-06-01

    As a fundamental material property, particle-particle friction coefficient is usually calculated based on angle of repose which can be obtained experimentally. In the present study, the bottomless cylinder test was carried out to investigate this friction coefficient of a kind of biomass material, i.e. willow chips. Because of its irregular shape and varying particle size distribution, calculation of the angle becomes less applicable and decisive. In the previous studies only one section of those uneven slopes is chosen in most cases, although standard methods in definition of a representable section are barely found. Hence, we presented an efficient and reliable method from the new technology, 3D scan, which was used to digitize the surface of heaps and generate its point cloud. Then, two tangential lines of any selected section were calculated through the linear least-squares regression (LLSR), such that the left and right angle of repose of a pile could be derived. As the next step, a certain sum of sections were stochastic selected, and calculations were repeated correspondingly in order to achieve sample of angles, which was plotted in Cartesian coordinates as spots diagram. Subsequently, different samples were acquired through various selections of sections. By applying similarities and difference analysis of these samples, the reliability of this proposed method was verified. Phased results provides a realistic criterion to reduce the deviation between experiment and simulation as a result of random selection of a single angle, which will be compared with the simulation results in the future.

  5. Signal-noise separation based on self-similarity testing in 1D-timeseries data

    Science.gov (United States)

    Bourdin, Philippe A.

    2015-08-01

    The continuous improvement of the resolution delivered by modern instrumentation is a cost-intensive part of any new space- or ground-based observatory. Typically, scientists later reduce the resolution of the obtained raw-data, for example in the spatial, spectral, or temporal domain, in order to suppress the effects of noise in the measurements. In practice, only simple methods are used that just smear out the noise, instead of trying to remove it, so that the noise can nomore be seen. In high-precision 1D-timeseries data, this usually results in an unwanted quality-loss and corruption of power spectra at selected frequency ranges. Novel methods exist that are based on non-local averaging, which would conserve much of the initial resolution, but these methods are so far focusing on 2D or 3D data. We present here a method specialized for 1D-timeseries, e.g. as obtained by magnetic field measurements from the recently launched MMS satellites. To identify the noise, we use a self-similarity testing and non-local averaging method in order to separate different types of noise and signals, like the instrument noise, non-correlated fluctuations in the signal from heliospheric sources, and correlated fluctuations such as harmonic waves or shock fronts. In power spectra of test data, we are able to restore significant parts of a previously know signal from a noisy measurement. This method also works for high frequencies, where the background noise may have a larger contribution to the spectral power than the signal itself. We offer an easy-to-use software tools set, which enables scientists to use this novel technique on their own noisy data. This allows to use the maximum possible capacity of the instrumental hardware and helps to enhance the quality of the obtained scientific results.

  6. Development and evaluation of a biomedical search engine using a predicate-based vector space model.

    Science.gov (United States)

    Kwak, Myungjae; Leroy, Gondy; Martinez, Jesse D; Harwell, Jeffrey

    2013-10-01

    Although biomedical information available in articles and patents is increasing exponentially, we continue to rely on the same information retrieval methods and use very few keywords to search millions of documents. We are developing a fundamentally different approach for finding much more precise and complete information with a single query using predicates instead of keywords for both query and document representation. Predicates are triples that are more complex datastructures than keywords and contain more structured information. To make optimal use of them, we developed a new predicate-based vector space model and query-document similarity function with adjusted tf-idf and boost function. Using a test bed of 107,367 PubMed abstracts, we evaluated the first essential function: retrieving information. Cancer researchers provided 20 realistic queries, for which the top 15 abstracts were retrieved using a predicate-based (new) and keyword-based (baseline) approach. Each abstract was evaluated, double-blind, by cancer researchers on a 0-5 point scale to calculate precision (0 versus higher) and relevance (0-5 score). Precision was significantly higher (psearching than keywords, laying the foundation for rich and sophisticated information search. Copyright © 2013 Elsevier Inc. All rights reserved.

  7. A behavioral similarity measure between labeled Petri nets based on principal transition sequences

    NARCIS (Netherlands)

    Wang, J.; He, T.; Wen, L.; Wu, N.; Hofstede, ter A.H.M.; Su, J.; Meersman, R.; Dillon, T.S.; Herrero, P.

    2010-01-01

    Being able to determine the degree of similarity between process models is important for management, reuse, and analysis of business process models. In this paper we propose a novel method to determine the degree of similarity between process models, which exploits their semantics. Our approach is

  8. Semantic similarity-based alignment between clinical archetypes and SNOMED CT: an application to observations.

    Science.gov (United States)

    Meizoso García, María; Iglesias Allones, José Luis; Martínez Hernández, Diego; Taboada Iglesias, María Jesús

    2012-08-01

    One of the main challenges of eHealth is semantic interoperability of health systems. But, this will only be possible if the capture, representation and access of patient data is standardized. Clinical data models, such as OpenEHR Archetypes, define data structures that are agreed by experts to ensure the accuracy of health information. In addition, they provide an option to normalize clinical data by means of binding terms used in the model definition to standard medical vocabularies. Nevertheless, the effort needed to establish the association between archetype terms and standard terminology concepts is considerable. Therefore, the purpose of this study is to provide an automated approach to bind OpenEHR archetypes terms to the external terminology SNOMED CT, with the capability to do it at a semantic level. This research uses lexical techniques and external terminological tools in combination with context-based techniques, which use information about structural and semantic proximity to identify similarities between terms and so, to find alignments between them. The proposed approach exploits both the structural context of archetypes and the terminology context, in which concepts are logically defined through the relationships (hierarchical and definitional) to other concepts. A set of 25 OBSERVATION archetypes with 477 bound terms was used to test the method. Of these, 342 terms (74.6%) were linked with 96.1% precision, 71.7% recall and 1.23 SNOMED CT concepts on average for each mapping. It has been detected that about one third of the archetype clinical information is grouped logically. Context-based techniques take advantage of this to increase the recall and to validate a 30.4% of the bindings produced by lexical techniques. This research shows that it is possible to automatically map archetype terms to a standard terminology with a high precision and recall, with the help of appropriate contextual and semantic information of both models. Moreover, the

  9. Algorithm of axial fuel optimization based in progressive steps of turned search

    International Nuclear Information System (INIS)

    Martin del Campo, C.; Francois, J.L.

    2003-01-01

    The development of an algorithm for the axial optimization of fuel of boiling water reactors (BWR) is presented. The algorithm is based in a serial optimizations process in the one that the best solution in each stage is the starting point of the following stage. The objective function of each stage adapts to orient the search toward better values of one or two parameters leaving the rest like restrictions. Conform to it advances in those optimization stages, it is increased the fineness of the evaluation of the investigated designs. The algorithm is based on three stages, in the first one are used Genetic algorithms and in the two following Tabu Search. The objective function of the first stage it looks for to minimize the average enrichment of the one it assembles and to fulfill with the generation of specified energy for the operation cycle besides not violating none of the limits of the design base. In the following stages the objective function looks for to minimize the power factor peak (PPF) and to maximize the margin of shutdown (SDM), having as restrictions the one average enrichment obtained for the best design in the first stage and those other restrictions. The third stage, very similar to the previous one, it begins with the design of the previous stage but it carries out a search of the margin of shutdown to different exhibition steps with calculations in three dimensions (3D). An application to the case of the design of the fresh assemble for the fourth fuel reload of the Unit 1 reactor of the Laguna Verde power plant (U1-CLV) is presented. The obtained results show an advance in the handling of optimization methods and in the construction of the objective functions that should be used for the different design stages of the fuel assemblies. (Author)

  10. Sample similarity analysis of angles of repose based on experimental results for DEM calibration

    Directory of Open Access Journals (Sweden)

    Tan Yuan

    2017-01-01

    Full Text Available As a fundamental material property, particle-particle friction coefficient is usually calculated based on angle of repose which can be obtained experimentally. In the present study, the bottomless cylinder test was carried out to investigate this friction coefficient of a kind of biomass material, i.e. willow chips. Because of its irregular shape and varying particle size distribution, calculation of the angle becomes less applicable and decisive. In the previous studies only one section of those uneven slopes is chosen in most cases, although standard methods in definition of a representable section are barely found. Hence, we presented an efficient and reliable method from the new technology, 3D scan, which was used to digitize the surface of heaps and generate its point cloud. Then, two tangential lines of any selected section were calculated through the linear least-squares regression (LLSR, such that the left and right angle of repose of a pile could be derived. As the next step, a certain sum of sections were stochastic selected, and calculations were repeated correspondingly in order to achieve sample of angles, which was plotted in Cartesian coordinates as spots diagram. Subsequently, different samples were acquired through various selections of sections. By applying similarities and difference analysis of these samples, the reliability of this proposed method was verified. Phased results provides a realistic criterion to reduce the deviation between experiment and simulation as a result of random selection of a single angle, which will be compared with the simulation results in the future.

  11. Physiological Differences and Similarities in Asthma and COPD—Based on Respiratory Function Testing—

    Directory of Open Access Journals (Sweden)

    Michiaki Mishima

    2009-01-01

    Full Text Available Physiological differences and similarities in asthma and COPD are documented based on respiratory function testing. (1 The airflow reversibility is usually important for the diagnosis of asthma. However, patients with long disease histories may have poor reversibility. The reversibility test in COPD is useful for predicting the treatment response. (2 In some of the stable asthmatic patients without attack, the concave downslope of flow- volume curve is present. In severe COPD, the flow in the second half of the curve is smaller than that of rest- breathing. (3 Inspiratory capacity (IC is a good estimator of air trapping and of predicting the exercise capacity in COPD or persistent asthma. (4 Peak expiratory flow (PEF can be an important aid in both diagnosis and monitoring of asthma. PEF is not used in COPD because the main disorder is in the peripheral airway. (5 Measurements of airway responsiveness may help to a diagnosis of asthma. However, many COPD cases also have it. (6 Impulse oscillation system (IOS revealed that the predominant airway disorders in asthma and COPD are central and peripheral respiratory resistance, respectively. However, some asthma patients have larger values of peripheral component. (7 Dlco reflects the extent of pathological emphysema and it is useful for the follow-up of COPD, whereas Dlco is not decreased in asthma. (8 The patient with widened A-aDO2 and alveolar hypoventilation may lead to the life threatening hypoxia in severe asthma attack or severe COPD. When PaCO2 overcomes PaO2, the patient should immediately be treated by mechanical ventilation.

  12. Mapping Rice Cropping Systems in Vietnam Using an NDVI-Based Time-Series Similarity Measurement Based on DTW Distance

    Directory of Open Access Journals (Sweden)

    Xudong Guan

    2016-01-01

    Full Text Available Normalized Difference Vegetation Index (NDVI derived from Moderate Resolution Imaging Spectroradiometer (MODIS time-series data has been widely used in the fields of crop and rice classification. The cloudy and rainy weather characteristics of the monsoon season greatly reduce the likelihood of obtaining high-quality optical remote sensing images. In addition, the diverse crop-planting system in Vietnam also hinders the comparison of NDVI among different crop stages. To address these problems, we apply a Dynamic Time Warping (DTW distance-based similarity measure approach and use the entire yearly NDVI time series to reduce the inaccuracy of classification using a single image. We first de-noise the NDVI time series using S-G filtering based on the TIMESAT software. Then, a standard NDVI time-series base for rice growth is established based on field survey data and Google Earth sample data. NDVI time-series data for each pixel are constructed and the DTW distance with the standard rice growth NDVI time series is calculated. Then, we apply thresholds to extract rice growth areas. A qualitative assessment using statistical data and a spatial assessment using sampled data from the rice-cropping map reveal a high mapping accuracy at the national scale between the statistical data, with the corresponding R2 being as high as 0.809; however, the mapped rice accuracy decreased at the provincial scale due to the reduced number of rice planting areas per province. An analysis of the results indicates that the 500-m resolution MODIS data are limited in terms of mapping scattered rice parcels. The results demonstrate that the DTW-based similarity measure of the NDVI time series can be effectively used to map large-area rice cropping systems with diverse cultivation processes.

  13. [Baking method of Platycladi Cacumen Carbonisatum based on similarity of UPLC fingerprints].

    Science.gov (United States)

    Shan, Mingqiu; Chen, Chao; Yao, Xiaodong; Ding, Anwei

    2010-09-01

    To establish a baking method of Platycladi Cacumen Carbonisatum for providing a new idea to Carbonic Herbs' research. Samples were prepared in an oven for different time at different temperatures separately. Then the fingerprints of the samples were determined by UPLC. According to the standard fingerprint, the similarities of the samples' fingerprints were compared. The similarities of 3 samples, which were baked at 230 degrees C for 20 min, 30 min and at 240 degrees C for 20 min, were above 0.96. According to the similarities of the fingerprints and in view of the appearances, Platycladi Cacumen Carbonizing should be baked at 230 degrees C for 20 min.

  14. Enhancing Image Retrieval System Using Content Based Search ...

    African Journals Online (AJOL)

    The output shows more efficiency in retrieval because instead of performing the search on the entire image database, the image category option directs the retrieval engine to the specified category. Also, there is provision to update or modify the different image categories in the image database as need arise. Keywords: ...

  15. Automated dating of the world’s language families based on lexical similarity

    NARCIS (Netherlands)

    Holman, E.W.; Brown, C.H.; Wichmann, S.; Müller, A.; Velupillai, V.; Hammarström, H.; Sauppe, S.; Jung, H.; Bakker, D.; Brown, P.; Belyaev, O.; Urban, M.; Mailhammer, R.; List, J.-M.; Egorov, D.

    2011-01-01

    This paper describes a computerized alternative to glottochronology for estimating elapsed time since parent languages diverged into daughter languages. The method, developed by the Automated Similarity Judgment Program (ASJP) consortium, is different from glottochronology in four major respects:

  16. Development and Evaluation of Thesauri-Based Bibliographic Biomedical Search Engine

    Science.gov (United States)

    Alghoson, Abdullah

    2017-01-01

    Due to the large volume and exponential growth of biomedical documents (e.g., books, journal articles), it has become increasingly challenging for biomedical search engines to retrieve relevant documents based on users' search queries. Part of the challenge is the matching mechanism of free-text indexing that performs matching based on…

  17. Teaching AI Search Algorithms in a Web-Based Educational System

    Science.gov (United States)

    Grivokostopoulou, Foteini; Hatzilygeroudis, Ioannis

    2013-01-01

    In this paper, we present a way of teaching AI search algorithms in a web-based adaptive educational system. Teaching is based on interactive examples and exercises. Interactive examples, which use visualized animations to present AI search algorithms in a step-by-step way with explanations, are used to make learning more attractive. Practice…

  18. Novel citation-based search method for scientific literature: application to meta-analyses

    NARCIS (Netherlands)

    Janssens, A.C.J.W.; Gwinn, M.

    2015-01-01

    Background: Finding eligible studies for meta-analysis and systematic reviews relies on keyword-based searching as the gold standard, despite its inefficiency. Searching based on direct citations is not sufficiently comprehensive. We propose a novel strategy that ranks articles on their degree of

  19. Perturbation based Monte Carlo criticality search in density, enrichment and concentration

    International Nuclear Information System (INIS)

    Li, Zeguang; Wang, Kan; Deng, Jingkang

    2015-01-01

    Highlights: • A new perturbation based Monte Carlo criticality search method is proposed. • The method could get accurate results with only one individual criticality run. • The method is used to solve density, enrichment and concentration search problems. • Results show the feasibility and good performances of this method. • The relationship between results’ accuracy and perturbation order is discussed. - Abstract: Criticality search is a very important aspect in reactor physics analysis. Due to the advantages of Monte Carlo method and the development of computer technologies, Monte Carlo criticality search is becoming more and more necessary and feasible. Existing Monte Carlo criticality search methods need large amount of individual criticality runs and may have unstable results because of the uncertainties of criticality results. In this paper, a new perturbation based Monte Carlo criticality search method is proposed and discussed. This method only needs one individual criticality calculation with perturbation tallies to estimate k eff changing function using initial k eff and differential coefficients results, and solves polynomial equations to get the criticality search results. The new perturbation based Monte Carlo criticality search method is implemented in the Monte Carlo code RMC, and criticality search problems in density, enrichment and concentration are taken out. Results show that this method is quite inspiring in accuracy and efficiency, and has advantages compared with other criticality search methods

  20. A Study on Control System Design Based on ARM Sea Target Search System

    Directory of Open Access Journals (Sweden)

    Lin Xinwei

    2015-01-01

    Full Text Available The infrared detector is used for sea target search, which can assist humans in searching suspicious objects at night and under poor visibility conditions, and improving search efficiency. This paper applies for interrupt and stack technology to solve problems of data losses that may be caused by one-to-many multi-byte protocol communication. Meanwhile, this paper implements hardware and software design of the system based on industrial-grade ARM control chip and uC / OS-II embedded operating system. The control system in the sea target search system is an information exchange and control center of the whole system, which solves the problem of controlling over the shooting angle of the infrared detector in the process of target search. After testing, the control system operates stably and reliably, and realizes rotation and control functions of the pan/tilt platform during automatic search, manual search and track.

  1. Image quality assessment based on inter-patch and intra-patch similarity.

    Directory of Open Access Journals (Sweden)

    Fei Zhou

    Full Text Available In this paper, we propose a full-reference (FR image quality assessment (IQA scheme, which evaluates image fidelity from two aspects: the inter-patch similarity and the intra-patch similarity. The scheme is performed in a patch-wise fashion so that a quality map can be obtained. On one hand, we investigate the disparity between one image patch and its adjacent ones. This disparity is visually described by an inter-patch feature, where the hybrid effect of luminance masking and contrast masking is taken into account. The inter-patch similarity is further measured by modifying the normalized correlation coefficient (NCC. On the other hand, we also attach importance to the impact of image contents within one patch on the IQA problem. For the intra-patch feature, we consider image curvature as an important complement of image gradient. According to local image contents, the intra-patch similarity is measured by adaptively comparing image curvature and gradient. Besides, a nonlinear integration of the inter-patch and intra-patch similarity is presented to obtain an overall score of image quality. The experiments conducted on six publicly available image databases show that our scheme achieves better performance in comparison with several state-of-the-art schemes.

  2. Authentication of commercial spices based on the similarities between gas chromatographic fingerprints.

    Science.gov (United States)

    Matsushita, Takaya; Zhao, Jing Jing; Igura, Noriyuki; Shimoda, Mitsuya

    2018-06-01

    A simple and solvent-free method was developed for the authentication of commercial spices. The similarities between gas chromatographic fingerprints were measured using similarity indices and multivariate data analyses, as morphological differentiation between dried powders and small spice particles was challenging. The volatile compounds present in 11 spices (i.e. allspice, anise, black pepper, caraway, clove, coriander, cumin, dill, fennel, star anise, and white pepper) were extracted by headspace solid-phase microextraction, and analysed by gas chromatography-mass spectrometry. The largest 10 peaks were selected from each total ion chromatogram, and a total of 65 volatiles were tentatively identified. The similarity indices (i.e. the congruence coefficients) were calculated using the data matrices of the identified compound relative peak areas to differentiate between two sets of fingerprints. Where pairs of similar fingerprints produced high congruence coefficients (>0.80), distinctive volatile markers were employed to distinguish between these samples. In addition, hierarchical cluster analysis and principal component analysis were performed to visualise the similarity among fingerprints, and the analysed spices were grouped and characterised according to their distinctive major components. This method is suitable for screening unknown spices, and can therefore be employed to evaluate the quality and authenticity of various spices. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.

  3. A new measure for functional similarity of gene products based on Gene Ontology

    Directory of Open Access Journals (Sweden)

    Lengauer Thomas

    2006-06-01

    Full Text Available Abstract Background Gene Ontology (GO is a standard vocabulary of functional terms and allows for coherent annotation of gene products. These annotations provide a basis for new methods that compare gene products regarding their molecular function and biological role. Results We present a new method for comparing sets of GO terms and for assessing the functional similarity of gene products. The method relies on two semantic similarity measures; simRel and funSim. One measure (simRel is applied in the comparison of the biological processes found in different groups of organisms. The other measure (funSim is used to find functionally related gene products within the same or between different genomes. Results indicate that the method, in addition to being in good agreement with established sequence similarity approaches, also provides a means for the identification of functionally related proteins independent of evolutionary relationships. The method is also applied to estimating functional similarity between all proteins in Saccharomyces cerevisiae and to visualizing the molecular function space of yeast in a map of the functional space. A similar approach is used to visualize the functional relationships between protein families. Conclusion The approach enables the comparison of the underlying molecular biology of different taxonomic groups and provides a new comparative genomics tool identifying functionally related gene products independent of homology. The proposed map of the functional space provides a new global view on the functional relationships between gene products or protein families.

  4. IBRI-CASONTO: Ontology-based semantic search engine

    OpenAIRE

    Awny Sayed; Amal Al Muqrishi

    2017-01-01

    The vast availability of information, that added in a very fast pace, in the data repositories creates a challenge in extracting correct and accurate information. Which has increased the competition among developers in order to gain access to technology that seeks to understand the intent researcher and contextual meaning of terms. While the competition for developing an Arabic Semantic Search systems are still in their infancy, and the reason could be traced back to the complexity of Arabic ...

  5. Extension of frequency-based dissimilarity for retrieving similar plasma waveforms

    International Nuclear Information System (INIS)

    Hochin, Teruhisa; Koyama, Katsumasa; Nakanishi, Hideya; Kojima, Mamoru

    2008-01-01

    Some computer-aided assistance in finding the waveforms similar to a waveform has become indispensable for accelerating data analysis in the plasma experiments. For the slowly-varying waveforms and those having time-sectional oscillation patterns, the methods using the Fourier series coefficients of waveforms in calculating the dissimilarity have successfully improved the performance in retrieving similar waveforms. This paper treats severely-varying waveforms, and proposes two extensions to the dissimilarity of waveforms. The first extension is to capture the difference of the importance of the Fourier series coefficients of waveforms against frequency. The second extension is to consider the outlines of waveforms. The correctness of the extended dissimilarity is experimentally evaluated by using the metrics used in evaluating that of the information retrieval, i.e. precision and recall. The experimental results show that the extended dissimilarity could improve the correctness of the similarity retrieval of plasma waveforms

  6. Efficient similarity-based data clustering by optimal object to cluster reallocation.

    Science.gov (United States)

    Rossignol, Mathias; Lagrange, Mathieu; Cont, Arshia

    2018-01-01

    We present an iterative flat hard clustering algorithm designed to operate on arbitrary similarity matrices, with the only constraint that these matrices be symmetrical. Although functionally very close to kernel k-means, our proposal performs a maximization of average intra-class similarity, instead of a squared distance minimization, in order to remain closer to the semantics of similarities. We show that this approach permits the relaxing of some conditions on usable affinity matrices like semi-positiveness, as well as opening possibilities for computational optimization required for large datasets. Systematic evaluation on a variety of data sets shows that compared with kernel k-means and the spectral clustering methods, the proposed approach gives equivalent or better performance, while running much faster. Most notably, it significantly reduces memory access, which makes it a good choice for large data collections. Material enabling the reproducibility of the results is made available online.

  7. Semiconductor-based experiments for neutrinoless double beta decay search

    International Nuclear Information System (INIS)

    Barnabé Heider, Marik

    2012-01-01

    Three experiments are employing semiconductor detectors in the search for neutrinoless double beta (0νββ) decay: COBRA, Majorana and GERDA. COBRA is studying the prospects of using CdZnTe detectors in terms of achievable energy resolution and background suppression. These detectors contain several ββ emitters and the most promising for 0νββ-decay search is 116 Cd. Majorana and GERDA will use isotopically enriched high purity Ge detectors to search for 0νββ-decay of 76 Ge. Their aim is to achieve a background ⩽10 −3 counts/(kg⋅y⋅keV) at the Q improvement compared to the present state-of-art. Majorana will operate Ge detectors in electroformed-Cu vacuum cryostats. A first cryostat housing a natural-Ge detector array is currently under preparation. In contrast, GERDA is operating bare Ge detectors submerged in liquid argon. The construction of the GERDA experiment is completed and a commissioning run started in June 2010. A string of natural-Ge detectors is operated to test the complete experimental setup and to determine the background before submerging the detectors enriched in 76 Ge. An overview and a comparison of these three experiments will be presented together with the latest results and developments.

  8. Distance and Density Similarity Based Enhanced k-NN Classifier for Improving Fault Diagnosis Performance of Bearings

    Directory of Open Access Journals (Sweden)

    Sharif Uddin

    2016-01-01

    Full Text Available An enhanced k-nearest neighbor (k-NN classification algorithm is presented, which uses a density based similarity measure in addition to a distance based similarity measure to improve the diagnostic performance in bearing fault diagnosis. Due to its use of distance based similarity measure alone, the classification accuracy of traditional k-NN deteriorates in case of overlapping samples and outliers and is highly susceptible to the neighborhood size, k. This study addresses these limitations by proposing the use of both distance and density based measures of similarity between training and test samples. The proposed k-NN classifier is used to enhance the diagnostic performance of a bearing fault diagnosis scheme, which classifies different fault conditions based upon hybrid feature vectors extracted from acoustic emission (AE signals. Experimental results demonstrate that the proposed scheme, which uses the enhanced k-NN classifier, yields better diagnostic performance and is more robust to variations in the neighborhood size, k.

  9. Pep-3D-Search: a method for B-cell epitope prediction based on mimotope analysis.

    Science.gov (United States)

    Huang, Yan Xin; Bao, Yong Li; Guo, Shu Yan; Wang, Yan; Zhou, Chun Guang; Li, Yu Xin

    2008-12-16

    The prediction of conformational B-cell epitopes is one of the most important goals in immunoinformatics. The solution to this problem, even if approximate, would help in designing experiments to precisely map the residues of interaction between an antigen and an antibody. Consequently, this area of research has received considerable attention from immunologists, structural biologists and computational biologists. Phage-displayed random peptide libraries are powerful tools used to obtain mimotopes that are selected by binding to a given monoclonal antibody (mAb) in a similar way to the native epitope. These mimotopes can be considered as functional epitope mimics. Mimotope analysis based methods can predict not only linear but also conformational epitopes and this has been the focus of much research in recent years. Though some algorithms based on mimotope analysis have been proposed, the precise localization of the interaction site mimicked by the mimotopes is still a challenging task. In this study, we propose a method for B-cell epitope prediction based on mimotope analysis called Pep-3D-Search. Given the 3D structure of an antigen and a set of mimotopes (or a motif sequence derived from the set of mimotopes), Pep-3D-Search can be used in two modes: mimotope or motif. To evaluate the performance of Pep-3D-Search to predict epitopes from a set of mimotopes, 10 epitopes defined by crystallography were compared with the predicted results from a Pep-3D-Search: the average Matthews correlation coefficient (MCC), sensitivity and precision were 0.1758, 0.3642 and 0.6948. Compared with other available prediction algorithms, Pep-3D-Search showed comparable MCC, specificity and precision, and could provide novel, rational results. To verify the capability of Pep-3D-Search to align a motif sequence to a 3D structure for predicting epitopes, 6 test cases were used. The predictive performance of Pep-3D-Search was demonstrated to be superior to that of other similar programs

  10. New Methodology for Measuring Semantic Functional Similarity Based on Bidirectional Integration

    Science.gov (United States)

    Jeong, Jong Cheol

    2013-01-01

    1.2 billion users in Facebook, 17 million articles in Wikipedia, and 190 million tweets per day have demanded significant increase of information processing through Internet in recent years. Similarly life sciences and bioinformatics also have faced issues of processing Big data due to the explosion of publicly available genomic information…

  11. User-assisted Object Detection by Segment Based Similarity Measures in Mobile Laser Scanner Data

    NARCIS (Netherlands)

    Oude Elberink, S.J.; Kemboi, B.J.

    2014-01-01

    This paper describes a method that aims to find all instances of a certain object in Mobile Laser Scanner (MLS) data. In a userassisted approach, a sample segment of an object is selected, and all similar objects are to be found. By selecting samples from multiple classes, a classification can be

  12. Multi-Attribute Decision Making Based on Several Trigonometric Hamming Similarity Measures under Interval Rough Neutrosophic Environment

    Directory of Open Access Journals (Sweden)

    Surapati Pramanik

    2018-03-01

    Full Text Available In this paper, the sine, cosine and cotangent similarity measures of interval rough neutrosophic sets is proposed. Some properties of the proposed measures are discussed. We have proposed multi attribute decision making approaches based on proposed similarity measures. To demonstrate the applicability, a numerical example is solved.

  13. MetaboSearch: tool for mass-based metabolite identification using multiple databases.

    Directory of Open Access Journals (Sweden)

    Bin Zhou

    Full Text Available Searching metabolites against databases according to their masses is often the first step in metabolite identification for a mass spectrometry-based untargeted metabolomics study. Major metabolite databases include Human Metabolome DataBase (HMDB, Madison Metabolomics Consortium Database (MMCD, Metlin, and LIPID MAPS. Since each one of these databases covers only a fraction of the metabolome, integration of the search results from these databases is expected to yield a more comprehensive coverage. However, the manual combination of multiple search results is generally difficult when identification of hundreds of metabolites is desired. We have implemented a web-based software tool that enables simultaneous mass-based search against the four major databases, and the integration of the results. In addition, more complete chemical identifier information for the metabolites is retrieved by cross-referencing multiple databases. The search results are merged based on IUPAC International Chemical Identifier (InChI keys. Besides a simple list of m/z values, the software can accept the ion annotation information as input for enhanced metabolite identification. The performance of the software is demonstrated on mass spectrometry data acquired in both positive and negative ionization modes. Compared with search results from individual databases, MetaboSearch provides better coverage of the metabolome and more complete chemical identifier information.The software tool is available at http://omics.georgetown.edu/MetaboSearch.html.

  14. Graphics-based intelligent search and abstracting using Data Modeling

    Science.gov (United States)

    Jaenisch, Holger M.; Handley, James W.; Case, Carl T.; Songy, Claude G.

    2002-11-01

    This paper presents an autonomous text and context-mining algorithm that converts text documents into point clouds for visual search cues. This algorithm is applied to the task of data-mining a scriptural database comprised of the Old and New Testaments from the Bible and the Book of Mormon, Doctrine and Covenants, and the Pearl of Great Price. Results are generated which graphically show the scripture that represents the average concept of the database and the mining of the documents down to the verse level.

  15. Contextual Cueing in Multiconjunction Visual Search Is Dependent on Color- and Configuration-Based Intertrial Contingencies

    Science.gov (United States)

    Geyer, Thomas; Shi, Zhuanghua; Muller, Hermann J.

    2010-01-01

    Three experiments examined memory-based guidance of visual search using a modified version of the contextual-cueing paradigm (Jiang & Chun, 2001). The target, if present, was a conjunction of color and orientation, with target (and distractor) features randomly varying across trials (multiconjunction search). Under these conditions, reaction times…

  16. The role of space and time in object-based visual search

    NARCIS (Netherlands)

    Schreij, D.B.B.; Olivers, C.N.L.

    2013-01-01

    Recently we have provided evidence that observers more readily select a target from a visual search display if the motion trajectory of the display object suggests that the observer has dealt with it before. Here we test the prediction that this object-based memory effect on search breaks down if

  17. Effect of Reading Ability and Internet Experience on Keyword-Based Image Search

    Science.gov (United States)

    Lei, Pei-Lan; Lin, Sunny S. J.; Sun, Chuen-Tsai

    2013-01-01

    Image searches are now crucial for obtaining information, constructing knowledge, and building successful educational outcomes. We investigated how reading ability and Internet experience influence keyword-based image search behaviors and performance. We categorized 58 junior-high-school students into four groups of high/low reading ability and…

  18. Supervised learning of tools for content-based search of image databases

    Science.gov (United States)

    Delanoy, Richard L.

    1996-03-01

    A computer environment, called the Toolkit for Image Mining (TIM), is being developed with the goal of enabling users with diverse interests and varied computer skills to create search tools for content-based image retrieval and other pattern matching tasks. Search tools are generated using a simple paradigm of supervised learning that is based on the user pointing at mistakes of classification made by the current search tool. As mistakes are identified, a learning algorithm uses the identified mistakes to build up a model of the user's intentions, construct a new search tool, apply the search tool to a test image, display the match results as feedback to the user, and accept new inputs from the user. Search tools are constructed in the form of functional templates, which are generalized matched filters capable of knowledge- based image processing. The ability of this system to learn the user's intentions from experience contrasts with other existing approaches to content-based image retrieval that base searches on the characteristics of a single input example or on a predefined and semantically- constrained textual query. Currently, TIM is capable of learning spectral and textural patterns, but should be adaptable to the learning of shapes, as well. Possible applications of TIM include not only content-based image retrieval, but also quantitative image analysis, the generation of metadata for annotating images, data prioritization or data reduction in bandwidth-limited situations, and the construction of components for larger, more complex computer vision algorithms.

  19. Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs

    Directory of Open Access Journals (Sweden)

    Hutchison Clyde A

    2006-01-01

    Full Text Available Abstract Background Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs. We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. Results "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not are attractive targets when seeking errors of gene predicion. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency. We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Conclusion Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.

  20. Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs.

    Science.gov (United States)

    Powell, Bradford C; Hutchison, Clyde A

    2006-01-19

    Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene prediction. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.

  1. Identification among morphologically similar Argyreia (Convolvulaceae) based on leaf anatomy and phenetic analyses.

    Science.gov (United States)

    Traiperm, Paweena; Chow, Janene; Nopun, Possathorn; Staples, G; Swangpol, Sasivimon C

    2017-12-01

    The genus Argyreia Lour. is one of the species-rich Asian genera in the family Convolvulaceae. Several species complexes were recognized in which taxon delimitation was imprecise, especially when examining herbarium materials without fully developed open flowers. The main goal of this study is to investigate and describe leaf anatomy for some morphologically similar Argyreia using epidermal peeling, leaf and petiole transverse sections, and scanning electron microscopy. Phenetic analyses including cluster analysis and principal component analysis were used to investigate the similarity of these morpho-types. Anatomical differences observed between the morpho-types include epidermal cell walls and the trichome types on the leaf epidermis. Additional differences in the leaf and petiole transverse sections include the epidermal cell shape of the adaxial leaf blade, the leaf margins, and the petiole transverse sectional outline. The phenogram from cluster analysis using the UPGMA method represented four groups with an R value of 0.87. Moreover, the important quantitative and qualitative leaf anatomical traits of the four groups were confirmed by the principal component analysis of the first two components. The results from phenetic analyses confirmed the anatomical differentiation between the morpho-types. Leaf anatomical features regarded as particularly informative for morpho-type differentiation can be used to supplement macro morphological identification.

  2. Similarity-based distortion of visual short-term memory is due to perceptual averaging.

    Science.gov (United States)

    Dubé, Chad; Zhou, Feng; Kahana, Michael J; Sekuler, Robert

    2014-03-01

    A task-irrelevant stimulus can distort recall from visual short-term memory (VSTM). Specifically, reproduction of a task-relevant memory item is biased in the direction of the irrelevant memory item (Huang & Sekuler, 2010a). The present study addresses the hypothesis that such effects reflect the influence of neural averaging under conditions of uncertainty about the contents of VSTM (Alvarez, 2011; Ball & Sekuler, 1980). We manipulated subjects' attention to relevant and irrelevant study items whose similarity relationships were held constant, while varying how similar the study items were to a subsequent recognition probe. On each trial, subjects were shown one or two Gabor patches, followed by the probe; their task was to indicate whether the probe matched one of the study items. A brief cue told subjects which Gabor, first or second, would serve as that trial's target item. Critically, this cue appeared either before, between, or after the study items. A distributional analysis of the resulting mnemometric functions showed an inflation in probability density in the region spanning the spatial frequency of the average of the two memory items. This effect, due to an elevation in false alarms to probes matching the perceptual average, was diminished when cues were presented before both study items. These results suggest that (a) perceptual averages are computed obligatorily and (b) perceptual averages are relied upon to a greater extent when item representations are weakened. Implications of these results for theories of VSTM are discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Pulse shape analysis based on similarity and neural network with digital-analog fusion method

    International Nuclear Information System (INIS)

    Mardiyanto, M.P.; Uritani, A.; Sakai, H.; Kawarabayashi, J.; Iguchi, T.

    2000-01-01

    Through the measurement of 22 Na γ-rays, it has been demonstrated that the correction process was well done by fusing the similarity values with the pulse heights measured by the analog system, where at least four improvements in the energy spectrum characteristics were recognized, i.e., the increase of the peak-to-valley ratio, the photopeak area, the photopeak sharpness without discarding any events, and the 1,275 keV γ-ray photopeak was seen. The use of a slow digitizer was the main problem for this method. However, it can be solved easily using a faster digitizer. The fusion method was also applied for the beta-gamma mixed spectra separation. Mixed spectra of beta-gamma of the 137 Cs- 90 Sr mixed source could be separated well. We made a comparison between the energy spectrum of 137 Cs as a result of independent measurement with the result of the separation. After being compared, both FWHM agreed quite well. However, there was a slight difference between the two spectra on the peak-to-valley ratio. This separation method is simple and useful so that it can be applied for many other similar applications. (S.Y.)

  4. A path-based measurement for human miRNA functional similarities using miRNA-disease associations

    Science.gov (United States)

    Ding, Pingjian; Luo, Jiawei; Xiao, Qiu; Chen, Xiangtao

    2016-09-01

    Compared with the sequence and expression similarity, miRNA functional similarity is so important for biology researches and many applications such as miRNA clustering, miRNA function prediction, miRNA synergism identification and disease miRNA prioritization. However, the existing methods always utilized the predicted miRNA target which has high false positive and false negative to calculate the miRNA functional similarity. Meanwhile, it is difficult to achieve high reliability of miRNA functional similarity with miRNA-disease associations. Therefore, it is increasingly needed to improve the measurement of miRNA functional similarity. In this study, we develop a novel path-based calculation method of miRNA functional similarity based on miRNA-disease associations, called MFSP. Compared with other methods, our method obtains higher average functional similarity of intra-family and intra-cluster selected groups. Meanwhile, the lower average functional similarity of inter-family and inter-cluster miRNA pair is obtained. In addition, the smaller p-value is achieved, while applying Wilcoxon rank-sum test and Kruskal-Wallis test to different miRNA groups. The relationship between miRNA functional similarity and other information sources is exhibited. Furthermore, the constructed miRNA functional network based on MFSP is a scale-free and small-world network. Moreover, the higher AUC for miRNA-disease prediction indicates the ability of MFSP uncovering miRNA functional similarity.

  5. Keyword-based Ciphertext Search Algorithm under Cloud Storage

    Directory of Open Access Journals (Sweden)

    Ren Xunyi

    2016-01-01

    Full Text Available With the development of network storage services, cloud storage have the advantage of high scalability , inexpensive, without access limit and easy to manage. These advantages make more and more small or medium enterprises choose to outsource large quantities of data to a third party. This way can make lots of small and medium enterprises get rid of costs of construction and maintenance, so it has broad market prospects. But now lots of cloud storage service providers can not protect data security.This result leakage of user data, so many users have to use traditional storage method.This has become one of the important factors that hinder the development of cloud storage. In this article, establishing keyword index by extracting keywords from ciphertext data. After that, encrypted data and the encrypted index upload cloud server together.User get related ciphertext by searching encrypted index, so it can response data leakage problem.

  6. iPixel: a visual content-based and semantic search engine for retrieving digitized mammograms by using collective intelligence.

    Science.gov (United States)

    Alor-Hernández, Giner; Pérez-Gallardo, Yuliana; Posada-Gómez, Rubén; Cortes-Robles, Guillermo; Rodríguez-González, Alejandro; Aguilar-Laserre, Alberto A

    2012-09-01

    Nowadays, traditional search engines such as Google, Yahoo and Bing facilitate the retrieval of information in the format of images, but the results are not always useful for the users. This is mainly due to two problems: (1) the semantic keywords are not taken into consideration and (2) it is not always possible to establish a query using the image features. This issue has been covered in different domains in order to develop content-based image retrieval (CBIR) systems. The expert community has focussed their attention on the healthcare domain, where a lot of visual information for medical analysis is available. This paper provides a solution called iPixel Visual Search Engine, which involves semantics and content issues in order to search for digitized mammograms. iPixel offers the possibility of retrieving mammogram features using collective intelligence and implementing a CBIR algorithm. Our proposal compares not only features with similar semantic meaning, but also visual features. In this sense, the comparisons are made in different ways: by the number of regions per image, by maximum and minimum size of regions per image and by average intensity level of each region. iPixel Visual Search Engine supports the medical community in differential diagnoses related to the diseases of the breast. The iPixel Visual Search Engine has been validated by experts in the healthcare domain, such as radiologists, in addition to experts in digital image analysis.

  7. Similarity-based grouping to support teachers on collaborative activities in exploratory learning environments

    OpenAIRE

    Gutierrez-Santos, Sergio; Mavrikis, M.; Geraniou, E.; Poulovassilis, Alexandra

    2016-01-01

    This paper describes a computer-based tool that helps teachers group their students for collaborative activities in the classroom, the challenge being to organise groups of students based on their recent work so that their collaboration results in meaningful interactions. Students first work on an exploratory task individually, and then the computer suggests possible groupings of students to the teacher. The complexity of the tasks is such that teachers would require too long a time to create...

  8. Incremental Learning of Context Free Grammars by Parsing-Based Rule Generation and Rule Set Search

    Science.gov (United States)

    Nakamura, Katsuhiko; Hoshina, Akemi

    This paper discusses recent improvements and extensions in Synapse system for inductive inference of context free grammars (CFGs) from sample strings. Synapse uses incremental learning, rule generation based on bottom-up parsing, and the search for rule sets. The form of production rules in the previous system is extended from Revised Chomsky Normal Form A→βγ to Extended Chomsky Normal Form, which also includes A→B, where each of β and γ is either a terminal or nonterminal symbol. From the result of bottom-up parsing, a rule generation mechanism synthesizes minimum production rules required for parsing positive samples. Instead of inductive CYK algorithm in the previous version of Synapse, the improved version uses a novel rule generation method, called ``bridging,'' which bridges the lacked part of the derivation tree for the positive string. The improved version also employs a novel search strategy, called serial search in addition to minimum rule set search. The synthesis of grammars by the serial search is faster than the minimum set search in most cases. On the other hand, the size of the generated CFGs is generally larger than that by the minimum set search, and the system can find no appropriate grammar for some CFL by the serial search. The paper shows experimental results of incremental learning of several fundamental CFGs and compares the methods of rule generation and search strategies.

  9. SU-F-P-51: Similarity Analysis of the Linear Accelerator Machines Based On Clinical Simulation

    International Nuclear Information System (INIS)

    Li, K

    2016-01-01

    Purpose: To evaluate the clinical rationale for Truebeam and Trilogy Linac machines from Varian Medical System as exchangeable treatment modalities in the same radiation oncology department. Methods: Intensity Modulated Radiotherapy (IMRT) plans for different diseases were selected for this study. These disease sites included brain, head and neck, breast, lung, and prostate. The parameters selected for this study were the energy amount, Monitor Unit (MU); dose coverage of target reflected by prescription isodose volume(PIV); dose spillage described by the volume of 50% isodoseline of the prescription; and dose homogeneities represented by the maximum dose (MaxD) and the minimum dose (MinD) of target volume (TV) and critical structure (CS). Each percentage difference between the values of these parameters formed an element of a matrix, which was called Similarity Comparison Matrix(SCM). The elements of the SCM were then simplified by dimensional conversion algorithm, which was used to determine clinical similarity between two machines through a single value. Results: For the selected clinical cases in this study, the average percentage differences between Trilogy and Truebeam in MU was 0.28% with standard deviation(SD) 0.66%, PIV was 0.23% with SD 0.20%, Volume at 50% prescription dose was 0.31% with SD at 0.78%, MaxD at TV is 0.26% with SD 0.35%, MinD at TV is −0.04% with SD 0.51%, MaxD in CS is −0.53% with SD 0.92%, and MinD in CS 3.31%, with SD at 2.89%. The sum, product, geometric and harmonic mean for the matrix elements were 19.0%, 0.00%, 0.19%, and 0.00%. Conclusion: A method to compare two machines in clinical level was developed and some reference values were calculated for decision-making in clinical practice, and this strategy could be expanded to different clinical applications.

  10. Eddy diffusion coefficients and their upper limits based on application of the similarity theory

    Directory of Open Access Journals (Sweden)

    M. N. Vlasov

    2015-07-01

    Full Text Available The equation for the diffusion velocity in the mesosphere and the lower thermosphere (MLT includes the terms for molecular and eddy diffusion. These terms are very similar. For the first time, we show that, by using the similarity theory, the same formula can be obtained for the eddy diffusion coefficient as the commonly used formula derived by Weinstock (1981. The latter was obtained by taking, as a basis, the integral function for diffusion derived by Taylor (1921 and the three-dimensional Kolmogorov kinetic energy spectrum. The exact identity of both formulas means that the eddy diffusion and heat transport coefficients used in the equations, both for diffusion and thermal conductivity, must meet a criterion that restricts the outer eddy scale to being much less than the scale height of the atmosphere. This requirement is the same as the requirement that the free path of molecules must be much smaller than the scale height of the atmosphere. A further result of this criterion is that the eddy diffusion coefficients Ked, inferred from measurements of energy dissipation rates, cannot exceed the maximum value of 3.2 × 106 cm2 s−1 for the maximum value of the energy dissipation rate of 2 W kg−1 measured in the mesosphere and the lower thermosphere (MLT. This means that eddy diffusion coefficients larger than the maximum value correspond to eddies with outer scales so large that it is impossible to use these coefficients in eddy diffusion and eddy heat transport equations. The application of this criterion to the different experimental data shows that some reported eddy diffusion coefficients do not meet this criterion. For example, the large values of these coefficients (1 × 107 cm2 s−1 estimated in the Turbulent Oxygen Mixing Experiment (TOMEX do not correspond to this criterion. The Ked values inferred at high latitudes by Lübken (1997 meet this criterion for summer and winter polar data, but the Ked values for summer at low latitudes

  11. SU-F-P-51: Similarity Analysis of the Linear Accelerator Machines Based On Clinical Simulation

    Energy Technology Data Exchange (ETDEWEB)

    Li, K [Associates In Medical Physics, Lanham, MD (United States); John R Marsh Cancer Center (United States)

    2016-06-15

    Purpose: To evaluate the clinical rationale for Truebeam and Trilogy Linac machines from Varian Medical System as exchangeable treatment modalities in the same radiation oncology department. Methods: Intensity Modulated Radiotherapy (IMRT) plans for different diseases were selected for this study. These disease sites included brain, head and neck, breast, lung, and prostate. The parameters selected for this study were the energy amount, Monitor Unit (MU); dose coverage of target reflected by prescription isodose volume(PIV); dose spillage described by the volume of 50% isodoseline of the prescription; and dose homogeneities represented by the maximum dose (MaxD) and the minimum dose (MinD) of target volume (TV) and critical structure (CS). Each percentage difference between the values of these parameters formed an element of a matrix, which was called Similarity Comparison Matrix(SCM). The elements of the SCM were then simplified by dimensional conversion algorithm, which was used to determine clinical similarity between two machines through a single value. Results: For the selected clinical cases in this study, the average percentage differences between Trilogy and Truebeam in MU was 0.28% with standard deviation(SD) 0.66%, PIV was 0.23% with SD 0.20%, Volume at 50% prescription dose was 0.31% with SD at 0.78%, MaxD at TV is 0.26% with SD 0.35%, MinD at TV is −0.04% with SD 0.51%, MaxD in CS is −0.53% with SD 0.92%, and MinD in CS 3.31%, with SD at 2.89%. The sum, product, geometric and harmonic mean for the matrix elements were 19.0%, 0.00%, 0.19%, and 0.00%. Conclusion: A method to compare two machines in clinical level was developed and some reference values were calculated for decision-making in clinical practice, and this strategy could be expanded to different clinical applications.

  12. Identification of polycystic ovary syndrome potential drug targets based on pathobiological similarity in the protein-protein interaction network

    OpenAIRE

    Huang, Hao; He, Yuehan; Li, Wan; Wei, Wenqing; Li, Yiran; Xie, Ruiqiang; Guo, Shanshan; Wang, Yahui; Jiang, Jing; Chen, Binbin; Lv, Junjie; Zhang, Nana; Chen, Lina; He, Weiming

    2016-01-01

    Polycystic ovary syndrome (PCOS) is one of the most common endocrinological disorders in reproductive aged women. PCOS and Type 2 Diabetes (T2D) are closely linked in multiple levels and possess high pathobiological similarity. Here, we put forward a new computational approach based on the pathobiological similarity to identify PCOS potential drug target modules (PPDT-Modules) and PCOS potential drug targets in the protein-protein interaction network (PPIN). From the systems level and biologi...

  13. Modeling the angular motion dynamics of spacecraft with a magnetic attitude control system based on experimental studies and dynamic similarity

    Science.gov (United States)

    Kulkov, V. M.; Medvedskii, A. L.; Terentyev, V. V.; Firsyuk, S. O.; Shemyakov, A. O.

    2017-12-01

    The problem of spacecraft attitude control using electromagnetic systems interacting with the Earth's magnetic field is considered. A set of dimensionless parameters has been formed to investigate the spacecraft orientation regimes based on dynamically similar models. The results of experimental studies of small spacecraft with a magnetic attitude control system can be extrapolated to the in-orbit spacecraft motion control regimes by using the methods of the dimensional and similarity theory.

  14. Gene selection and classification for cancer microarray data based on machine learning and similarity measures

    Directory of Open Access Journals (Sweden)

    Liu Qingzhong

    2011-12-01

    Full Text Available Abstract Background Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.

  15. Clustering by Partitioning around Medoids using Distance-Based Similarity Measures on Interval-Scaled Variables

    Directory of Open Access Journals (Sweden)

    D. L. Nkweteyim

    2018-03-01

    Full Text Available It is reported in this paper, the results of a study of the partitioning around medoids (PAM clustering algorithm applied to four datasets, both standardized and not, and of varying sizes and numbers of clusters. The angular distance proximity measure in addition to the two more traditional proximity measures, namely the Euclidean distance and Manhattan distance, was used to compute object-object similarity. The data used in the study comprise three widely available datasets, and one that was constructed from publicly available climate data. Results replicate some of the well known facts about the PAM algorithm, namely that the quality of the clusters generated tend to be much better for small datasets, that the silhouette value is a good, even if not perfect, guide for the optimal number of clusters to generate, and that human intervention is required to interpret generated clusters. Additionally, results also indicate that the angular distance measure, which traditionally has not been widely used in clustering, outperforms both the Euclidean and Manhattan distance metrics in certain situations.

  16. The Search for Extension: 7 Steps to Help People Find Research-Based Information on the Internet

    Science.gov (United States)

    Hill, Paul; Rader, Heidi B.; Hino, Jeff

    2012-01-01

    For Extension's unbiased, research-based content to be found by people searching the Internet, it needs to be organized in a way conducive to the ranking criteria of a search engine. With proper web design and search engine optimization techniques, Extension's content can be found, recognized, and properly indexed by search engines and…

  17. Pathfinder: multiresolution region-based searching of pathology images using IRM.

    OpenAIRE

    Wang, J. Z.

    2000-01-01

    The fast growth of digitized pathology slides has created great challenges in research on image database retrieval. The prevalent retrieval technique involves human-supplied text annotations to describe slide contents. These pathology images typically have very high resolution, making it difficult to search based on image content. In this paper, we present Pathfinder, an efficient multiresolution region-based searching system for high-resolution pathology image libraries. The system uses wave...

  18. Color Image Segmentation Based on Statistics of Location and Feature Similarity

    Science.gov (United States)

    Mori, Fumihiko; Yamada, Hiromitsu; Mizuno, Makoto; Sugano, Naotoshi

    The process of “image segmentation and extracting remarkable regions” is an important research subject for the image understanding. However, an algorithm based on the global features is hardly found. The requisite of such an image segmentation algorism is to reduce as much as possible the over segmentation and over unification. We developed an algorithm using the multidimensional convex hull based on the density as the global feature. In the concrete, we propose a new algorithm in which regions are expanded according to the statistics of the region such as the mean value, standard deviation, maximum value and minimum value of pixel location, brightness and color elements and the statistics are updated. We also introduced a new concept of conspicuity degree and applied it to the various 21 images to examine the effectiveness. The remarkable object regions, which were extracted by the presented system, highly coincided with those which were pointed by the sixty four subjects who attended the psychological experiment.

  19. Prioritizing sites for conservation based on similarity to historical baselines and feasibility of protection.

    Science.gov (United States)

    Popejoy, Traci; Randklev, Charles R; Neeson, Thomas M; Vaughn, Caryn C

    2018-05-08

    The shifting baseline syndrome concept advocates for the use of historical knowledge to inform conservation baselines, but does not address the feasibility of restoring sites to those baselines. In many regions, conservation feasibility varies among sites due to differences in resource availability, statutory power, and land-owner participation. We use zooarchaeological records to identify a historical baseline of the freshwater mussel community's composition before Euro-American influence at a river-reach scale. We evaluate how the community reference position and the feasibility of conservation might enable identification of sites where conservation actions would preserve historically representative communities and be likely to succeed. We first present a conceptual model that incorporates community information and landscape factors to link the best conservation areas to potential cost and conservation benefits. Using fuzzy ordination, we identify modern mussel beds that are most like the historical baseline. We then quantify the housing density and land use near each reach to estimate feasibility of habitat restoration. Using our conceptual framework, we identify reaches that have high conservation value (i.e., reaches that contain the best mussel beds) and where restoration actions would be most likely to succeed. Reaches above Lake Belton in central Texas, U.S.A. were most similar in species composition and relative abundance to zooarchaeological sites. A subset of these mussel beds occurred in locations where conservation actions appear to be most feasible. This study demonstrates how to use zooarchaeological data (biodiversity data often readily available) and estimates of conservation feasibility to inform conservation priorities at a local spatial scale. This article is protected by copyright. All rights reserved.

  20. Refining locations of the 2005 Mukacheve, West Ukraine, earthquakes based on similarity of their waveforms

    Science.gov (United States)

    Gnyp, Andriy

    2009-06-01

    Based on the results of application of correlation analysis to records of the 2005 Mukacheve group of recurrent events and their subsequent relocation relative to the reference event of 7 July 2005, a conclusion has been drawn that all the events had most likely occurred on the same rup-ture plane. Station terms have been estimated for seismic stations of the Transcarpathians, accounting for variation of seismic velocities beneath their locations as compared to the travel time tables used in the study. In methodical aspect, potentials and usefulness of correlation analysis of seismic records for a more detailed study of seismic processes, tectonics and geodynamics of the Carpathian region have been demonstrated.

  1. SciRide Finder: a citation-based paradigm in biomedical literature search.

    Science.gov (United States)

    Volanakis, Adam; Krawczyk, Konrad

    2018-04-18

    There are more than 26 million peer-reviewed biomedical research items according to Medline/PubMed. This breadth of information is indicative of the progress in biomedical sciences on one hand, but an overload for scientists performing literature searches on the other. A major portion of scientific literature search is to find statements, numbers and protocols that can be cited to build an evidence-based narrative for a new manuscript. Because science builds on prior knowledge, such information has likely been written out and cited in an older manuscript. Thus, Cited Statements, pieces of text from scientific literature supported by citing other peer-reviewed publications, carry significant amount of condensed information on prior art. Based on this principle, we propose a literature search service, SciRide Finder (finder.sciride.org), which constrains the search corpus to such Cited Statements only. We demonstrate that Cited Statements can carry different information to this found in titles/abstracts and full text, giving access to alternative literature search results than traditional search engines. We further show how presenting search results as a list of Cited Statements allows researchers to easily find information to build an evidence-based narrative for their own manuscripts.

  2. Path Searching Based Fault Automated Recovery Scheme for Distribution Grid with DG

    Science.gov (United States)

    Xia, Lin; Qun, Wang; Hui, Xue; Simeng, Zhu

    2016-12-01

    Applying the method of path searching based on distribution network topology in setting software has a good effect, and the path searching method containing DG power source is also applicable to the automatic generation and division of planned islands after the fault. This paper applies path searching algorithm in the automatic division of planned islands after faults: starting from the switch of fault isolation, ending in each power source, and according to the line load that the searching path traverses and the load integrated by important optimized searching path, forming optimized division scheme of planned islands that uses each DG as power source and is balanced to local important load. Finally, COBASE software and distribution network automation software applied are used to illustrate the effectiveness of the realization of such automatic restoration program.

  3. Global polar geospatial information service retrieval based on search engine and ontology reasoning

    Science.gov (United States)

    Chen, Nengcheng; E, Dongcheng; Di, Liping; Gong, Jianya; Chen, Zeqiang

    2007-01-01

    In order to improve the access precision of polar geospatial information service on web, a new methodology for retrieving global spatial information services based on geospatial service search and ontology reasoning is proposed, the geospatial service search is implemented to find the coarse service from web, the ontology reasoning is designed to find the refined service from the coarse service. The proposed framework includes standardized distributed geospatial web services, a geospatial service search engine, an extended UDDI registry, and a multi-protocol geospatial information service client. Some key technologies addressed include service discovery based on search engine and service ontology modeling and reasoning in the Antarctic geospatial context. Finally, an Antarctica multi protocol OWS portal prototype based on the proposed methodology is introduced.

  4. Q-learning-based adjustable fixed-phase quantum Grover search algorithm

    International Nuclear Information System (INIS)

    Guo Ying; Shi Wensha; Wang Yijun; Hu, Jiankun

    2017-01-01

    We demonstrate that the rotation phase can be suitably chosen to increase the efficiency of the phase-based quantum search algorithm, leading to a dynamic balance between iterations and success probabilities of the fixed-phase quantum Grover search algorithm with Q-learning for a given number of solutions. In this search algorithm, the proposed Q-learning algorithm, which is a model-free reinforcement learning strategy in essence, is used for performing a matching algorithm based on the fraction of marked items λ and the rotation phase α. After establishing the policy function α = π(λ), we complete the fixed-phase Grover algorithm, where the phase parameter is selected via the learned policy. Simulation results show that the Q-learning-based Grover search algorithm (QLGA) enables fewer iterations and gives birth to higher success probabilities. Compared with the conventional Grover algorithms, it avoids the optimal local situations, thereby enabling success probabilities to approach one. (author)

  5. A Novel Approach to Semantic Similarity Measurement Based on a Weighted Concept Lattice: Exemplifying Geo-Information

    Directory of Open Access Journals (Sweden)

    Jia Xiao

    2017-11-01

    Full Text Available The measurement of semantic similarity has been widely recognized as having a fundamental and key role in information science and information systems. Although various models have been proposed to measure semantic similarity, these models are not able effectively to quantify the weights of relevant factors that impact on the judgement of semantic similarity, such as the attributes of concepts, application context, and concept hierarchy. In this paper, we propose a novel approach that comprehensively considers the effects of various factors on semantic similarity judgment, which we name semantic similarity measurement based on a weighted concept lattice (SSMWCL. A feature model and network model are integrated together in SSMWCL. Based on the feature model, the combined weight of each attribute of the concepts is calculated by merging its information entropy and inclusion-degree importance in a specific application context. By establishing the weighted concept lattice, the relative hierarchical depths of concepts for comparison are computed according to the principle of the network model. The integration of feature model and network model enables SSMWCL to take account of differences in concepts more comprehensively in semantic similarity measurement. Additionally, a workflow of SSMWCL is designed to demonstrate these procedures and a case study of geo-information is conducted to assess the approach.

  6. Similarity-based interference in a working memory numerical updating task: age-related differences between younger and older adults.

    Science.gov (United States)

    Pelegrina, Santiago; Borella, Erika; Carretti, Barbara; Lechuga, M Teresa

    2012-01-01

    Similarity among representations held simultaneously in working memory (WM) is a factor which increases interference and hinders performance. The aim of the current study was to investigate age-related differences between younger and older adults in a working memory numerical updating task, in which the similarity between information held in WM was manipulated. Results showed a higher susceptibility of older adults to similarity-based interference when accuracy, and not response times, was considered. It was concluded that older adults' WM difficulties appear to be due to the availability of stored information, which, in turn, might be related to the ability to generate distinctive representations and to the process of binding such representations to their context when similar information has to be processed in WM.

  7. Feature-Based Correlation and Topological Similarity for Interbeat Interval Estimation Using Ultrawideband Radar.

    Science.gov (United States)

    Sakamoto, Takuya; Imasaka, Ryohei; Taki, Hirofumi; Sato, Toru; Yoshioka, Mototaka; Inoue, Kenichi; Fukuda, Takeshi; Sakai, Hiroyuki

    2016-04-01

    The objectives of this paper are to propose a method that can accurately estimate the human heart rate (HR) using an ultrawideband (UWB) radar system, and to determine the performance of the proposed method through measurements. The proposed method uses the feature points of a radar signal to estimate the HR efficiently and accurately. Fourier- and periodicity-based methods are inappropriate for estimation of instantaneous HRs in real time because heartbeat waveforms are highly variable, even within the beat-to-beat interval. We define six radar waveform features that enable correlation processing to be performed quickly and accurately. In addition, we propose a feature topology signal that is generated from a feature sequence without using amplitude information. This feature topology signal is used to find unreliable feature points, and thus, to suppress inaccurate HR estimates. Measurements were taken using UWB radar, while simultaneously performing electrocardiography measurements in an experiment that was conducted on nine participants. The proposed method achieved an average root-mean-square error in the interbeat interval of 7.17 ms for the nine participants. The results demonstrate the effectiveness and accuracy of the proposed method. The significance of this study for biomedical research is that the proposed method will be useful in the realization of a remote vital signs monitoring system that enables accurate estimation of HR variability, which has been used in various clinical settings for the treatment of conditions such as diabetes and arterial hypertension.

  8. Knowledge-based personalized search engine for the Web-based Human Musculoskeletal System Resources (HMSR) in biomechanics.

    Science.gov (United States)

    Dao, Tien Tuan; Hoang, Tuan Nha; Ta, Xuan Hien; Tho, Marie Christine Ho Ba

    2013-02-01

    Human musculoskeletal system resources of the human body are valuable for the learning and medical purposes. Internet-based information from conventional search engines such as Google or Yahoo cannot response to the need of useful, accurate, reliable and good-quality human musculoskeletal resources related to medical processes, pathological knowledge and practical expertise. In this present work, an advanced knowledge-based personalized search engine was developed. Our search engine was based on a client-server multi-layer multi-agent architecture and the principle of semantic web services to acquire dynamically accurate and reliable HMSR information by a semantic processing and visualization approach. A security-enhanced mechanism was applied to protect the medical information. A multi-agent crawler was implemented to develop a content-based database of HMSR information. A new semantic-based PageRank score with related mathematical formulas were also defined and implemented. As the results, semantic web service descriptions were presented in OWL, WSDL and OWL-S formats. Operational scenarios with related web-based interfaces for personal computers and mobile devices were presented and analyzed. Functional comparison between our knowledge-based search engine, a conventional search engine and a semantic search engine showed the originality and the robustness of our knowledge-based personalized search engine. In fact, our knowledge-based personalized search engine allows different users such as orthopedic patient and experts or healthcare system managers or medical students to access remotely into useful, accurate, reliable and good-quality HMSR information for their learning and medical purposes. Copyright © 2012 Elsevier Inc. All rights reserved.

  9. Structure-Based Search for New Inhibitors of Cholinesterases

    Directory of Open Access Journals (Sweden)

    Barbara Malawska

    2013-03-01

    Full Text Available Cholinesterases are important biological targets responsible for regulation of cholinergic transmission, and their inhibitors are used for the treatment of Alzheimer’s disease. To design new cholinesterase inhibitors, of different structure-based design strategies was followed, including the modification of compounds from a previously developed library and a fragment-based design approach. This led to the selection of heterodimeric structures as potential inhibitors. Synthesis and biological evaluation of selected candidates confirmed that the designed compounds were acetylcholinesterase inhibitors with IC50 values in the mid-nanomolar to low micromolar range, and some of them were also butyrylcholinesterase inhibitors.

  10. Development of health information search engine based on metadata and ontology.

    Science.gov (United States)

    Song, Tae-Min; Park, Hyeoun-Ae; Jin, Dal-Lae

    2014-04-01

    The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers.

  11. Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms

    International Nuclear Information System (INIS)

    Tourassi, Georgia D.; Harrawood, Brian; Singh, Swatee; Lo, Joseph Y.; Floyd, Carey E.

    2007-01-01

    The purpose of this study was to evaluate image similarity measures employed in an information-theoretic computer-assisted detection (IT-CAD) scheme. The scheme was developed for content-based retrieval and detection of masses in screening mammograms. The study is aimed toward an interactive clinical paradigm where physicians query the proposed IT-CAD scheme on mammographic locations that are either visually suspicious or indicated as suspicious by other cuing CAD systems. The IT-CAD scheme provides an evidence-based, second opinion for query mammographic locations using a knowledge database of mass and normal cases. In this study, eight entropy-based similarity measures were compared with respect to retrieval precision and detection accuracy using a database of 1820 mammographic regions of interest. The IT-CAD scheme was then validated on a separate database for false positive reduction of progressively more challenging visual cues generated by an existing, in-house mass detection system. The study showed that the image similarity measures fall into one of two categories; one category is better suited to the retrieval of semantically similar cases while the second is more effective with knowledge-based decisions regarding the presence of a true mass in the query location. In addition, the IT-CAD scheme yielded a substantial reduction in false-positive detections while maintaining high detection rate for malignant masses

  12. ONTOLOGY BASED MEANINGFUL SEARCH USING SEMANTIC WEB AND NATURAL LANGUAGE PROCESSING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    K. Palaniammal

    2013-10-01

    Full Text Available The semantic web extends the current World Wide Web by adding facilities for the machine understood description of meaning. The ontology based search model is used to enhance efficiency and accuracy of information retrieval. Ontology is the core technology for the semantic web and this mechanism for representing formal and shared domain descriptions. In this paper, we proposed ontology based meaningful search using semantic web and Natural Language Processing (NLP techniques in the educational domain. First we build the educational ontology then we present the semantic search system. The search model consisting three parts which are embedding spell-check, finding synonyms using WordNet API and querying ontology using SPARQL language. The results are both sensitive to spell check and synonymous context. This paper provides more accurate results and the complete details for the selected field in a single page.

  13. Probabilistic hypergraph based hash codes for social image search

    Institute of Scientific and Technical Information of China (English)

    Yi XIE; Hui-min YU; Roland HU

    2014-01-01

    With the rapid development of the Internet, recent years have seen the explosive growth of social media. This brings great challenges in performing efficient and accurate image retrieval on a large scale. Recent work shows that using hashing methods to embed high-dimensional image features and tag information into Hamming space provides a powerful way to index large collections of social images. By learning hash codes through a spectral graph partitioning algorithm, spectral hashing (SH) has shown promising performance among various hashing approaches. However, it is incomplete to model the relations among images only by pairwise simple graphs which ignore the relationship in a higher order. In this paper, we utilize a probabilistic hypergraph model to learn hash codes for social image retrieval. A probabilistic hypergraph model offers a higher order repre-sentation among social images by connecting more than two images in one hyperedge. Unlike a normal hypergraph model, a probabilistic hypergraph model considers not only the grouping information, but also the similarities between vertices in hy-peredges. Experiments on Flickr image datasets verify the performance of our proposed approach.

  14. Particle Swarm Optimization and harmony search based clustering and routing in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Veena Anand

    2017-01-01

    Full Text Available Wireless Sensor Networks (WSN has the disadvantage of limited and non-rechargeable energy resource in WSN creates a challenge and led to development of various clustering and routing algorithms. The paper proposes an approach for improving network lifetime by using Particle swarm optimization based clustering and Harmony Search based routing in WSN. So in this paper, global optimal cluster head are selected and Gateway nodes are introduced to decrease the energy consumption of the CH while sending aggregated data to the Base station (BS. Next, the harmony search algorithm based Local Search strategy finds best routing path for gateway nodes to the Base Station. Finally, the proposed algorithm is presented.

  15. Beyond the search surface: visual search and attentional engagement.

    Science.gov (United States)

    Duncan, J; Humphreys, G

    1992-05-01

    Treisman (1991) described a series of visual search studies testing feature integration theory against an alternative (Duncan & Humphreys, 1989) in which feature and conjunction search are basically similar. Here the latter account is noted to have 2 distinct levels: (a) a summary of search findings in terms of stimulus similarities, and (b) a theory of how visual attention is brought to bear on relevant objects. Working at the 1st level, Treisman found that even when similarities were calibrated and controlled, conjunction search was much harder than feature search. The theory, however, can only really be tested at the 2nd level, because the 1st is an approximation. An account of the findings is developed at the 2nd level, based on the 2 processes of input-template matching and spreading suppression. New data show that, when both of these factors are controlled, feature and conjunction search are equally difficult. Possibilities for unification of the alternative views are considered.

  16. Prospects for SUSY discovery based on inclusive searches with the ATLAS detector

    International Nuclear Information System (INIS)

    Ventura, Andrea

    2009-01-01

    The search for Supersymmetry (SUSY) among the possible scenarios of new physics is one of the most relevant goals of the ATLAS experiment running at CERN's Large Hadron Collider. In the present work the expected prospects for discovering SUSY with the ATLAS detector are reviewed, in particular for the first fb -1 of collected integrated luminosity. All studies and results reported here are based on inclusive search analyses realized with Monte Carlo signal and background data simulated through the ATLAS apparatus.

  17. Similar digit-based working memory in deaf signers and hearing non-signers despite digit span differences

    Directory of Open Access Journals (Sweden)

    Josefine eAndin

    2013-12-01

    Full Text Available Similar working memory (WM for lexical items has been demonstrated for signers and non-signers while short-term memory (STM is regularly poorer in deaf than hearing individuals. In the present study, we investigated digit-based WM and STM in Swedish and British deaf signers and hearing non-signers. To maintain good experimental control we used printed stimuli throughout and held response mode constant across groups. We showed that deaf signers have similar digit-based WM performance, despite shorter digit spans, compared to well-matched hearing non-signers. We found no difference between signers and non-signers on STM span for letters chosen to minimize phonological similarity or in the effects of recall direction. This set of findings indicates that similar WM for signers and non-signers can be generalized from lexical items to digits and suggests that poorer STM in deaf signers compared to hearing non-signers may be due to differences in phonological similarity across the language modalities of sign and speech.

  18. Ontology-based Semantic Search Engine for Healthcare Services

    OpenAIRE

    Jotsna Molly Rajan; M. Deepa Lakshmi

    2012-01-01

    With the development of Web Services, the retrieval of relevant services has become a challenge. The keyword-based discovery mechanism using UDDI and WSDL is insufficient due to the retrievalof a large amount of irrelevant information. Also, keywords are insufficient in expressing semantic concepts since a single concept can be referred using syntactically different terms. Hence, service capabilities need to be manually analyzed, which lead to the development of the Semantic Web for automatic...

  19. Cooperative Search and Rescue with Artificial Fishes Based on Fish-Swarm Algorithm for Underwater Wireless Sensor Networks

    Science.gov (United States)

    Zhao, Wei; Tang, Zhenmin; Yang, Yuwang; Wang, Lei; Lan, Shaohua

    2014-01-01

    This paper presents a searching control approach for cooperating mobile sensor networks. We use a density function to represent the frequency of distress signals issued by victims. The mobile nodes' moving in mission space is similar to the behaviors of fish-swarm in water. So, we take the mobile node as artificial fish node and define its operations by a probabilistic model over a limited range. A fish-swarm based algorithm is designed requiring local information at each fish node and maximizing the joint detection probabilities of distress signals. Optimization of formation is also considered for the searching control approach and is optimized by fish-swarm algorithm. Simulation results include two schemes: preset route and random walks, and it is showed that the control scheme has adaptive and effective properties. PMID:24741341

  20. Cooperative Search and Rescue with Artificial Fishes Based on Fish-Swarm Algorithm for Underwater Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Wei Zhao

    2014-01-01

    Full Text Available This paper presents a searching control approach for cooperating mobile sensor networks. We use a density function to represent the frequency of distress signals issued by victims. The mobile nodes’ moving in mission space is similar to the behaviors of fish-swarm in water. So, we take the mobile node as artificial fish node and define its operations by a probabilistic model over a limited range. A fish-swarm based algorithm is designed requiring local information at each fish node and maximizing the joint detection probabilities of distress signals. Optimization of formation is also considered for the searching control approach and is optimized by fish-swarm algorithm. Simulation results include two schemes: preset route and random walks, and it is showed that the control scheme has adaptive and effective properties.

  1. Personalized Search

    CERN Document Server

    AUTHOR|(SzGeCERN)749939

    2015-01-01

    As the volume of electronically available information grows, relevant items become harder to find. This work presents an approach to personalizing search results in scientific publication databases. This work focuses on re-ranking search results from existing search engines like Solr or ElasticSearch. This work also includes the development of Obelix, a new recommendation system used to re-rank search results. The project was proposed and performed at CERN, using the scientific publications available on the CERN Document Server (CDS). This work experiments with re-ranking using offline and online evaluation of users and documents in CDS. The experiments conclude that the personalized search result outperform both latest first and word similarity in terms of click position in the search result for global search in CDS.

  2. Design of a Bioactive Small Molecule that Targets the Myotonic Dystrophy Type 1 RNA Via an RNA Motif-Ligand Database & Chemical Similarity Searching

    Science.gov (United States)

    Parkesh, Raman; Childs-Disney, Jessica L.; Nakamori, Masayuki; Kumar, Amit; Wang, Eric; Wang, Thomas; Hoskins, Jason; Tran, Tuan; Housman, David; Thornton, Charles A.; Disney, Matthew D.

    2012-01-01

    Myotonic dystrophy type 1 (DM1) is a triplet repeating disorder caused by expanded CTG repeats in the 3′ untranslated region of the dystrophia myotonica protein kinase (DMPK) gene. The transcribed repeats fold into an RNA hairpin with multiple copies of a 5′CUG/3′GUC motif that binds the RNA splicing regulator muscleblind-like 1 protein (MBNL1). Sequestration of MBNL1 by expanded r(CUG) repeats causes splicing defects in a subset of pre-mRNAs including the insulin receptor, the muscle-specific chloride ion channel, Sarco(endo)plasmic reticulum Ca2+ ATPase 1 (Serca1/Atp2a1), and cardiac troponin T (cTNT). Based on these observations, the development of small molecule ligands that target specifically expanded DM1 repeats could serve as therapeutics. In the present study, computational screening was employed to improve the efficacy of pentamidine and Hoechst 33258 ligands that have been shown previously to target the DM1 triplet repeat. A series of inhibitors of the RNA-protein complex with low micromolar IC50’s, which are >20-fold more potent than the query compounds, were identified. Importantly, a bis-benzimidazole identified from the Hoechst query improves DM1-associated pre-mRNA splicing defects in cell and mouse models of DM1 (when dosed with 1 mM and 100 mg/kg, respectively). Since Hoechst 33258 was identified as a DM1 binder through analysis of an RNA motif-ligand database, these studies suggest that lead ligands targeting RNA with improved biological activity can be identified by using a synergistic approach that combines analysis of known RNA-ligand interactions with virtual screening. PMID:22300544

  3. A GIS-based Quantitative Approach for the Search of Clandestine Graves, Italy.

    Science.gov (United States)

    Somma, Roberta; Cascio, Maria; Silvestro, Massimiliano; Torre, Eliana

    2018-05-01

    Previous research on the RAG color-coded prioritization systems for the discovery of clandestine graves has not considered all the factors influencing the burial site choice within a GIS project. The goal of this technical note was to discuss a GIS-based quantitative approach for the search of clandestine graves. The method is based on cross-referenced RAG maps with cumulative suitability factors to host a burial, leading to the editing of different search scenarios for ground searches showing high-(Red), medium-(Amber), and low-(Green) priority areas. The application of this procedure allowed several outcomes to be determined: If the concealment occurs at night, then the "search scenario without the visibility" will be the most effective one; if the concealment occurs in daylight, then the "search scenario with the DSM-based visibility" will be most appropriate; the different search scenarios may be cross-referenced with offender's confessions and eyewitnesses' testimonies to verify the veracity of their statements. © 2017 American Academy of Forensic Sciences.

  4. Nearby Search Indekos Based Android Using A Star (A*) Algorithm

    Science.gov (United States)

    Siregar, B.; Nababan, EB; Rumahorbo, JA; Andayani, U.; Fahmi, F.

    2018-03-01

    Indekos or rented room is a temporary residence for months or years. Society of academicians who come from out of town need a temporary residence, such as Indekos or rented room during their education, teaching, or duties. They are often found difficulty in finding a Indekos because lack of information about the Indekos. Besides, new society of academicians don’t recognize the areas around the campus and desire the shortest path from Indekos to get to the campus. The problem can be solved by implementing A Star (A*) algorithm. This algorithm is one of the shortest path algorithm to a finding shortest path from campus to the Indekos application, where the faculties in the campus as the starting point of the finding. Determination of the starting point used in this study aims to allow students to determine the starting point in finding the Indekos. The mobile based application facilitates the finding anytime and anywhere. Based on the experimental results, A* algorithm can find the shortest path with 86,67% accuracy.

  5. Content-based video indexing and searching with wavelet transformation

    Science.gov (United States)

    Stumpf, Florian; Al-Jawad, Naseer; Du, Hongbo; Jassim, Sabah

    2006-05-01

    Biometric databases form an essential tool in the fight against international terrorism, organised crime and fraud. Various government and law enforcement agencies have their own biometric databases consisting of combination of fingerprints, Iris codes, face images/videos and speech records for an increasing number of persons. In many cases personal data linked to biometric records are incomplete and/or inaccurate. Besides, biometric data in different databases for the same individual may be recorded with different personal details. Following the recent terrorist atrocities, law enforcing agencies collaborate more than before and have greater reliance on database sharing. In such an environment, reliable biometric-based identification must not only determine who you are but also who else you are. In this paper we propose a compact content-based video signature and indexing scheme that can facilitate retrieval of multiple records in face biometric databases that belong to the same person even if their associated personal data are inconsistent. We shall assess the performance of our system using a benchmark audio visual face biometric database that has multiple videos for each subject but with different identity claims. We shall demonstrate that retrieval of relatively small number of videos that are nearest, in terms of the proposed index, to any video in the database results in significant proportion of that individual biometric data.

  6. A unified architecture for biomedical search engines based on semantic web technologies.

    Science.gov (United States)

    Jalali, Vahid; Matash Borujerdi, Mohammad Reza

    2011-04-01

    There is a huge growth in the volume of published biomedical research in recent years. Many medical search engines are designed and developed to address the over growing information needs of biomedical experts and curators. Significant progress has been made in utilizing the knowledge embedded in medical ontologies and controlled vocabularies to assist these engines. However, the lack of common architecture for utilized ontologies and overall retrieval process, hampers evaluating different search engines and interoperability between them under unified conditions. In this paper, a unified architecture for medical search engines is introduced. Proposed model contains standard schemas declared in semantic web languages for ontologies and documents used by search engines. Unified models for annotation and retrieval processes are other parts of introduced architecture. A sample search engine is also designed and implemented based on the proposed architecture in this paper. The search engine is evaluated using two test collections and results are reported in terms of precision vs. recall and mean average precision for different approaches used by this search engine.

  7. An Analysis of Literature Searching Anxiety in Evidence-Based Medicine Education

    Directory of Open Access Journals (Sweden)

    Hui-Chin Chang

    2014-01-01

    Full Text Available Introduction. Evidence-Based Medicine (EBM is hurtling towards a cornerstone in lifelong learning for healthcare personnel worldwide. This study aims to evaluate the literature searching anxiety in graduate students in practicing EBM. Method The study participants were 48 graduate students who enrolled the EBM course at aMedical Universityin central Taiwan. Student’s t-test, Pearson correlation and multivariate regression, interviewing are used to evaluate the students’ literature searching anxiety of EBM course. The questionnaire is Literature Searching Anxiety Rating Scale -LSARS. Results The sources of anxiety are uncertainty of database selection, literatures evaluation and selection, technical assistance request, computer programs use, English and EBM education programs were disclosed. The class performance is negatively related to LSARS score, however, the correlation is statistically insignificant with the adjustment of gender, degree program, age category and experience of publication. Conclusion This study helps in understanding the causes and the extent of anxiety in order to work on a better teaching program planning to improve user’s searching skills and the capability of utilization the information; At the same time, provide friendly-user facilities of evidence searching. In short, we need to upgrade the learner’s searching 45 skills and reduce theanxiety. We also need to stress on the auxiliary teaching program for those with the prevalent and profoundanxiety during literature searching.

  8. Similarity-based Fisherfaces

    DEFF Research Database (Denmark)

    Delgado-Gomez, David; Fagertun, Jens; Ersbøll, Bjarne Kjær

    2009-01-01

    databases (XM2VTS, AR and Equinox) show consistently good results. The proposed algorithm achieves Equal Error Rate (EER) and Half-Total Error Rate (HTER) values in the ranges of 0.41-1.67% and 0.1-1.95%, respectively. Our approach yields results comparable to the top two winners in recent contests reported...

  9. Architecture for knowledge-based and federated search of online clinical evidence.

    Science.gov (United States)

    Coiera, Enrico; Walther, Martin; Nguyen, Ken; Lovell, Nigel H

    2005-10-24

    It is increasingly difficult for clinicians to keep up-to-date with the rapidly growing biomedical literature. Online evidence retrieval methods are now seen as a core tool to support evidence-based health practice. However, standard search engine technology is not designed to manage the many different types of evidence sources that are available or to handle the very different information needs of various clinical groups, who often work in widely different settings. The objectives of this paper are (1) to describe the design considerations and system architecture of a wrapper-mediator approach to federate search system design, including the use of knowledge-based, meta-search filters, and (2) to analyze the implications of system design choices on performance measurements. A trial was performed to evaluate the technical performance of a federated evidence retrieval system, which provided access to eight distinct online resources, including e-journals, PubMed, and electronic guidelines. The Quick Clinical system architecture utilized a universal query language to reformulate queries internally and utilized meta-search filters to optimize search strategies across resources. We recruited 227 family physicians from across Australia who used the system to retrieve evidence in a routine clinical setting over a 4-week period. The total search time for a query was recorded, along with the duration of individual queries sent to different online resources. Clinicians performed 1662 searches over the trial. The average search duration was 4.9 +/- 3.2 s (N = 1662 searches). Mean search duration to the individual sources was between 0.05 s and 4.55 s. Average system time (ie, system overhead) was 0.12 s. The relatively small system overhead compared to the average time it takes to perform a search for an individual source shows that the system achieves a good trade-off between performance and reliability. Furthermore, despite the additional effort required to incorporate the

  10. Analysis of Search Engines and Meta Search Engines\\\\\\' Position by University of Isfahan Users Based on Rogers\\\\\\' Diffusion of Innovation Theory

    Directory of Open Access Journals (Sweden)

    Maryam Akbari

    2012-10-01

    Full Text Available The present study investigated the analysis of search engines and meta search engines adoption process by University of Isfahan users during 2009-2010 based on the Rogers' diffusion of innovation theory. The main aim of the research was to study the rate of adoption and recognizing the potentials and effective tools in search engines and meta search engines adoption among University of Isfahan users. The research method was descriptive survey study. The cases of the study were all of the post graduate students of the University of Isfahan. 351 students were selected as the sample and categorized by a stratified random sampling method. Questionnaire was used for collecting data. The collected data was analyzed using SPSS 16 in both descriptive and analytic statistic. For descriptive statistic frequency, percentage and mean were used, while for analytic statistic t-test and Kruskal-Wallis non parametric test (H-test were used. The finding of t-test and Kruscal-Wallis indicated that the mean of search engines and meta search engines adoption did not show statistical differences gender, level of education and the faculty. Special search engines adoption process was different in terms of gender but not in terms of the level of education and the faculty. Other results of the research indicated that among general search engines, Google had the most adoption rate. In addition, among the special search engines, Google Scholar and among the meta search engines Mamma had the most adopting rate. Findings also showed that friends played an important role on how students adopted general search engines while professors had important role on how students adopted special search engines and meta search engines. Moreover, results showed that the place where students got the most acquaintance with search engines and meta search engines was in the university. The finding showed that the curve of adoption rate was not normal and it was not also in S-shape. Morover

  11. Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM

    Directory of Open Access Journals (Sweden)

    Yunyun Liang

    2015-01-01

    Full Text Available Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM. Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS, segmented PsePSSM, and segmented autocovariance transformation (ACT based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640 are adopted in this paper. Then a 700-dimensional (700D feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA. To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.

  12. Research on the optimization strategy of web search engine based on data mining

    Science.gov (United States)

    Chen, Ronghua

    2018-04-01

    With the wide application of search engines, web site information has become an important way for people to obtain information. People have found that they are growing in an increasingly explosive manner. Web site information is verydifficult to find the information they need, and now the search engine can not meet the need, so there is an urgent need for the network to provide website personalized information service, data mining technology for this new challenge is to find a breakthrough. In order to improve people's accuracy of finding information from websites, a website search engine optimization strategy based on data mining is proposed, and verified by website search engine optimization experiment. The results show that the proposed strategy improves the accuracy of the people to find information, and reduces the time for people to find information. It has an important practical value.

  13. Addressing special structure in the relevance feedback learning problem through aspect-based image search

    NARCIS (Netherlands)

    M.J. Huiskes (Mark)

    2004-01-01

    textabstractIn this paper we focus on a number of issues regarding special structure in the relevance feedback learning problem, most notably the effects of image selection based on partial relevance on the clustering behavior of examples. We propose a simple scheme, aspect-based image search, which

  14. Ontology-Driven Search and Triage: Design of a Web-Based Visual Interface for MEDLINE.

    Science.gov (United States)

    Demelo, Jonathan; Parsons, Paul; Sedig, Kamran

    2017-02-02

    Diverse users need to search health and medical literature to satisfy open-ended goals such as making evidence-based decisions and updating their knowledge. However, doing so is challenging due to at least two major difficulties: (1) articulating information needs using accurate vocabulary and (2) dealing with large document sets returned from searches. Common search interfaces such as PubMed do not provide adequate support for exploratory search tasks. Our objective was to improve support for exploratory search tasks by combining two strategies in the design of an interactive visual interface by (1) using a formal ontology to help users build domain-specific knowledge and vocabulary and (2) providing multi-stage triaging support to help mitigate the information overload problem. We developed a Web-based tool, Ontology-Driven Visual Search and Triage Interface for MEDLINE (OVERT-MED), to test our design ideas. We implemented a custom searchable index of MEDLINE, which comprises approximately 25 million document citations. We chose a popular biomedical ontology, the Human Phenotype Ontology (HPO), to test our solution to the vocabulary problem. We implemented multistage triaging support in OVERT-MED, with the aid of interactive visualization techniques, to help users deal with large document sets returned from searches. Formative evaluation suggests that the design features in OVERT-MED are helpful in addressing the two major difficulties described above. Using a formal ontology seems to help users articulate their information needs with more accurate vocabulary. In addition, multistage triaging combined with interactive visualizations shows promise in mitigating the information overload problem. Our strategies appear to be valuable in addressing the two major problems in exploratory search. Although we tested OVERT-MED with a particular ontology and document collection, we anticipate that our strategies can be transferred successfully to other contexts.

  15. Neurally and ocularly informed graph-based models for searching 3D environments.

    Science.gov (United States)

    Jangraw, David C; Wang, Jun; Lance, Brent J; Chang, Shih-Fu; Sajda, Paul

    2014-08-01

    As we move through an environment, we are constantly making assessments, judgments and decisions about the things we encounter. Some are acted upon immediately, but many more become mental notes or fleeting impressions-our implicit 'labeling' of the world. In this paper, we use physiological correlates of this labeling to construct a hybrid brain-computer interface (hBCI) system for efficient navigation of a 3D environment. First, we record electroencephalographic (EEG), saccadic and pupillary data from subjects as they move through a small part of a 3D virtual city under free-viewing conditions. Using machine learning, we integrate the neural and ocular signals evoked by the objects they encounter to infer which ones are of subjective interest to them. These inferred labels are propagated through a large computer vision graph of objects in the city, using semi-supervised learning to identify other, unseen objects that are visually similar to the labeled ones. Finally, the system plots an efficient route to help the subjects visit the 'similar' objects it identifies. We show that by exploiting the subjects' implicit labeling to find objects of interest instead of exploring naively, the median search precision is increased from 25% to 97%, and the median subject need only travel 40% of the distance to see 84% of the objects of interest. We also find that the neural and ocular signals contribute in a complementary fashion to the classifiers' inference of subjects' implicit labeling. In summary, we show that neural and ocular signals reflecting subjective assessment of objects in a 3D environment can be used to inform a graph-based learning model of that environment, resulting in an hBCI system that improves navigation and information delivery specific to the user's interests.

  16. Modified Three-Step Search Block Matching Motion Estimation and Weighted Finite Automata based Fractal Video Compression

    Directory of Open Access Journals (Sweden)

    Shailesh Kamble

    2017-08-01

    Full Text Available The major challenge with fractal image/video coding technique is that, it requires more encoding time. Therefore, how to reduce the encoding time is the research component remains in the fractal coding. Block matching motion estimation algorithms are used, to reduce the computations performed in the process of encoding. The objective of the proposed work is to develop an approach for video coding using modified three step search (MTSS block matching algorithm and weighted finite automata (WFA coding with a specific focus on reducing the encoding time. The MTSS block matching algorithm are used for computing motion vectors between the two frames i.e. displacement of pixels and WFA is used for the coding as it behaves like the Fractal Coding (FC. WFA represents an image (frame or motion compensated prediction error based on the idea of fractal that the image has self-similarity in itself. The self-similarity is sought from the symmetry of an image, so the encoding algorithm divides an image into multi-levels of quad-tree segmentations and creates an automaton from the sub-images. The proposed MTSS block matching algorithm is based on the combination of rectangular and hexagonal search pattern and compared with the existing New Three-Step Search (NTSS, Three-Step Search (TSS, and Efficient Three-Step Search (ETSS block matching estimation algorithm. The performance of the proposed MTSS block matching algorithm is evaluated on the basis of performance evaluation parameters i.e. mean absolute difference (MAD and average search points required per frame. Mean of absolute difference (MAD distortion function is used as the block distortion measure (BDM. Finally, developed approaches namely, MTSS and WFA, MTSS and FC, and Plane FC (applied on every frame are compared with each other. The experimentations are carried out on the standard uncompressed video databases, namely, akiyo, bus, mobile, suzie, traffic, football, soccer, ice etc. Developed

  17. Optimal Search Strategy of Robotic Assembly Based on Neural Vibration Learning

    Directory of Open Access Journals (Sweden)

    Lejla Banjanovic-Mehmedovic

    2011-01-01

    Full Text Available This paper presents implementation of optimal search strategy (OSS in verification of assembly process based on neural vibration learning. The application problem is the complex robot assembly of miniature parts in the example of mating the gears of one multistage planetary speed reducer. Assembly of tube over the planetary gears was noticed as the most difficult problem of overall assembly. The favourable influence of vibration and rotation movement on compensation of tolerance was also observed. With the proposed neural-network-based learning algorithm, it is possible to find extended scope of vibration state parameter. Using optimal search strategy based on minimal distance path between vibration parameter stage sets (amplitude and frequencies of robots gripe vibration and recovery parameter algorithm, we can improve the robot assembly behaviour, that is, allow the fastest possible way of mating. We have verified by using simulation programs that search strategy is suitable for the situation of unexpected events due to uncertainties.

  18. Deep Convolutional Neural Networks Outperform Feature-Based But Not Categorical Models in Explaining Object Similarity Judgments

    Science.gov (United States)

    Jozwik, Kamila M.; Kriegeskorte, Nikolaus; Storrs, Katherine R.; Mur, Marieke

    2017-01-01

    Recent advances in Deep convolutional Neural Networks (DNNs) have enabled unprecedentedly accurate computational models of brain representations, and present an exciting opportunity to model diverse cognitive functions. State-of-the-art DNNs achieve human-level performance on object categorisation, but it is unclear how well they capture human behavior on complex cognitive tasks. Recent reports suggest that DNNs can explain significant variance in one such task, judging object similarity. Here, we extend these findings by replicating them for a rich set of object images, comparing performance across layers within two DNNs of different depths, and examining how the DNNs’ performance compares to that of non-computational “conceptual” models. Human observers performed similarity judgments for a set of 92 images of real-world objects. Representations of the same images were obtained in each of the layers of two DNNs of different depths (8-layer AlexNet and 16-layer VGG-16). To create conceptual models, other human observers generated visual-feature labels (e.g., “eye”) and category labels (e.g., “animal”) for the same image set. Feature labels were divided into parts, colors, textures and contours, while category labels were divided into subordinate, basic, and superordinate categories. We fitted models derived from the features, categories, and from each layer of each DNN to the similarity judgments, using representational similarity analysis to evaluate model performance. In both DNNs, similarity within the last layer explains most of the explainable variance in human similarity judgments. The last layer outperforms almost all feature-based models. Late and mid-level layers outperform some but not all feature-based models. Importantly, categorical models predict similarity judgments significantly better than any DNN layer. Our results provide further evidence for commonalities between DNNs and brain representations. Models derived from visual features

  19. Filling Predictable and Unpredictable Gaps, with and without Similarity-Based Interference: Evidence for LIFG Effects of Dependency Processing.

    Science.gov (United States)

    Leiken, Kimberly; McElree, Brian; Pylkkänen, Liina

    2015-01-01

    One of the most replicated findings in neurolinguistic literature on syntax is the increase of hemodynamic activity in the left inferior frontal gyrus (LIFG) in response to object relative (OR) clauses compared to subject relative clauses. However, behavioral studies have shown that ORs are primarily only costly when similarity-based interference is involved and recently, Leiken and Pylkkänen (2014) showed with magnetoencephalography (MEG) that an LIFG increase at an OR gap is also dependent on such interference. However, since ORs always involve a cue indicating an upcoming dependency formation, OR dependencies could be processed already prior to the gap-site and thus show no sheer dependency effects at the gap itself. To investigate the role of gap predictability in LIFG dependency effects, this MEG study compared ORs to verb phrase ellipsis (VPE), which was used as an example of a non-predictable dependency. Additionally, we explored LIFG sensitivity to filler-gap order by including right node raising structures, in which the order of filler and gap is reverse to that of ORs and VPE. Half of the stimuli invoked similarity-based interference and half did not. Our results demonstrate that LIFG effects of dependency can be elicited regardless of whether the dependency is predictable, the stimulus materials evoke similarity-based interference, or the filler precedes the gap. Thus, contrary to our own prior data, the current findings suggest a highly general role for the LIFG in dependency interpretation that is not limited to environments involving similarity-based interference. Additionally, the millisecond time-resolution of MEG allowed for a detailed characterization of the temporal profiles of LIFG dependency effects across our three constructions, revealing that the timing of these effects is somewhat construction-specific.

  20. Efficient Similarity Retrieval in Music Databases

    DEFF Research Database (Denmark)

    Ruxanda, Maria Magdalena; Jensen, Christian Søndergaard

    2006-01-01

    Audio music is increasingly becoming available in digital form, and the digital music collections of individuals continue to grow. Addressing the need for effective means of retrieving music from such collections, this paper proposes new techniques for content-based similarity search. Each music...

  1. Composition-based statistics and translated nucleotide searches: Improving the TBLASTN module of BLAST

    Directory of Open Access Journals (Sweden)

    Schäffer Alejandro A

    2006-12-01

    Full Text Available Abstract Background TBLASTN is a mode of operation for BLAST that aligns protein sequences to a nucleotide database translated in all six frames. We present the first description of the modern implementation of TBLASTN, focusing on new techniques that were used to implement composition-based statistics for translated nucleotide searches. Composition-based statistics use the composition of the sequences being aligned to generate more accurate E-values, which allows for a more accurate distinction between true and false matches. Until recently, composition-based statistics were available only for protein-protein searches. They are now available as a command line option for recent versions of TBLASTN and as an option for TBLASTN on the NCBI BLAST web server. Results We evaluate the statistical and retrieval accuracy of the E-values reported by a baseline version of TBLASTN and by two variants that use different types of composition-based statistics. To test the statistical accuracy of TBLASTN, we ran 1000 searches using scrambled proteins from the mouse genome and a database of human chromosomes. To test retrieval accuracy, we modernize and adapt to translated searches a test set previously used to evaluate the retrieval accuracy of protein-protein searches. We show that composition-based statistics greatly improve the statistical accuracy of TBLASTN, at a small cost to the retrieval accuracy. Conclusion TBLASTN is widely used, as it is common to wish to compare proteins to chromosomes or to libraries of mRNAs. Composition-based statistics improve the statistical accuracy, and therefore the reliability, of TBLASTN results. The algorithms used by TBLASTN are not widely known, and some of the most important are reported here. The data used to test TBLASTN are available for download and may be useful in other studies of translated search algorithms.

  2. Identification of Fuzzy Inference Systems by Means of a Multiobjective Opposition-Based Space Search Algorithm

    Directory of Open Access Journals (Sweden)

    Wei Huang

    2013-01-01

    Full Text Available We introduce a new category of fuzzy inference systems with the aid of a multiobjective opposition-based space search algorithm (MOSSA. The proposed MOSSA is essentially a multiobjective space search algorithm improved by using an opposition-based learning that employs a so-called opposite numbers mechanism to speed up the convergence of the optimization algorithm. In the identification of fuzzy inference system, the MOSSA is exploited to carry out the parametric identification of the fuzzy model as well as to realize its structural identification. Experimental results demonstrate the effectiveness of the proposed fuzzy models.

  3. Efficient data retrieval method for similar plasma waveforms in EAST

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Ying, E-mail: liuying-ipp@szu.edu.cn [SZU-CASIPP Joint Laboratory for Applied Plasma, Shenzhen University, Shenzhen 518060 (China); Huang, Jianjun; Zhou, Huasheng; Wang, Fan [SZU-CASIPP Joint Laboratory for Applied Plasma, Shenzhen University, Shenzhen 518060 (China); Wang, Feng [Institute of Plasma Physics Chinese Academy of Sciences, Hefei 230031 (China)

    2016-11-15

    Highlights: • The proposed method is carried out by means of bounding envelope and angle distance. • It allows retrieving for whole similar waveforms of any time length. • In addition, the proposed method is also possible to retrieve subsequences. - Abstract: Fusion research relies highly on data analysis due to its massive-sized database. In the present work, we propose an efficient method for searching and retrieving similar plasma waveforms in Experimental Advanced Superconducting Tokamak (EAST). Based on Piecewise Linear Aggregate Approximation (PLAA) for extracting feature values, the searching process is accomplished in two steps. The first one is coarse searching to narrow down the search space, which is carried out by means of bounding envelope. The second step is fine searching to retrieval similar waveforms, which is implemented by the angle distance. The proposed method is tested in EAST databases and turns out to have good performance in retrieving similar waveforms.

  4. Noise suppression for dual-energy CT via penalized weighted least-square optimization with similarity-based regularization

    Energy Technology Data Exchange (ETDEWEB)

    Harms, Joseph; Wang, Tonghe; Petrongolo, Michael; Zhu, Lei, E-mail: leizhu@gatech.edu [Nuclear and Radiological Engineering and Medical Physics Programs, The George W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332 (United States); Niu, Tianye [Sir Run Run Shaw Hospital, Zhejiang University School of Medicine (China); Institute of Translational Medicine, Zhejiang University, Hangzhou, Zhejiang, 310016 (China)

    2016-05-15

    Purpose: Dual-energy CT (DECT) expands applications of CT imaging in its capability to decompose CT images into material images. However, decomposition via direct matrix inversion leads to large noise amplification and limits quantitative use of DECT. Their group has previously developed a noise suppression algorithm via penalized weighted least-square optimization with edge-preservation regularization (PWLS-EPR). In this paper, the authors improve method performance using the same framework of penalized weighted least-square optimization but with similarity-based regularization (PWLS-SBR), which substantially enhances the quality of decomposed images by retaining a more uniform noise power spectrum (NPS). Methods: The design of PWLS-SBR is based on the fact that averaging pixels of similar materials gives a low-noise image. For each pixel, the authors calculate the similarity to other pixels in its neighborhood by comparing CT values. Using an empirical Gaussian model, the authors assign high/low similarity value to one neighboring pixel if its CT value is close/far to the CT value of the pixel of interest. These similarity values are organized in matrix form, such that multiplication of the similarity matrix to the image vector reduces image noise. The similarity matrices are calculated on both high- and low-energy CT images and averaged. In PWLS-SBR, the authors include a regularization term to minimize the L-2 norm of the difference between the images without and with noise suppression via similarity matrix multiplication. By using all pixel information of the initial CT images rather than just those lying on or near edges, PWLS-SBR is superior to the previously developed PWLS-EPR, as supported by comparison studies on phantoms and a head-and-neck patient. Results: On the line-pair slice of the Catphan{sup ©}600 phantom, PWLS-SBR outperforms PWLS-EPR and retains spatial resolution of 8 lp/cm, comparable to the original CT images, even at 90% reduction in noise

  5. GEPSI: A Gene Expression Profile Similarity-Based Identification Method of Bioactive Components in Traditional Chinese Medicine Formula.

    Science.gov (United States)

    Zhang, Baixia; He, Shuaibing; Lv, Chenyang; Zhang, Yanling; Wang, Yun

    2018-01-01

    The identification of bioactive components in traditional Chinese medicine (TCM) is an important part of the TCM material foundation research. Recently, molecular docking technology has been extensively used for the identification of TCM bioactive components. However, target proteins that are used in molecular docking may not be the actual TCM target. For this reason, the bioactive components would likely be omitted or incorrect. To address this problem, this study proposed the GEPSI method that identified the target proteins of TCM based on the similarity of gene expression profiles. The similarity of the gene expression profiles affected by TCM and small molecular drugs was calculated. The pharmacological action of TCM may be similar to that of small molecule drugs that have a high similarity score. Indeed, the target proteins of the small molecule drugs could be considered TCM targets. Thus, we identified the bioactive components of a TCM by molecular docking and verified the reliability of this method by a literature investigation. Using the target proteins that TCM actually affected as targets, the identification of the bioactive components was more accurate. This study provides a fast and effective method for the identification of TCM bioactive components.

  6. Power law-based local search in spider monkey optimisation for lower order system modelling

    Science.gov (United States)

    Sharma, Ajay; Sharma, Harish; Bhargava, Annapurna; Sharma, Nirmala

    2017-01-01

    The nature-inspired algorithms (NIAs) have shown efficiency to solve many complex real-world optimisation problems. The efficiency of NIAs is measured by their ability to find adequate results within a reasonable amount of time, rather than an ability to guarantee the optimal solution. This paper presents a solution for lower order system modelling using spider monkey optimisation (SMO) algorithm to obtain a better approximation for lower order systems and reflects almost original higher order system's characteristics. Further, a local search strategy, namely, power law-based local search is incorporated with SMO. The proposed strategy is named as power law-based local search in SMO (PLSMO). The efficiency, accuracy and reliability of the proposed algorithm is tested over 20 well-known benchmark functions. Then, the PLSMO algorithm is applied to solve the lower order system modelling problem.

  7. Adaptive Spatial Filter Based on Similarity Indices to Preserve the Neural Information on EEG Signals during On-Line Processing

    Directory of Open Access Journals (Sweden)

    Denis Delisle-Rodriguez

    2017-11-01

    Full Text Available This work presents a new on-line adaptive filter, which is based on a similarity analysis between standard electrode locations, in order to reduce artifacts and common interferences throughout electroencephalography (EEG signals, but preserving the useful information. Standard deviation and Concordance Correlation Coefficient (CCC between target electrodes and its correspondent neighbor electrodes are analyzed on sliding windows to select those neighbors that are highly correlated. Afterwards, a model based on CCC is applied to provide higher values of weight to those correlated electrodes with lower similarity to the target electrode. The approach was applied to brain computer-interfaces (BCIs based on Canonical Correlation Analysis (CCA to recognize 40 targets of steady-state visual evoked potential (SSVEP, providing an accuracy (ACC of 86.44 ± 2.81%. In addition, also using this approach, features of low frequency were selected in the pre-processing stage of another BCI to recognize gait planning. In this case, the recognition was significantly ( p < 0.01 improved for most of the subjects ( A C C ≥ 74.79 % , when compared with other BCIs based on Common Spatial Pattern, Filter Bank-Common Spatial Pattern, and Riemannian Geometry.

  8. High-power Yb-fiber comb based on pre-chirped-management self-similar amplification

    Science.gov (United States)

    Luo, Daping; Liu, Yang; Gu, Chenglin; Wang, Chao; Zhu, Zhiwei; Zhang, Wenchao; Deng, Zejiang; Zhou, Lian; Li, Wenxue; Zeng, Heping

    2018-02-01

    We report a fiber self-similar-amplification (SSA) comb system that delivers a 250-MHz, 109-W, 42-fs pulse train with a 10-dB spectral width of 85 nm at 1056 nm. A pair of grisms is employed to compensate the group velocity dispersion and third-order dispersion of pre-amplified pulses for facilitating a self-similar evolution and a self-phase modulation (SPM). Moreover, we analyze the stabilities and noise characteristics of both the locked carrier envelope phase and the repetition rate, verifying the stability of the generated high-power comb. The demonstration of the SSA comb at such high power proves the feasibility of the SPM-based low-noise ultrashort comb.

  9. Efficient Multi-keyword Ranked Search over Outsourced Cloud Data based on Homomorphic Encryption

    Directory of Open Access Journals (Sweden)

    Nie Mengxi

    2016-01-01

    Full Text Available With the development of cloud computing, more and more data owners are motivated to outsource their data to the cloud server for great flexibility and less saving expenditure. Because the security of outsourced data must be guaranteed, some encryption methods should be used which obsoletes traditional data utilization based on plaintext, e.g. keyword search. To solve the search of encrypted data, some schemes were proposed to solve the search of encrypted data, e.g. top-k single or multiple keywords retrieval. However, the efficiency of these proposed schemes is not high enough to be impractical in the cloud computing. In this paper, we propose a new scheme based on homomorphic encryption to solve this challenging problem of privacy-preserving efficient multi-keyword ranked search over outsourced cloud data. In our scheme, the inner product is adopted to measure the relevance scores and the technique of relevance feedback is used to reflect the search preference of the data users. Security analysis shows that the proposed scheme can meet strict privacy requirements for such a secure cloud data utilization system. Performance evaluation demonstrates that the proposed scheme can achieve low overhead on both computation and communication.

  10. EARS: An Online Bibliographic Search and Retrieval System Based on Ordered Explosion.

    Science.gov (United States)

    Ramesh, R.; Drury, Colin G.

    1987-01-01

    Provides overview of Ergonomics Abstracts Retrieval System (EARS), an online bibliographic search and retrieval system in the area of human factors engineering. Other online systems are described, the design of EARS based on inverted file organization is explained, and system expansions including a thesaurus are discussed. (Author/LRW)

  11. Agent-oriented Architecture for Task-based Information Search System

    NARCIS (Netherlands)

    Aroyo, Lora; de Bra, Paul M.E.; De Bra, P.; Hardman, L.

    1999-01-01

    The topic of the reported research discusses an agent-oriented architecture of an educational information search system AIMS - a task-based learner support system. It is implemented within the context of 'Courseware Engineering' on-line course at the Faculty of Educational Science and Technology,

  12. COORDINATE-BASED META-ANALYTIC SEARCH FOR THE SPM NEUROIMAGING PIPELINE

    DEFF Research Database (Denmark)

    Wilkowski, Bartlomiej; Szewczyk, Marcin; Rasmussen, Peter Mondrup

    2009-01-01

    . BredeQuery offers a direct link from SPM5 to the Brede Database coordinate-based search engine. BredeQuery is able to ‘grab’ brain location coordinates from the SPM windows and enter them as a query for the Brede Database. Moreover, results of the query can be displayed in an SPM window and/or exported...

  13. A novel approach towards skill-based search and services of Open Educational Resources

    NARCIS (Netherlands)

    Ha, Kyung-Hun; Niemann, Katja; Schwertel, Uta; Holtkamp, Philipp; Pirkkalainen, Henri; Börner, Dirk; Kalz, Marco; Pitsilis, Vassilis; Vidalis, Ares; Pappa, Dimitra; Bick, Markus; Pawlowski, Jan; Wolpers, Martin

    2011-01-01

    Ha, K.-H., Niemann, K., Schwertel, U., Holtkamp, P., Pirkkalainen, H., Börner, D. et al (2011). A novel approach towards skill-based search and services of Open Educational Resources. In E. Garcia-Barriocanal, A. Öztürk, & M. C. Okur (Eds.), Metadata and Semantics Research: 5th International

  14. Promoting evidence based medicine in preclinical medical students via a federated literature search tool.

    Science.gov (United States)

    Keim, Samuel Mark; Howse, David; Bracke, Paul; Mendoza, Kathryn

    2008-01-01

    Medical educators are increasingly faced with directives to teach Evidence Based Medicine (EBM) skills. Because of its nature, integrating fundamental EBM educational content is a challenge in the preclinical years. To analyse preclinical medical student user satisfaction and feedback regarding a clinical EBM search strategy. The authors introduced a custom EBM search option with a self-contained education structure to first-year medical students. The implementation took advantage of a major curricular change towards case-based instruction. Medical student views and experiences were studied regarding the tool's convenience, problems and the degree to which they used it to answer questions raised by case-based instruction. Surveys were completed by 70% of the available first-year students. Student satisfaction and experiences were strongly positive towards the EBM strategy, especially of the tool's convenience and utility for answering issues raised during case-based learning sessions. About 90% of the students responded that the tool was easy to use, productive and accessed for half or more of their search needs. This study provides evidence that the integration of an educational EBM search tool can be positively received by preclinical medical students.

  15. Web-Based Search and Plot System for Nuclear Reaction Data

    International Nuclear Information System (INIS)

    Otuka, N.; Nakagawa, T.; Fukahori, T.; Katakura, J.; Aikawa, M.; Suda, T.; Naito, K.; Korennov, S.; Arai, K.; Noto, H.; Ohnishi, A.; Kato, K.

    2005-01-01

    A web-based search and plot system for nuclear reaction data has been developed, covering experimental data in EXFOR format and evaluated data in ENDF format. The system is implemented for Linux OS, with Perl and MySQL used for CGI scripts and the database manager, respectively. Two prototypes for experimental and evaluated data are presented

  16. MotionExplorer: exploratory search in human motion capture data based on hierarchical aggregation.

    Science.gov (United States)

    Bernard, Jürgen; Wilhelm, Nils; Krüger, Björn; May, Thorsten; Schreck, Tobias; Kohlhammer, Jörn

    2013-12-01

    We present MotionExplorer, an exploratory search and analysis system for sequences of human motion in large motion capture data collections. This special type of multivariate time series data is relevant in many research fields including medicine, sports and animation. Key tasks in working with motion data include analysis of motion states and transitions, and synthesis of motion vectors by interpolation and combination. In the practice of research and application of human motion data, challenges exist in providing visual summaries and drill-down functionality for handling large motion data collections. We find that this domain can benefit from appropriate visual retrieval and analysis support to handle these tasks in presence of large motion data. To address this need, we developed MotionExplorer together with domain experts as an exploratory search system based on interactive aggregation and visualization of motion states as a basis for data navigation, exploration, and search. Based on an overview-first type visualization, users are able to search for interesting sub-sequences of motion based on a query-by-example metaphor, and explore search results by details on demand. We developed MotionExplorer in close collaboration with the targeted users who are researchers working on human motion synthesis and analysis, including a summative field study. Additionally, we conducted a laboratory design study to substantially improve MotionExplorer towards an intuitive, usable and robust design. MotionExplorer enables the search in human motion capture data with only a few mouse clicks. The researchers unanimously confirm that the system can efficiently support their work.

  17. Tree decomposition based fast search of RNA structures including pseudoknots in genomes.

    Science.gov (United States)

    Song, Yinglei; Liu, Chunmei; Malmberg, Russell; Pan, Fangfang; Cai, Liming

    2005-01-01

    Searching genomes for RNA secondary structure with computational methods has become an important approach to the annotation of non-coding RNAs. However, due to the lack of efficient algorithms for accurate RNA structure-sequence alignment, computer programs capable of fast and effectively searching genomes for RNA secondary structures have not been available. In this paper, a novel RNA structure profiling model is introduced based on the notion of a conformational graph to specify the consensus structure of an RNA family. Tree decomposition yields a small tree width t for such conformation graphs (e.g., t = 2 for stem loops and only a slight increase for pseudo-knots). Within this modelling framework, the optimal alignment of a sequence to the structure model corresponds to finding a maximum valued isomorphic subgraph and consequently can be accomplished through dynamic programming on the tree decomposition of the conformational graph in time O(k(t)N(2)), where k is a small parameter; and N is the size of the projiled RNA structure. Experiments show that the application of the alignment algorithm to search in genomes yields the same search accuracy as methods based on a Covariance model with a significant reduction in computation time. In particular; very accurate searches of tmRNAs in bacteria genomes and of telomerase RNAs in yeast genomes can be accomplished in days, as opposed to months required by other methods. The tree decomposition based searching tool is free upon request and can be downloaded at our site h t t p ://w.uga.edu/RNA-informatics/software/index.php.

  18. Multiscale sample entropy and cross-sample entropy based on symbolic representation and similarity of stock markets

    Science.gov (United States)

    Wu, Yue; Shang, Pengjian; Li, Yilong

    2018-03-01

    A modified multiscale sample entropy measure based on symbolic representation and similarity (MSEBSS) is proposed in this paper to research the complexity of stock markets. The modified algorithm reduces the probability of inducing undefined entropies and is confirmed to be robust to strong noise. Considering the validity and accuracy, MSEBSS is more reliable than Multiscale entropy (MSE) for time series mingled with much noise like financial time series. We apply MSEBSS to financial markets and results show American stock markets have the lowest complexity compared with European and Asian markets. There are exceptions to the regularity that stock markets show a decreasing complexity over the time scale, indicating a periodicity at certain scales. Based on MSEBSS, we introduce the modified multiscale cross-sample entropy measure based on symbolic representation and similarity (MCSEBSS) to consider the degree of the asynchrony between distinct time series. Stock markets from the same area have higher synchrony than those from different areas. And for stock markets having relative high synchrony, the entropy values will decrease with the increasing scale factor. While for stock markets having high asynchrony, the entropy values will not decrease with the increasing scale factor sometimes they tend to increase. So both MSEBSS and MCSEBSS are able to distinguish stock markets of different areas, and they are more helpful if used together for studying other features of financial time series.

  19. Theoretical and Empirical Analyses of an Improved Harmony Search Algorithm Based on Differential Mutation Operator

    Directory of Open Access Journals (Sweden)

    Longquan Yong

    2012-01-01

    Full Text Available Harmony search (HS method is an emerging metaheuristic optimization algorithm. In this paper, an improved harmony search method based on differential mutation operator (IHSDE is proposed to deal with the optimization problems. Since the population diversity plays an important role in the behavior of evolution algorithm, the aim of this paper is to calculate the expected population mean and variance of IHSDE from theoretical viewpoint. Numerical results, compared with the HSDE, NGHS, show that the IHSDE method has good convergence property over a test-suite of well-known benchmark functions.

  20. Project GRACE A grid based search tool for the global digital library

    CERN Document Server

    Scholze, Frank; Vigen, Jens; Prazak, Petra; The Seventh International Conference on Electronic Theses and Dissertations

    2004-01-01

    The paper will report on the progress of an ongoing EU project called GRACE - Grid Search and Categorization Engine (http://www.grace-ist.org). The project participants are CERN, Sheffield Hallam University, Stockholm University, Stuttgart University, GL 2006 and Telecom Italia. The project started in 2002 and will finish in 2005, resulting in a Grid based search engine that will search across a variety of content sources including a number of electronic thesis and dissertation repositories. The Open Archives Initiative (OAI) is expanding and is clearly an interesting movement for a community advocating open access to ETD. However, the OAI approach alone may not be sufficiently scalable to achieve a truly global ETD Digital Library. Many universities simply offer their collections to the world via their local web services without being part of any federated system for archiving and even those dissertations that are provided with OAI compliant metadata will not necessarily be picked up by a centralized OAI Ser...

  1. A hybrid firefly algorithm and pattern search technique for SSSC based power oscillation damping controller design

    Directory of Open Access Journals (Sweden)

    Srikanta Mahapatra

    2014-12-01

    Full Text Available In this paper, a novel hybrid Firefly Algorithm and Pattern Search (h-FAPS technique is proposed for a Static Synchronous Series Compensator (SSSC-based power oscillation damping controller design. The proposed h-FAPS technique takes the advantage of global search capability of FA and local search facility of PS. In order to tackle the drawback of using the remote signal that may impact reliability of the controller, a modified signal equivalent to the remote speed deviation signal is constructed from the local measurements. The performances of the proposed controllers are evaluated in SMIB and multi-machine power system subjected to various transient disturbances. To show the effectiveness and robustness of the proposed design approach, simulation results are presented and compared with some recently published approaches such as Differential Evolution (DE and Particle Swarm Optimization (PSO. It is observed that the proposed approach yield superior damping performance compared to some recently reported approaches.

  2. Cloud based automated framework for semantic rich ontology construction and similarity computation for E-health applications

    Directory of Open Access Journals (Sweden)

    T. MuthamilSelvan

    Full Text Available Ontology structure, a core of semantic web is an excellent tool for knowledge representation and semantic visualization. Moreover, knowledge reuse is made possible through similarity measure estimation between two ontologies, threshold estimation and use of simple if-then rules for checking relevancy and irrelevancy measures. Reduced semantic representations of the ontology provide reduced knowledge visualization which is critical especially for e-health data processing and analysis. This usually occurs due to the presence of implicit knowledge and polymorphic objects and can be made semantically rich through the construction by resolving this implicit knowledge occurring in the form of non-dominant words and conditional dependence actions. This paper presents the working of the automated framework for the construction of semantic rich ontology structures and store in the repository. This construction uses dyadic deontic logic based Graph Derivation Representation in order to construct semantically rich ontologies. Moreover, in order to retrieve a set of relevant documents in response to the cloud user document, the degree of similarity between two ontologies is estimated using the traditional cosine similarity measure and simple if-then rules are used to determine the number of relevant documents and obtain such document's metadata for further processing. These working modules will be extremely beneficial to the authenticated cloud users for document retrieval, information extraction and domain dictionary construction which are especially used for e-health applications. The proposed framework is implemented using diabetes dataset and the effectiveness of the experimental results is high when compared to other Graph Derivation Representation methods. The graphical results shown in the paper is an added visualization for viewing the performance of the proposed framework. Keywords: Ontology, Implicit knowledge, Conditional dependence, Graph

  3. A Fast Framework for Abrupt Change Detection Based on Binary Search Trees and Kolmogorov Statistic

    Science.gov (United States)

    Qi, Jin-Peng; Qi, Jie; Zhang, Qing

    2016-01-01

    Change-Point (CP) detection has attracted considerable attention in the fields of data mining and statistics; it is very meaningful to discuss how to quickly and efficiently detect abrupt change from large-scale bioelectric signals. Currently, most of the existing methods, like Kolmogorov-Smirnov (KS) statistic and so forth, are time-consuming, especially for large-scale datasets. In this paper, we propose a fast framework for abrupt change detection based on binary search trees (BSTs) and a modified KS statistic, named BSTKS (binary search trees and Kolmogorov statistic). In this method, first, two binary search trees, termed as BSTcA and BSTcD, are constructed by multilevel Haar Wavelet Transform (HWT); second, three search criteria are introduced in terms of the statistic and variance fluctuations in the diagnosed time series; last, an optimal search path is detected from the root to leaf nodes of two BSTs. The studies on both the synthetic time series samples and the real electroencephalograph (EEG) recordings indicate that the proposed BSTKS can detect abrupt change more quickly and efficiently than KS, t-statistic (t), and Singular-Spectrum Analyses (SSA) methods, with the shortest computation time, the highest hit rate, the smallest error, and the highest accuracy out of four methods. This study suggests that the proposed BSTKS is very helpful for useful information inspection on all kinds of bioelectric time series signals. PMID:27413364

  4. A Fast Framework for Abrupt Change Detection Based on Binary Search Trees and Kolmogorov Statistic.

    Science.gov (United States)

    Qi, Jin-Peng; Qi, Jie; Zhang, Qing

    2016-01-01

    Change-Point (CP) detection has attracted considerable attention in the fields of data mining and statistics; it is very meaningful to discuss how to quickly and efficiently detect abrupt change from large-scale bioelectric signals. Currently, most of the existing methods, like Kolmogorov-Smirnov (KS) statistic and so forth, are time-consuming, especially for large-scale datasets. In this paper, we propose a fast framework for abrupt change detection based on binary search trees (BSTs) and a modified KS statistic, named BSTKS (binary search trees and Kolmogorov statistic). In this method, first, two binary search trees, termed as BSTcA and BSTcD, are constructed by multilevel Haar Wavelet Transform (HWT); second, three search criteria are introduced in terms of the statistic and variance fluctuations in the diagnosed time series; last, an optimal search path is detected from the root to leaf nodes of two BSTs. The studies on both the synthetic time series samples and the real electroencephalograph (EEG) recordings indicate that the proposed BSTKS can detect abrupt change more quickly and efficiently than KS, t-statistic (t), and Singular-Spectrum Analyses (SSA) methods, with the shortest computation time, the highest hit rate, the smallest error, and the highest accuracy out of four methods. This study suggests that the proposed BSTKS is very helpful for useful information inspection on all kinds of bioelectric time series signals.

  5. Neurally and ocularly informed graph-based models for searching 3D environments

    Science.gov (United States)

    Jangraw, David C.; Wang, Jun; Lance, Brent J.; Chang, Shih-Fu; Sajda, Paul

    2014-08-01

    Objective. As we move through an environment, we are constantly making assessments, judgments and decisions about the things we encounter. Some are acted upon immediately, but many more become mental notes or fleeting impressions—our implicit ‘labeling’ of the world. In this paper, we use physiological correlates of this labeling to construct a hybrid brain-computer interface (hBCI) system for efficient navigation of a 3D environment. Approach. First, we record electroencephalographic (EEG), saccadic and pupillary data from subjects as they move through a small part of a 3D virtual city under free-viewing conditions. Using machine learning, we integrate the neural and ocular signals evoked by the objects they encounter to infer which ones are of subjective interest to them. These inferred labels are propagated through a large computer vision graph of objects in the city, using semi-supervised learning to identify other, unseen objects that are visually similar to the labeled ones. Finally, the system plots an efficient route to help the subjects visit the ‘similar’ objects it identifies. Main results. We show that by exploiting the subjects’ implicit labeling to find objects of interest instead of exploring naively, the median search precision is increased from 25% to 97%, and the median subject need only travel 40% of the distance to see 84% of the objects of interest. We also find that the neural and ocular signals contribute in a complementary fashion to the classifiers’ inference of subjects’ implicit labeling. Significance. In summary, we show that neural and ocular signals reflecting subjective assessment of objects in a 3D environment can be used to inform a graph-based learning model of that environment, resulting in an hBCI system that improves navigation and information delivery specific to the user’s interests.

  6. Female choice for male cuticular hydrocarbon profile in decorated crickets is not based on similarity to their own profile.

    Science.gov (United States)

    Steiger, S; Capodeanu-Nägler, A; Gershman, S N; Weddle, C B; Rapkin, J; Sakaluk, S K; Hunt, J

    2015-12-01

    Indirect genetic benefits derived from female mate choice comprise additive (good genes) and nonadditive genetic benefits (genetic compatibility). Although good genes can be revealed by condition-dependent display traits, the mechanism by which compatibility alleles are detected is unclear because evaluation of the genetic similarity of a prospective mate requires the female to assess the genotype of the male and compare it to her own. Cuticular hydrocarbons (CHCs), lipids coating the exoskeleton of most insects, influence female mate choice in a number of species and offer a way for females to assess genetic similarity of prospective mates. Here, we determine whether female mate choice in decorated crickets is based on male CHCs and whether it is influenced by females' own CHC profiles. We used multivariate selection analysis to estimate the strength and form of selection acting on male CHCs through female mate choice, and employed different measures of multivariate dissimilarity to determine whether a female's preference for male CHCs is based on similarity to her own CHC profile. Female mating preferences were significantly influenced by CHC profiles of males. Male CHC attractiveness was not, however, contingent on the CHC profile of the choosing female, as certain male CHC phenotypes were equally attractive to most females, evidenced by significant linear and stabilizing selection gradients. These results suggest that additive genetic benefits, rather than nonadditive genetic benefits, accrue to female mate choice, in support of earlier work showing that CHC expression of males, but not females, is condition dependent. © 2015 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2015 European Society For Evolutionary Biology.

  7. Similarity-based multi-model ensemble approach for 1-15-day advance prediction of monsoon rainfall over India

    Science.gov (United States)

    Jaiswal, Neeru; Kishtawal, C. M.; Bhomia, Swati

    2018-04-01

    The southwest (SW) monsoon season (June, July, August and September) is the major period of rainfall over the Indian region. The present study focuses on the development of a new multi-model ensemble approach based on the similarity criterion (SMME) for the prediction of SW monsoon rainfall in the extended range. This approach is based on the assumption that training with the similar type of conditions may provide the better forecasts in spite of the sequential training which is being used in the conventional MME approaches. In this approach, the training dataset has been selected by matching the present day condition to the archived dataset and days with the most similar conditions were identified and used for training the model. The coefficients thus generated were used for the rainfall prediction. The precipitation forecasts from four general circulation models (GCMs), viz. European Centre for Medium-Range Weather Forecasts (ECMWF), United Kingdom Meteorological Office (UKMO), National Centre for Environment Prediction (NCEP) and China Meteorological Administration (CMA) have been used for developing the SMME forecasts. The forecasts of 1-5, 6-10 and 11-15 days were generated using the newly developed approach for each pentad of June-September during the years 2008-2013 and the skill of the model was analysed using verification scores, viz. equitable skill score (ETS), mean absolute error (MAE), Pearson's correlation coefficient and Nash-Sutcliffe model efficiency index. Statistical analysis of SMME forecasts shows superior forecast skill compared to the conventional MME and the individual models for all the pentads, viz. 1-5, 6-10 and 11-15 days.

  8. A similarity score-based two-phase heuristic approach to solve the dynamic cellular facility layout for manufacturing systems

    Science.gov (United States)

    Kumar, Ravi; Singh, Surya Prakash

    2017-11-01

    The dynamic cellular facility layout problem (DCFLP) is a well-known NP-hard problem. It has been estimated that the efficient design of DCFLP reduces the manufacturing cost of products by maintaining the minimum material flow among all machines in all cells, as the material flow contributes around 10-30% of the total product cost. However, being NP hard, solving the DCFLP optimally is very difficult in reasonable time. Therefore, this article proposes a novel similarity score-based two-phase heuristic approach to solve the DCFLP optimally considering multiple products in multiple times to be manufactured in the manufacturing layout. In the first phase of the proposed heuristic, a machine-cell cluster is created based on similarity scores between machines. This is provided as an input to the second phase to minimize inter/intracell material handling costs and rearrangement costs over the entire planning period. The solution methodology of the proposed approach is demonstrated. To show the efficiency of the two-phase heuristic approach, 21 instances are generated and solved using the optimization software package LINGO. The results show that the proposed approach can optimally solve the DCFLP in reasonable time.

  9. RNA-Seq-based toxicogenomic assessment of fresh frozen and formalin-fixed tissues yields similar mechanistic insights.

    Science.gov (United States)

    Auerbach, Scott S; Phadke, Dhiral P; Mav, Deepak; Holmgren, Stephanie; Gao, Yuan; Xie, Bin; Shin, Joo Heon; Shah, Ruchir R; Merrick, B Alex; Tice, Raymond R

    2015-07-01

    Formalin-fixed, paraffin-embedded (FFPE) pathology specimens represent a potentially vast resource for transcriptomic-based biomarker discovery. We present here a comparison of results from a whole transcriptome RNA-Seq analysis of RNA extracted from fresh frozen and FFPE livers. The samples were derived from rats exposed to aflatoxin B1 (AFB1 ) and a corresponding set of control animals. Principal components analysis indicated that samples were separated in the two groups representing presence or absence of chemical exposure, both in fresh frozen and FFPE sample types. Sixty-five percent of the differentially expressed transcripts (AFB1 vs. controls) in fresh frozen samples were also differentially expressed in FFPE samples (overlap significance: P < 0.0001). Genomic signature and gene set analysis of AFB1 differentially expressed transcript lists indicated highly similar results between fresh frozen and FFPE at the level of chemogenomic signatures (i.e., single chemical/dose/duration elicited transcriptomic signatures), mechanistic and pathology signatures, biological processes, canonical pathways and transcription factor networks. Overall, our results suggest that similar hypotheses about the biological mechanism of toxicity would be formulated from fresh frozen and FFPE samples. These results indicate that phenotypically anchored archival specimens represent a potentially informative resource for signature-based biomarker discovery and mechanistic characterization of toxicity. Copyright © 2014 John Wiley & Sons, Ltd.

  10. Evaluation of diffusion-tensor imaging-based global search and tractography for tumor surgery close to the language system.

    Directory of Open Access Journals (Sweden)

    Mirco Richter

    Full Text Available Pre-operative planning and intra-operative guidance in neurosurgery require detailed information about the location of functional areas and their anatomo-functional connectivity. In particular, regarding the language system, post-operative deficits such as aphasia can be avoided. By combining functional magnetic resonance imaging and diffusion tensor imaging, the connectivity between functional areas can be reconstructed by tractography techniques that need to cope with limitations such as limited resolution and low anisotropic diffusion close to functional areas. Tumors pose particular challenges because of edema, displacement effects on brain tissue and infiltration of white matter. Under these conditions, standard fiber tracking methods reconstruct pathways of insufficient quality. Therefore, robust global or probabilistic approaches are required. In this study, two commonly used standard fiber tracking algorithms, streamline propagation and tensor deflection, were compared with a previously published global search, Gibbs tracking and a connection-oriented probabilistic tractography approach. All methods were applied to reconstruct neuronal pathways of the language system of patients undergoing brain tumor surgery, and control subjects. Connections between Broca and Wernicke areas via the arcuate fasciculus (AF and the inferior fronto-occipital fasciculus (IFOF were validated by a clinical expert to ensure anatomical feasibility, and compared using distance- and diffusion-based similarity metrics to evaluate their agreement on pathway locations. For both patients and controls, a strong agreement between all methods was observed regarding the location of the AF. In case of the IFOF however, standard fiber tracking and Gibbs tracking predominantly identified the inferior longitudinal fasciculus that plays a secondary role in semantic language processing. In contrast, global search resolved connections in almost every case via the IFOF which

  11. A Novel Quad Harmony Search Algorithm for Grid-Based Path Finding

    Directory of Open Access Journals (Sweden)

    Saso Koceski

    2014-09-01

    Full Text Available A novel approach to the problem of grid-based path finding has been introduced. The method is a block-based search algorithm, founded on the bases of two algorithms, namely the quad-tree algorithm, which offered a great opportunity for decreasing the time needed to compute the solution, and the harmony search (HS algorithm, a meta-heuristic algorithm used to obtain the optimal solution. This quad HS algorithm uses the quad-tree decomposition of free space in the grid to mark the free areas and treat them as a single node, which greatly improves the execution. The results of the quad HS algorithm have been compared to other meta-heuristic algorithms, i.e., ant colony, genetic algorithm, particle swarm optimization and simulated annealing, and it was proved to obtain the best results in terms of time and giving the optimal path.

  12. Road Traffic Congestion Management Based on a Search-Allocation Approach

    Directory of Open Access Journals (Sweden)

    Raiyn Jamal

    2017-03-01

    Full Text Available This paper introduces a new scheme for road traffic management in smart cities, aimed at reducing road traffic congestion. The scheme is based on a combination of searching, updating, and allocation techniques (SUA. An SUA approach is proposed to reduce the processing time for forecasting the conditions of all road sections in real-time, which is typically considerable and complex. It searches for the shortest route based on historical observations, then computes travel time forecasts based on vehicular location in real-time. Using updated information, which includes travel time forecasts and accident forecasts, the vehicle is allocated the appropriate section. The novelty of the SUA scheme lies in its updating of vehicles in every time to reduce traffic congestion. Furthermore, the SUA approach supports autonomy and management by self-regulation, which recommends its use in smart cities that support internet of things (IoT technologies.

  13. Hidden policy ciphertext-policy attribute-based encryption with keyword search against keyword guessing attack

    Institute of Scientific and Technical Information of China (English)

    Shuo; QIU; Jiqiang; LIU; Yanfeng; SHI; Rui; ZHANG

    2017-01-01

    Attribute-based encryption with keyword search(ABKS) enables data owners to grant their search capabilities to other users by enforcing an access control policy over the outsourced encrypted data. However,existing ABKS schemes cannot guarantee the privacy of the access structures, which may contain some sensitive private information. Furthermore, resulting from the exposure of the access structures, ABKS schemes are susceptible to an off-line keyword guessing attack if the keyword space has a polynomial size. To solve these problems, we propose a novel primitive named hidden policy ciphertext-policy attribute-based encryption with keyword search(HP-CPABKS). With our primitive, the data user is unable to search on encrypted data and learn any information about the access structure if his/her attribute credentials cannot satisfy the access control policy specified by the data owner. We present a rigorous selective security analysis of the proposed HP-CPABKS scheme, which simultaneously keeps the indistinguishability of the keywords and the access structures. Finally,the performance evaluation verifies that our proposed scheme is efficient and practical.

  14. Developing a distributed HTML5-based search engine for geospatial resource discovery

    Science.gov (United States)

    ZHOU, N.; XIA, J.; Nebert, D.; Yang, C.; Gui, Z.; Liu, K.

    2013-12-01

    With explosive growth of data, Geospatial Cyberinfrastructure(GCI) components are developed to manage geospatial resources, such as data discovery and data publishing. However, the efficiency of geospatial resources discovery is still challenging in that: (1) existing GCIs are usually developed for users of specific domains. Users may have to visit a number of GCIs to find appropriate resources; (2) The complexity of decentralized network environment usually results in slow response and pool user experience; (3) Users who use different browsers and devices may have very different user experiences because of the diversity of front-end platforms (e.g. Silverlight, Flash or HTML). To address these issues, we developed a distributed and HTML5-based search engine. Specifically, (1)the search engine adopts a brokering approach to retrieve geospatial metadata from various and distributed GCIs; (2) the asynchronous record retrieval mode enhances the search performance and user interactivity; (3) the search engine based on HTML5 is able to provide unified access capabilities for users with different devices (e.g. tablet and smartphone).

  15. Is Internet search better than structured instruction for web-based health education?

    Science.gov (United States)

    Finkelstein, Joseph; Bedra, McKenzie

    2013-01-01

    Internet provides access to vast amounts of comprehensive information regarding any health-related subject. Patients increasingly use this information for health education using a search engine to identify education materials. An alternative approach of health education via Internet is based on utilizing a verified web site which provides structured interactive education guided by adult learning theories. Comparison of these two approaches in older patients was not performed systematically. The aim of this study was to compare the efficacy of a web-based computer-assisted education (CO-ED) system versus searching the Internet for learning about hypertension. Sixty hypertensive older adults (age 45+) were randomized into control or intervention groups. The control patients spent 30 to 40 minutes searching the Internet using a search engine for information about hypertension. The intervention patients spent 30 to 40 minutes using the CO-ED system, which provided computer-assisted instruction about major hypertension topics. Analysis of pre- and post- knowledge scores indicated a significant improvement among CO-ED users (14.6%) as opposed to Internet users (2%). Additionally, patients using the CO-ED program rated their learning experience more positively than those using the Internet.

  16. Category Theory Approach to Solution Searching Based on Photoexcitation Transfer Dynamics

    Directory of Open Access Journals (Sweden)

    Makoto Naruse

    2017-07-01

    Full Text Available Solution searching that accompanies combinatorial explosion is one of the most important issues in the age of artificial intelligence. Natural intelligence, which exploits natural processes for intelligent functions, is expected to help resolve or alleviate the difficulties of conventional computing paradigms and technologies. In fact, we have shown that a single-celled organism such as an amoeba can solve constraint satisfaction problems and related optimization problems as well as demonstrate experimental systems based on non-organic systems such as optical energy transfer involving near-field interactions. However, the fundamental mechanisms and limitations behind solution searching based on natural processes have not yet been understood. Herein, we present a theoretical background of solution searching based on optical excitation transfer from a category-theoretic standpoint. One important indication inspired by the category theory is that the satisfaction of short exact sequences is critical for an adequate computational operation that determines the flow of time for the system and is termed as “short-exact-sequence-based time.” In addition, the octahedral and braid structures known in triangulated categories provide a clear understanding of the underlying mechanisms, including a quantitative indication of the difficulties of obtaining solutions based on homology dimension. This study contributes to providing a fundamental background of natural intelligence.

  17. Development of a Google-based search engine for data mining radiology reports.

    Science.gov (United States)

    Erinjeri, Joseph P; Picus, Daniel; Prior, Fred W; Rubin, David A; Koppel, Paul

    2009-08-01

    The aim of this study is to develop a secure, Google-based data-mining tool for radiology reports using free and open source technologies and to explore its use within an academic radiology department. A Health Insurance Portability and Accountability Act (HIPAA)-compliant data repository, search engine and user interface were created to facilitate treatment, operations, and reviews preparatory to research. The Institutional Review Board waived review of the project, and informed consent was not required. Comprising 7.9 GB of disk space, 2.9 million text reports were downloaded from our radiology information system to a fileserver. Extensible markup language (XML) representations of the reports were indexed using Google Desktop Enterprise search engine software. A hypertext markup language (HTML) form allowed users to submit queries to Google Desktop, and Google's XML response was interpreted by a practical extraction and report language (PERL) script, presenting ranked results in a web browser window. The query, reason for search, results, and documents visited were logged to maintain HIPAA compliance. Indexing averaged approximately 25,000 reports per hour. Keyword search of a common term like "pneumothorax" yielded the first ten most relevant results of 705,550 total results in 1.36 s. Keyword search of a rare term like "hemangioendothelioma" yielded the first ten most relevant results of 167 total results in 0.23 s; retrieval of all 167 results took 0.26 s. Data mining tools for radiology reports will improve the productivity of academic radiologists in clinical, educational, research, and administrative tasks. By leveraging existing knowledge of Google's interface, radiologists can quickly perform useful searches.

  18. SU-C-206-03: Metal Artifact Reduction in X-Ray Computed Tomography Based On Local Anatomical Similarity

    International Nuclear Information System (INIS)

    Dong, X; Yang, X; Rosenfield, J; Elder, E; Dhabaan, A

    2016-01-01

    Purpose: Metal implants such as orthopedic hardware and dental fillings cause severe bright and dark streaking in reconstructed CT images. These artifacts decrease image contrast and degrade HU accuracy, leading to inaccuracies in target delineation and dose calculation. Additionally, such artifacts negatively impact patient set-up in image guided radiation therapy (IGRT). In this work, we propose a novel method for metal artifact reduction which utilizes the anatomical similarity between neighboring CT slices. Methods: Neighboring CT slices show similar anatomy. Based on this anatomical similarity, the proposed method replaces corrupted CT pixels with pixels from adjacent, artifact-free slices. A gamma map, which is the weighted summation of relative HU error and distance error, is calculated for each pixel in the artifact-corrupted CT image. The minimum value in each pixel’s gamma map is used to identify a pixel from the adjacent CT slice to replace the corresponding artifact-corrupted pixel. This replacement only occurs if the minimum value in a particular pixel’s gamma map is larger than a threshold. The proposed method was evaluated with clinical images. Results: Highly attenuating dental fillings and hip implants cause severe streaking artifacts on CT images. The proposed method eliminates the dark and bright streaking and improves the implant delineation and visibility. In particular, the image non-uniformity in the central region of interest was reduced from 1.88 and 1.01 to 0.28 and 0.35, respectively. Further, the mean CT HU error was reduced from 328 HU and 460 HU to 60 HU and 36 HU, respectively. Conclusions: The proposed metal artifact reduction method replaces corrupted image pixels with pixels from neighboring slices that are free of metal artifacts. This method proved capable of suppressing streaking artifacts, improving HU accuracy and image detectability.

  19. Large-scale structural and textual similarity-based mining of knowledge graph to predict drug-drug interactions

    KAUST Repository

    Abdelaziz, Ibrahim; Fokoue, Achille; Hassanzadeh, Oktie; Zhang, Ping; Sadoghi, Mohammad

    2017-01-01

    Drug-Drug Interactions (DDIs) are a major cause of preventable Adverse Drug Reactions (ADRs), causing a significant burden on the patients’ health and the healthcare system. It is widely known that clinical studies cannot sufficiently and accurately identify DDIs for new drugs before they are made available on the market. In addition, existing public and proprietary sources of DDI information are known to be incomplete and/or inaccurate and so not reliable. As a result, there is an emerging body of research on in-silico prediction of drug-drug interactions. In this paper, we present Tiresias, a large-scale similarity-based framework that predicts DDIs through link prediction. Tiresias takes in various sources of drug-related data and knowledge as inputs, and provides DDI predictions as outputs. The process starts with semantic integration of the input data that results in a knowledge graph describing drug attributes and relationships with various related entities such as enzymes, chemical structures, and pathways. The knowledge graph is then used to compute several similarity measures between all the drugs in a scalable and distributed framework. In particular, Tiresias utilizes two classes of features in a knowledge graph: local and global features. Local features are derived from the information directly associated to each drug (i.e., one hop away) while global features are learnt by minimizing a global loss function that considers the complete structure of the knowledge graph. The resulting similarity metrics are used to build features for a large-scale logistic regression model to predict potential DDIs. We highlight the novelty of our proposed Tiresias and perform thorough evaluation of the quality of the predictions. The results show the effectiveness of Tiresias in both predicting new interactions among existing drugs as well as newly developed drugs.

  20. Large-scale structural and textual similarity-based mining of knowledge graph to predict drug-drug interactions

    KAUST Repository

    Abdelaziz, Ibrahim

    2017-06-12

    Drug-Drug Interactions (DDIs) are a major cause of preventable Adverse Drug Reactions (ADRs), causing a significant burden on the patients’ health and the healthcare system. It is widely known that clinical studies cannot sufficiently and accurately identify DDIs for new drugs before they are made available on the market. In addition, existing public and proprietary sources of DDI information are known to be incomplete and/or inaccurate and so not reliable. As a result, there is an emerging body of research on in-silico prediction of drug-drug interactions. In this paper, we present Tiresias, a large-scale similarity-based framework that predicts DDIs through link prediction. Tiresias takes in various sources of drug-related data and knowledge as inputs, and provides DDI predictions as outputs. The process starts with semantic integration of the input data that results in a knowledge graph describing drug attributes and relationships with various related entities such as enzymes, chemical structures, and pathways. The knowledge graph is then used to compute several similarity measures between all the drugs in a scalable and distributed framework. In particular, Tiresias utilizes two classes of features in a knowledge graph: local and global features. Local features are derived from the information directly associated to each drug (i.e., one hop away) while global features are learnt by minimizing a global loss function that considers the complete structure of the knowledge graph. The resulting similarity metrics are used to build features for a large-scale logistic regression model to predict potential DDIs. We highlight the novelty of our proposed Tiresias and perform thorough evaluation of the quality of the predictions. The results show the effectiveness of Tiresias in both predicting new interactions among existing drugs as well as newly developed drugs.

  1. SU-C-206-03: Metal Artifact Reduction in X-Ray Computed Tomography Based On Local Anatomical Similarity

    Energy Technology Data Exchange (ETDEWEB)

    Dong, X; Yang, X; Rosenfield, J; Elder, E; Dhabaan, A [Emory University, Winship Cancer Institute, Atlanta, GA (United States)

    2016-06-15

    Purpose: Metal implants such as orthopedic hardware and dental fillings cause severe bright and dark streaking in reconstructed CT images. These artifacts decrease image contrast and degrade HU accuracy, leading to inaccuracies in target delineation and dose calculation. Additionally, such artifacts negatively impact patient set-up in image guided radiation therapy (IGRT). In this work, we propose a novel method for metal artifact reduction which utilizes the anatomical similarity between neighboring CT slices. Methods: Neighboring CT slices show similar anatomy. Based on this anatomical similarity, the proposed method replaces corrupted CT pixels with pixels from adjacent, artifact-free slices. A gamma map, which is the weighted summation of relative HU error and distance error, is calculated for each pixel in the artifact-corrupted CT image. The minimum value in each pixel’s gamma map is used to identify a pixel from the adjacent CT slice to replace the corresponding artifact-corrupted pixel. This replacement only occurs if the minimum value in a particular pixel’s gamma map is larger than a threshold. The proposed method was evaluated with clinical images. Results: Highly attenuating dental fillings and hip implants cause severe streaking artifacts on CT images. The proposed method eliminates the dark and bright streaking and improves the implant delineation and visibility. In particular, the image non-uniformity in the central region of interest was reduced from 1.88 and 1.01 to 0.28 and 0.35, respectively. Further, the mean CT HU error was reduced from 328 HU and 460 HU to 60 HU and 36 HU, respectively. Conclusions: The proposed metal artifact reduction method replaces corrupted image pixels with pixels from neighboring slices that are free of metal artifacts. This method proved capable of suppressing streaking artifacts, improving HU accuracy and image detectability.

  2. Identification of polycystic ovary syndrome potential drug targets based on pathobiological similarity in the protein-protein interaction network

    Science.gov (United States)

    Li, Wan; Wei, Wenqing; Li, Yiran; Xie, Ruiqiang; Guo, Shanshan; Wang, Yahui; Jiang, Jing; Chen, Binbin; Lv, Junjie; Zhang, Nana; Chen, Lina; He, Weiming

    2016-01-01

    Polycystic ovary syndrome (PCOS) is one of the most common endocrinological disorders in reproductive aged women. PCOS and Type 2 Diabetes (T2D) are closely linked in multiple levels and possess high pathobiological similarity. Here, we put forward a new computational approach based on the pathobiological similarity to identify PCOS potential drug target modules (PPDT-Modules) and PCOS potential drug targets in the protein-protein interaction network (PPIN). From the systems level and biological background, 1 PPDT-Module and 22 PCOS potential drug targets were identified, 21 of which were verified by literatures to be associated with the pathogenesis of PCOS. 42 drugs targeting to 13 PCOS potential drug targets were investigated experimentally or clinically for PCOS. Evaluated by independent datasets, the whole PPDT-Module and 22 PCOS potential drug targets could not only reveal the drug response, but also distinguish the statuses between normal and disease. Our identified PPDT-Module and PCOS potential drug targets would shed light on the treatment of PCOS. And our approach would provide valuable insights to research on the pathogenesis and drug response of other diseases. PMID:27191267

  3. A novel method to remove GPR background noise based on the similarity of non-neighboring regions

    Science.gov (United States)

    Montiel-Zafra, V.; Canadas-Quesada, F. J.; Vera-Candeas, P.; Ruiz-Reyes, N.; Rey, J.; Martinez, J.

    2017-09-01

    Ground penetrating radar (GPR) is a non-destructive technique that has been widely used in many areas of research, such as landmine detection or subsurface anomalies, where it is required to locate targets embedded within a background medium. One of the major challenges in the research of GPR data remains the improvement of the image quality of stone materials by means of detection of true anisotropies since most of the errors are caused by an incorrect interpretation by the users. However, it is complicated due to the interference of the horizontal background noise, e.g., the air-ground interface, that reduces the high-resolution quality of radargrams. Thus, weak or deep anisotropies are often masked by this type of noise. In order to remove the background noise obtained by GPR, this work proposes a novel background removal method assuming that the horizontal noise shows repetitive two-dimensional regions along the movement of the GPR antenna. Specifically, the proposed method, based on the non-local similarity of regions over the distance, computes similarities between different regions of the same depth in order to identify most repetitive regions using a criterion to avoid closer regions. Evaluations are performed using a set of synthetic and real GPR data. Experimental results show that the proposed method obtains promising results compared to the classic background removal techniques and the most recently published background removal methods.

  4. Gender similarities and differences in brain activation strategies: Voxel-based meta-analysis on fMRI studies.

    Science.gov (United States)

    AlRyalat, Saif Aldeen

    2017-01-01

    Gender similarities and differences have long been a matter of debate in almost all human research, especially upon reaching the discussion about brain functions. This large scale meta-analysis was performed on functional MRI studies. It included more than 700 active brain foci from more than 70 different experiments to study gender related similarities and differences in brain activation strategies for three of the main brain functions: Visual-spatial cognition, memory, and emotion. Areas that are significantly activated by both genders (i.e. core areas) for the tested brain function are mentioned, whereas those areas significantly activated exclusively in one gender are the gender specific areas. During visual-spatial cognition task, and in addition to the core areas, males significantly activated their left superior frontal gyrus, compared with left superior parietal lobule in females. For memory tasks, several different brain areas activated by each gender, but females significantly activated two areas from the limbic system during memory retrieval tasks. For emotional task, males tend to recruit their bilateral prefrontal regions, whereas females tend to recruit their bilateral amygdalae. This meta-analysis provides an overview based on functional MRI studies on how males and females use their brain.

  5. Pairing symmetries of several iron-based superconductor families and some similarities with cuprates and heavy-fermions

    Directory of Open Access Journals (Sweden)

    Das Tanmoy

    2012-03-01

    Full Text Available We show that, by using the unit-cell transformation between 1 Fe per unit cell to 2 Fe per unit cell, one can qualitatively understand the pairing symmetry of several families of iron-based superconductors. In iron-pnictides and iron-chalcogenides, the nodeless s±-pairing and the resulting magnetic resonance mode transform nicely between the two unit cells, while retaining all physical properties unchanged. However, when the electron-pocket disappears from the Fermi surface with complete doping in KFe2As2, we find that the unit-cell invariant requirement prohibits the occurrence of s±-pairing symmetry (caused by inter-hole-pocket nesting. However, the intra-pocket nesting is compatible here, which leads to a nodal d-wave pairing. The corresponding Fermi surface topology and the pairing symmetry are similar to Ce-based heavy-fermion superconductors. Furthermore, when the Fermi surface hosts only electron-pockets in KyFe2-xSe2, the inter-electron-pocket nesting induces a nodeless and isotropic d-wave pairing. This situation is analogous to the electron-doped cuprates, where the strong antiferromagnetic order creates similar disconnected electron-pocket Fermi surface, and hence nodeless d-wave pairing appears. The unit-cell transformation in KyFe2-xSe2 exhibits that the d-wave pairing breaks the translational symmetry of the 2 Fe unit cell, and thus cannot be realized unless a vacancy ordering forms to compensate for it. These results are consistent with the coexistence picture of a competing order and nodeless d-wave superconductivity in both cuprates and KyFe1.6Se2.

  6. Tuning glass formation and brittle behaviors by similar solvent element substitution in (Mn,Fe)-based bulk metallic glasses

    Energy Technology Data Exchange (ETDEWEB)

    Xu, Tao [Key Laboratory of Aerospace Materials and Performance (Ministry of Education), School of Materials Science and Engineering, Beihang University, Beijing 100191 (China); Li, Ran, E-mail: liran@buaa.edu.cn [Key Laboratory of Aerospace Materials and Performance (Ministry of Education), School of Materials Science and Engineering, Beihang University, Beijing 100191 (China); Xiao, Ruijuan [Institute of Physics, Chinese Academy of Sciences, Beijing 100190 (China); Liu, Gang [State Key Laboratory for Mechanical Behavior of Materials, School of Materials Science and Engineering, Xi' an Jiaotong University, Xi' an 710049 (China); Wang, Jianfeng [School of Materials Science and Engineering, Zhengzhou University, Zhengzhou 450001 (China); Zhang, Tao, E-mail: zhangtao@buaa.edu.cn [Key Laboratory of Aerospace Materials and Performance (Ministry of Education), School of Materials Science and Engineering, Beihang University, Beijing 100191 (China)

    2015-02-25

    A family of Mn-rich bulk metallic glasses (BMGs) was developed through the similar solvent elements (SSE) substitution of Mn for Fe in (Mn{sub x}Fe{sub 80−x})P{sub 10}B{sub 7}C{sub 3} alloys. The effect of the SSE substitution on glass formation, thermal stability, elastic constants, mechanical properties, fracture morphologies, Weibull modulus and indentation fracture toughness was discussed. A thermodynamics analysis provided by Battezzati et al. (L. Battezzati, E. Garrone, Z. Metallkd. 75 (1984) 305–310) was adopted to explain the compositional dependence of the glass-forming ability (GFA). The elastic moduli follow roughly linear correlations with the substitution concentration of Mn in (Mn{sub x}Fe{sub 80−x})P{sub 10}B{sub 7}C{sub 3} BMGs. The introduction of Mn to replace Fe significantly decreases the plasticity of the resulting BMGs and the Weibull modulus of the fracture strength. A super-brittle Mn-based BMGs of (Mn{sub 55}Fe{sub 25})P{sub 10}B{sub 7}C{sub 3} BMGs were found with the indentation fracture toughness (K{sub c}) of 1.91±0.04 MPa m{sup 1/2}, the lowest value among all kinds of BMGs so far. The atomic and electronic structure of the selected BMGs were simulated by the first principles molecular dynamics calculations based on density functional theory, which provided a possible understanding of the brittleness caused by the similar chemical element replacement of Mn for Fe.

  7. Parallel trajectory similarity joins in spatial networks

    KAUST Repository

    Shang, Shuo

    2018-04-04

    The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider two cases of trajectory similarity joins (TS-Joins), including a threshold-based join (Tb-TS-Join) and a top-k TS-Join (k-TS-Join), where the objects are trajectories of vehicles moving in road networks. Given two sets of trajectories and a threshold θ, the Tb-TS-Join returns all pairs of trajectories from the two sets with similarity above θ. In contrast, the k-TS-Join does not take a threshold as a parameter, and it returns the top-k most similar trajectory pairs from the two sets. The TS-Joins target diverse applications such as trajectory near-duplicate detection, data cleaning, ridesharing recommendation, and traffic congestion prediction. With these applications in mind, we provide purposeful definitions of similarity. To enable efficient processing of the TS-Joins on large sets of trajectories, we develop search space pruning techniques and enable use of the parallel processing capabilities of modern processors. Specifically, we present a two-phase divide-and-conquer search framework that lays the foundation for the algorithms for the Tb-TS-Join and the k-TS-Join that rely on different pruning techniques to achieve efficiency. For each trajectory, the algorithms first find similar trajectories. Then they merge the results to obtain the final result. The algorithms for the two joins exploit different upper and lower bounds on the spatiotemporal trajectory similarity and different heuristic scheduling strategies for search space pruning. Their per-trajectory searches are independent of each other and can be performed in parallel, and the mergings have constant cost. An empirical study with real data offers insight in the performance of the algorithms and demonstrates that they are capable of outperforming well-designed baseline algorithms by an order of magnitude.

  8. Parallel trajectory similarity joins in spatial networks

    KAUST Repository

    Shang, Shuo; Chen, Lisi; Wei, Zhewei; Jensen, Christian S.; Zheng, Kai; Kalnis, Panos

    2018-01-01

    The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider two cases of trajectory similarity joins (TS-Joins), including a threshold-based join (Tb-TS-Join) and a top-k TS-Join (k-TS-Join), where the objects are trajectories of vehicles moving in road networks. Given two sets of trajectories and a threshold θ, the Tb-TS-Join returns all pairs of trajectories from the two sets with similarity above θ. In contrast, the k-TS-Join does not take a threshold as a parameter, and it returns the top-k most similar trajectory pairs from the two sets. The TS-Joins target diverse applications such as trajectory near-duplicate detection, data cleaning, ridesharing recommendation, and traffic congestion prediction. With these applications in mind, we provide purposeful definitions of similarity. To enable efficient processing of the TS-Joins on large sets of trajectories, we develop search space pruning techniques and enable use of the parallel processing capabilities of modern processors. Specifically, we present a two-phase divide-and-conquer search framework that lays the foundation for the algorithms for the Tb-TS-Join and the k-TS-Join that rely on different pruning techniques to achieve efficiency. For each trajectory, the algorithms first find similar trajectories. Then they merge the results to obtain the final result. The algorithms for the two joins exploit different upper and lower bounds on the spatiotemporal trajectory similarity and different heuristic scheduling strategies for search space pruning. Their per-trajectory searches are independent of each other and can be performed in parallel, and the mergings have constant cost. An empirical study with real data offers insight in the performance of the algorithms and demonstrates that they are capable of outperforming well-designed baseline algorithms by an order of magnitude.

  9. FPS-RAM: Fast Prefix Search RAM-Based Hardware for Forwarding Engine

    Science.gov (United States)

    Zaitsu, Kazuya; Yamamoto, Koji; Kuroda, Yasuto; Inoue, Kazunari; Ata, Shingo; Oka, Ikuo

    Ternary content addressable memory (TCAM) is becoming very popular for designing high-throughput forwarding engines on routers. However, TCAM has potential problems in terms of hardware and power costs, which limits its ability to deploy large amounts of capacity in IP routers. In this paper, we propose new hardware architecture for fast forwarding engines, called fast prefix search RAM-based hardware (FPS-RAM). We designed FPS-RAM hardware with the intent of maintaining the same search performance and physical user interface as TCAM because our objective is to replace the TCAM in the market. Our RAM-based hardware architecture is completely different from that of TCAM and has dramatically reduced the costs and power consumption to 62% and 52%, respectively. We implemented FPS-RAM on an FPGA to examine its lookup operation.

  10. Spatial planning via extremal optimization enhanced by cell-based local search

    International Nuclear Information System (INIS)

    Sidiropoulos, Epaminondas

    2014-01-01

    A new treatment is presented for land use planning problems by means of extremal optimization in conjunction to cell-based neighborhood local search. Extremal optimization, inspired by self-organized critical models of evolution has been applied mainly to the solution of classical combinatorial optimization problems. Cell-based local search has been employed by the author elsewhere in problems of spatial resource allocation in combination with genetic algorithms and simulated annealing. In this paper it complements extremal optimization in order to enhance its capacity for a spatial optimization problem. The hybrid method thus formed is compared to methods of the literature on a specific characteristic problem. It yields better results both in terms of objective function values and in terms of compactness. The latter is an important quantity for spatial planning. The present treatment yields significant compactness values as emergent results

  11. Using fuzzy rule-based knowledge model for optimum plating conditions search

    Science.gov (United States)

    Solovjev, D. S.; Solovjeva, I. A.; Litovka, Yu V.; Arzamastsev, A. A.; Glazkov, V. P.; L’vov, A. A.

    2018-03-01

    The paper discusses existing approaches to plating process modeling in order to decrease the distribution thickness of plating surface cover. However, these approaches do not take into account the experience, knowledge, and intuition of the decision-makers when searching the optimal conditions of electroplating technological process. The original approach to optimal conditions search for applying the electroplating coatings, which uses the rule-based model of knowledge and allows one to reduce the uneven product thickness distribution, is proposed. The block diagrams of a conventional control system of a galvanic process as well as the system based on the production model of knowledge are considered. It is shown that the fuzzy production model of knowledge in the control system makes it possible to obtain galvanic coatings of a given thickness unevenness with a high degree of adequacy to the experimental data. The described experimental results confirm the theoretical conclusions.

  12. Hybrid fuzzy charged system search algorithm based state estimation in distribution networks

    Directory of Open Access Journals (Sweden)

    Sachidananda Prasad

    2017-06-01

    Full Text Available This paper proposes a new hybrid charged system search (CSS algorithm based state estimation in radial distribution networks in fuzzy framework. The objective of the optimization problem is to minimize the weighted square of the difference between the measured and the estimated quantity. The proposed method of state estimation considers bus voltage magnitude and phase angle as state variable along with some equality and inequality constraints for state estimation in distribution networks. A rule based fuzzy inference system has been designed to control the parameters of the CSS algorithm to achieve better balance between the exploration and exploitation capability of the algorithm. The efficiency of the proposed fuzzy adaptive charged system search (FACSS algorithm has been tested on standard IEEE 33-bus system and Indian 85-bus practical radial distribution system. The obtained results have been compared with the conventional CSS algorithm, weighted least square (WLS algorithm and particle swarm optimization (PSO for feasibility of the algorithm.

  13. A Minimal Path Searching Approach for Active Shape Model (ASM)-based Segmentation of the Lung.

    Science.gov (United States)

    Guo, Shengwen; Fei, Baowei

    2009-03-27

    We are developing a minimal path searching method for active shape model (ASM)-based segmentation for detection of lung boundaries on digital radiographs. With the conventional ASM method, the position and shape parameters of the model points are iteratively refined and the target points are updated by the least Mahalanobis distance criterion. We propose an improved searching strategy that extends the searching points in a fan-shape region instead of along the normal direction. A minimal path (MP) deformable model is applied to drive the searching procedure. A statistical shape prior model is incorporated into the segmentation. In order to keep the smoothness of the shape, a smooth constraint is employed to the deformable model. To quantitatively assess the ASM-MP segmentation, we compare the automatic segmentation with manual segmentation for 72 lung digitized radiographs. The distance error between the ASM-MP and manual segmentation is 1.75 ± 0.33 pixels, while the error is 1.99 ± 0.45 pixels for the ASM. Our results demonstrate that our ASM-MP method can accurately segment the lung on digital radiographs.

  14. A minimal path searching approach for active shape model (ASM)-based segmentation of the lung

    Science.gov (United States)

    Guo, Shengwen; Fei, Baowei

    2009-02-01

    We are developing a minimal path searching method for active shape model (ASM)-based segmentation for detection of lung boundaries on digital radiographs. With the conventional ASM method, the position and shape parameters of the model points are iteratively refined and the target points are updated by the least Mahalanobis distance criterion. We propose an improved searching strategy that extends the searching points in a fan-shape region instead of along the normal direction. A minimal path (MP) deformable model is applied to drive the searching procedure. A statistical shape prior model is incorporated into the segmentation. In order to keep the smoothness of the shape, a smooth constraint is employed to the deformable model. To quantitatively assess the ASM-MP segmentation, we compare the automatic segmentation with manual segmentation for 72 lung digitized radiographs. The distance error between the ASM-MP and manual segmentation is 1.75 +/- 0.33 pixels, while the error is 1.99 +/- 0.45 pixels for the ASM. Our results demonstrate that our ASM-MP method can accurately segment the lung on digital radiographs.

  15. A BLE-Based Pedestrian Navigation System for Car Searching in Indoor Parking Garages.

    Science.gov (United States)

    Wang, Sheng-Shih

    2018-05-05

    The continuous global increase in the number of cars has led to an increase in parking issues, particularly with respect to the search for available parking spaces and finding cars. In this paper, we propose a navigation system for car owners to find their cars in indoor parking garages. The proposed system comprises a car-searching mobile app and a positioning-assisting subsystem. The app guides car owners to their cars based on a “turn-by-turn” navigation strategy, and has the ability to correct the user’s heading orientation. The subsystem uses beacon technology for indoor positioning, supporting self-guidance of the car-searching mobile app. This study also designed a local coordinate system to support the identification of the locations of parking spaces and beacon devices. We used Android as the platform to implement the proposed car-searching mobile app, and used Bytereal HiBeacon devices to implement the proposed positioning-assisting subsystem. We also deployed the system in a parking lot in our campus for testing. The experimental results verified that the proposed system not only works well, but also provides the car owner with the correct route guidance information.

  16. The possibilities of searching for new materials based on isocationic analogs of ZnBVI

    Science.gov (United States)

    Kirovskaya, I. A.; Mironova, E. V.; Kosarev, B. A.; Yureva, A. V.; Ekkert, R. V.

    2017-08-01

    Acid-base properties of chalcogenides - analogs of ZnBVI - were investigated in detail by modern techniques. The regularities of their composition-dependent changes were set, these regularities correlating with the dependencies "bulk physicochemical property - composition". The main reason for such correlations was found, facilitating the search for new materials of corresponding sensors. In this case, it was the sensors for basic gases impurities.

  17. A peak value searching method of the MCA based on digital logic devices

    International Nuclear Information System (INIS)

    Sang Ziru; Huang Shanshan; Chen Lian; Jin Ge

    2010-01-01

    Digital multi-channel analyzers play a more important role in multi-channel pulse height analysis technique. The direction of digitalization are characterized by powerful pulse processing ability, high throughput, improved stability and flexibility. This paper introduces a method of searching peak value of waveform based on digital logic with FPGA. This method reduce the dead time. Then data correction offline can improvement the non-linearity of MCA. It gives the α energy spectrum of 241 Am. (authors)

  18. An Efficient Energy Constraint Based UAV Path Planning for Search and Coverage

    OpenAIRE

    Gramajo, German; Shankar, Praveen

    2017-01-01

    A path planning strategy for a search and coverage mission for a small UAV that maximizes the area covered based on stored energy and maneuverability constraints is presented. The proposed formulation has a high level of autonomy, without requiring an exact choice of optimization parameters, and is appropriate for real-time implementation. The computed trajectory maximizes spatial coverage while closely satisfying terminal constraints on the position of the vehicle and minimizing the time of ...

  19. Similar effects of feature-based attention on motion perception and pursuit eye movements at different levels of awareness.

    Science.gov (United States)

    Spering, Miriam; Carrasco, Marisa

    2012-05-30

    Feature-based attention enhances visual processing and improves perception, even for visual features that we are not aware of. Does feature-based attention also modulate motor behavior in response to visual information that does or does not reach awareness? Here we compare the effect of feature-based attention on motion perception and smooth-pursuit eye movements in response to moving dichoptic plaids--stimuli composed of two orthogonally drifting gratings, presented separately to each eye--in human observers. Monocular adaptation to one grating before the presentation of both gratings renders the adapted grating perceptually weaker than the unadapted grating and decreases the level of awareness. Feature-based attention was directed to either the adapted or the unadapted grating's motion direction or to both (neutral condition). We show that observers were better at detecting a speed change in the attended than the unattended motion direction, indicating that they had successfully attended to one grating. Speed change detection was also better when the change occurred in the unadapted than the adapted grating, indicating that the adapted grating was perceptually weaker. In neutral conditions, perception and pursuit in response to plaid motion were dissociated: While perception followed one grating's motion direction almost exclusively (component motion), the eyes tracked the average of both gratings (pattern motion). In attention conditions, perception and pursuit were shifted toward the attended component. These results suggest that attention affects perception and pursuit similarly even though only the former reflects awareness. The eyes can track an attended feature even if observers do not perceive it.

  20. Search Advertising

    OpenAIRE

    Cornière (de), Alexandre

    2016-01-01

    Search engines enable advertisers to target consumers based on the query they have entered. In a framework with horizontal product differentiation, imperfect product information and in which consumers incur search costs, I study a game in which advertisers have to choose a price and a set of relevant keywords. The targeting mechanism brings about three kinds of efficiency gains, namely lower search costs, better matching, and more intense product market price-competition. A monopolistic searc...

  1. Expectation violations in sensorimotor sequences: shifting from LTM-based attentional selection to visual search.

    Science.gov (United States)

    Foerster, Rebecca M; Schneider, Werner X

    2015-03-01

    Long-term memory (LTM) delivers important control signals for attentional selection. LTM expectations have an important role in guiding the task-driven sequence of covert attention and gaze shifts, especially in well-practiced multistep sensorimotor actions. What happens when LTM expectations are disconfirmed? Does a sensory-based visual-search mode of attentional selection replace the LTM-based mode? What happens when prior LTM expectations become valid again? We investigated these questions in a computerized version of the number-connection test. Participants clicked on spatially distributed numbered shapes in ascending order while gaze was recorded. Sixty trials were performed with a constant spatial arrangement. In 20 consecutive trials, either numbers, shapes, both, or no features switched position. In 20 reversion trials, participants worked on the original arrangement. Only the sequence-affecting number switches elicited slower clicking, visual search-like scanning, and lower eye-hand synchrony. The effects were neither limited to the exchanged numbers nor to the corresponding actions. Thus, expectation violations in a well-learned sensorimotor sequence cause a regression from LTM-based attentional selection to visual search beyond deviant-related actions and locations. Effects lasted for several trials and reappeared during reversion. © 2015 New York Academy of Sciences.

  2. Pulmonary parenchyma segmentation in thin CT image sequences with spectral clustering and geodesic active contour model based on similarity

    Science.gov (United States)

    He, Nana; Zhang, Xiaolong; Zhao, Juanjuan; Zhao, Huilan; Qiang, Yan

    2017-07-01

    While the popular thin layer scanning technology of spiral CT has helped to improve diagnoses of lung diseases, the large volumes of scanning images produced by the technology also dramatically increase the load of physicians in lesion detection. Computer-aided diagnosis techniques like lesions segmentation in thin CT sequences have been developed to address this issue, but it remains a challenge to achieve high segmentation efficiency and accuracy without much involvement of human manual intervention. In this paper, we present our research on automated segmentation of lung parenchyma with an improved geodesic active contour model that is geodesic active contour model based on similarity (GACBS). Combining spectral clustering algorithm based on Nystrom (SCN) with GACBS, this algorithm first extracts key image slices, then uses these slices to generate an initial contour of pulmonary parenchyma of un-segmented slices with an interpolation algorithm, and finally segments lung parenchyma of un-segmented slices. Experimental results show that the segmentation results generated by our method are close to what manual segmentation can produce, with an average volume overlap ratio of 91.48%.

  3. HSTLBO: A hybrid algorithm based on Harmony Search and Teaching-Learning-Based Optimization for complex high-dimensional optimization problems.

    Directory of Open Access Journals (Sweden)

    Shouheng Tuo

    Full Text Available Harmony Search (HS and Teaching-Learning-Based Optimization (TLBO as new swarm intelligent optimization algorithms have received much attention in recent years. Both of them have shown outstanding performance for solving NP-Hard optimization problems. However, they also suffer dramatic performance degradation for some complex high-dimensional optimization problems. Through a lot of experiments, we find that the HS and TLBO have strong complementarity each other. The HS has strong global exploration power but low convergence speed. Reversely, the TLBO has much fast convergence speed but it is easily trapped into local search. In this work, we propose a hybrid search algorithm named HSTLBO that merges the two algorithms together for synergistically solving complex optimization problems using a self-adaptive selection strategy. In the HSTLBO, both HS and TLBO are modified with the aim of balancing the global exploration and exploitation abilities, where the HS aims mainly to explore the unknown regions and the TLBO aims to rapidly exploit high-precision solutions in the known regions. Our experimental results demonstrate better performance and faster speed than five state-of-the-art HS variants and show better exploration power than five good TLBO variants with similar run time, which illustrates that our method is promising in solving complex high-dimensional optimization problems. The experiment on portfolio optimization problems also demonstrate that the HSTLBO is effective in solving complex read-world application.

  4. Incidental Learning: A Brief, Valid Measure of Memory Based on the WAIS-IV Vocabulary and Similarities Subtests.

    Science.gov (United States)

    Spencer, Robert J; Reckow, Jaclyn; Drag, Lauren L; Bieliauskas, Linas A

    2016-12-01

    We assessed the validity of a brief incidental learning measure based on the Similarities and Vocabulary subtests of the Wechsler Adult Intelligence Scale-Fourth Edition (WAIS-IV). Most neuropsychological assessments for memory require intentional learning, but incidental learning occurs without explicit instruction. Incidental memory tests such as the WAIS-III Symbol Digit Coding subtest have existed for many years, but few memory studies have used a semantically processed incidental learning model. We conducted a retrospective analysis of 37 veterans with traumatic brain injury, referred for outpatient neuropsychological testing at a Veterans Affairs hospital. As part of their evaluation, the participants completed the incidental learning tasks. We compared their incidental learning performance to their performance on traditional memory measures. Incidental learning scores correlated strongly with scores on the California Verbal Learning Test-Second Edition (CVLT-II) and Brief Visuospatial Memory Test-Revised (BVMT-R). After we conducted a partial correlation that controlled for the effects of age, incidental learning correlated significantly with the CVLT-II Immediate Free Recall, CVLT-II Short-Delay Recall, CVLT-II Long-Delay Recall, and CVLT-II Yes/No Recognition Hits, and with the BVMT-R Delayed Recall and BVMT-R Recognition Discrimination Index. Our incidental learning procedures derived from subtests of the WAIS-IV Edition are an efficient and valid way of measuring memory. These tasks add minimally to testing time and capitalize on the semantic encoding that is inherent in completing the Similarities and Vocabulary subtests.

  5. Evidence for similar patterns of neural activity elicited by picture- and word-based representations of natural scenes.

    Science.gov (United States)

    Kumar, Manoj; Federmeier, Kara D; Fei-Fei, Li; Beck, Diane M

    2017-07-15

    A long-standing core question in cognitive science is whether different modalities and representation types (pictures, words, sounds, etc.) access a common store of semantic information. Although different input types have been shown to activate a shared network of brain regions, this does not necessitate that there is a common representation, as the neurons in these regions could still differentially process the different modalities. However, multi-voxel pattern analysis can be used to assess whether, e.g., pictures and words evoke a similar pattern of activity, such that the patterns that separate categories in one modality transfer to the other. Prior work using this method has found support for a common code, but has two limitations: they have either only examined disparate categories (e.g. animals vs. tools) that are known to activate different brain regions, raising the possibility that the pattern separation and inferred similarity reflects only large scale differences between the categories or they have been limited to individual object representations. By using natural scene categories, we not only extend the current literature on cross-modal representations beyond objects, but also, because natural scene categories activate a common set of brain regions, we identify a more fine-grained (i.e. higher spatial resolution) common representation. Specifically, we studied picture- and word-based representations of natural scene stimuli from four different categories: beaches, cities, highways, and mountains. Participants passively viewed blocks of either phrases (e.g. "sandy beach") describing scenes or photographs from those same scene categories. To determine whether the phrases and pictures evoke a common code, we asked whether a classifier trained on one stimulus type (e.g. phrase stimuli) would transfer (i.e. cross-decode) to the other stimulus type (e.g. picture stimuli). The analysis revealed cross-decoding in the occipitotemporal, posterior parietal and

  6. An Improved Harmony Search Based on Teaching-Learning Strategy for Unconstrained Optimization Problems

    Directory of Open Access Journals (Sweden)

    Shouheng Tuo

    2013-01-01

    Full Text Available Harmony search (HS algorithm is an emerging population-based metaheuristic algorithm, which is inspired by the music improvisation process. The HS method has been developed rapidly and applied widely during the past decade. In this paper, an improved global harmony search algorithm, named harmony search based on teaching-learning (HSTL, is presented for high dimension complex optimization problems. In HSTL algorithm, four strategies (harmony memory consideration, teaching-learning strategy, local pitch adjusting, and random mutation are employed to maintain the proper balance between convergence and population diversity, and dynamic strategy is adopted to change the parameters. The proposed HSTL algorithm is investigated and compared with three other state-of-the-art HS optimization algorithms. Furthermore, to demonstrate the robustness and convergence, the success rate and convergence analysis is also studied. The experimental results of 31 complex benchmark functions demonstrate that the HSTL method has strong convergence and robustness and has better balance capacity of space exploration and local exploitation on high dimension complex optimization problems.

  7. Simulation Optimization of Search and Rescue in Disaster Relief Based on Distributed Auction Mechanism

    Directory of Open Access Journals (Sweden)

    Jian Tang

    2017-11-01

    Full Text Available In this paper, we optimize the search and rescue (SAR in disaster relief through agent-based simulation. We simulate rescue teams’ search behaviors with the improved Truncated Lévy walks. Then we propose a cooperative rescue plan based on a distributed auction mechanism, and illustrate it with the case of landslide disaster relief. The simulation is conducted in three scenarios, including “fatal”, “serious” and “normal”. Compared with the non-cooperative rescue plan, the proposed rescue plan in this paper would increase victims’ relative survival probability by 7–15%, increase the ratio of survivors getting rescued by 5.3–12.9%, and decrease the average elapsed time for one site getting rescued by 16.6–21.6%. The robustness analysis shows that search radius can affect the rescue efficiency significantly, while the scope of cooperation cannot. The sensitivity analysis shows that the two parameters, the time limit for completing rescue operations in one buried site and the maximum turning angle for next step, both have a great influence on rescue efficiency, and there exists optimal value for both of them in view of rescue efficiency.

  8. A Rule-Based Local Search Algorithm for General Shift Design Problems in Airport Ground Handling

    DEFF Research Database (Denmark)

    Clausen, Tommy

    We consider a generalized version of the shift design problem where shifts are created to cover a multiskilled demand and fit the parameters of the workforce. We present a collection of constraints and objectives for the generalized shift design problem. A local search solution framework with mul......We consider a generalized version of the shift design problem where shifts are created to cover a multiskilled demand and fit the parameters of the workforce. We present a collection of constraints and objectives for the generalized shift design problem. A local search solution framework...... with multiple neighborhoods and a loosely coupled rule engine based on simulated annealing is presented. Computational experiments on real-life data from various airport ground handling organization show the performance and flexibility of the proposed algorithm....

  9. A Privacy-Preserving Intelligent Medical Diagnosis System Based on Oblivious Keyword Search

    Directory of Open Access Journals (Sweden)

    Zhaowen Lin

    2017-01-01

    Full Text Available One of the concerns people have is how to get the diagnosis online without privacy being jeopardized. In this paper, we propose a privacy-preserving intelligent medical diagnosis system (IMDS, which can efficiently solve the problem. In IMDS, users submit their health examination parameters to the server in a protected form; this submitting process is based on Paillier cryptosystem and will not reveal any information about their data. And then the server retrieves the most likely disease (or multiple diseases from the database and returns it to the users. In the above search process, we use the oblivious keyword search (OKS as a basic framework, which makes the server maintain the computational ability but cannot learn any personal information over the data of users. Besides, this paper also provides a preprocessing method for data stored in the server, to make our protocol more efficient.

  10. Transmission network expansion planning based on hybridization model of neural networks and harmony search algorithm

    Directory of Open Access Journals (Sweden)

    Mohammad Taghi Ameli

    2012-01-01

    Full Text Available Transmission Network Expansion Planning (TNEP is a basic part of power network planning that determines where, when and how many new transmission lines should be added to the network. So, the TNEP is an optimization problem in which the expansion purposes are optimized. Artificial Intelligence (AI tools such as Genetic Algorithm (GA, Simulated Annealing (SA, Tabu Search (TS and Artificial Neural Networks (ANNs are methods used for solving the TNEP problem. Today, by using the hybridization models of AI tools, we can solve the TNEP problem for large-scale systems, which shows the effectiveness of utilizing such models. In this paper, a new approach to the hybridization model of Probabilistic Neural Networks (PNNs and Harmony Search Algorithm (HSA was used to solve the TNEP problem. Finally, by considering the uncertain role of the load based on a scenario technique, this proposed model was tested on the Garver’s 6-bus network.

  11. Search-free license plate localization based on saliency and local variance estimation

    Science.gov (United States)

    Safaei, Amin; Tang, H. L.; Sanei, S.

    2015-02-01

    In recent years, the performance and accuracy of automatic license plate number recognition (ALPR) systems have greatly improved, however the increasing number of applications for such systems have made ALPR research more challenging than ever. The inherent computational complexity of search dependent algorithms remains a major problem for current ALPR systems. This paper proposes a novel search-free method of localization based on the estimation of saliency and local variance. Gabor functions are then used to validate the choice of candidate license plate. The algorithm was applied to three image datasets with different levels of complexity and the results compared with a number of benchmark methods, particularly in terms of speed. The proposed method outperforms the state of the art methods and can be used for real time applications.

  12. The multi-copy simultaneous search methodology: a fundamental tool for structure-based drug design.

    Science.gov (United States)

    Schubert, Christian R; Stultz, Collin M

    2009-08-01

    Fragment-based ligand design approaches, such as the multi-copy simultaneous search (MCSS) methodology, have proven to be useful tools in the search for novel therapeutic compounds that bind pre-specified targets of known structure. MCSS offers a variety of advantages over more traditional high-throughput screening methods, and has been applied successfully to challenging targets. The methodology is quite general and can be used to construct functionality maps for proteins, DNA, and RNA. In this review, we describe the main aspects of the MCSS method and outline the general use of the methodology as a fundamental tool to guide the design of de novo lead compounds. We focus our discussion on the evaluation of MCSS results and the incorporation of protein flexibility into the methodology. In addition, we demonstrate on several specific examples how the information arising from the MCSS functionality maps has been successfully used to predict ligand binding to protein targets and RNA.

  13. Neural Based Tabu Search method for solving unit commitment problem with cooling-banking constraints

    Directory of Open Access Journals (Sweden)

    Rajan Asir Christober Gnanakkan Charles

    2009-01-01

    Full Text Available This paper presents a new approach to solve short-term unit commitment problem (UCP using Neural Based Tabu Search (NBTS with cooling and banking constraints. The objective of this paper is to find the generation scheduling such that the total operating cost can be minimized, when subjected to a variety of constraints. This also means that it is desirable to find the optimal generating unit commitment in the power system for next H hours. A 7-unit utility power system in India demonstrates the effectiveness of the proposed approach; extensive studies have also been performed for different IEEE test systems consist of 10, 26 and 34 units. Numerical results are shown to compare the superiority of the cost solutions obtained using the Tabu Search (TS method, Dynamic Programming (DP and Lagrangian Relaxation (LR methods in reaching proper unit commitment.

  14. Power-Management Techniques for Wireless Sensor Networks and Similar Low-Power Communication Devices Based on Nonrechargeable Batteries

    Directory of Open Access Journals (Sweden)

    Agnelo Silva

    2012-01-01

    Full Text Available Despite the well-known advantages of communication solutions based on energy harvesting, there are scenarios where the absence of batteries (supercapacitor only or the use of rechargeable batteries is not a realistic option. Therefore, the alternative is to extend as much as possible the lifetime of primary cells (nonrechargeable batteries. By assuming low duty-cycle applications, three power-management techniques are combined in a novel way to provide an efficient energy solution for wireless sensor networks nodes or similar communication devices powered by primary cells. Accordingly, a customized node is designed and long-term experiments in laboratory and outdoors are realized. Simulated and empirical results show that the battery lifetime can be drastically enhanced. However, two trade-offs are identified: a significant increase of both data latency and hardware/software complexity. Unattended nodes deployed in outdoors under extreme temperatures, buried sensors (underground communication, and nodes embedded in the structure of buildings, bridges, and roads are some of the target scenarios for this work. Part of the provided guidelines can be used to extend the battery lifetime of communication devices in general.

  15. Annealing-free P3HT:PCBM-based organic solar cells via two halohydrocarbons additives with similar boiling points

    International Nuclear Information System (INIS)

    Bao, Xichang; Wang, Ting; Yang, Ailing; Yang, Chunpeng; Dou, Xiaowei; Chen, Weichao; Wang, Ning; Yang, Renqiang

    2014-01-01

    Highlights: • Two halohydrocarbons were selected as additives for polymer solar cells. • The additives can improve the photocurrent of photovoltaic devices. • Extensive characterization of the blends was done to explore the mechanism. -- Abstract: Efficient annealing-free inverted bulk heterojunction (BHJ) organic solar cells based on poly(3-hexylthiophene) (P3HT) and [6,6]-phenyl C 61 -butyric acid methyl ester (PCBM) (1:1, w/w) have been obtained using two easily accessible halohydrocarbons (1,6-dibromohexane (DBH) and 1-bromodecane (BD)) with the same boiling points as solvent additives. The devices treated with 2.5 wt% additives removed the grain boundary of the large PCBM-rich phase, resulting in more-uniform film morphology on the nanoscale. The more-uniform film morphology greatly improved the short circuit current density of the devices. Finally, PCEs of the devices processed with DBH and BD reached 3.81% and 3.68%, respectively. Both additives with almost the same boiling points had a similar impact on device performance, despite of different chemical structures with different polarities and other physical properties

  16. Direct Patlak Reconstruction From Dynamic PET Data Using the Kernel Method With MRI Information Based on Structural Similarity.

    Science.gov (United States)

    Gong, Kuang; Cheng-Liao, Jinxiu; Wang, Guobao; Chen, Kevin T; Catana, Ciprian; Qi, Jinyi

    2018-04-01

    Positron emission tomography (PET) is a functional imaging modality widely used in oncology, cardiology, and neuroscience. It is highly sensitive, but suffers from relatively poor spatial resolution, as compared with anatomical imaging modalities, such as magnetic resonance imaging (MRI). With the recent development of combined PET/MR systems, we can improve the PET image quality by incorporating MR information into image reconstruction. Previously, kernel learning has been successfully embedded into static and dynamic PET image reconstruction using either PET temporal or MRI information. Here, we combine both PET temporal and MRI information adaptively to improve the quality of direct Patlak reconstruction. We examined different approaches to combine the PET and MRI information in kernel learning to address the issue of potential mismatches between MRI and PET signals. Computer simulations and hybrid real-patient data acquired on a simultaneous PET/MR scanner were used to evaluate the proposed methods. Results show that the method that combines PET temporal information and MRI spatial information adaptively based on the structure similarity index has the best performance in terms of noise reduction and resolution improvement.

  17. Stretchable human-machine interface based on skin-conformal sEMG electrodes with self-similar geometry

    Science.gov (United States)

    Dong, Wentao; Zhu, Chen; Hu, Wei; Xiao, Lin; Huang, Yong'an

    2018-01-01

    Current stretchable surface electrodes have attracted increasing attention owing to their potential applications in biological signal monitoring, wearable human-machine interfaces (HMIs) and the Internet of Things. The paper proposed a stretchable HMI based on a surface electromyography (sEMG) electrode with a self-similar serpentine configuration. The sEMG electrode was transfer-printed onto the skin surface conformally to monitor biological signals, followed by signal classification and controlling of a mobile robot. Such electrodes can bear rather large deformation (such as >30%) under an appropriate areal coverage. The sEMG electrodes have been used to record electrophysiological signals from different parts of the body with sharp curvature, such as the index finger, back of the neck and face, and they exhibit great potential for HMI in the fields of robotics and healthcare. The electrodes placed onto the two wrists would generate two different signals with the fist clenched and loosened. It is classified to four kinds of signals with a combination of the gestures from the two wrists, that is, four control modes. Experiments demonstrated that the electrodes were successfully used as an HMI to control the motion of a mobile robot remotely. Project supported by the National Natural Science Foundation of China (Nos. 51635007, 91323303).

  18. Group search optimiser-based optimal bidding strategies with no Karush-Kuhn-Tucker optimality conditions

    Science.gov (United States)

    Yadav, Naresh Kumar; Kumar, Mukesh; Gupta, S. K.

    2017-03-01

    General strategic bidding procedure has been formulated in the literature as a bi-level searching problem, in which the offer curve tends to minimise the market clearing function and to maximise the profit. Computationally, this is complex and hence, the researchers have adopted Karush-Kuhn-Tucker (KKT) optimality conditions to transform the model into a single-level maximisation problem. However, the profit maximisation problem with KKT optimality conditions poses great challenge to the classical optimisation algorithms. The problem has become more complex after the inclusion of transmission constraints. This paper simplifies the profit maximisation problem as a minimisation function, in which the transmission constraints, the operating limits and the ISO market clearing functions are considered with no KKT optimality conditions. The derived function is solved using group search optimiser (GSO), a robust population-based optimisation algorithm. Experimental investigation is carried out on IEEE 14 as well as IEEE 30 bus systems and the performance is compared against differential evolution-based strategic bidding, genetic algorithm-based strategic bidding and particle swarm optimisation-based strategic bidding methods. The simulation results demonstrate that the obtained profit maximisation through GSO-based bidding strategies is higher than the other three methods.

  19. A Self-Organizing Map-Based Approach to Generating Reduced-Size, Statistically Similar Climate Datasets

    Science.gov (United States)

    Cabell, R.; Delle Monache, L.; Alessandrini, S.; Rodriguez, L.

    2015-12-01

    Climate-based studies require large amounts of data in order to produce accurate and reliable results. Many of these studies have used 30-plus year data sets in order to produce stable and high-quality results, and as a result, many such data sets are available, generally in the form of global reanalyses. While the analysis of these data lead to high-fidelity results, its processing can be very computationally expensive. This computational burden prevents the utilization of these data sets for certain applications, e.g., when rapid response is needed in crisis management and disaster planning scenarios resulting from release of toxic material in the atmosphere. We have developed a methodology to reduce large climate datasets to more manageable sizes while retaining statistically similar results when used to produce ensembles of possible outcomes. We do this by employing a Self-Organizing Map (SOM) algorithm to analyze general patterns of meteorological fields over a regional domain of interest to produce a small set of "typical days" with which to generate the model ensemble. The SOM algorithm takes as input a set of vectors and generates a 2D map of representative vectors deemed most similar to the input set and to each other. Input predictors are selected that are correlated with the model output, which in our case is an Atmospheric Transport and Dispersion (T&D) model that is highly dependent on surface winds and boundary layer depth. To choose a subset of "typical days," each input day is assigned to its closest SOM map node vector and then ranked by distance. Each node vector is treated as a distribution and days are sampled from them by percentile. Using a 30-node SOM, with sampling every 20th percentile, we have been able to reduce 30 years of the Climate Forecast System Reanalysis (CFSR) data for the month of October to 150 "typical days." To estimate the skill of this approach, the "Measure of Effectiveness" (MOE) metric is used to compare area and overlap

  20. Extended-Search, Bézier Curve-Based Lane Detection and Reconstruction System for an Intelligent Vehicle

    Directory of Open Access Journals (Sweden)

    Xiaoyun Huang

    2015-09-01

    Full Text Available To improve the real-time performance and detection rate of a Lane Detection and Reconstruction (LDR system, an extended-search-based lane detection method and a Bézier curve-based lane reconstruction algorithm are proposed in this paper. The extended-search-based lane detection method is designed to search boundary blocks from the initial position, in an upwards direction and along the lane, with small search areas including continuous search, discontinuous search and bending search in order to detect different lane boundaries. The Bézier curve-based lane reconstruction algorithm is employed to describe a wide range of lane boundary forms with comparatively simple expressions. In addition, two Bézier curves are adopted to reconstruct the lanes' outer boundaries with large curvature variation. The lane detection and reconstruction algorithm — including initial-blocks' determining, extended search, binarization processing and lane boundaries' fitting in different scenarios — is verified in road tests. The results show that this algorithm is robust against different shadows and illumination variations; the average processing time per frame is 13 ms. Significantly, it presents an 88.6% high-detection rate on curved lanes with large or variable curvatures, where the accident rate is higher than that of straight lanes.