WorldWideScience

Sample records for accurate similarity search

  1. Protein structural similarity search by Ramachandran codes

    Directory of Open Access Journals (Sweden)

    Chang Chih-Hung

    2007-08-01

    Full Text Available Abstract Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation. SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era.

  2. Semantically enabled image similarity search

    Science.gov (United States)

    Casterline, May V.; Emerick, Timothy; Sadeghi, Kolia; Gosse, C. A.; Bartlett, Brent; Casey, Jason

    2015-05-01

    Georeferenced data of various modalities are increasingly available for intelligence and commercial use, however effectively exploiting these sources demands a unified data space capable of capturing the unique contribution of each input. This work presents a suite of software tools for representing geospatial vector data and overhead imagery in a shared high-dimension vector or embedding" space that supports fused learning and similarity search across dissimilar modalities. While the approach is suitable for fusing arbitrary input types, including free text, the present work exploits the obvious but computationally difficult relationship between GIS and overhead imagery. GIS is comprised of temporally-smoothed but information-limited content of a GIS, while overhead imagery provides an information-rich but temporally-limited perspective. This processing framework includes some important extensions of concepts in literature but, more critically, presents a means to accomplish them as a unified framework at scale on commodity cloud architectures.

  3. Similarity search processing. Paralelization and indexing technologies.

    Directory of Open Access Journals (Sweden)

    Eder Dos Santos

    2015-08-01

    The next Scientific-Technical Report addresses the similarity search and the implementation of metric structures on parallel environments. It also presents the state of the art related to similarity search on metric structures and parallelism technologies. Comparative analysis are also proposed, seeking to identify the behavior of a set of metric spaces and metric structures over processing platforms multicore-based and GPU-based.

  4. Effective semantic search using thematic similarity

    Directory of Open Access Journals (Sweden)

    Sharifullah Khan

    2014-07-01

    Full Text Available Most existing semantic search systems expand search keywords using domain ontology to deal with semantic heterogeneity. They focus on matching the semantic similarity of individual keywords in a multiple-keywords query; however, they ignore the semantic relationships that exist among the keywords of the query themselves. The systems return less relevant answers for these types of queries. More relevant documents for a multiple-keywords query can be retrieved if the systems know the relationships that exist among multiple keywords in the query. The proposed search methodology matches patterns of keywords for capturing the context of keywords, and then the relevant documents are ranked according to their pattern relevance score. A prototype system has been implemented to validate the proposed search methodology. The system has been compared with existing systems for evaluation. The results demonstrate improvement in precision and recall of search.

  5. Molecular fingerprint similarity search in virtual screening.

    Science.gov (United States)

    Cereto-Massagué, Adrià; Ojeda, María José; Valls, Cristina; Mulero, Miquel; Garcia-Vallvé, Santiago; Pujadas, Gerard

    2015-01-01

    Molecular fingerprints have been used for a long time now in drug discovery and virtual screening. Their ease of use (requiring little to no configuration) and the speed at which substructure and similarity searches can be performed with them - paired with a virtual screening performance similar to other more complex methods - is the reason for their popularity. However, there are many types of fingerprints, each representing a different aspect of the molecule, which can greatly affect search performance. This review focuses on commonly used fingerprint algorithms, their usage in virtual screening, and the software packages and online tools that provide these algorithms.

  6. Ultra accurate collaborative information filtering via directed user similarity

    OpenAIRE

    Guo, Qiang; Song, Wen-Jun; Liu, Jian-Guo

    2014-01-01

    A key challenge of the collaborative filtering (CF) information filtering is how to obtain the reliable and accurate results with the help of peers' recommendation. Since the similarities from small-degree users to large-degree users would be larger than the ones opposite direction, the large-degree users' selections are recommended extensively by the traditional second-order CF algorithms. By considering the users' similarity direction and the second-order correlations to depress the influen...

  7. Gene functional similarity search tool (GFSST

    Directory of Open Access Journals (Sweden)

    Russo James J

    2006-03-01

    Full Text Available Abstract Background With the completion of the genome sequences of human, mouse, and other species and the advent of high throughput functional genomic research technologies such as biomicroarray chips, more and more genes and their products have been discovered and their functions have begun to be understood. Increasing amounts of data about genes, gene products and their functions have been stored in databases. To facilitate selection of candidate genes for gene-disease research, genetic association studies, biomarker and drug target selection, and animal models of human diseases, it is essential to have search engines that can retrieve genes by their functions from proteome databases. In recent years, the development of Gene Ontology (GO has established structured, controlled vocabularies describing gene functions, which makes it possible to develop novel tools to search genes by functional similarity. Results By using a statistical model to measure the functional similarity of genes based on the Gene Ontology directed acyclic graph, we developed a novel Gene Functional Similarity Search Tool (GFSST to identify genes with related functions from annotated proteome databases. This search engine lets users design their search targets by gene functions. Conclusion An implementation of GFSST which works on the UniProt (Universal Protein Resource for the human and mouse proteomes is available at GFSST Web Server. GFSST provides functions not only for similar gene retrieval but also for gene search by one or more GO terms. This represents a powerful new approach for selecting similar genes and gene products from proteome databases according to their functions.

  8. Web Search Results Summarization Using Similarity Assessment

    Directory of Open Access Journals (Sweden)

    Sawant V.V.

    2014-06-01

    Full Text Available Now day’s internet has become part of our life, the WWW is most important service of internet because it allows presenting information such as document, imaging etc. The WWW grows rapidly and caters to a diversified levels and categories of users. For user specified results web search results are extracted. Millions of information pouring online, users has no time to surf the contents completely .Moreover the information available is repeated or duplicated in nature. This issue has created the necessity to restructure the search results that could yield results summarized. The proposed approach comprises of different feature extraction of web pages. Web page visual similarity assessment has been employed to address the problems in different fields including phishing, web archiving, web search engine etc. In this approach, initially by enters user query the number of search results get stored. The Earth Mover's Distance is used to assessment of web page visual similarity, in this technique take the web page as a low resolution image, create signature of that web page image with color and co-ordinate features .Calculate the distance between web pages by applying EMD method. Compute the Layout Similarity value by using tag comparison algorithm and template comparison algorithm. Textual similarity is computed by using cosine similarity, and hyperlink analysis is performed to compute outward links. The final similarity value is calculated by fusion of layout, text, hyperlink and EMD value. Once the similarity matrix is found clustering is employed with the help of connected component. Finally group of similar web pages i.e. summarized results get displayed to user. Experiment conducted to demonstrate the effectiveness of four methods to generate summarized result on different web pages and user queries also.

  9. SEAL: Spatio-Textual Similarity Search

    CERN Document Server

    Fan, Ju; Zhou, Lizhu; Chen, Shanshan; Hu, Jun

    2012-01-01

    Location-based services (LBS) have become more and more ubiquitous recently. Existing methods focus on finding relevant points-of-interest (POIs) based on users' locations and query keywords. Nowadays, modern LBS applications generate a new kind of spatio-textual data, regions-of-interest (ROIs), containing region-based spatial information and textual description, e.g., mobile user profiles with active regions and interest tags. To satisfy search requirements on ROIs, we study a new research problem, called spatio-textual similarity search: Given a set of ROIs and a query ROI, we find the similar ROIs by considering spatial overlap and textual similarity. Spatio-textual similarity search has many important applications, e.g., social marketing in location-aware social networks. It calls for an efficient search method to support large scales of spatio-textual data in LBS systems. To this end, we introduce a filter-and-verification framework to compute the answers. In the filter step, we generate signatures for ...

  10. Similarity searching in large combinatorial chemistry spaces

    Science.gov (United States)

    Rarey, Matthias; Stahl, Martin

    2001-06-01

    We present a novel algorithm, called Ftrees-FS, for similarity searching in large chemistry spaces based on dynamic programming. Given a query compound, the algorithm generates sets of compounds from a given chemistry space that are similar to the query. The similarity search is based on the feature tree similarity measure representing molecules by tree structures. This descriptor allows handling combinatorial chemistry spaces as a whole instead of looking at subsets of enumerated compounds. Within few minutes of computing time, the algorithm is able to find the most similar compound in very large spaces as well as sets of compounds at an arbitrary similarity level. In addition, the diversity among the generated compounds can be controlled. A set of 17 000 fragments of known drugs, generated by the RECAP procedure from the World Drug Index, was used as the search chemistry space. These fragments can be combined to more than 1018 compounds of reasonable size. For validation, known antagonists/inhibitors of several targets including dopamine D4, histamine H1, and COX2 are used as queries. Comparison of the compounds created by Ftrees-FS to other known actives demonstrates the ability of the method to jump between structurally unrelated molecule classes.

  11. Predicting the performance of fingerprint similarity searching.

    Science.gov (United States)

    Vogt, Martin; Bajorath, Jürgen

    2011-01-01

    Fingerprints are bit string representations of molecular structure that typically encode structural fragments, topological features, or pharmacophore patterns. Various fingerprint designs are utilized in virtual screening and their search performance essentially depends on three parameters: the nature of the fingerprint, the active compounds serving as reference molecules, and the composition of the screening database. It is of considerable interest and practical relevance to predict the performance of fingerprint similarity searching. A quantitative assessment of the potential that a fingerprint search might successfully retrieve active compounds, if available in the screening database, would substantially help to select the type of fingerprint most suitable for a given search problem. The method presented herein utilizes concepts from information theory to relate the fingerprint feature distributions of reference compounds to screening libraries. If these feature distributions do not sufficiently differ, active database compounds that are similar to reference molecules cannot be retrieved because they disappear in the "background." By quantifying the difference in feature distribution using the Kullback-Leibler divergence and relating the divergence to compound recovery rates obtained for different benchmark classes, fingerprint search performance can be quantitatively predicted.

  12. New similarity search based glioma grading

    Energy Technology Data Exchange (ETDEWEB)

    Haegler, Katrin; Brueckmann, Hartmut; Linn, Jennifer [Ludwig-Maximilians-University of Munich, Department of Neuroradiology, Munich (Germany); Wiesmann, Martin; Freiherr, Jessica [RWTH Aachen University, Department of Neuroradiology, Aachen (Germany); Boehm, Christian [Ludwig-Maximilians-University of Munich, Department of Computer Science, Munich (Germany); Schnell, Oliver; Tonn, Joerg-Christian [Ludwig-Maximilians-University of Munich, Department of Neurosurgery, Munich (Germany)

    2012-08-15

    MR-based differentiation between low- and high-grade gliomas is predominately based on contrast-enhanced T1-weighted images (CE-T1w). However, functional MR sequences as perfusion- and diffusion-weighted sequences can provide additional information on tumor grade. Here, we tested the potential of a recently developed similarity search based method that integrates information of CE-T1w and perfusion maps for non-invasive MR-based glioma grading. We prospectively included 37 untreated glioma patients (23 grade I/II, 14 grade III gliomas), in whom 3T MRI with FLAIR, pre- and post-contrast T1-weighted, and perfusion sequences was performed. Cerebral blood volume, cerebral blood flow, and mean transit time maps as well as CE-T1w images were used as input for the similarity search. Data sets were preprocessed and converted to four-dimensional Gaussian Mixture Models that considered correlations between the different MR sequences. For each patient, a so-called tumor feature vector (= probability-based classifier) was defined and used for grading. Biopsy was used as gold standard, and similarity based grading was compared to grading solely based on CE-T1w. Accuracy, sensitivity, and specificity of pure CE-T1w based glioma grading were 64.9%, 78.6%, and 56.5%, respectively. Similarity search based tumor grading allowed differentiation between low-grade (I or II) and high-grade (III) gliomas with an accuracy, sensitivity, and specificity of 83.8%, 78.6%, and 87.0%. Our findings indicate that integration of perfusion parameters and CE-T1w information in a semi-automatic similarity search based analysis improves the potential of MR-based glioma grading compared to CE-T1w data alone. (orig.)

  13. Efficient Video Similarity Measurement and Search

    Energy Technology Data Exchange (ETDEWEB)

    Cheung, S-C S

    2002-12-19

    The amount of information on the world wide web has grown enormously since its creation in 1990. Duplication of content is inevitable because there is no central management on the web. Studies have shown that many similar versions of the same text documents can be found throughout the web. This redundancy problem is more severe for multimedia content such as web video sequences, as they are often stored in multiple locations and different formats to facilitate downloading and streaming. Similar versions of the same video can also be found, unknown to content creators, when web users modify and republish original content using video editing tools. Identifying similar content can benefit many web applications and content owners. For example, it will reduce the number of similar answers to a web search and identify inappropriate use of copyright content. In this dissertation, they present a system architecture and corresponding algorithms to efficiently measure, search, and organize similar video sequences found on any large database such as the web.

  14. Ultra-accurate collaborative information filtering via directed user similarity

    Science.gov (United States)

    Guo, Q.; Song, W.-J.; Liu, J.-G.

    2014-07-01

    A key challenge of the collaborative filtering (CF) information filtering is how to obtain the reliable and accurate results with the help of peers' recommendation. Since the similarities from small-degree users to large-degree users would be larger than the ones in opposite direction, the large-degree users' selections are recommended extensively by the traditional second-order CF algorithms. By considering the users' similarity direction and the second-order correlations to depress the influence of mainstream preferences, we present the directed second-order CF (HDCF) algorithm specifically to address the challenge of accuracy and diversity of the CF algorithm. The numerical results for two benchmark data sets, MovieLens and Netflix, show that the accuracy of the new algorithm outperforms the state-of-the-art CF algorithms. Comparing with the CF algorithm based on random walks proposed by Liu et al. (Int. J. Mod. Phys. C, 20 (2009) 285) the average ranking score could reach 0.0767 and 0.0402, which is enhanced by 27.3% and 19.1% for MovieLens and Netflix, respectively. In addition, the diversity, precision and recall are also enhanced greatly. Without relying on any context-specific information, tuning the similarity direction of CF algorithms could obtain accurate and diverse recommendations. This work suggests that the user similarity direction is an important factor to improve the personalized recommendation performance.

  15. Outsourced similarity search on metric data assets

    KAUST Repository

    Yiu, Man Lung

    2012-02-01

    This paper considers a cloud computing setting in which similarity querying of metric data is outsourced to a service provider. The data is to be revealed only to trusted users, not to the service provider or anyone else. Users query the server for the most similar data objects to a query example. Outsourcing offers the data owner scalability and a low-initial investment. The need for privacy may be due to the data being sensitive (e.g., in medicine), valuable (e.g., in astronomy), or otherwise confidential. Given this setting, the paper presents techniques that transform the data prior to supplying it to the service provider for similarity queries on the transformed data. Our techniques provide interesting trade-offs between query cost and accuracy. They are then further extended to offer an intuitive privacy guarantee. Empirical studies with real data demonstrate that the techniques are capable of offering privacy while enabling efficient and accurate processing of similarity queries.

  16. Earthquake detection through computationally efficient similarity search

    Science.gov (United States)

    Yoon, Clara E.; O’Reilly, Ossian; Bergen, Karianne J.; Beroza, Gregory C.

    2015-01-01

    Seismology is experiencing rapid growth in the quantity of data, which has outpaced the development of processing algorithms. Earthquake detection—identification of seismic events in continuous data—is a fundamental operation for observational seismology. We developed an efficient method to detect earthquakes using waveform similarity that overcomes the disadvantages of existing detection methods. Our method, called Fingerprint And Similarity Thresholding (FAST), can analyze a week of continuous seismic waveform data in less than 2 hours, or 140 times faster than autocorrelation. FAST adapts a data mining algorithm, originally designed to identify similar audio clips within large databases; it first creates compact “fingerprints” of waveforms by extracting key discriminative features, then groups similar fingerprints together within a database to facilitate fast, scalable search for similar fingerprint pairs, and finally generates a list of earthquake detections. FAST detected most (21 of 24) cataloged earthquakes and 68 uncataloged earthquakes in 1 week of continuous data from a station located near the Calaveras Fault in central California, achieving detection performance comparable to that of autocorrelation, with some additional false detections. FAST is expected to realize its full potential when applied to extremely long duration data sets over a distributed network of seismic stations. The widespread application of FAST has the potential to aid in the discovery of unexpected seismic signals, improve seismic monitoring, and promote a greater understanding of a variety of earthquake processes. PMID:26665176

  17. Subvoxel accurate graph search using non-Euclidean graph space.

    Directory of Open Access Journals (Sweden)

    Michael D Abràmoff

    Full Text Available Graph search is attractive for the quantitative analysis of volumetric medical images, and especially for layered tissues, because it allows globally optimal solutions in low-order polynomial time. However, because nodes of graphs typically encode evenly distributed voxels of the volume with arcs connecting orthogonally sampled voxels in Euclidean space, segmentation cannot achieve greater precision than a single unit, i.e. the distance between two adjoining nodes, and partial volume effects are ignored. We generalize the graph to non-Euclidean space by allowing non-equidistant spacing between nodes, so that subvoxel accurate segmentation is achievable. Because the number of nodes and edges in the graph remains the same, running time and memory use are similar, while all the advantages of graph search, including global optimality and computational efficiency, are retained. A deformation field calculated from the volume data adaptively changes regional node density so that node density varies with the inverse of the expected cost. We validated our approach using optical coherence tomography (OCT images of the retina and 3-D MR of the arterial wall, and achieved statistically significant increased accuracy. Our approach allows improved accuracy in volume data acquired with the same hardware, and also, preserved accuracy with lower resolution, more cost-effective, image acquisition equipment. The method is not limited to any specific imaging modality and readily extensible to higher dimensions.

  18. Outsourced Similarity Search on Metric Data Assets

    DEFF Research Database (Denmark)

    Yiu, Man Lung; Assent, Ira; Jensen, Christian S.

    2012-01-01

    This paper considers a cloud computing setting in which similarity querying of metric data is outsourced to a service provider. The data is to be revealed only to trusted users, not to the service provider or anyone else. Users query the server for the most similar data objects to a query example...

  19. A Similarity Search Using Molecular Topological Graphs

    Directory of Open Access Journals (Sweden)

    Yoshifumi Fukunishi

    2009-01-01

    Full Text Available A molecular similarity measure has been developed using molecular topological graphs and atomic partial charges. Two kinds of topological graphs were used. One is the ordinary adjacency matrix and the other is a matrix which represents the minimum path length between two atoms of the molecule. The ordinary adjacency matrix is suitable to compare the local structures of molecules such as functional groups, and the other matrix is suitable to compare the global structures of molecules. The combination of these two matrices gave a similarity measure. This method was applied to in silico drug screening, and the results showed that it was effective as a similarity measure.

  20. Learning Style Similarity for Searching Infographics

    OpenAIRE

    Saleh, Babak; Dontcheva, Mira; Hertzmann, Aaron; Liu, Zhicheng

    2015-01-01

    Infographics are complex graphic designs integrating text, images, charts and sketches. Despite the increasing popularity of infographics and the rapid growth of online design portfolios, little research investigates how we can take advantage of these design resources. In this paper we present a method for measuring the style similarity between infographics. Based on human perception data collected from crowdsourced experiments, we use computer vision and machine learning algorithms to learn ...

  1. Distributed Efficient Similarity Search Mechanism in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Khandakar Ahmed

    2015-03-01

    Full Text Available The Wireless Sensor Network similarity search problem has received considerable research attention due to sensor hardware imprecision and environmental parameter variations. Most of the state-of-the-art distributed data centric storage (DCS schemes lack optimization for similarity queries of events. In this paper, a DCS scheme with metric based similarity searching (DCSMSS is proposed. DCSMSS takes motivation from vector distance index, called iDistance, in order to transform the issue of similarity searching into the problem of an interval search in one dimension. In addition, a sector based distance routing algorithm is used to efficiently route messages. Extensive simulation results reveal that DCSMSS is highly efficient and significantly outperforms previous approaches in processing similarity search queries.

  2. Visual similarity is stronger than semantic similarity in guiding visual search for numbers.

    Science.gov (United States)

    Godwin, Hayward J; Hout, Michael C; Menneer, Tamaryn

    2014-06-01

    Using a visual search task, we explored how behavior is influenced by both visual and semantic information. We recorded participants' eye movements as they searched for a single target number in a search array of single-digit numbers (0-9). We examined the probability of fixating the various distractors as a function of two key dimensions: the visual similarity between the target and each distractor, and the semantic similarity (i.e., the numerical distance) between the target and each distractor. Visual similarity estimates were obtained using multidimensional scaling based on the independent observer similarity ratings. A linear mixed-effects model demonstrated that both visual and semantic similarity influenced the probability that distractors would be fixated. However, the visual similarity effect was substantially larger than the semantic similarity effect. We close by discussing the potential value of using this novel methodological approach and the implications for both simple and complex visual search displays.

  3. How Google Web Search copes with very similar documents

    NARCIS (Netherlands)

    Mettrop, W.; Nieuwenhuysen, P.; Smulders, H.

    2006-01-01

    A significant portion of the computer files that carry documents, multimedia, programs etc. on the Web are identical or very similar to other files on the Web. How do search engines cope with this? Do they perform some kind of “deduplication”? How should users take into account that web search resul

  4. Fast and accurate protein substructure searching with simulated annealing and GPUs

    Directory of Open Access Journals (Sweden)

    Stivala Alex D

    2010-09-01

    Full Text Available Abstract Background Searching a database of protein structures for matches to a query structure, or occurrences of a structural motif, is an important task in structural biology and bioinformatics. While there are many existing methods for structural similarity searching, faster and more accurate approaches are still required, and few current methods are capable of substructure (motif searching. Results We developed an improved heuristic for tableau-based protein structure and substructure searching using simulated annealing, that is as fast or faster and comparable in accuracy, with some widely used existing methods. Furthermore, we created a parallel implementation on a modern graphics processing unit (GPU. Conclusions The GPU implementation achieves up to 34 times speedup over the CPU implementation of tableau-based structure search with simulated annealing, making it one of the fastest available methods. To the best of our knowledge, this is the first application of a GPU to the protein structural search problem.

  5. Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases

    CERN Document Server

    Yuan, Ye; Chen, Lei; Wang, Haixun

    2012-01-01

    Many studies have been conducted on seeking the efficient solution for subgraph similarity search over certain (deterministic) graphs due to its wide application in many fields, including bioinformatics, social network analysis, and Resource Description Framework (RDF) data management. All these works assume that the underlying data are certain. However, in reality, graphs are often noisy and uncertain due to various factors, such as errors in data extraction, inconsistencies in data integration, and privacy preserving purposes. Therefore, in this paper, we study subgraph similarity search on large probabilistic graph databases. Different from previous works assuming that edges in an uncertain graph are independent of each other, we study the uncertain graphs where edges' occurrences are correlated. We formally prove that subgraph similarity search over probabilistic graphs is #P-complete, thus, we employ a filter-and-verify framework to speed up the search. In the filtering phase,we develop tight lower and u...

  6. Exact score distribution computation for ontological similarity searches

    Directory of Open Access Journals (Sweden)

    Schulz Marcel H

    2011-11-01

    Full Text Available Abstract Background Semantic similarity searches in ontologies are an important component of many bioinformatic algorithms, e.g., finding functionally related proteins with the Gene Ontology or phenotypically similar diseases with the Human Phenotype Ontology (HPO. We have recently shown that the performance of semantic similarity searches can be improved by ranking results according to the probability of obtaining a given score at random rather than by the scores themselves. However, to date, there are no algorithms for computing the exact distribution of semantic similarity scores, which is necessary for computing the exact P-value of a given score. Results In this paper we consider the exact computation of score distributions for similarity searches in ontologies, and introduce a simple null hypothesis which can be used to compute a P-value for the statistical significance of similarity scores. We concentrate on measures based on Resnik's definition of ontological similarity. A new algorithm is proposed that collapses subgraphs of the ontology graph and thereby allows fast score distribution computation. The new algorithm is several orders of magnitude faster than the naive approach, as we demonstrate by computing score distributions for similarity searches in the HPO. It is shown that exact P-value calculation improves clinical diagnosis using the HPO compared to approaches based on sampling. Conclusions The new algorithm enables for the first time exact P-value calculation via exact score distribution computation for ontology similarity searches. The approach is applicable to any ontology for which the annotation-propagation rule holds and can improve any bioinformatic method that makes only use of the raw similarity scores. The algorithm was implemented in Java, supports any ontology in OBO format, and is available for non-commercial and academic usage under: https://compbio.charite.de/svn/hpo/trunk/src/tools/significance/

  7. Search Profiles Based on User to Cluster Similarity

    Directory of Open Access Journals (Sweden)

    Saša Bošnjak

    2009-06-01

    Full Text Available Privacy of web users' query search logs has, since the AOL dataset release few years ago, been treated as one of the central issues concerning privacy on the Internet. Therefore, the question of privacy preservation has also raised a lot of attention in different communities surrounding the search engines. Usage of clustering methods for providing low level contextual search while retaining high privacy-utility tradeoff, is examined in this paper. By using only the user`s cluster membership the search query terms could be no longer retained thus providing less privacy concerns both for the users and companies. The paper brings lightweight framework for combining query words, user similarities and clustering in order to provide a meaningful way of mining user searches while protecting their privacy. This differs from previous attempts for privacy preserving in the attempt to anonymize the queries instead of the users.

  8. SEARCH PROFILES BASED ON USER TO CLUSTER SIMILARITY

    Directory of Open Access Journals (Sweden)

    Ilija Subasic

    2007-12-01

    Full Text Available Privacy of web users' query search logs has, since last year's AOL dataset release, been treated as one of the central issues concerning privacy on the Internet, Therefore, the question of privacy preservation has also raised a lot of attention in different communities surrounding the search engines. Usage of clustering methods for providing low level contextual search, wriile retaining high privacy/utility is examined in this paper. By using only the user's cluster membership the search query terms could be no longer retained thus providing less privacy concerns both for the users and companies. The paper brings lightweight framework for combining query words, user similarities and clustering in order to provide a meaningful way of mining user searches while protecting their privacy. This differs from previous attempts for privacy preserving in the attempt to anonymize the queries instead of the users.

  9. RAPSearch: a fast protein similarity search tool for short reads

    Directory of Open Access Journals (Sweden)

    Choi Jeong-Hyeon

    2011-05-01

    Full Text Available Abstract Background Next Generation Sequencing (NGS is producing enormous corpuses of short DNA reads, affecting emerging fields like metagenomics. Protein similarity search--a key step to achieve annotation of protein-coding genes in these short reads, and identification of their biological functions--faces daunting challenges because of the very sizes of the short read datasets. Results We developed a fast protein similarity search tool RAPSearch that utilizes a reduced amino acid alphabet and suffix array to detect seeds of flexible length. For short reads (translated in 6 frames we tested, RAPSearch achieved ~20-90 times speedup as compared to BLASTX. RAPSearch missed only a small fraction (~1.3-3.2% of BLASTX similarity hits, but it also discovered additional homologous proteins (~0.3-2.1% that BLASTX missed. By contrast, BLAT, a tool that is even slightly faster than RAPSearch, had significant loss of sensitivity as compared to RAPSearch and BLAST. Conclusions RAPSearch is implemented as open-source software and is accessible at http://omics.informatics.indiana.edu/mg/RAPSearch. It enables faster protein similarity search. The application of RAPSearch in metageomics has also been demonstrated.

  10. Robust hashing with local models for approximate similarity search.

    Science.gov (United States)

    Song, Jingkuan; Yang, Yi; Li, Xuelong; Huang, Zi; Yang, Yang

    2014-07-01

    Similarity search plays an important role in many applications involving high-dimensional data. Due to the known dimensionality curse, the performance of most existing indexing structures degrades quickly as the feature dimensionality increases. Hashing methods, such as locality sensitive hashing (LSH) and its variants, have been widely used to achieve fast approximate similarity search by trading search quality for efficiency. However, most existing hashing methods make use of randomized algorithms to generate hash codes without considering the specific structural information in the data. In this paper, we propose a novel hashing method, namely, robust hashing with local models (RHLM), which learns a set of robust hash functions to map the high-dimensional data points into binary hash codes by effectively utilizing local structural information. In RHLM, for each individual data point in the training dataset, a local hashing model is learned and used to predict the hash codes of its neighboring data points. The local models from all the data points are globally aligned so that an optimal hash code can be assigned to each data point. After obtaining the hash codes of all the training data points, we design a robust method by employing l2,1 -norm minimization on the loss function to learn effective hash functions, which are then used to map each database point into its hash code. Given a query data point, the search process first maps it into the query hash code by the hash functions and then explores the buckets, which have similar hash codes to the query hash code. Extensive experimental results conducted on real-life datasets show that the proposed RHLM outperforms the state-of-the-art methods in terms of search quality and efficiency.

  11. Online multiple kernel similarity learning for visual search.

    Science.gov (United States)

    Xia, Hao; Hoi, Steven C H; Jin, Rong; Zhao, Peilin

    2014-03-01

    Recent years have witnessed a number of studies on distance metric learning to improve visual similarity search in content-based image retrieval (CBIR). Despite their successes, most existing methods on distance metric learning are limited in two aspects. First, they usually assume the target proximity function follows the family of Mahalanobis distances, which limits their capacity of measuring similarity of complex patterns in real applications. Second, they often cannot effectively handle the similarity measure of multimodal data that may originate from multiple resources. To overcome these limitations, this paper investigates an online kernel similarity learning framework for learning kernel-based proximity functions which goes beyond the conventional linear distance metric learning approaches. Based on the framework, we propose a novel online multiple kernel similarity (OMKS) learning method which learns a flexible nonlinear proximity function with multiple kernels to improve visual similarity search in CBIR. We evaluate the proposed technique for CBIR on a variety of image data sets in which encouraging results show that OMKS outperforms the state-of-the-art techniques significantly.

  12. Similarity Search and Locality Sensitive Hashing using TCAMs

    CERN Document Server

    Shinde, Rajendra; Gupta, Pankaj; Dutta, Debojyoti

    2010-01-01

    Similarity search methods are widely used as kernels in various machine learning applications. Nearest neighbor search (NNS) algorithms are often used to retrieve similar entries, given a query. While there exist efficient techniques for exact query lookup using hashing, similarity search using exact nearest neighbors is known to be a hard problem and in high dimensions, best known solutions offer little improvement over a linear scan. Fast solutions to the approximate NNS problem include Locality Sensitive Hashing (LSH) based techniques, which need storage polynomial in $n$ with exponent greater than $1$, and query time sublinear, but still polynomial in $n$, where $n$ is the size of the database. In this work we present a new technique of solving the approximate NNS problem in Euclidean space using a Ternary Content Addressable Memory (TCAM), which needs near linear space and has O(1) query time. In fact, this method also works around the best known lower bounds in the cell probe model for the query time us...

  13. Semantic similarity measure in biomedical domain leverage web search engine.

    Science.gov (United States)

    Chen, Chi-Huang; Hsieh, Sheau-Ling; Weng, Yung-Ching; Chang, Wen-Yung; Lai, Feipei

    2010-01-01

    Semantic similarity measure plays an essential role in Information Retrieval and Natural Language Processing. In this paper we propose a page-count-based semantic similarity measure and apply it in biomedical domains. Previous researches in semantic web related applications have deployed various semantic similarity measures. Despite the usefulness of the measurements in those applications, measuring semantic similarity between two terms remains a challenge task. The proposed method exploits page counts returned by the Web Search Engine. We define various similarity scores for two given terms P and Q, using the page counts for querying P, Q and P AND Q. Moreover, we propose a novel approach to compute semantic similarity using lexico-syntactic patterns with page counts. These different similarity scores are integrated adapting support vector machines, to leverage the robustness of semantic similarity measures. Experimental results on two datasets achieve correlation coefficients of 0.798 on the dataset provided by A. Hliaoutakis, 0.705 on the dataset provide by T. Pedersen with physician scores and 0.496 on the dataset provided by T. Pedersen et al. with expert scores.

  14. Computing Semantic Similarity Measure Between Words Using Web Search Engine

    Directory of Open Access Journals (Sweden)

    Pushpa C N

    2013-05-01

    Full Text Available Semantic Similarity measures between words plays an important role in information retrieval, natural language processing and in various tasks on the web. In this paper, we have proposed a Modified Pattern Extraction Algorithm to compute th e supervised semantic similarity measure between the words by combining both page count meth od and web snippets method. Four association measures are used to find semantic simi larity between words in page count method using web search engines. We use a Sequential Minim al Optimization (SMO support vector machines (SVM to find the optimal combination of p age counts-based similarity scores and top-ranking patterns from the web snippets method. The SVM is trained to classify synonymous word-pairs and non-synonymous word-pairs. The propo sed Modified Pattern Extraction Algorithm outperforms by 89.8 percent of correlatio n value.

  15. CLIP: similarity searching of 3D databases using clique detection.

    Science.gov (United States)

    Rhodes, Nicholas; Willett, Peter; Calvet, Alain; Dunbar, James B; Humblet, Christine

    2003-01-01

    This paper describes a program for 3D similarity searching, called CLIP (for Candidate Ligand Identification Program), that uses the Bron-Kerbosch clique detection algorithm to find those structures in a file that have large structures in common with a target structure. Structures are characterized by the geometric arrangement of pharmacophore points and the similarity between two structures calculated using modifications of the Simpson and Tanimoto association coefficients. This modification takes into account the fact that a distance tolerance is required to ensure that pairs of interatomic distances can be regarded as equivalent during the clique-construction stage of the matching algorithm. Experiments with HIV assay data demonstrate the effectiveness and the efficiency of this approach to virtual screening.

  16. Fast and accurate database searches with MS-GF+Percolator

    Energy Technology Data Exchange (ETDEWEB)

    Granholm, Viktor; Kim, Sangtae; Navarro, Jose' C.; Sjolund, Erik; Smith, Richard D.; Kall, Lukas

    2014-02-28

    To identify peptides and proteins from the large number of fragmentation spectra in mass spectrometrybased proteomics, researches commonly employ so called database search engines. Additionally, postprocessors like Percolator have been used on the results from such search engines, to assess confidence, infer peptides and generally increase the number of identifications. A recent search engine, MS-GF+, has previously been showed to out-perform these classical search engines in terms of the number of identified spectra. However, MS-GF+ generates only limited statistical estimates of the results, hence hampering the biological interpretation. Here, we enabled Percolator-processing for MS-GF+ output, and observed an increased number of identified peptides for a wide variety of datasets. In addition, Percolator directly reports false discovery rate estimates, such as q values and posterior error probabilities, as well as p values, for peptide-spectrum matches, peptides and proteins, functions useful for the whole proteomics community.

  17. SHOP: scaffold hopping by GRID-based similarity searches

    DEFF Research Database (Denmark)

    Bergmann, Rikke; Linusson, Anna; Zamora, Ismael

    2007-01-01

    A new GRID-based method for scaffold hopping (SHOP) is presented. In a fully automatic manner, scaffolds were identified in a database based on three types of 3D-descriptors. SHOP's ability to recover scaffolds was assessed and validated by searching a database spiked with fragments of known...... ligands of three different protein targets relevant for drug discovery using a rational approach based on statistical experimental design. Five out of eight and seven out of eight thrombin scaffolds and all seven HIV protease scaffolds were recovered within the top 10 and 31 out of 31 neuraminidase...... scaffolds were in the 31 top-ranked scaffolds. SHOP also identified new scaffolds with substantially different chemotypes from the queries. Docking analysis indicated that the new scaffolds would have similar binding modes to those of the respective query scaffolds observed in X-ray structures...

  18. Fast and accurate database searches with MS-GF+Percolator.

    Science.gov (United States)

    Granholm, Viktor; Kim, Sangtae; Navarro, José C F; Sjölund, Erik; Smith, Richard D; Käll, Lukas

    2014-02-07

    One can interpret fragmentation spectra stemming from peptides in mass-spectrometry-based proteomics experiments using so-called database search engines. Frequently, one also runs post-processors such as Percolator to assess the confidence, infer unique peptides, and increase the number of identifications. A recent search engine, MS-GF+, has shown promising results, due to a new and efficient scoring algorithm. However, MS-GF+ provides few statistical estimates about the peptide-spectrum matches, hence limiting the biological interpretation. Here, we enabled Percolator processing for MS-GF+ output and observed an increased number of identified peptides for a wide variety of data sets. In addition, Percolator directly reports p values and false discovery rate estimates, such as q values and posterior error probabilities, for peptide-spectrum matches, peptides, and proteins, functions that are useful for the whole proteomics community.

  19. Keyword Search over Data Service Integration for Accurate Results

    CERN Document Server

    Zemleris, Vidmantas; Robert Gwadera

    2013-01-01

    Virtual data integration provides a coherent interface for querying heterogeneous data sources (e.g., web services, proprietary systems) with minimum upfront effort. Still, this requires its users to learn the query language and to get acquainted with data organization, which may pose problems even to proficient users. We present a keyword search system, which proposes a ranked list of structured queries along with their explanations. It operates mainly on the metadata, such as the constraints on inputs accepted by services. It was developed as an integral part of the CMS data discovery service, and is currently available as open source.

  20. Efficient searching and annotation of metabolic networks using chemical similarity

    OpenAIRE

    Pertusi, Dante A.; Stine, Andrew E.; Broadbelt, Linda J.; Keith E J Tyo

    2014-01-01

    Motivation: The urgent need for efficient and sustainable biological production of fuels and high-value chemicals has elicited a wave of in silico techniques for identifying promising novel pathways to these compounds in large putative metabolic networks. To date, these approaches have primarily used general graph search algorithms, which are prohibitively slow as putative metabolic networks may exceed 1 million compounds. To alleviate this limitation, we report two methods—SimIndex (SI) and ...

  1. An accurate algorithm to calculate the Hurst exponent of self-similar processes

    Energy Technology Data Exchange (ETDEWEB)

    Fernández-Martínez, M., E-mail: fmm124@ual.es [Department of Mathematics, Faculty of Science, Universidad de Almería, 04120 Almería (Spain); Sánchez-Granero, M.A., E-mail: misanche@ual.es [Department of Mathematics, Faculty of Science, Universidad de Almería, 04120 Almería (Spain); Trinidad Segovia, J.E., E-mail: jetrini@ual.es [Department of Accounting and Finance, Faculty of Economics and Business, Universidad de Almería, 04120 Almería (Spain); Román-Sánchez, I.M., E-mail: iroman@ual.es [Department of Accounting and Finance, Faculty of Economics and Business, Universidad de Almería, 04120 Almería (Spain)

    2014-06-27

    In this paper, we introduce a new approach which generalizes the GM2 algorithm (introduced in Sánchez-Granero et al. (2008) [52]) as well as fractal dimension algorithms (FD1, FD2 and FD3) (first appeared in Sánchez-Granero et al. (2012) [51]), providing an accurate algorithm to calculate the Hurst exponent of self-similar processes. We prove that this algorithm performs properly in the case of short time series when fractional Brownian motions and Lévy stable motions are considered. We conclude the paper with a dynamic study of the Hurst exponent evolution in the S and P500 index stocks. - Highlights: • We provide a new approach to properly calculate the Hurst exponent. • This generalizes FD algorithms and GM2, introduced previously by the authors. • This method (FD4) results especially appropriate for short time series. • FD4 may be used in both unifractal and multifractal contexts. • As an empirical application, we show that S and P500 stocks improved their efficiency.

  2. Accurate estimation of influenza epidemics using Google search data via ARGO.

    Science.gov (United States)

    Yang, Shihao; Santillana, Mauricio; Kou, S C

    2015-11-24

    Accurate real-time tracking of influenza outbreaks helps public health officials make timely and meaningful decisions that could save lives. We propose an influenza tracking model, ARGO (AutoRegression with GOogle search data), that uses publicly available online search data. In addition to having a rigorous statistical foundation, ARGO outperforms all previously available Google-search-based tracking models, including the latest version of Google Flu Trends, even though it uses only low-quality search data as input from publicly available Google Trends and Google Correlate websites. ARGO not only incorporates the seasonality in influenza epidemics but also captures changes in people's online search behavior over time. ARGO is also flexible, self-correcting, robust, and scalable, making it a potentially powerful tool that can be used for real-time tracking of other social events at multiple temporal and spatial resolutions.

  3. Density-based similarity measures for content based search

    Energy Technology Data Exchange (ETDEWEB)

    Hush, Don R [Los Alamos National Laboratory; Porter, Reid B [Los Alamos National Laboratory; Ruggiero, Christy E [Los Alamos National Laboratory

    2009-01-01

    We consider the query by multiple example problem where the goal is to identify database samples whose content is similar to a coUection of query samples. To assess the similarity we use a relative content density which quantifies the relative concentration of the query distribution to the database distribution. If the database distribution is a mixture of the query distribution and a background distribution then it can be shown that database samples whose relative content density is greater than a particular threshold {rho} are more likely to have been generated by the query distribution than the background distribution. We describe an algorithm for predicting samples with relative content density greater than {rho} that is computationally efficient and possesses strong performance guarantees. We also show empirical results for applications in computer network monitoring and image segmentation.

  4. Content-Based Search on a Database of Geometric Models: Identifying Objects of Similar Shape

    Energy Technology Data Exchange (ETDEWEB)

    XAVIER, PATRICK G.; HENRY, TYSON R.; LAFARGE, ROBERT A.; MEIRANS, LILITA; RAY, LAWRENCE P.

    2001-11-01

    The Geometric Search Engine is a software system for storing and searching a database of geometric models. The database maybe searched for modeled objects similar in shape to a target model supplied by the user. The database models are generally from CAD models while the target model may be either a CAD model or a model generated from range data collected from a physical object. This document describes key generation, database layout, and search of the database.

  5. Perceptual Grouping in Haptic Search: The Influence of Proximity, Similarity, and Good Continuation

    Science.gov (United States)

    Overvliet, Krista E.; Krampe, Ralf Th.; Wagemans, Johan

    2012-01-01

    We conducted a haptic search experiment to investigate the influence of the Gestalt principles of proximity, similarity, and good continuation. We expected faster search when the distractors could be grouped. We chose edges at different orientations as stimuli because they are processed similarly in the haptic and visual modality. We therefore…

  6. Searching the protein structure database for ligand-binding site similarities using CPASS v.2

    Directory of Open Access Journals (Sweden)

    Caprez Adam

    2011-01-01

    Full Text Available Abstract Background A recent analysis of protein sequences deposited in the NCBI RefSeq database indicates that ~8.5 million protein sequences are encoded in prokaryotic and eukaryotic genomes, where ~30% are explicitly annotated as "hypothetical" or "uncharacterized" protein. Our Comparison of Protein Active-Site Structures (CPASS v.2 database and software compares the sequence and structural characteristics of experimentally determined ligand binding sites to infer a functional relationship in the absence of global sequence or structure similarity. CPASS is an important component of our Functional Annotation Screening Technology by NMR (FAST-NMR protocol and has been successfully applied to aid the annotation of a number of proteins of unknown function. Findings We report a major upgrade to our CPASS software and database that significantly improves its broad utility. CPASS v.2 is designed with a layered architecture to increase flexibility and portability that also enables job distribution over the Open Science Grid (OSG to increase speed. Similarly, the CPASS interface was enhanced to provide more user flexibility in submitting a CPASS query. CPASS v.2 now allows for both automatic and manual definition of ligand-binding sites and permits pair-wise, one versus all, one versus list, or list versus list comparisons. Solvent accessible surface area, ligand root-mean square difference, and Cβ distances have been incorporated into the CPASS similarity function to improve the quality of the results. The CPASS database has also been updated. Conclusions CPASS v.2 is more than an order of magnitude faster than the original implementation, and allows for multiple simultaneous job submissions. Similarly, the CPASS database of ligand-defined binding sites has increased in size by ~ 38%, dramatically increasing the likelihood of a positive search result. The modification to the CPASS similarity function is effective in reducing CPASS similarity scores

  7. δ-Similar Elimination to Enhance Search Performance of Multiobjective Evolutionary Algorithms

    Science.gov (United States)

    Aguirre, Hernán; Sato, Masahiko; Tanaka, Kiyoshi

    In this paper, we propose δ-similar elimination to improve the search performance of multiobjective evolutionary algorithms in combinatorial optimization problems. This method eliminates similar individuals in objective space to fairly distribute selection among the different regions of the instantaneous Pareto front. We investigate four eliminating methods analyzing their effects using NSGA-II. In addition, we compare the search performance of NSGA-II enhanced by our method and NSGA-II enhanced by controlled elitism.

  8. SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2004-10-01

    Full Text Available Abstract Background Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. Results We describe the implementation of SS-Wrapper (Similarity Search Wrapper, a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST that provides a complementary solution for BLAST searches when the database is too large to fit into

  9. Efficient and accurate nearest neighbor and closest pair search in high-dimensional space

    KAUST Repository

    Tao, Yufei

    2010-07-01

    Nearest Neighbor (NN) search in high-dimensional space is an important problem in many applications. From the database perspective, a good solution needs to have two properties: (i) it can be easily incorporated in a relational database, and (ii) its query cost should increase sublinearly with the dataset size, regardless of the data and query distributions. Locality-Sensitive Hashing (LSH) is a well-known methodology fulfilling both requirements, but its current implementations either incur expensive space and query cost, or abandon its theoretical guarantee on the quality of query results. Motivated by this, we improve LSH by proposing an access method called the Locality-Sensitive B-tree (LSB-tree) to enable fast, accurate, high-dimensional NN search in relational databases. The combination of several LSB-trees forms a LSB-forest that has strong quality guarantees, but improves dramatically the efficiency of the previous LSH implementation having the same guarantees. In practice, the LSB-tree itself is also an effective index which consumes linear space, supports efficient updates, and provides accurate query results. In our experiments, the LSB-tree was faster than: (i) iDistance (a famous technique for exact NN search) by two orders ofmagnitude, and (ii) MedRank (a recent approximate method with nontrivial quality guarantees) by one order of magnitude, and meanwhile returned much better results. As a second step, we extend our LSB technique to solve another classic problem, called Closest Pair (CP) search, in high-dimensional space. The long-term challenge for this problem has been to achieve subquadratic running time at very high dimensionalities, which fails most of the existing solutions. We show that, using a LSB-forest, CP search can be accomplished in (worst-case) time significantly lower than the quadratic complexity, yet still ensuring very good quality. In practice, accurate answers can be found using just two LSB-trees, thus giving a substantial

  10. Efficient Similarity Search Using the Earth Mover's Distance for Large Multimedia Databases

    DEFF Research Database (Denmark)

    Assent, Ira; Wichterich, Marc; Meisen, Tobias

    2008-01-01

    Multimedia similarity search in large databases requires efficient query processing. The Earth mover's distance, introduced in computer vision, is successfully used as a similarity model in a number of small-scale applications. Its computational complexity hindered its adoption in large multimedia...

  11. MEASURING THE PERFORMANCE OF SIMILARITY PROPAGATION IN AN SEMANTIC SEARCH ENGINE

    Directory of Open Access Journals (Sweden)

    S. K. Jayanthi

    2013-10-01

    Full Text Available In the current scenario, web page result personalization is playing a vital role. Nearly 80 % of the users expect the best results in the first page itself without having any persistence to browse longer in URL mode. This research work focuses on two main themes: Semantic web search through online and Domain based search through offline. The first part is to find an effective method which allows grouping similar results together using BookShelf Data Structure and organizing the various clusters. The second one is focused on the academic domain based search through offline. This paper focuses on finding documents which are similar and how Vector space can be used to solve it. So more weightage is given for the principles and working methodology of similarity propagation. Cosine similarity measure is used for finding the relevancy among the documents.

  12. Efficient Retrieval of Images for Search Engine by Visual Similarity and Re Ranking

    Directory of Open Access Journals (Sweden)

    Viswa S S

    2013-06-01

    Full Text Available Nowadays, web scale image search engines (e.g. Google Image Search, Microsoft Live Image Search rely almost purely on surrounding text features. Users type keywords in hope of finding a certain type of images. The search engine returns thousands of images ranked by the text keywords extracted from the surrounding text. However, many of returned images are noisy, disorganized, or irrelevant. Even Google and Microsoft have no Visual Information for searching of images. Using visual information to re rank and improve text based image search results is the idea. This improves the precision of the text based image search ranking by incorporating the information conveyed by the visual modality. The typical assumption that the top- images in the text-based search result are equally relevant is relaxed by linking the relevance of the images to their initial rank positions. Then, a number of images from the initial search result are employed as the prototypes that serve to visually represent the query and that are subsequently used to construct meta re rankers .i.e. The most relevant images are found by visual similarity and the average scores are calculated. By applying different meta re rankers to an image from the initial result, re ranking scores are generated, which are then used to find the new rank position for an image in the re ranked search result. Human supervision is introduced to learn the model weights offline, prior to the online re ranking process. While model learning requires manual labelling of the results for a few queries, the resulting model is query independent and therefore applicable to any other query. The experimental results on a representative web image search dataset comprising 353 queries demonstrate that the proposed method outperforms the existing supervised and unsupervised Re ranking approaches. Moreover, it improves the performance over the text-based image search engine by more than 25.48%.

  13. Similarity

    Science.gov (United States)

    Apostol, Tom M. (Editor)

    1990-01-01

    In this 'Project Mathematics! series, sponsored by the California Institute for Technology (CalTech), the mathematical concept of similarity is presented. he history of and real life applications are discussed using actual film footage and computer animation. Terms used and various concepts of size, shape, ratio, area, and volume are demonstrated. The similarity of polygons, solids, congruent triangles, internal ratios, perimeters, and line segments using the previous mentioned concepts are shown.

  14. A comparison of field-based similarity searching methods: CatShape, FBSS, and ROCS.

    Science.gov (United States)

    Moffat, Kirstin; Gillet, Valerie J; Whittle, Martin; Bravi, Gianpaolo; Leach, Andrew R

    2008-04-01

    Three field-based similarity methods are compared in retrospective virtual screening experiments. The methods are the CatShape module of CATALYST, ROCS, and an in-house program developed at the University of Sheffield called FBSS. The programs are used in both rigid and flexible searches carried out in the MDL Drug Data Report. UNITY 2D fingerprints are also used to provide a comparison with a more traditional approach to similarity searching, and similarity based on simple whole-molecule properties is used to provide a baseline for the more sophisticated searches. Overall, UNITY 2D fingerprints and ROCS with the chemical force field option gave comparable performance and were superior to the shape-only 3D methods. When the flexible methods were compared with the rigid methods, it was generally found that the flexible methods gave slightly better results than their respective rigid methods; however, the increased performance did not justify the additional computational cost required.

  15. Molecular fingerprint recombination: generating hybrid fingerprints for similarity searching from different fingerprint types.

    Science.gov (United States)

    Nisius, Britta; Bajorath, Jürgen

    2009-11-01

    Molecular fingerprints have a long history in computational medicinal chemistry and continue to be popular tools for similarity searching. Over the years, a variety of fingerprint types have been introduced. We report an approach to identify preferred bit subsets in fingerprints of different design and "recombine" these bit segments into "hybrid fingerprints". These compound class-directed fingerprint representations are found to increase the similarity search performance of their parental fingerprints, which can be rationalized by the often complementary nature of distinct fingerprint features.

  16. A New Retrieval Model Based on TextTiling for Document Similarity Search

    Institute of Scientific and Technical Information of China (English)

    Xiao-Jun Wan; Yu-Xin Peng

    2005-01-01

    Document similarity search is to find documents similar to a given query document and return a ranked list of similar documents to users, which is widely used in many text and web systems, such as digital library, search engine,etc. Traditional retrieval models, including the Okapi's BM25 model and the Smart's vector space model with length normalization, could handle this problem to some extent by taking the query document as a long query. In practice,the Cosine measure is considered as the best model for document similarity search because of its good ability to measure similarity between two documents. In this paper, the quantitative performances of the above models are compared using experiments. Because the Cosine measure is not able to reflect the structural similarity between documents, a new retrieval model based on TextTiling is proposed in the paper. The proposed model takes into account the subtopic structures of documents. It first splits the documents into text segments with TextTiling and calculates the similarities for different pairs of text segments in the documents. Lastly the overall similarity between the documents is returned by combining the similarities of different pairs of text segments with optimal matching method. Experiments are performed and results show:1) the popular retrieval models (the Okapi's BM25 model and the Smart's vector space model with length normalization)do not perform well for document similarity search; 2) the proposed model based on TextTiling is effective and outperforms other models, including the Cosine measure; 3) the methods for the three components in the proposed model are validated to be appropriately employed.

  17. Similarities and differences between Web search procedure and searching in the pre-web information retrieval systems

    Directory of Open Access Journals (Sweden)

    Yazdan Mansourian

    2004-08-01

    Full Text Available This paper presents an introductory discussion about the commonalities and dissimilarities between Web searching procedure and the searching process in the previous online information retrieval systems including classic information retrieval systems and database. The paper attempts to explain which factors make these two groups different, why investigating about the search process on the Web environment is important, how much we know about this procedure and what are the main lines of research in front of the researchers in this area of study and practice. After presenting the major involved factor the paper concludes that although information seeking process on the Web is fairly similar to the pre-web systems in some ways, there are notable differences between them as well. These differences may provide Web searcher and Web researchers with some opportunities and challenges.

  18. Web Image Search Re-ranking with Click-based Similarity and Typicality.

    Science.gov (United States)

    Yang, Xiaopeng; Mei, Tao; Zhang, Yong Dong; Liu, Jie; Satoh, Shin'ichi

    2016-07-20

    In image search re-ranking, besides the well known semantic gap, intent gap, which is the gap between the representation of users' query/demand and the real intent of the users, is becoming a major problem restricting the development of image retrieval. To reduce human effects, in this paper, we use image click-through data, which can be viewed as the "implicit feedback" from users, to help overcome the intention gap, and further improve the image search performance. Generally, the hypothesis visually similar images should be close in a ranking list and the strategy images with higher relevance should be ranked higher than others are widely accepted. To obtain satisfying search results, thus, image similarity and the level of relevance typicality are determinate factors correspondingly. However, when measuring image similarity and typicality, conventional re-ranking approaches only consider visual information and initial ranks of images, while overlooking the influence of click-through data. This paper presents a novel re-ranking approach, named spectral clustering re-ranking with click-based similarity and typicality (SCCST). First, to learn an appropriate similarity measurement, we propose click-based multi-feature similarity learning algorithm (CMSL), which conducts metric learning based on clickbased triplets selection, and integrates multiple features into a unified similarity space via multiple kernel learning. Then based on the learnt click-based image similarity measure, we conduct spectral clustering to group visually and semantically similar images into same clusters, and get the final re-rank list by calculating click-based clusters typicality and withinclusters click-based image typicality in descending order. Our experiments conducted on two real-world query-image datasets with diverse representative queries show that our proposed reranking approach can significantly improve initial search results, and outperform several existing re-ranking approaches.

  19. Similarity-based search of model organism, disease and drug effect phenotypes

    KAUST Repository

    Hoehndorf, Robert

    2015-02-19

    Background: Semantic similarity measures over phenotype ontologies have been demonstrated to provide a powerful approach for the analysis of model organism phenotypes, the discovery of animal models of human disease, novel pathways, gene functions, druggable therapeutic targets, and determination of pathogenicity. Results: We have developed PhenomeNET 2, a system that enables similarity-based searches over a large repository of phenotypes in real-time. It can be used to identify strains of model organisms that are phenotypically similar to human patients, diseases that are phenotypically similar to model organism phenotypes, or drug effect profiles that are similar to the phenotypes observed in a patient or model organism. PhenomeNET 2 is available at http://aber-owl.net/phenomenet. Conclusions: Phenotype-similarity searches can provide a powerful tool for the discovery and investigation of molecular mechanisms underlying an observed phenotypic manifestation. PhenomeNET 2 facilitates user-defined similarity searches and allows researchers to analyze their data within a large repository of human, mouse and rat phenotypes.

  20. SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences

    Directory of Open Access Journals (Sweden)

    Chen Ke

    2008-05-01

    Full Text Available Abstract Background Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. Results SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. Conclusion The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is

  1. RScan: fast searching structural similarities for structured RNAs in large databases

    Directory of Open Access Journals (Sweden)

    Liu Guo-Ping

    2007-07-01

    Full Text Available Abstract Background Many RNAs have evolutionarily conserved secondary structures instead of primary sequences. Recently, there are an increasing number of methods being developed with focus on the structural alignments for finding conserved secondary structures as well as common structural motifs in pair-wise or multiple sequences. A challenging task is to search similar structures quickly for structured RNA sequences in large genomic databases since existing methods are too slow to be used in large databases. Results An implementation of a fast structural alignment algorithm, RScan, is proposed to fulfill the task. RScan is developed by levering the advantages of both hashing algorithms and local alignment algorithms. In our experiment, on the average, the times for searching a tRNA and an rRNA in the randomized A. pernix genome are only 256 seconds and 832 seconds respectively by using RScan, but need 3,178 seconds and 8,951 seconds respectively by using an existing method RSEARCH. Remarkably, RScan can handle large database queries, taking less than 4 minutes for searching similar structures for a microRNA precursor in human chromosome 21. Conclusion These results indicate that RScan is a preferable choice for real-life application of searching structural similarities for structured RNAs in large databases. RScan software is freely available at http://bioinfo.au.tsinghua.edu.cn/member/cxue/rscan/RScan.htm.

  2. Manifold Learning for Multivariate Variable-Length Sequences With an Application to Similarity Search.

    Science.gov (United States)

    Ho, Shen-Shyang; Dai, Peng; Rudzicz, Frank

    2016-06-01

    Multivariate variable-length sequence data are becoming ubiquitous with the technological advancement in mobile devices and sensor networks. Such data are difficult to compare, visualize, and analyze due to the nonmetric nature of data sequence similarity measures. In this paper, we propose a general manifold learning framework for arbitrary-length multivariate data sequences driven by similarity/distance (parameter) learning in both the original data sequence space and the learned manifold. Our proposed algorithm transforms the data sequences in a nonmetric data sequence space into feature vectors in a manifold that preserves the data sequence space structure. In particular, the feature vectors in the manifold representing similar data sequences remain close to one another and far from the feature points corresponding to dissimilar data sequences. To achieve this objective, we assume a semisupervised setting where we have knowledge about whether some of data sequences are similar or dissimilar, called the instance-level constraints. Using this information, one learns the similarity measure for the data sequence space and the distance measures for the manifold. Moreover, we describe an approach to handle the similarity search problem given user-defined instance level constraints in the learned manifold using a consensus voting scheme. Experimental results on both synthetic data and real tropical cyclone sequence data are presented to demonstrate the feasibility of our manifold learning framework and the robustness of performing similarity search in the learned manifold.

  3. Software Suite for Gene and Protein Annotation Prediction and Similarity Search.

    Science.gov (United States)

    Chicco, Davide; Masseroli, Marco

    2015-01-01

    In the computational biology community, machine learning algorithms are key instruments for many applications, including the prediction of gene-functions based upon the available biomolecular annotations. Additionally, they may also be employed to compute similarity between genes or proteins. Here, we describe and discuss a software suite we developed to implement and make publicly available some of such prediction methods and a computational technique based upon Latent Semantic Indexing (LSI), which leverages both inferred and available annotations to search for semantically similar genes. The suite consists of three components. BioAnnotationPredictor is a computational software module to predict new gene-functions based upon Singular Value Decomposition of available annotations. SimilBio is a Web module that leverages annotations available or predicted by BioAnnotationPredictor to discover similarities between genes via LSI. The suite includes also SemSim, a new Web service built upon these modules to allow accessing them programmatically. We integrated SemSim in the Bio Search Computing framework (http://www.bioinformatics.deib. polimi.it/bio-seco/seco/), where users can exploit the Search Computing technology to run multi-topic complex queries on multiple integrated Web services. Accordingly, researchers may obtain ranked answers involving the computation of the functional similarity between genes in support of biomedical knowledge discovery.

  4. A genetic similarity algorithm for searching the Gene Ontology terms and annotating anonymous protein sequences.

    Science.gov (United States)

    Othman, Razib M; Deris, Safaai; Illias, Rosli M

    2008-02-01

    A genetic similarity algorithm is introduced in this study to find a group of semantically similar Gene Ontology terms. The genetic similarity algorithm combines semantic similarity measure algorithm with parallel genetic algorithm. The semantic similarity measure algorithm is used to compute the similitude strength between the Gene Ontology terms. Then, the parallel genetic algorithm is employed to perform batch retrieval and to accelerate the search in large search space of the Gene Ontology graph. The genetic similarity algorithm is implemented in the Gene Ontology browser named basic UTMGO to overcome the weaknesses of the existing Gene Ontology browsers which use a conventional approach based on keyword matching. To show the applicability of the basic UTMGO, we extend its structure to develop a Gene Ontology -based protein sequence annotation tool named extended UTMGO. The objective of developing the extended UTMGO is to provide a simple and practical tool that is capable of producing better results and requires a reasonable amount of running time with low computing cost specifically for offline usage. The computational results and comparison with other related tools are presented to show the effectiveness of the proposed algorithm and tools.

  5. Applying Statistical Models and Parametric Distance Measures for Music Similarity Search

    Science.gov (United States)

    Lukashevich, Hanna; Dittmar, Christian; Bastuck, Christoph

    Automatic deriving of similarity relations between music pieces is an inherent field of music information retrieval research. Due to the nearly unrestricted amount of musical data, the real-world similarity search algorithms have to be highly efficient and scalable. The possible solution is to represent each music excerpt with a statistical model (ex. Gaussian mixture model) and thus to reduce the computational costs by applying the parametric distance measures between the models. In this paper we discuss the combinations of applying different parametric modelling techniques and distance measures and weigh the benefits of each one against the others.

  6. WEB SEARCH ENGINE BASED SEMANTIC SIMILARITY MEASURE BETWEEN WORDS USING PATTERN RETRIEVAL ALGORITHM

    Directory of Open Access Journals (Sweden)

    Pushpa C N

    2013-02-01

    Full Text Available Semantic Similarity measures plays an important role in information retrieval, natural language processing and various tasks on web such as relation extraction, community mining, document clustering, and automatic meta-data extraction. In this paper, we have proposed a Pattern Retrieval Algorithm [PRA] to compute the semantic similarity measure between the words by combining both page count method and web snippets method. Four association measures are used to find semantic similarity between words in page count method using web search engines. We use a Sequential Minimal Optimization (SMO support vector machines (SVM to find the optimal combination of page counts-based similarity scores and top-ranking patterns from the web snippets method. The SVM is trained to classify synonymous word-pairs and nonsynonymous word-pairs. The proposed approach aims to improve the Correlation values, Precision, Recall, and F-measures, compared to the existing methods. The proposed algorithm outperforms by 89.8 % of correlation value.

  7. Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA.

    Science.gov (United States)

    Mrozek, Dariusz; Brożek, Miłosz; Małysiak-Mrozek, Bożena

    2014-02-01

    Searching for similar 3D protein structures is one of the primary processes employed in the field of structural bioinformatics. However, the computational complexity of this process means that it is constantly necessary to search for new methods that can perform such a process faster and more efficiently. Finding molecular substructures that complex protein structures have in common is still a challenging task, especially when entire databases containing tens or even hundreds of thousands of protein structures must be scanned. Graphics processing units (GPUs) and general purpose graphics processing units (GPGPUs) can perform many time-consuming and computationally demanding processes much more quickly than a classical CPU can. In this paper, we describe the GPU-based implementation of the CASSERT algorithm for 3D protein structure similarity searching. This algorithm is based on the two-phase alignment of protein structures when matching fragments of the compared proteins. The GPU (GeForce GTX 560Ti: 384 cores, 2GB RAM) implementation of CASSERT ("GPU-CASSERT") parallelizes both alignment phases and yields an average 180-fold increase in speed over its CPU-based, single-core implementation on an Intel Xeon E5620 (2.40GHz, 4 cores). In this paper, we show that massive parallelization of the 3D structure similarity search process on many-core GPU devices can reduce the execution time of the process, allowing it to be performed in real time. GPU-CASSERT is available at: http://zti.polsl.pl/dmrozek/science/gpucassert/cassert.htm.

  8. Similarity searching and scaffold hopping in synthetically accessible combinatorial chemistry spaces.

    Science.gov (United States)

    Boehm, Markus; Wu, Tong-Ying; Claussen, Holger; Lemmen, Christian

    2008-04-24

    Large collections of combinatorial libraries are an integral element in today's pharmaceutical industry. It is of great interest to perform similarity searches against all virtual compounds that are synthetically accessible by any such library. Here we describe the successful application of a new software tool CoLibri on 358 combinatorial libraries based on validated reaction protocols to create a single chemistry space containing over 10 (12) possible products. Similarity searching with FTrees-FS allows the systematic exploration of this space without the need to enumerate all product structures. The search result is a set of virtual hits which are synthetically accessible by one or more of the existing reaction protocols. Grouping these virtual hits by their synthetic protocols allows the rapid design and synthesis of multiple follow-up libraries. Such library ideas support hit-to-lead design efforts for tasks like follow-up from high-throughput screening hits or scaffold hopping from one hit to another attractive series.

  9. Prospective and retrospective ECG-gating for CT coronary angiography perform similarly accurate at low heart rates

    Energy Technology Data Exchange (ETDEWEB)

    Stolzmann, Paul, E-mail: paul.stolzmann@usz.ch [Institute of Diagnostic Radiology, University Hospital Zurich, Raemistrasse 100, 8091 Zurich (Switzerland); Goetti, Robert; Baumueller, Stephan [Institute of Diagnostic Radiology, University Hospital Zurich, Raemistrasse 100, 8091 Zurich (Switzerland); Plass, Andre; Falk, Volkmar [Clinic for Cardiovascular Surgery, University Hospital Zurich (Switzerland); Scheffel, Hans; Feuchtner, Gudrun; Marincek, Borut [Institute of Diagnostic Radiology, University Hospital Zurich, Raemistrasse 100, 8091 Zurich (Switzerland); Alkadhi, Hatem [Institute of Diagnostic Radiology, University Hospital Zurich, Raemistrasse 100, 8091 Zurich (Switzerland); Cardiac MR PET CT Program, Massachusetts General Hospital and Harvard Medical School, Boston, MA (United States); Leschka, Sebastian [Institute of Diagnostic Radiology, University Hospital Zurich, Raemistrasse 100, 8091 Zurich (Switzerland)

    2011-07-15

    Objective: To compare, in patients with suspicion of coronary artery disease (CAD) and low heart rates, image quality, diagnostic performance, and radiation dose values of prospectively and retrospectively electrocardiography (ECG)-gated dual-source computed tomography coronary angiography (CTCA) for the diagnosis of significant coronary stenoses. Materials and methods: Two-hundred consecutive patients with heart rates {<=}70 bpm were retrospectively enrolled; 100 patients undergoing prospectively ECG-gated CTCA (group 1) and 100 patients undergoing retrospectively-gated CTCA (group 2). Coronary artery segments were assessed for image quality and significant luminal diameter narrowing. Sensitivity, specificity, positive predictive values (PPV), negative predictive values (NPV), and accuracy of both CTCA groups were determined using conventional catheter angiography (CCA) as reference standard. Radiation dose values were calculated. Results: Both groups were comparable regarding gender, body weight, cardiovascular risk profile, severity of CAD, mean heart rate, heart rate variability, and Agatston score (all p > 0.05). There was no significant difference in the rate of non-assessable coronary segments between group 1 (1.6%, 24/1404) and group 2 (1.4%, 19/1385; p = 0.77); non-diagnostic image quality was significantly (p < 0.001) more often attributed to stair step artifacts in group 1. Segment-based sensitivity, specificity, PPV, NPV, and accuracy were 98%, 98%, 88%, 100%, and 100% among group 1; 96%, 99%, 90%, 100%, and 98% among group 2, respectively. Parameters of diagnostic performance were similar (all p > 0.05). Mean effective radiation dose of prospectively ECG-gated CTCA (2.2 {+-} 0.4 mSv) was significantly (p < 0.0001) smaller than that of retrospectively ECG-gated CTCA (8.1 {+-} 0.6 mSv). Conclusion: Prospectively ECG-gated CTCA yields similar image quality, performs as accurately as retrospectively ECG-gated CTCA in patients having heart rates {<=}70 bpm

  10. Semantic similarity measures in the biomedical domain by leveraging a web search engine.

    Science.gov (United States)

    Hsieh, Sheau-Ling; Chang, Wen-Yung; Chen, Chi-Huang; Weng, Yung-Ching

    2013-07-01

    Various researches in web related semantic similarity measures have been deployed. However, measuring semantic similarity between two terms remains a challenging task. The traditional ontology-based methodologies have a limitation that both concepts must be resided in the same ontology tree(s). Unfortunately, in practice, the assumption is not always applicable. On the other hand, if the corpus is sufficiently adequate, the corpus-based methodologies can overcome the limitation. Now, the web is a continuous and enormous growth corpus. Therefore, a method of estimating semantic similarity is proposed via exploiting the page counts of two biomedical concepts returned by Google AJAX web search engine. The features are extracted as the co-occurrence patterns of two given terms P and Q, by querying P, Q, as well as P AND Q, and the web search hit counts of the defined lexico-syntactic patterns. These similarity scores of different patterns are evaluated, by adapting support vector machines for classification, to leverage the robustness of semantic similarity measures. Experimental results validating against two datasets: dataset 1 provided by A. Hliaoutakis; dataset 2 provided by T. Pedersen, are presented and discussed. In dataset 1, the proposed approach achieves the best correlation coefficient (0.802) under SNOMED-CT. In dataset 2, the proposed method obtains the best correlation coefficient (SNOMED-CT: 0.705; MeSH: 0.723) with physician scores comparing with measures of other methods. However, the correlation coefficients (SNOMED-CT: 0.496; MeSH: 0.539) with coder scores received opposite outcomes. In conclusion, the semantic similarity findings of the proposed method are close to those of physicians' ratings. Furthermore, the study provides a cornerstone investigation for extracting fully relevant information from digitizing, free-text medical records in the National Taiwan University Hospital database.

  11. Protein structure alignment and fast similarity search using local shape signatures.

    Science.gov (United States)

    Can, Tolga; Wang, Yuan-Fang

    2004-03-01

    We present a new method for conducting protein structure similarity searches, which improves on the efficiency of some existing techniques. Our method is grounded in the theory of differential geometry on 3D space curve matching. We generate shape signatures for proteins that are invariant, localized, robust, compact, and biologically meaningful. The invariancy of the shape signatures allows us to improve similarity searching efficiency by adopting a hierarchical coarse-to-fine strategy. We index the shape signatures using an efficient hashing-based technique. With the help of this technique we screen out unlikely candidates and perform detailed pairwise alignments only for a small number of candidates that survive the screening process. Contrary to other hashing based techniques, our technique employs domain specific information (not just geometric information) in constructing the hash key, and hence, is more tuned to the domain of biology. Furthermore, the invariancy, localization, and compactness of the shape signatures allow us to utilize a well-known local sequence alignment algorithm for aligning two protein structures. One measure of the efficacy of the proposed technique is that we were able to perform structure alignment queries 36 times faster (on the average) than a well-known method while keeping the quality of the query results at an approximately similar level.

  12. SiMPSON: Efficient Similarity Search in Metric Spaces over P2P Structured Overlay Networks

    Science.gov (United States)

    Vu, Quang Hieu; Lupu, Mihai; Wu, Sai

    Similarity search in metric spaces over centralized systems has been significantly studied in the database research community. However, not so much work has been done in the context of P2P networks. This paper introduces SiMPSON: a P2P system supporting similarity search in metric spaces. The aim is to answer queries faster and using less resources than existing systems. For this, each peer first clusters its own data using any off-the-shelf clustering algorithms. Then, the resulting clusters are mapped to one-dimensional values. Finally, these one-dimensional values are indexed into a structured P2P overlay. Our method slightly increases the indexing overhead, but allows us to greatly reduce the number of peers and messages involved in query processing: we trade a small amount of overhead in the data publishing process for a substantial reduction of costs in the querying phase. Based on this architecture, we propose algorithms for processing range and kNN queries. Extensive experimental results validate the claims of efficiency and effectiveness of SiMPSON.

  13. World Climate Classification and Search: Data Mining Approach Utilizing Dynamic Time Warping Similarity Function

    Science.gov (United States)

    Stepinski, T. F.; Netzel, P.; Jasiewicz, J.

    2014-12-01

    We have developed a novel method for classification and search of climate over the global land surface excluding Antarctica. Our method classifies climate on the basis of the outcome of time series segmentation and clustering. We use WorldClim 30 arc sec. (approx. 1 km) resolution grid data which is based on 50 years of climatic observations. Each cell in a grid is assigned a 12 month series consisting of 50-years monthly averages of mean, maximum, and minimum temperatures as well as the total precipitation. The presented method introduces several innovations with comparison to existing data-driven methods of world climate classifications. First, it uses only climatic rather than bioclimatic data. Second, it employs object-oriented methodology - the grid is first segmented before climatic segments are classified. Third, and most importantly, the similarity between climates in two given cells is performed using the dynamic time warping (DTW) measure instead of the Euclidean distance. The DTW is known to be superior to Euclidean distance for time series, but has not been utilized before in classification of global climate. To account for computational expense of DTW we use highly efficient GeoPAT software (http://sil.uc.edu/gitlist/) that, in the first step, segments the grid into local regions of uniform climate. In the second step, the segments are classified. We also introduce a climate search - a GeoWeb-based method for interactive presentation of global climate information in the form of query-and-retrieval. A user selects a geographical location and the system returns a global map indicating level of similarity between local climates and a climate in the selected location. The results of the search for location: "University of Cincinnati, Main Campus" are presented on attached map. The results of the search for location: "University of Cincinnati, Main Campus" are presented on the map. We have compared the results of our method to Koeppen classification scheme

  14. iSARST: an integrated SARST web server for rapid protein structural similarity searches.

    Science.gov (United States)

    Lo, Wei-Cheng; Lee, Che-Yu; Lee, Chi-Ching; Lyu, Ping-Chiang

    2009-07-01

    iSARST is a web server for efficient protein structural similarity searches. It is a multi-processor, batch-processing and integrated implementation of several structural comparison tools and two database searching methods: SARST for common structural homologs and CPSARST for homologs with circular permutations. iSARST allows users submitting multiple PDB/SCOP entry IDs or an archive file containing many structures. After scanning the target database using SARST/CPSARST, the ordering of hits are refined with conventional structure alignment tools such as FAST, TM-align and SAMO, which are run in a PC cluster. In this way, iSARST achieves a high running speed while preserving the high precision of refinement engines. The final outputs include tables listing co-linear or circularly permuted homologs of the query proteins and a functional summary of the best hits. Superimposed structures can be examined through an interactive and informative visualization tool. iSARST provides the first batch mode structural comparison web service for both co-linear homologs and circular permutants. It can serve as a rapid annotation system for functionally unknown or hypothetical proteins, which are increasing rapidly in this post-genomics era. The server can be accessed at http://sarst.life.nthu.edu.tw/iSARST/.

  15. An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: Sensitivity and Specificity analysis.

    Energy Technology Data Exchange (ETDEWEB)

    Kapp, Eugene; Schutz, Frederick; Connolly, Lisa M.; Chakel, John A.; Meza, Jose E.; Miller, Christine A.; Fenyo, David; Eng, Jimmy K.; Adkins, Joshua N.; Omenn, Gilbert; Simpson, Richard

    2005-08-01

    MS/MS and associated database search algorithms are essential proteomic tools for identifying peptides. Due to their widespread use, it is now time to perform a systematic analysis of the various algorithms currently in use. Using blood specimens used in the HUPO Plasma Proteome Project, we have evaluated five search algorithms with respect to their sensitivity and specificity, and have also accurately benchmarked them based on specified false-positive (FP) rates. Spectrum Mill and SEQUEST performed well in terms of sensitivity, but were inferior to MASCOT, X-Tandem, and Sonar in terms of specificity. Overall, MASCOT, a probabilistic search algorithm, correctly identified most peptides based on a specified FP rate. The rescoring algorithm, Peptide Prophet, enhanced the overall performance of the SEQUEST algorithm, as well as provided predictable FP error rates. Ideally, score thresholds should be calculated for each peptide spectrum or minimally, derived from a reversed-sequence search as demonstrated in this study based on a validated data set. The availability of open-source search algorithms, such as X-Tandem, makes it feasible to further improve the validation process (manual or automatic) on the basis of ''consensus scoring'', i.e., the use of multiple (at least two) search algorithms to reduce the number of FPs. complement.

  16. Gene network homology in prokaryotes using a similarity search approach: queries of quorum sensing signal transduction.

    Directory of Open Access Journals (Sweden)

    David N Quan

    Full Text Available Bacterial cell-cell communication is mediated by small signaling molecules known as autoinducers. Importantly, autoinducer-2 (AI-2 is synthesized via the enzyme LuxS in over 80 species, some of which mediate their pathogenicity by recognizing and transducing this signal in a cell density dependent manner. AI-2 mediated phenotypes are not well understood however, as the means for signal transduction appears varied among species, while AI-2 synthesis processes appear conserved. Approaches to reveal the recognition pathways of AI-2 will shed light on pathogenicity as we believe recognition of the signal is likely as important, if not more, than the signal synthesis. LMNAST (Local Modular Network Alignment Similarity Tool uses a local similarity search heuristic to study gene order, generating homology hits for the genomic arrangement of a query gene sequence. We develop and apply this tool for the E. coli lac and LuxS regulated (Lsr systems. Lsr is of great interest as it mediates AI-2 uptake and processing. Both test searches generated results that were subsequently analyzed through a number of different lenses, each with its own level of granularity, from a binary phylogenetic representation down to trackback plots that preserve genomic organizational information. Through a survey of these results, we demonstrate the identification of orthologs, paralogs, hitchhiking genes, gene loss, gene rearrangement within an operon context, and also horizontal gene transfer (HGT. We found a variety of operon structures that are consistent with our hypothesis that the signal can be perceived and transduced by homologous protein complexes, while their regulation may be key to defining subsequent phenotypic behavior.

  17. HBLAST: Parallelised sequence similarity--A Hadoop MapReducable basic local alignment search tool.

    Science.gov (United States)

    O'Driscoll, Aisling; Belogrudov, Vladislav; Carroll, John; Kropp, Kai; Walsh, Paul; Ghazal, Peter; Sleator, Roy D

    2015-04-01

    The recent exponential growth of genomic databases has resulted in the common task of sequence alignment becoming one of the major bottlenecks in the field of computational biology. It is typical for these large datasets and complex computations to require cost prohibitive High Performance Computing (HPC) to function. As such, parallelised solutions have been proposed but many exhibit scalability limitations and are incapable of effectively processing "Big Data" - the name attributed to datasets that are extremely large, complex and require rapid processing. The Hadoop framework, comprised of distributed storage and a parallelised programming framework known as MapReduce, is specifically designed to work with such datasets but it is not trivial to efficiently redesign and implement bioinformatics algorithms according to this paradigm. The parallelisation strategy of "divide and conquer" for alignment algorithms can be applied to both data sets and input query sequences. However, scalability is still an issue due to memory constraints or large databases, with very large database segmentation leading to additional performance decline. Herein, we present Hadoop Blast (HBlast), a parallelised BLAST algorithm that proposes a flexible method to partition both databases and input query sequences using "virtual partitioning". HBlast presents improved scalability over existing solutions and well balanced computational work load while keeping database segmentation and recompilation to a minimum. Enhanced BLAST search performance on cheap memory constrained hardware has significant implications for in field clinical diagnostic testing; enabling faster and more accurate identification of pathogenic DNA in human blood or tissue samples.

  18. PHOG-BLAST – a new generation tool for fast similarity search of protein families

    Directory of Open Access Journals (Sweden)

    Mironov Andrey A

    2006-06-01

    Full Text Available Abstract Background The need to compare protein profiles frequently arises in various protein research areas: comparison of protein families, domain searches, resolution of orthology and paralogy. The existing fast algorithms can only compare a protein sequence with a protein sequence and a profile with a sequence. Algorithms to compare profiles use dynamic programming and complex scoring functions. Results We developed a new algorithm called PHOG-BLAST for fast similarity search of profiles. This algorithm uses profile discretization to convert a profile to a finite alphabet and utilizes hashing for fast search. To determine the optimal alphabet, we analyzed columns in reliable multiple alignments and obtained column clusters in the 20-dimensional profile space by applying a special clustering procedure. We show that the clustering procedure works best if its parameters are chosen so that 20 profile clusters are obtained which can be interpreted as ancestral amino acid residues. With these clusters, only less than 2% of columns in multiple alignments are out of clusters. We tested the performance of PHOG-BLAST vs. PSI-BLAST on three well-known databases of multiple alignments: COG, PFAM and BALIBASE. On the COG database both algorithms showed the same performance, on PFAM and BALIBASE PHOG-BLAST was much superior to PSI-BLAST. PHOG-BLAST required 10–20 times less computer memory and computation time than PSI-BLAST. Conclusion Since PHOG-BLAST can compare multiple alignments of protein families, it can be used in different areas of comparative proteomics and protein evolution. For example, PHOG-BLAST helped to build the PHOG database of phylogenetic orthologous groups. An essential step in building this database was comparing protein complements of different species and orthologous groups of different taxons on a personal computer in reasonable time. When it is applied to detect weak similarity between protein families, PHOG-BLAST is less

  19. PSimScan: algorithm and utility for fast protein similarity search.

    Directory of Open Access Journals (Sweden)

    Anna Kaznadzey

    Full Text Available In the era of metagenomics and diagnostics sequencing, the importance of protein comparison methods of boosted performance cannot be overstated. Here we present PSimScan (Protein Similarity Scanner, a flexible open source protein similarity search tool which provides a significant gain in speed compared to BLASTP at the price of controlled sensitivity loss. The PSimScan algorithm introduces a number of novel performance optimization methods that can be further used by the community to improve the speed and lower hardware requirements of bioinformatics software. The optimization starts at the lookup table construction, then the initial lookup table-based hits are passed through a pipeline of filtering and aggregation routines of increasing computational complexity. The first step in this pipeline is a novel algorithm that builds and selects 'similarity zones' aggregated from neighboring matches on small arrays of adjacent diagonals. PSimScan performs 5 to 100 times faster than the standard NCBI BLASTP, depending on chosen parameters, and runs on commodity hardware. Its sensitivity and selectivity at the slowest settings are comparable to the NCBI BLASTP's and decrease with the increase of speed, yet stay at the levels reasonable for many tasks. PSimScan is most advantageous when used on large collections of query sequences. Comparing the entire proteome of Streptocuccus pneumoniae (2,042 proteins to the NCBI's non-redundant protein database of 16,971,855 records takes 6.5 hours on a moderately powerful PC, while the same task with the NCBI BLASTP takes over 66 hours. We describe innovations in the PSimScan algorithm in considerable detail to encourage bioinformaticians to improve on the tool and to use the innovations in their own software development.

  20. Unbounded Binary Search for a Fast and Accurate Maximum Power Point Tracking

    Science.gov (United States)

    Kim, Yong Sin; Winston, Roland

    2011-12-01

    This paper presents a technique for maximum power point tracking (MPPT) of a concentrating photovoltaic system using cell level power optimization. Perturb and observe (P&O) has been a standard for an MPPT, but it introduces a tradeoff between the tacking speed and the accuracy of the maximum power delivered. The P&O algorithm is not suitable for a rapid environmental condition change by partial shading and self-shading due to its tracking time being linear to the length of the voltage range. Some of researches have been worked on fast tracking but they come with internal ad hoc parameters. In this paper, by using the proposed unbounded binary search algorithm for the MPPT, tracking time becomes a logarithmic function of the voltage search range without ad hoc parameters.

  1. Application of 3D Zernike descriptors to shape-based ligand similarity searching

    Directory of Open Access Journals (Sweden)

    Venkatraman Vishwesh

    2009-12-01

    Full Text Available Abstract Background The identification of promising drug leads from a large database of compounds is an important step in the preliminary stages of drug design. Although shape is known to play a key role in the molecular recognition process, its application to virtual screening poses significant hurdles both in terms of the encoding scheme and speed. Results In this study, we have examined the efficacy of the alignment independent three-dimensional Zernike descriptor (3DZD for fast shape based similarity searching. Performance of this approach was compared with several other methods including the statistical moments based ultrafast shape recognition scheme (USR and SIMCOMP, a graph matching algorithm that compares atom environments. Three benchmark datasets are used to thoroughly test the methods in terms of their ability for molecular classification, retrieval rate, and performance under the situation that simulates actual virtual screening tasks over a large pharmaceutical database. The 3DZD performed better than or comparable to the other methods examined, depending on the datasets and evaluation metrics used. Reasons for the success and the failure of the shape based methods for specific cases are investigated. Based on the results for the three datasets, general conclusions are drawn with regard to their efficiency and applicability. Conclusion The 3DZD has unique ability for fast comparison of three-dimensional shape of compounds. Examples analyzed illustrate the advantages and the room for improvements for the 3DZD.

  2. Target-distractor similarity has a larger impact on visual search in school-age children than spacing.

    Science.gov (United States)

    Huurneman, Bianca; Boonstra, F Nienke

    2015-01-22

    In typically developing children, crowding decreases with increasing age. The influence of target-distractor similarity with respect to orientation and element spacing on visual search performance was investigated in 29 school-age children with normal vision (4- to 6-year-olds [N = 16], 7- to 8-year-olds [N = 13]). Children were instructed to search for a target E among distractor Es (feature search: all flanking Es pointing right; conjunction search: flankers in three orientations). Orientation of the target was manipulated in four directions: right (target absent), left (inversed), up, and down (vertical). Spacing was varied in four steps: 0.04°, 0.5°, 1°, and 2°. During feature search, high target-distractor similarity had a stronger impact on performance than spacing: Orientation affected accuracy until spacing was 1°, and spacing only influenced accuracy for identifying inversed targets. Spatial analyses showed that orientation affected oculomotor strategy: Children made more fixations in the "inversed" target area (4.6) than the vertical target areas (1.8 and 1.9). Furthermore, age groups differed in fixation duration: 4- to 6-year-old children showed longer fixation durations than 7- to 8-year-olds at the two largest element spacings (p = 0.039 and p = 0.027). Conjunction search performance was unaffected by spacing. Four conclusions can be drawn from this study: (a) Target-distractor similarity governs visual search performance in school-age children, (b) children make more fixations in target areas when target-distractor similarity is high, (c) 4- to 6-year-olds show longer fixation durations than 7- to 8-year-olds at 1° and 2° element spacing, and (d) spacing affects feature but not conjunction search-a finding that might indicate top-down control ameliorates crowding in children.

  3. Efficient Retrieval of Images for Search Engine by Visual Similarity and Re Ranking

    Directory of Open Access Journals (Sweden)

    Viswa S S

    2013-06-01

    Full Text Available Nowadays, web scale image search engines (e.g.Google Image Search, Microsoft Live ImageSearch rely almost purely on surrounding textfeatures. Users type keywords in hope of finding acertain type of images. The search engine returnsthousands of images ranked by the text keywordsextracted from the surrounding text. However,many of returned images are noisy, disorganized, orirrelevant. Even Google and Microsoft have noVisual Information for searching of images. Usingvisual information to re rank and improve textbased image search results is the idea. Thisimproves the precision of the text based imagesearch ranking by incorporating the informationconveyed by the visual modality.The typicalassumption that the top-images in the text-basedsearch result are equally relevant is relaxed bylinking the relevance of the images to their initialrank positions. Then, a number of images from theinitial search result are employed as the prototypesthat serve to visually represent the query and thatare subsequently used to construct meta re rankers.i.e. The most relevant images are found by visualsimilarity and the average scores are calculated. Byapplying different meta re rankers to an image fromthe initial result, re ranking scores are generated,which are then used to find the new rank positionfor an image in the re ranked search result.Humansupervision is introduced to learn the model weightsoffline, prior to the online re ranking process. Whilemodel learning requires manual labelling of theresults for a few queries, the resulting model isquery independent and therefore applicable to anyother query. The experimental results on arepresentative web image search dataset comprising353 queries demonstrate that the proposed methodoutperforms the existing supervised andunsupervised Re ranking approaches. Moreover, itimproves the performance over the text-based imagesearch engine by morethan 25.48%

  4. Accurate protein structure annotation through competitive diffusion of enzymatic functions over a network of local evolutionary similarities.

    Science.gov (United States)

    Venner, Eric; Lisewski, Andreas Martin; Erdin, Serkan; Ward, R Matthew; Amin, Shivas R; Lichtarge, Olivier

    2010-12-13

    High-throughput Structural Genomics yields many new protein structures without known molecular function. This study aims to uncover these missing annotations by globally comparing select functional residues across the structural proteome. First, Evolutionary Trace Annotation, or ETA, identifies which proteins have local evolutionary and structural features in common; next, these proteins are linked together into a proteomic network of ETA similarities; then, starting from proteins with known functions, competing functional labels diffuse link-by-link over the entire network. Every node is thus assigned a likelihood z-score for every function, and the most significant one at each node wins and defines its annotation. In high-throughput controls, this competitive diffusion process recovered enzyme activity annotations with 99% and 97% accuracy at half-coverage for the third and fourth Enzyme Commission (EC) levels, respectively. This corresponds to false positive rates 4-fold lower than nearest-neighbor and 5-fold lower than sequence-based annotations. In practice, experimental validation of the predicted carboxylesterase activity in a protein from Staphylococcus aureus illustrated the effectiveness of this approach in the context of an increasingly drug-resistant microbe. This study further links molecular function to a small number of evolutionarily important residues recognizable by Evolutionary Tracing and it points to the specificity and sensitivity of functional annotation by competitive global network diffusion. A web server is at http://mammoth.bcm.tmc.edu/networks.

  5. Accurate protein structure annotation through competitive diffusion of enzymatic functions over a network of local evolutionary similarities.

    Directory of Open Access Journals (Sweden)

    Eric Venner

    Full Text Available High-throughput Structural Genomics yields many new protein structures without known molecular function. This study aims to uncover these missing annotations by globally comparing select functional residues across the structural proteome. First, Evolutionary Trace Annotation, or ETA, identifies which proteins have local evolutionary and structural features in common; next, these proteins are linked together into a proteomic network of ETA similarities; then, starting from proteins with known functions, competing functional labels diffuse link-by-link over the entire network. Every node is thus assigned a likelihood z-score for every function, and the most significant one at each node wins and defines its annotation. In high-throughput controls, this competitive diffusion process recovered enzyme activity annotations with 99% and 97% accuracy at half-coverage for the third and fourth Enzyme Commission (EC levels, respectively. This corresponds to false positive rates 4-fold lower than nearest-neighbor and 5-fold lower than sequence-based annotations. In practice, experimental validation of the predicted carboxylesterase activity in a protein from Staphylococcus aureus illustrated the effectiveness of this approach in the context of an increasingly drug-resistant microbe. This study further links molecular function to a small number of evolutionarily important residues recognizable by Evolutionary Tracing and it points to the specificity and sensitivity of functional annotation by competitive global network diffusion. A web server is at http://mammoth.bcm.tmc.edu/networks.

  6. Accurate Image Search using Local Descriptors into a Compact Image Representation

    Directory of Open Access Journals (Sweden)

    Soumia Benkrama

    2013-01-01

    Full Text Available Progress in image retrieval by using low-level features, such as colors, textures and shapes, the performance is still unsatisfied as there are existing gaps between low-level features and high-level semantic concepts. In this work, we present an improved implementation for the bag of visual words approach. We propose a image retrieval system based on bag-of-features (BoF model by using scale invariant feature transform (SIFT and speeded up robust features (SURF. In literature SIFT and SURF give of good results. Based on this observation, we decide to use a bag-of-features approach over quaternion zernike moments (QZM. We compare the results of SIFT and SURF with those of QZM. We propose an indexing method for content based search task that aims to retrieve collection of images and returns a ranked list of objects in response to a query image. Experimental results with the Coil-100 and corel-1000 image database, demonstrate that QZM produces a better performance than known representations (SIFT and SURF.

  7. Target-distractor similarity has a larger impact on visual search in school-age children than spacing

    NARCIS (Netherlands)

    Huurneman, B.; Boonstra, F.N.

    2015-01-01

    In typically developing children, crowding decreases with increasing age. The influence of target-distractor similarity with respect to orientation and element spacing on visual search performance was investigated in 29 school-age children with normal vision (4- to 6-year-olds [N = 16], 7- to 8-year

  8. Breast cancer stories on the internet : improving search facilities to help patients find stories of similar others

    NARCIS (Netherlands)

    Overberg, Regina Ingrid

    2013-01-01

    The primary aim of this thesis is to gain insight into which search facilities for spontaneously published stories facilitate breast cancer patients in finding stories by other patients in a similar situation. According to the narrative approach, social comparison theory, and social cognitive theory

  9. Finding and Reusing Learning Materials with Multimedia Similarity Search and Social Networks

    Science.gov (United States)

    Little, Suzanne; Ferguson, Rebecca; Ruger, Stefan

    2012-01-01

    The authors describe how content-based multimedia search technologies can be used to help learners find new materials and learning pathways by identifying semantic relationships between educational resources in a social learning network. This helps users--both learners and educators--to explore and find material to support their learning aims.…

  10. Efficient EMD-based Similarity Search in Multimedia Databases via Flexible Dimensionality Reduction

    DEFF Research Database (Denmark)

    Wichterich, Marc; Assent, Ira; Philipp, Kranen

    2008-01-01

    The Earth Mover's Distance (EMD) was developed in computer vision as a flexible similarity model that utilizes similarities in feature space to define a high quality similarity measure in feature representation space. It has been successfully adopted in a multitude of applications with low to med...

  11. Application of belief theory to similarity data fusion for use in analog searching and lead hopping.

    Science.gov (United States)

    Muchmore, Steven W; Debe, Derek A; Metz, James T; Brown, Scott P; Martin, Yvonne C; Hajduk, Philip J

    2008-05-01

    A wide variety of computational algorithms have been developed that strive to capture the chemical similarity between two compounds for use in virtual screening and lead discovery. One limitation of such approaches is that, while a returned similarity value reflects the perceived degree of relatedness between any two compounds, there is no direct correlation between this value and the expectation or confidence that any two molecules will in fact be equally active. A lack of a common framework for interpretation of similarity measures also confounds the reliable fusion of information from different algorithms. Here, we present a probabilistic framework for interpreting similarity measures that directly correlates the similarity value to a quantitative expectation that two molecules will in fact be equipotent. The approach is based on extensive benchmarking of 10 different similarity methods (MACCS keys, Daylight fingerprints, maximum common subgraphs, rapid overlay of chemical structures (ROCS) shape similarity, and six connectivity-based fingerprints) against a database of more than 150,000 compounds with activity data against 23 protein targets. Given this unified and probabilistic framework for interpreting chemical similarity, principles derived from decision theory can then be applied to combine the evidence from different similarity measures in such a way that both capitalizes on the strengths of the individual approaches and maintains a quantitative estimate of the likelihood that any two molecules will exhibit similar biological activity.

  12. SHOP: receptor-based scaffold hopping by GRID-based similarity searches

    DEFF Research Database (Denmark)

    Bergmann, Rikke; Liljefors, Tommy; Sørensen, Morten D

    2009-01-01

    A new field-derived 3D method for receptor-based scaffold hopping, implemented in the software SHOP, is presented. Information from a protein-ligand complex is utilized to substitute a fragment of the ligand with another fragment from a database of synthetically accessible scaffolds. A GRID......-based interaction profile of the receptor and geometrical descriptions of a ligand scaffold are used to obtain new scaffolds with different structural features and are able to replace the original scaffold in the protein-ligand complex. An enrichment study was successfully performed verifying the ability of SHOP...... to find known active CDK2 scaffolds in a database. Additionally, SHOP was used for suggesting new inhibitors of p38 MAP kinase. Four p38 complexes were used to perform six scaffold searches. Several new scaffolds were suggested, and the resulting compounds were successfully docked into the query proteins....

  13. Proposal for a Similar Question Search System on a Q&A Site

    Directory of Open Access Journals (Sweden)

    Katsutoshi Kanamori

    2014-06-01

    Full Text Available There is a service to help Internet users obtain answers to specific questions when they visit a Q&A site. A Q&A site is very useful for the Internet user, but posted questions are often not answered immediately. This delay in answering occurs because in most cases another site user is answering the question manually. In this study, we propose a system that can present a question that is similar to a question posted by a user. An advantage of this system is that a user can refer to an answer to a similar question. This research measures the similarity of a candidate question based on word and dependency parsing. In an experiment, we examined the effectiveness of the proposed system for questions actually posted on the Q&A site. The result indicates that the system can show the questioner the answer to a similar question. However, the system still has a number of aspects that should be improved.

  14. Development of a fingerprint reduction approach for Bayesian similarity searching based on Kullback-Leibler divergence analysis.

    Science.gov (United States)

    Nisius, Britta; Vogt, Martin; Bajorath, Jürgen

    2009-06-01

    The contribution of individual fingerprint bit positions to similarity search performance is systematically evaluated. A method is introduced to determine bit significance on the basis of Kullback-Leibler divergence analysis of bit distributions in active and database compounds. Bit divergence analysis and Bayesian compound screening share a common methodological foundation. Hence, given the significance ranking of all individual bit positions comprising a fingerprint, subsets of bits are evaluated in the context of Bayesian screening, and minimal fingerprint representations are determined that meet or exceed the search performance of unmodified fingerprints. For fingerprints of different design evaluated on many compound activity classes, we consistently find that subsets of fingerprint bit positions are responsible for search performance. In part, these subsets are very small and contain in some cases only a few fingerprint bit positions. Structural or pharmacophore patterns captured by preferred bit positions can often be directly associated with characteristic features of active compounds. In some cases, reduced fingerprint representations clearly exceed the search performance of the original fingerprints. Thus, fingerprint reduction likely represents a promising approach for practical applications.

  15. SIMILARITY SEARCH FOR TRAJECTORIES OF RFID TAGS IN SUPPLY CHAIN TRAFFIC

    Directory of Open Access Journals (Sweden)

    Sabu Augustine

    2016-06-01

    Full Text Available In this fast developing period the use of RFID have become more significant in many application domaindue to drastic cut down in the price of the RFID tags. This technology is evolving as a means of tracking objects and inventory items. One such diversified application domain is in Supply Chain Management where RFID is being applied as the manufacturers and distributers need to analyse product and logistic information in order to get the right quantity of products arriving at the right time to the right locations. Usually the RFID tag information collected from RFID readers is stored in remote database and the RFID data is being analyzed by querying data from this database based on path encoding method by the property of prime numbers. In this paper we propose an improved encoding scheme that encodes the flows of objects in RFID tag movement. A Trajectory of moving RFID tags consists of a sequence of tagsthat changes over time. With the integration of wireless communications and positioning technologies, the concept of Trajectory Database has become increasingly important, and has posed great challenges to the data mining community.The support of efficient trajectory similarity techniques is indisputably very important for the quality of data analysis tasks in Supply Chain Traffic which will enable similar product movements.

  16. SymDex: increasing the efficiency of chemical fingerprint similarity searches for comparing large chemical libraries by using query set indexing.

    Science.gov (United States)

    Tai, David; Fang, Jianwen

    2012-08-27

    The large sizes of today's chemical databases require efficient algorithms to perform similarity searches. It can be very time consuming to compare two large chemical databases. This paper seeks to build upon existing research efforts by describing a novel strategy for accelerating existing search algorithms for comparing large chemical collections. The quest for efficiency has focused on developing better indexing algorithms by creating heuristics for searching individual chemical against a chemical library by detecting and eliminating needless similarity calculations. For comparing two chemical collections, these algorithms simply execute searches for each chemical in the query set sequentially. The strategy presented in this paper achieves a speedup upon these algorithms by indexing the set of all query chemicals so redundant calculations that arise in the case of sequential searches are eliminated. We implement this novel algorithm by developing a similarity search program called Symmetric inDexing or SymDex. SymDex shows over a 232% maximum speedup compared to the state-of-the-art single query search algorithm over real data for various fingerprint lengths. Considerable speedup is even seen for batch searches where query set sizes are relatively small compared to typical database sizes. To the best of our knowledge, SymDex is the first search algorithm designed specifically for comparing chemical libraries. It can be adapted to most, if not all, existing indexing algorithms and shows potential for accelerating future similarity search algorithms for comparing chemical databases.

  17. Web Similarity

    NARCIS (Netherlands)

    Cohen, A.R.; Vitányi, P.M.B.

    2015-01-01

    Normalized web distance (NWD) is a similarity or normalized semantic distance based on the World Wide Web or any other large electronic database, for instance Wikipedia, and a search engine that returns reliable aggregate page counts. For sets of search terms the NWD gives a similarity on a scale fr

  18. Identification of aggregation breakers for bevacizumab (Avastin®) self-association through similarity searching and interaction studies.

    Science.gov (United States)

    Westermaier, Y; Veurink, M; Riis-Johannessen, T; Guinchard, S; Gurny, R; Scapozza, L

    2013-11-01

    Aggregation is a common challenge in the optimization of therapeutic antibody formulations. Since initial self-association of two monomers is typically a reversible process, the aim of this study is to identify different excipients that are able to shift this equilibrium to the monomeric state. The hypothesis is that a specific interaction between excipient and antibody may hinder two monomers from approaching each other, based on previous work in which dexamethasone phosphate showed the ability to partially reverse formed aggregates of the monoclonal IgG1 antibody bevacizumab back into monomers. The current study focuses on the selection of therapeutically inactive compounds with similar properties. Adenosine monophosphate, adenosine triphosphate, sucrose-6-phosphate and guanosine monophosphate were selected in silico through similarity searching and docking. All four compounds were predicted to bind to a protein-protein interaction hotspot on the Fc region of bevacizumab and thereby breaking dimer formation. The predictions were supported in vitro: An interaction between AMP and bevacizumab with a dissociation constant of 9.59±0.15 mM was observed by microscale thermophoresis. The stability of the antibody at elevated temperature (40 °C) in a 51 mM phosphate buffer pH 7 was investigated in presence and absence of the excipients. Quantification of the different aggregation species by asymmetrical flow field-flow fractionation and size exclusion chromatography demonstrates that all four excipients are able to partially overcome the initial self-association of bevacizumab monomers.

  19. Efficient generation, storage, and manipulation of fully flexible pharmacophore multiplets and their use in 3-D similarity searching.

    Science.gov (United States)

    Abrahamian, Edmond; Fox, Peter C; Naerum, Lars; Christensen, Inge Thøger; Thøgersen, Henning; Clark, Robert D

    2003-01-01

    Pharmacophore triplets and quartets have been used by many groups in recent years, primarily as a tool for molecular diversity analysis. In most cases, slow processing speeds and the very large size of the bitsets generated have forced researchers to compromise in terms of how such multiplets were stored, manipulated, and compared, e.g., by using simple unions to represent multiplets for sets of molecules. Here we report using bitmaps in place of bitsets to reduce storage demands and to improve processing speed. Here, a bitset is taken to mean a fully enumerated string of zeros and ones, from which a compressed bitmap is obtained by replacing uniform blocks ("runs") of digits in the bitset with a pair of values identifying the content and length of the block (run-length encoding compression). High-resolution multiplets involving four features are enabled by using 64 bit executables to create and manipulate bitmaps, which "connect" to the 32 bit executables used for database access and feature identification via an extensible mark-up language (XML) data stream. The encoding system used supports simple pairs, triplets, and quartets; multiplets in which a privileged substructure is used as an anchor point; and augmented multiplets in which an additional vertex is added to represent a contingent feature such as a hydrogen bond extension point linked to a complementary feature (e.g., a donor or an acceptor atom) in a base pair or triplet. It can readily be extended to larger, more complex multiplets as well. Database searching is one particular potential application for this technology. Consensus bitmaps built up from active ligands identified in preliminary screening can be used to generate hypothesis bitmaps, a process which includes allowance for differential weighting to allow greater emphasis to be placed on bits arising from multiplets expected to be particularly discriminating. Such hypothesis bitmaps are shown to be useful queries for database searching

  20. Retrieval of very large numbers of items in the Web of Science: an exercise to develop accurate search strategies

    CERN Document Server

    Arencibia-Jorge, Ricardo; Chinchilla-Rodriguez, Zaida; Rousseau, Ronald; Paris, Soren W

    2009-01-01

    The current communication presents a simple exercise with the aim of solving a singular problem: the retrieval of extremely large amounts of items in the Web of Science interface. As it is known, Web of Science interface allows a user to obtain at most 100,000 items from a single query. But what about queries that achieve a result of more than 100,000 items? The exercise developed one possible way to achieve this objective. The case study is the retrieval of the entire scientific production from the United States in a specific year. Different sections of items were retrieved using the field Source of the database. Then, a simple Boolean statement was created with the aim of eliminating overlapping and to improve the accuracy of the search strategy. The importance of team work in the development of advanced search strategies was noted.

  1. Introduction of the conditional correlated Bernoulli model of similarity value distributions and its application to the prospective prediction of fingerprint search performance.

    Science.gov (United States)

    Vogt, Martin; Bajorath, Jürgen

    2011-10-24

    A statistical approach named the conditional correlated Bernoulli model is introduced for modeling of similarity scores and predicting the potential of fingerprint search calculations to identify active compounds. Fingerprint features are rationalized as dependent Bernoulli variables and conditional distributions of Tanimoto similarity values of database compounds given a reference molecule are assessed. The conditional correlated Bernoulli model is utilized in the context of virtual screening to estimate the position of a compound obtaining a certain similarity value in a database ranking. Through the generation of receiver operating characteristic curves from cumulative distribution functions of conditional similarity values for known active and random database compounds, one can predict how successful a fingerprint search might be. The comparison of curves for different fingerprints makes it possible to identify fingerprints that are most likely to identify new active molecules in a database search given a set of known reference molecules.

  2. Design and implement of the search engine technology for the accurate decision in the hospital computer system%基于搜索引擎技术的医院精准决策应用研究与实现

    Institute of Scientific and Technical Information of China (English)

    赵立川

    2016-01-01

    随着医院决策系统的进步和发展,大部分医院形成了以指标、报告、OLAP等可视化分析功能为基础的数据展现平台。文中设计和实现了一种用于医院精准决策的搜索引擎技术,该搜索引擎针对结构化数据建立索引,通过中文分词匹配用户输入的语义关键字,将相似度高的数据进行聚合,并以可视化搜索页面的形式提供给用户,应用于医院的精准决策。%With the progress and development of the decision-making system in the hospital, the data display platform has been widely set up with the functions of visual analytic processing of index, report and OLAP as the bases. In this paper, a search engine technology for the Accurate Decision in the Hospital Computer System is designed and implemented. The proposed search engine builds index for the structured data, and matches the semantic key words entered by the user with the Chinese word segmentation technology, then the data with high similarity is put together. Finally, the searching results are displayed to the users in the visual searching page for accurate decision in the hospital.

  3. Visual search for real world targets under conditions of high target-background similarity: Exploring training and transfer in younger and older adults.

    Science.gov (United States)

    Neider, Mark B; Boot, Walter R; Kramer, Arthur F

    2010-05-01

    Real world visual search tasks often require observers to locate a target that blends in with its surrounding environment. However, studies of the effect of target-background similarity on search processes have been relatively rare and have ignored potential age-related differences. We trained younger and older adults to search displays comprised of real world objects on either homogenous backgrounds or backgrounds that camouflaged the target. Training was followed by a transfer session in which participants searched for novel camouflaged objects. Although older adults were slower to locate the target compared to younger adults, all participants improved substantially with training. Surprisingly, camouflage-trained younger and older adults showed no performance decrements when transferred to novel camouflage displays, suggesting that observers learned age-invariant, generalizable skills relevant for searching under conditions of high target-background similarity. Camouflage training benefits at transfer for older adults appeared to be related to improvements in attentional guidance and target recognition rather than a more efficient search strategy.

  4. Novel DOCK clique driven 3D similarity database search tools for molecule shape matching and beyond: adding flexibility to the search for ligand kin.

    Science.gov (United States)

    Good, Andrew C

    2007-10-01

    With readily available CPU power and copious disk storage, it is now possible to undertake rapid comparison of 3D properties derived from explicit ligand overlay experiments. With this in mind, shape software tools originally devised in the 1990s are revisited, modified and applied to the problem of ligand database shape comparison. The utility of Connolly surface data is highlighted using the program MAKESITE, which leverages surface normal data to a create ligand shape cast. This cast is applied directly within DOCK, allowing the program to be used unmodified as a shape searching tool. In addition, DOCK has undergone multiple modifications to create a dedicated ligand shape comparison tool KIN. Scoring has been altered to incorporate the original incarnation of Gaussian function derived shape description based on STO-3G atomic electron density. In addition, a tabu-like search refinement has been added to increase search speed by removing redundant starting orientations produced during clique matching. The ability to use exclusion regions, again based on Gaussian shape overlap, has also been integrated into the scoring function. The use of both DOCK with MAKESITE and KIN in database screening mode is illustrated using a published ligand shape virtual screening template. The advantages of using a clique-driven search paradigm are highlighted, including shape optimization within a pharmacophore constrained framework, and easy incorporation of additional scoring function modifications. The potential for further development of such methods is also discussed.

  5. Developing Molecular Interaction Database and Searching for Similar Pathways (MOLECULAR BIOLOGY AND INFORMATION-Biological Information Science)

    OpenAIRE

    Kawashima, Shuichi; Katayama, Toshiaki; Kanehisa, Minoru

    1998-01-01

    We have developed a database named BRITE, which contains knowledge of interacting molecules and/or genes concering cell cycle and early development. Here, we report an overview of the database and the method of automatic search for functionally common sub-pathways between two biological pathways in BRITE.

  6. Using argumentation to retrieve articles with similar citations: an inquiry into improving related articles search in the MEDLINE digital library.

    Science.gov (United States)

    Tbahriti, Imad; Chichester, Christine; Lisacek, Frédérique; Ruch, Patrick

    2006-06-01

    The aim of this study is to investigate the relationships between citations and the scientific argumentation found abstracts. We design a related article search task and observe how the argumentation can affect the search results. We extracted citation lists from a set of 3200 full-text papers originating from a narrow domain. In parallel, we recovered the corresponding MEDLINE records for analysis of the argumentative moves. Our argumentative model is founded on four classes: PURPOSE, METHODS, RESULTS and CONCLUSION. A Bayesian classifier trained on explicitly structured MEDLINE abstracts generates these argumentative categories. The categories are used to generate four different argumentative indexes. A fifth index contains the complete abstract, together with the title and the list of Medical Subject Headings (MeSH) terms. To appraise the relationship of the moves to the citations, the citation lists were used as the criteria for determining relatedness of articles, establishing a benchmark; it means that two articles are considered as "related" if they share a significant set of co-citations. Our results show that the average precision of queries with the PURPOSE and CONCLUSION features is the highest, while the precision of the RESULTS and METHODS features was relatively low. A linear weighting combination of the moves is proposed, which significantly improves retrieval of related articles.

  7. eF-seek: prediction of the functional sites of proteins by searching for similar electrostatic potential and molecular surface shape

    Science.gov (United States)

    Kinoshita, Kengo; Murakami, Yoichi; Nakamura, Haruki

    2007-01-01

    We have developed a method to predict ligand-binding sites in a new protein structure by searching for similar binding sites in the Protein Data Bank (PDB). The similarities are measured according to the shapes of the molecular surfaces and their electrostatic potentials. A new web server, eF-seek, provides an interface to our search method. It simply requires a coordinate file in the PDB format, and generates a prediction result as a virtual complex structure, with the putative ligands in a PDB format file as the output. In addition, the predicted interacting interface is displayed to facilitate the examination of the virtual complex structure on our own applet viewer with the web browser (URL: http://eF-site.hgc.jp/eF-seek). PMID:17567616

  8. PhenoMeter: a metabolome database search tool using statistical similarity matching of metabolic phenotypes for high-confidence detection of functional links

    Directory of Open Access Journals (Sweden)

    Adam James Carroll

    2015-07-01

    Full Text Available This article describes PhenoMeter, a new type of metabolomics database search that accepts metabolite response patterns as queries and searches the MetaPhen database of reference patterns for responses that are statistically significantly similar or inverse for the purposes of detecting functional links. To identify a similarity measure that would detect functional links as reliably as possible, we compared the performance of four statistics in correctly top-matching metabolic phenotypes of Arabidopsis thaliana metabolism mutants affected in different steps of the photorespiration metabolic pathway to reference phenotypes of mutants affected in the same enzymes by independent mutations. The best performing statistic, the PhenoMeter Score (PM Score, was a function of both Pearson correlation and Fisher’s Exact Test of directional overlap. This statistic outperformed Pearson correlation, biweight midcorrelation and Fisher’s Exact Test used alone. To demonstrate general applicability, we show that the PhenoMeter reliably retrieved the most closely functionally-linked response in the database when queried with responses to a wide variety of environmental and genetic perturbations. Attempts to match metabolic phenotypes between independent studies were met with varying success and possible reasons for this are discussed. Overall, our results suggest that integration of pattern-based search tools into metabolomics databases will aid functional annotation of newly recorded metabolic phenotypes analogously to the way sequence similarity search algorithms have aided the functional annotation of genes and proteins. PhenoMeter is freely available at MetabolomeExpress (https://www.metabolome-express.org/phenometer.php.

  9. Similarity-potency trees: a method to search for SAR information in compound data sets and derive SAR rules.

    Science.gov (United States)

    Wawer, Mathias; Bajorath, Jürgen

    2010-08-23

    An intuitive and generally applicable analysis method, termed similarity-potency tree (SPT), is introduced to mine structure-activity relationship (SAR) information in compound data sets of any source. Only compound potency values and nearest-neighbor similarity relationships are considered. Rather than analyzing a data set as a whole, in part overlapping compound neighborhoods are systematically generated and represented as SPTs. This local analysis scheme simplifies the evaluation of SAR information and SPTs of high SAR information content are easily identified. By inspecting only a limited number of compound neighborhoods, it is also straightforward to determine whether data sets contain only little or no interpretable SAR information. Interactive analysis of SPTs is facilitated by reading the trees in two directions, which makes it possible to extract SAR rules, if available, in a consistent manner. The simplicity and interpretability of the data structure and the ease of calculation are characteristic features of this approach. We apply the methodology to high-throughput screening and lead optimization data sets, compare the approach to standard clustering techniques, illustrate how SAR rules are derived, and provide some practical guidance how to best utilize the methodology. The SPT program is made freely available to the scientific community.

  10. A relook on using the Earth Similarity Index for searching habitable zones around solar and extrasolar planets

    Science.gov (United States)

    Biswas, S.; Shome, A.; Raha, B.; Bhattacharya, A. B.

    2017-01-01

    To study the distribution of Earth-like planets and to locate the habitable zone around extrasolar planets and their known satellites, we have emphasized in this paper the consideration of Earth similarity index (ESI) as a multi parameter quick assessment of Earth-likeness with a value between zero and one. Weight exponent values for four planetary properties have been taken into account to determine the ESI. A plot of surface ESI against the interior ESI exhibits some interesting results which provide further information when confirmed planets are examined. From the analysis of the available catalog and existing theory, none of the solar planets achieves an ESI value greater than 0.8. Though the planet Mercury has a value of 0.6, Mars exhibits a value between 0.6 and 0.8 and the planet Venus shows a value near 0.5. Finally, the locations of the habitable zone around different type of stars are critically examined and discussed.

  11. A spin transfer torque magnetoresistance random access memory-based high-density and ultralow-power associative memory for fully data-adaptive nearest neighbor search with current-mode similarity evaluation and time-domain minimum searching

    Science.gov (United States)

    Ma, Yitao; Miura, Sadahiko; Honjo, Hiroaki; Ikeda, Shoji; Hanyu, Takahiro; Ohno, Hideo; Endoh, Tetsuo

    2017-04-01

    A high-density nonvolatile associative memory (NV-AM) based on spin transfer torque magnetoresistive random access memory (STT-MRAM), which achieves highly concurrent and ultralow-power nearest neighbor search with full adaptivity of the template data format, has been proposed and fabricated using the 90 nm CMOS/70 nm perpendicular-magnetic-tunnel-junction hybrid process. A truly compact current-mode circuitry is developed to realize flexibly controllable and high-parallel similarity evaluation, which makes the NV-AM adaptable to any dimensionality and component-bit of template data. A compact dual-stage time-domain minimum searching circuit is also developed, which can freely extend the system for more template data by connecting multiple NM-AM cores without additional circuits for integrated processing. Both the embedded STT-MRAM module and the computing circuit modules in this NV-AM chip are synchronously power-gated to completely eliminate standby power and maximally reduce operation power by only activating the currently accessed circuit blocks. The operations of a prototype chip at 40 MHz are demonstrated by measurement. The average operation power is only 130 µW, and the circuit density is less than 11 µm2/bit. Compared with the latest conventional works in both volatile and nonvolatile approaches, more than 31.3% circuit area reductions and 99.2% power improvements are achieved, respectively. Further power performance analyses are discussed, which verify the special superiority of the proposed NV-AM in low-power and large-memory-based VLSIs.

  12. An Accurate FOA and TOA Estimation Algorithm for Galileo Search and Rescue Signal%伽利略搜救信号FOA和TOA精确估计算法

    Institute of Scientific and Technical Information of China (English)

    王堃; 吴嗣亮; 韩月涛

    2011-01-01

    According to the high precision demand of Frequency of Arrival(FOA) and Time of Arrival(TOA) estimation in Galileo search and rescue(SAR) system and considering the fact that the message bit width is unknown in real received beacons,a new FOA and TOA estimation algorithm which combines the multi-dimensional joint maximum likelihood estimation algorithm and barycenter calculation algorithm is proposed.The principle of the algorithm is derived after the signal model is introduced,and the concrete realization of the estimation algorithm is given.Monte Carlo simulation results and measurement results show that when CNR equals the threshold of 34.8 dBHz,FOA and TOA estimation rmse(root-mean-square error) of this algorithm are respectively within 0.03 Hz and 9.5 μs,which are better than the system requirements of 0.05 Hz and 11 μs.This algorithm has been applied to the Galileo Medium-altitude Earth Orbit Local User Terminal(MEOLUT station).%针对伽利略搜救系统中到达频率(FOA)和到达时间(TOA)高精度估计的需求,考虑到实际接收的信标信号中信息位宽未知的情况,提出了多维联合极大似然估计算法和体积重心算法相结合的FOA和TOA估计算法。在介绍信号模型的基础上推导了算法原理,给出了估计算法的具体实现过程。Monte Carlo仿真和实测结果表明,在34.8 dBHz的处理门限下,该算法得到的FOA和TOA估计的均方根误差分别小于0.03 Hz和9.5μs,优于0.05 Hz和11μs的系统指标要求。该算法目前已应用于伽利略中轨卫星地面用户终端(MEOLUT地面站)。

  13. Megraft: a software package to graft ribosomal small subunit (16S/18S) fragments onto full-length sequences for accurate species richness and sequencing depth analysis in pyrosequencing-length metagenomes and similar environmental datasets.

    Science.gov (United States)

    Bengtsson, Johan; Hartmann, Martin; Unterseher, Martin; Vaishampayan, Parag; Abarenkov, Kessy; Durso, Lisa; Bik, Elisabeth M; Garey, James R; Eriksson, K Martin; Nilsson, R Henrik

    2012-07-01

    Metagenomic libraries represent subsamples of the total DNA found at a study site and offer unprecedented opportunities to study ecological and functional aspects of microbial communities. To examine the depth of a community sequencing effort, rarefaction analysis of the ribosomal small subunit (SSU/16S/18S) gene in the metagenome is usually performed. The fragmentary, non-overlapping nature of SSU sequences in metagenomic libraries poses a problem for this analysis, however. We introduce a software package - Megraft - that grafts SSU fragments onto full-length SSU sequences, accounting for observed and unobserved variability, for accurate assessment of species richness and sequencing depth in metagenomics endeavors.

  14. Combination of 2D/3D Ligand-Based Similarity Search in Rapid Virtual Screening from Multimillion Compound Repositories. Selection and Biological Evaluation of Potential PDE4 and PDE5 Inhibitors

    Directory of Open Access Journals (Sweden)

    Krisztina Dobi

    2014-05-01

    Full Text Available Rapid in silico selection of target focused libraries from commercial repositories is an attractive and cost effective approach. If structures of active compounds are available rapid 2D similarity search can be performed on multimillion compound databases but the generated library requires further focusing by various 2D/3D chemoinformatics tools. We report here a combination of the 2D approach with a ligand-based 3D method (Screen3D which applies flexible matching to align reference and target compounds in a dynamic manner and thus to assess their structural and conformational similarity. In the first case study we compared the 2D and 3D similarity scores on an existing dataset derived from the biological evaluation of a PDE5 focused library. Based on the obtained similarity metrices a fusion score was proposed. The fusion score was applied to refine the 2D similarity search in a second case study where we aimed at selecting and evaluating a PDE4B focused library. The application of this fused 2D/3D similarity measure led to an increase of the hit rate from 8.5% (1st round, 47% inhibition at 10 µM to 28.5% (2nd round at 50% inhibition at 10 µM and the best two hits had 53 nM inhibitory activities.

  15. Compression-based Similarity

    CERN Document Server

    Vitanyi, Paul M B

    2011-01-01

    First we consider pair-wise distances for literal objects consisting of finite binary files. These files are taken to contain all of their meaning, like genomes or books. The distances are based on compression of the objects concerned, normalized, and can be viewed as similarity distances. Second, we consider pair-wise distances between names of objects, like "red" or "christianity." In this case the distances are based on searches of the Internet. Such a search can be performed by any search engine that returns aggregate page counts. We can extract a code length from the numbers returned, use the same formula as before, and derive a similarity or relative semantics between names for objects. The theory is based on Kolmogorov complexity. We test both similarities extensively experimentally.

  16. Personalized Search

    CERN Document Server

    AUTHOR|(SzGeCERN)749939

    2015-01-01

    As the volume of electronically available information grows, relevant items become harder to find. This work presents an approach to personalizing search results in scientific publication databases. This work focuses on re-ranking search results from existing search engines like Solr or ElasticSearch. This work also includes the development of Obelix, a new recommendation system used to re-rank search results. The project was proposed and performed at CERN, using the scientific publications available on the CERN Document Server (CDS). This work experiments with re-ranking using offline and online evaluation of users and documents in CDS. The experiments conclude that the personalized search result outperform both latest first and word similarity in terms of click position in the search result for global search in CDS.

  17. Active browsing using similarity pyramids

    Science.gov (United States)

    Chen, Jau-Yuen; Bouman, Charles A.; Dalton, John C.

    1998-12-01

    In this paper, we describe a new approach to managing large image databases, which we call active browsing. Active browsing integrates relevance feedback into the browsing environment, so that users can modify the database's organization to suit the desired task. Our method is based on a similarity pyramid data structure, which hierarchically organizes the database, so that it can be efficiently browsed. At coarse levels, the similarity pyramid allows users to view the database as large clusters of similar images. Alternatively, users can 'zoom into' finer levels to view individual images. We discuss relevance feedback for the browsing process, and argue that it is fundamentally different from relevance feedback for more traditional search-by-query tasks. We propose two fundamental operations for active browsing: pruning and reorganization. Both of these operations depend on a user-defined relevance set, which represents the image or set of images desired by the user. We present statistical methods for accurately pruning the database, and we propose a new 'worm hole' distance metric for reorganizing the database, so that members of the relevance set are grouped together.

  18. Similarity Scaling

    Science.gov (United States)

    Schnack, Dalton D.

    In Lecture 10, we introduced a non-dimensional parameter called the Lundquist number, denoted by S. This is just one of many non-dimensional parameters that can appear in the formulations of both hydrodynamics and MHD. These generally express the ratio of the time scale associated with some dissipative process to the time scale associated with either wave propagation or transport by flow. These are important because they define regions in parameter space that separate flows with different physical characteristics. All flows that have the same non-dimensional parameters behave in the same way. This property is called similarity scaling.

  19. Proteomic analysis of cellular soluble proteins from human bronchial smooth muscle cells by combining nondenaturing micro 2DE and quantitative LC-MS/MS. 2. Similarity search between protein maps for the analysis of protein complexes.

    Science.gov (United States)

    Jin, Ya; Yuan, Qi; Zhang, Jun; Manabe, Takashi; Tan, Wen

    2015-09-01

    Human bronchial smooth muscle cell soluble proteins were analyzed by a combined method of nondenaturing micro 2DE, grid gel-cutting, and quantitative LC-MS/MS and a native protein map was prepared for each of the identified 4323 proteins [1]. A method to evaluate the degree of similarity between the protein maps was developed since we expected the proteins comprising a protein complex would be separated together under nondenaturing conditions. The following procedure was employed using Excel macros; (i) maps that have three or more squares with protein quantity data were selected (2328 maps), (ii) within each map, the quantity values of the squares were normalized setting the highest value to be 1.0, (iii) in comparing a map with another map, the smaller normalized quantity in two corresponding squares was taken and summed throughout the map to give an "overlap score," (iv) each map was compared against all the 2328 maps and the largest overlap score, obtained when a map was compared with itself, was set to be 1.0 thus providing 2328 "overlap factors," (v) step (iv) was repeated for all maps providing 2328 × 2328 matrix of overlap factors. From the matrix, protein pairs that showed overlap factors above 0.65 from both protein sides were selected (431 protein pairs). Each protein pair was searched in a database (UniProtKB) on complex formation and 301 protein pairs, which comprise 35 protein complexes, were found to be documented. These results demonstrated that native protein maps and their similarity search would enable simultaneous analysis of multiple protein complexes in cells.

  20. Applying ligands profiling using multiple extended electron distribution based field templates and feature trees similarity searching in the discovery of new generation of urea-based antineoplastic kinase inhibitors.

    Directory of Open Access Journals (Sweden)

    Eman M Dokla

    Full Text Available This study provides a comprehensive computational procedure for the discovery of novel urea-based antineoplastic kinase inhibitors while focusing on diversification of both chemotype and selectivity pattern. It presents a systematic structural analysis of the different binding motifs of urea-based kinase inhibitors and the corresponding configurations of the kinase enzymes. The computational model depends on simultaneous application of two protocols. The first protocol applies multiple consecutive validated virtual screening filters including SMARTS, support vector-machine model (ROC = 0.98, Bayesian model (ROC = 0.86 and structure-based pharmacophore filters based on urea-based kinase inhibitors complexes retrieved from literature. This is followed by hits profiling against different extended electron distribution (XED based field templates representing different kinase targets. The second protocol enables cancericidal activity verification by using the algorithm of feature trees (Ftrees similarity searching against NCI database. Being a proof-of-concept study, this combined procedure was experimentally validated by its utilization in developing a novel series of urea-based derivatives of strong anticancer activity. This new series is based on 3-benzylbenzo[d]thiazol-2(3H-one scaffold which has interesting chemical feasibility and wide diversification capability. Antineoplastic activity of this series was assayed in vitro against NCI 60 tumor-cell lines showing very strong inhibition of GI(50 as low as 0.9 uM. Additionally, its mechanism was unleashed using KINEX™ protein kinase microarray-based small molecule inhibitor profiling platform and cell cycle analysis showing a peculiar selectivity pattern against Zap70, c-src, Mink1, csk and MeKK2 kinases. Interestingly, it showed activity on syk kinase confirming the recent studies finding of the high activity of diphenyl urea containing compounds against this kinase. Allover, the new series

  1. Custom Search Engines: Tools & Tips

    Science.gov (United States)

    Notess, Greg R.

    2008-01-01

    Few have the resources to build a Google or Yahoo! from scratch. Yet anyone can build a search engine based on a subset of the large search engines' databases. Use Google Custom Search Engine or Yahoo! Search Builder or any of the other similar programs to create a vertical search engine targeting sites of interest to users. The basic steps to…

  2. Different predictors of multiple-target search accuracy between nonprofessional and professional visual searchers.

    Science.gov (United States)

    Biggs, Adam T; Mitroff, Stephen R

    2014-01-01

    Visual search, locating target items among distractors, underlies daily activities ranging from critical tasks (e.g., looking for dangerous objects during security screening) to commonplace ones (e.g., finding your friends in a crowded bar). Both professional and nonprofessional individuals conduct visual searches, and the present investigation is aimed at understanding how they perform similarly and differently. We administered a multiple-target visual search task to both professional (airport security officers) and nonprofessional participants (members of the Duke University community) to determine how search abilities differ between these populations and what factors might predict accuracy. There were minimal overall accuracy differences, although the professionals were generally slower to respond. However, the factors that predicted accuracy varied drastically between groups; variability in search consistency-how similarly an individual searched from trial to trial in terms of speed-best explained accuracy for professional searchers (more consistent professionals were more accurate), whereas search speed-how long an individual took to complete a search when no targets were present-best explained accuracy for nonprofessional searchers (slower nonprofessionals were more accurate). These findings suggest that professional searchers may utilize different search strategies from those of nonprofessionals, and that search consistency, in particular, may provide a valuable tool for enhancing professional search accuracy.

  3. Accurate backgrounds to Higgs production at the LHC

    CERN Document Server

    Kauer, N

    2007-01-01

    Corrections of 10-30% for backgrounds to the H --> WW --> l^+l^-\\sla{p}_T search in vector boson and gluon fusion at the LHC are reviewed to make the case for precise and accurate theoretical background predictions.

  4. Efficient protein structure search using indexing methods.

    Science.gov (United States)

    Kim, Sungchul; Sael, Lee; Yu, Hwanjo

    2013-01-01

    Understanding functions of proteins is one of the most important challenges in many studies of biological processes. The function of a protein can be predicted by analyzing the functions of structurally similar proteins, thus finding structurally similar proteins accurately and efficiently from a large set of proteins is crucial. A protein structure can be represented as a vector by 3D-Zernike Descriptor (3DZD) which compactly represents the surface shape of the protein tertiary structure. This simplified representation accelerates the searching process. However, computing the similarity of two protein structures is still computationally expensive, thus it is hard to efficiently process many simultaneous requests of structurally similar protein search. This paper proposes indexing techniques which substantially reduce the search time to find structurally similar proteins. In particular, we first exploit two indexing techniques, i.e., iDistance and iKernel, on the 3DZDs. After that, we extend the techniques to further improve the search speed for protein structures. The extended indexing techniques build and utilize an reduced index constructed from the first few attributes of 3DZDs of protein structures. To retrieve top-k similar structures, top-10 × k similar structures are first found using the reduced index, and top-k structures are selected among them. We also modify the indexing techniques to support θ-based nearest neighbor search, which returns data points less than θ to the query point. The results show that both iDistance and iKernel significantly enhance the searching speed. In top-k nearest neighbor search, the searching time is reduced 69.6%, 77%, 77.4% and 87.9%, respectively using iDistance, iKernel, the extended iDistance, and the extended iKernel. In θ-based nearest neighbor serach, the searching time is reduced 80%, 81%, 95.6% and 95.6% using iDistance, iKernel, the extended iDistance, and the extended iKernel, respectively.

  5. Speaking Fluently And Accurately

    Institute of Scientific and Technical Information of China (English)

    JosephDeVeto

    2004-01-01

    Even after many years of study,students make frequent mistakes in English. In addition, many students still need a long time to think of what they want to say. For some reason, in spite of all the studying, students are still not quite fluent.When I teach, I use one technique that helps students not only speak more accurately, but also more fluently. That technique is dictations.

  6. Contextual Bandits with Similarity Information

    CERN Document Server

    Slivkins, Aleksandrs

    2009-01-01

    In a multi-armed bandit (MAB) problem, an online algorithm makes a sequence of choices. In each round it chooses from a time-invariant set of alternatives and receives the payoff associated with this alternative. While the case of small strategy sets is by now well-understood, a lot of recent work has focused on MAB problems with exponentially or infinitely large strategy sets, where one needs to assume extra structure in order to make the problem tractable. In particular, recent literature considered information on similarity between arms. We consider similarity information in the setting of "contextual bandits", a natural extension of the basic MAB problem where before each round an algorithm is given the "context" -- a hint about the payoffs in this round. Contextual bandits are directly motivated by placing advertisements on webpages, one of the crucial problems in sponsored search. A particularly simple way to represent similarity information in the contextual bandit setting is via a "similarity distance...

  7. Professional Microsoft search fast search, Sharepoint search, and search server

    CERN Document Server

    Bennett, Mark; Kehoe, Miles; Voskresenskaya, Natalya

    2010-01-01

    Use Microsoft's latest search-based technology-FAST search-to plan, customize, and deploy your search solutionFAST is Microsoft's latest intelligent search-based technology that boasts robustness and an ability to integrate business intelligence with Search. This in-depth guide provides you with advanced coverage on FAST search and shows you how to use it to plan, customize, and deploy your search solution, with an emphasis on SharePoint 2010 and Internet-based search solutions.With a particular appeal for anyone responsible for implementing and managing enterprise search, this book presents t

  8. Distance learning for similarity estimation.

    Science.gov (United States)

    Yu, Jie; Amores, Jaume; Sebe, Nicu; Radeva, Petia; Tian, Qi

    2008-03-01

    In this paper, we present a general guideline to find a better distance measure for similarity estimation based on statistical analysis of distribution models and distance functions. A new set of distance measures are derived from the harmonic distance, the geometric distance, and their generalized variants according to the Maximum Likelihood theory. These measures can provide a more accurate feature model than the classical Euclidean and Manhattan distances. We also find that the feature elements are often from heterogeneous sources that may have different influence on similarity estimation. Therefore, the assumption of single isotropic distribution model is often inappropriate. To alleviate this problem, we use a boosted distance measure framework that finds multiple distance measures which fit the distribution of selected feature elements best for accurate similarity estimation. The new distance measures for similarity estimation are tested on two applications: stereo matching and motion tracking in video sequences. The performance of boosted distance measure is further evaluated on several benchmark data sets from the UCI repository and two image retrieval applications. In all the experiments, robust results are obtained based on the proposed methods.

  9. CADC Advanced Search

    Science.gov (United States)

    Jenkins, D. N.

    2012-09-01

    The Canadian Astronomy Data Centre's (CADC) Advanced Search web application is a modern search tool to access data across the CADC archives. It allows searching in different units, and is well averse in wild card characters and numeric operations. Search results are displayed in a sortable and filterable manner allowing quick and accurate access to downloadable data. The Advanced Search interface makes extremely good use of the Astronomical Data Query Language (ADQL) to scour the Common Archive Observation Model (CAOM) Table Access Protocol (TAP) query service and the vast CADC Archive Data (AD) storage system. A new tabular view of the query form and the results data makes it easy to view the query, then return to the query form to make further changes, or, alternatively, filter the data from the paginated table. Results are displayed using a rich, open-source, JavaScript-based VOTable viewer called voview.

  10. Search Cloud

    Science.gov (United States)

    ... of this page: https://medlineplus.gov/cloud.html Search Cloud To use the sharing features on this ... of Top 110 zoster vaccine Share the MedlinePlus search cloud with your users by embedding our search ...

  11. Niche Genetic Algorithm with Accurate Optimization Performance

    Institute of Scientific and Technical Information of China (English)

    LIU Jian-hua; YAN De-kun

    2005-01-01

    Based on crowding mechanism, a novel niche genetic algorithm was proposed which can record evolutionary direction dynamically during evolution. After evolution, the solutions's precision can be greatly improved by means of the local searching along the recorded direction. Simulation shows that this algorithm can not only keep population diversity but also find accurate solutions. Although using this method has to take more time compared with the standard GA, it is really worth applying to some cases that have to meet a demand for high solution precision.

  12. Search Patterns

    CERN Document Server

    Morville, Peter

    2010-01-01

    What people are saying about Search Patterns "Search Patterns is a delight to read -- very thoughtful and thought provoking. It's the most comprehensive survey of designing effective search experiences I've seen." --Irene Au, Director of User Experience, Google "I love this book! Thanks to Peter and Jeffery, I now know that search (yes, boring old yucky who cares search) is one of the coolest ways around of looking at the world." --Dan Roam, author, The Back of the Napkin (Portfolio Hardcover) "Search Patterns is a playful guide to the practical concerns of search interface design. It cont

  13. Similarity transformations of MAPs

    Directory of Open Access Journals (Sweden)

    Andersen Allan T.

    1999-01-01

    Full Text Available We introduce the notion of similar Markovian Arrival Processes (MAPs and show that the event stationary point processes related to two similar MAPs are stochastically equivalent. This holds true for the time stationary point processes too. We show that several well known stochastical equivalences as e.g. that between the H 2 renewal process and the Interrupted Poisson Process (IPP can be expressed by the similarity transformations of MAPs. In the appendix the valid region of similarity transformations for two-state MAPs is characterized.

  14. New Similarity Functions

    DEFF Research Database (Denmark)

    Yazdani, Hossein; Ortiz-Arroyo, Daniel; Kwasnicka, Halina

    2016-01-01

    In data science, there are important parameters that affect the accuracy of the algorithms used. Some of these parameters are: the type of data objects, the membership assignments, and distance or similarity functions. This paper discusses similarity functions as fundamental elements in membership...

  15. Clustering by Pattern Similarity

    Institute of Scientific and Technical Information of China (English)

    Hai-xun Wang; Jian Pei

    2008-01-01

    The task of clustering is to identify classes of similar objects among a set of objects. The definition of similarity varies from one clustering model to another. However, in most of these models the concept of similarity is often based on such metrics as Manhattan distance, Euclidean distance or other Lp distances. In other words, similar objects must have close values in at least a set of dimensions. In this paper, we explore a more general type of similarity. Under the pCluster model we proposed, two objects are similar if they exhibit a coherent pattern on a subset of dimensions. The new similarity concept models a wide range of applications. For instance, in DNA microarray analysis, the expression levels of two genes may rise and fall synchronously in response to a set of environmental stimuli. Although the magnitude of their expression levels may not be close, the patterns they exhibit can be very much alike. Discovery of such clusters of genes is essential in revealing significant connections in gene regulatory networks. E-commerce applications, such as collaborative filtering, can also benefit from the new model, because it is able to capture not only the closeness of values of certain leading indicators but also the closeness of (purchasing, browsing, etc.) patterns exhibited by the customers. In addition to the novel similarity model, this paper also introduces an effective and efficient algorithm to detect such clusters, and we perform tests on several real and synthetic data sets to show its performance.

  16. Judgments of brand similarity

    NARCIS (Netherlands)

    Bijmolt, THA; Wedel, M; Pieters, RGM; DeSarbo, WS

    1998-01-01

    This paper provides empirical insight into the way consumers make pairwise similarity judgments between brands, and how familiarity with the brands, serial position of the pair in a sequence, and the presentation format affect these judgments. Within the similarity judgment process both the formatio

  17. New Similarity Functions

    DEFF Research Database (Denmark)

    Yazdani, Hossein; Ortiz-Arroyo, Daniel; Kwasnicka, Halina

    2016-01-01

    In data science, there are important parameters that affect the accuracy of the algorithms used. Some of these parameters are: the type of data objects, the membership assignments, and distance or similarity functions. This paper discusses similarity functions as fundamental elements in membership...... assignments. The paper introduces Weighted Feature Distance (WFD), and Prioritized Weighted Feature Distance (PWFD), two new distance functions that take into account the diversity in feature spaces. WFD functions perform better in supervised and unsupervised methods by comparing data objects on their feature...... spaces, in addition to their similarity in the vector space. Prioritized Weighted Feature Distance (PWFD) works similarly as WFD, but provides the ability to give priorities to desirable features. The accuracy of the proposed functions are compared with other similarity functions on several data sets...

  18. Measuring Personalization of Web Search

    DEFF Research Database (Denmark)

    Hannak, Aniko; Sapiezynski, Piotr; Kakhki, Arash Molavi

    2013-01-01

    Web search is an integral part of our daily lives. Recently, there has been a trend of personalization in Web search, where different users receive different results for the same search query. The increasing personalization is leading to concerns about Filter Bubble effects, where certain users...... are simply unable to access information that the search engines’ algorithm decidesis irrelevant. Despitetheseconcerns, there has been little quantification of the extent of personalization in Web search today, or the user attributes that cause it. In light of this situation, we make three contributions....... First, we develop a methodology for measuring personalization in Web search results. While conceptually simple, there are numerous details that our methodology must handle in order to accurately attribute differences in search results to personalization. Second, we apply our methodology to 200 users...

  19. Cluster Tree Based Hybrid Document Similarity Measure

    Directory of Open Access Journals (Sweden)

    M. Varshana Devi

    2015-10-01

    Full Text Available similarity measure is established to measure the hybrid similarity. In cluster tree, the hybrid similarity measure can be calculated for the random data even it may not be the co-occurred and generate different views. Different views of tree can be combined and choose the one which is significant in cost. A method is proposed to combine the multiple views. Multiple views are represented by different distance measures into a single cluster. Comparing the cluster tree based hybrid similarity with the traditional statistical methods it gives the better feasibility for intelligent based search. It helps in improving the dimensionality reduction and semantic analysis.

  20. Similar component analysis

    Institute of Scientific and Technical Information of China (English)

    ZHANG Hong; WANG Xin; LI Junwei; CAO Xianguang

    2006-01-01

    A new unsupervised feature extraction method called similar component analysis (SCA) is proposed in this paper. SCA method has a self-aggregation property that the data objects will move towards each other to form clusters through SCA theoretically,which can reveal the inherent pattern of similarity hidden in the dataset. The inputs of SCA are just the pairwise similarities of the dataset,which makes it easier for time series analysis due to the variable length of the time series. Our experimental results on many problems have verified the effectiveness of SCA on some engineering application.

  1. Search Combinators

    CERN Document Server

    Schrijvers, Tom; Wuille, Pieter; Samulowitz, Horst; Stuckey, Peter J

    2012-01-01

    The ability to model search in a constraint solver can be an essential asset for solving combinatorial problems. However, existing infrastructure for defining search heuristics is often inadequate. Either modeling capabilities are extremely limited or users are faced with a general-purpose programming language whose features are not tailored towards writing search heuristics. As a result, major improvements in performance may remain unexplored. This article introduces search combinators, a lightweight and solver-independent method that bridges the gap between a conceptually simple modeling language for search (high-level, functional and naturally compositional) and an efficient implementation (low-level, imperative and highly non-modular). By allowing the user to define application-tailored search strategies from a small set of primitives, search combinators effectively provide a rich domain-specific language (DSL) for modeling search to the user. Remarkably, this DSL comes at a low implementation cost to the...

  2. Gender similarities and differences.

    Science.gov (United States)

    Hyde, Janet Shibley

    2014-01-01

    Whether men and women are fundamentally different or similar has been debated for more than a century. This review summarizes major theories designed to explain gender differences: evolutionary theories, cognitive social learning theory, sociocultural theory, and expectancy-value theory. The gender similarities hypothesis raises the possibility of theorizing gender similarities. Statistical methods for the analysis of gender differences and similarities are reviewed, including effect sizes, meta-analysis, taxometric analysis, and equivalence testing. Then, relying mainly on evidence from meta-analyses, gender differences are reviewed in cognitive performance (e.g., math performance), personality and social behaviors (e.g., temperament, emotions, aggression, and leadership), and psychological well-being. The evidence on gender differences in variance is summarized. The final sections explore applications of intersectionality and directions for future research.

  3. Music Retrieval based on Melodic Similarity

    NARCIS (Netherlands)

    Typke, R.

    2007-01-01

    This thesis introduces a method for measuring melodic similarity for notated music such as MIDI files. This music search algorithm views music as sets of notes that are represented as weighted points in the two-dimensional space of time and pitch. Two point sets can be compared by calculating how mu

  4. Visual search

    NARCIS (Netherlands)

    Toet, A.; Bijl, P.

    2003-01-01

    Visual search, with or without the aid of optical or electro-optical instruments, plays a significant role in various types of military and civilian operations (e.g., reconnaissance, surveillance, and search and rescue). Advance knowledge of human visual search and target acquisition performance is

  5. Information Extraction Using Distant Supervision and Semantic Similarities

    Directory of Open Access Journals (Sweden)

    PARK, Y.

    2016-02-01

    Full Text Available Information extraction is one of the main research tasks in natural language processing and text mining that extracts useful information from unstructured sentences. Information extraction techniques include named entity recognition, relation extraction, and co-reference resolution. Among them, relation extraction refers to a task that extracts semantic relations between entities such as personal and geographic names in documents. This is an important research area, which is used in knowledge base construction and question and answering systems. This study presents relation extraction using a distant supervision learning technique among semi-supervised learning methods, which have been spotlighted in recent years to reduce human manual work and costs required for supervised learning. That is, this study proposes a method that can improve relation extraction by improving a distant supervision learning technique by applying a clustering method to create a learning corpus and semantic analysis for relation extraction that is difficult to identify using existing distant supervision. Through comparison experiments of various semantic similarity comparison methods, similarity calculation methods that are useful to relation extraction using distant supervision are searched, and a large number of accurate relation triples can be extracted using the proposed structural advantages and semantic similarity comparison.

  6. The application of similar image retrieval in electronic commerce.

    Science.gov (United States)

    Hu, YuPing; Yin, Hua; Han, Dezhi; Yu, Fei

    2014-01-01

    Traditional online shopping platform (OSP), which searches product information by keywords, faces three problems: indirect search mode, large search space, and inaccuracy in search results. For solving these problems, we discuss and research the application of similar image retrieval in electronic commerce. Aiming at improving the network customers' experience and providing merchants with the accuracy of advertising, we design a reasonable and extensive electronic commerce application system, which includes three subsystems: image search display subsystem, image search subsystem, and product information collecting subsystem. This system can provide seamless connection between information platform and OSP, on which consumers can automatically and directly search similar images according to the pictures from information platform. At the same time, it can be used to provide accuracy of internet marketing for enterprises. The experiment shows the efficiency of constructing the system.

  7. The Application of Similar Image Retrieval in Electronic Commerce

    Directory of Open Access Journals (Sweden)

    YuPing Hu

    2014-01-01

    Full Text Available Traditional online shopping platform (OSP, which searches product information by keywords, faces three problems: indirect search mode, large search space, and inaccuracy in search results. For solving these problems, we discuss and research the application of similar image retrieval in electronic commerce. Aiming at improving the network customers’ experience and providing merchants with the accuracy of advertising, we design a reasonable and extensive electronic commerce application system, which includes three subsystems: image search display subsystem, image search subsystem, and product information collecting subsystem. This system can provide seamless connection between information platform and OSP, on which consumers can automatically and directly search similar images according to the pictures from information platform. At the same time, it can be used to provide accuracy of internet marketing for enterprises. The experiment shows the efficiency of constructing the system.

  8. Segmentation Similarity and Agreement

    CERN Document Server

    Fournier, Chris

    2012-01-01

    We propose a new segmentation evaluation metric, called segmentation similarity (S), that quantifies the similarity between two segmentations as the proportion of boundaries that are not transformed when comparing them using edit distance, essentially using edit distance as a penalty function and scaling penalties by segmentation size. We propose several adapted inter-annotator agreement coefficients which use S that are suitable for segmentation. We show that S is configurable enough to suit a wide variety of segmentation evaluations, and is an improvement upon the state of the art. We also propose using inter-annotator agreement coefficients to evaluate automatic segmenters in terms of human performance.

  9. An efficient and accurate 3D displacements tracking strategy for digital volume correlation

    KAUST Repository

    Pan, Bing

    2014-07-01

    Owing to its inherent computational complexity, practical implementation of digital volume correlation (DVC) for internal displacement and strain mapping faces important challenges in improving its computational efficiency. In this work, an efficient and accurate 3D displacement tracking strategy is proposed for fast DVC calculation. The efficiency advantage is achieved by using three improvements. First, to eliminate the need of updating Hessian matrix in each iteration, an efficient 3D inverse compositional Gauss-Newton (3D IC-GN) algorithm is introduced to replace existing forward additive algorithms for accurate sub-voxel displacement registration. Second, to ensure the 3D IC-GN algorithm that converges accurately and rapidly and avoid time-consuming integer-voxel displacement searching, a generalized reliability-guided displacement tracking strategy is designed to transfer accurate and complete initial guess of deformation for each calculation point from its computed neighbors. Third, to avoid the repeated computation of sub-voxel intensity interpolation coefficients, an interpolation coefficient lookup table is established for tricubic interpolation. The computational complexity of the proposed fast DVC and the existing typical DVC algorithms are first analyzed quantitatively according to necessary arithmetic operations. Then, numerical tests are performed to verify the performance of the fast DVC algorithm in terms of measurement accuracy and computational efficiency. The experimental results indicate that, compared with the existing DVC algorithm, the presented fast DVC algorithm produces similar precision and slightly higher accuracy at a substantially reduced computational cost. © 2014 Elsevier Ltd.

  10. Improved Search Techniques

    Science.gov (United States)

    Albornoz, Caleb Ronald

    2012-01-01

    Thousands of millions of documents are stored and updated daily in the World Wide Web. Most of the information is not efficiently organized to build knowledge from the stored data. Nowadays, search engines are mainly used by users who rely on their skills to look for the information needed. This paper presents different techniques search engine users can apply in Google Search to improve the relevancy of search results. According to the Pew Research Center, the average person spends eight hours a month searching for the right information. For instance, a company that employs 1000 employees wastes $2.5 million dollars on looking for nonexistent and/or not found information. The cost is very high because decisions are made based on the information that is readily available to use. Whenever the information necessary to formulate an argument is not available or found, poor decisions may be made and mistakes will be more likely to occur. Also, the survey indicates that only 56% of Google users feel confident with their current search skills. Moreover, just 76% of the information that is available on the Internet is accurate.

  11. Quantum search by measurement

    CERN Document Server

    Childs, A M; Farhi, E; Goldstone, J; Gutmann, S; Landahl, A J; Childs, Andrew M.; Deotto, Enrico; Farhi, Edward; Goldstone, Jeffrey; Gutmann, Sam; Landahl, Andrew J.

    2002-01-01

    We propose a quantum algorithm for solving combinatorial search problems that uses only a sequence of measurements. The algorithm is similar in spirit to quantum computation by adiabatic evolution, in that the goal is to remain in the ground state of a time-varying Hamiltonian. Indeed, we show that the running times of the two algorithms are closely related. We also show how to achieve the quadratic speedup for Grover's unstructured search problem with only two measurements. Finally, we discuss some similarities and differences between the adiabatic and measurement algorithms.

  12. Faceted Search

    CERN Document Server

    Tunkelang, Daniel

    2009-01-01

    We live in an information age that requires us, more than ever, to represent, access, and use information. Over the last several decades, we have developed a modern science and technology for information retrieval, relentlessly pursuing the vision of a "memex" that Vannevar Bush proposed in his seminal article, "As We May Think." Faceted search plays a key role in this program. Faceted search addresses weaknesses of conventional search approaches and has emerged as a foundation for interactive information retrieval. User studies demonstrate that faceted search provides more

  13. Accurate guitar tuning by cochlear implant musicians.

    Directory of Open Access Journals (Sweden)

    Thomas Lu

    Full Text Available Modern cochlear implant (CI users understand speech but find difficulty in music appreciation due to poor pitch perception. Still, some deaf musicians continue to perform with their CI. Here we show unexpected results that CI musicians can reliably tune a guitar by CI alone and, under controlled conditions, match simultaneously presented tones to <0.5 Hz. One subject had normal contralateral hearing and produced more accurate tuning with CI than his normal ear. To understand these counterintuitive findings, we presented tones sequentially and found that tuning error was larger at ∼ 30 Hz for both subjects. A third subject, a non-musician CI user with normal contralateral hearing, showed similar trends in performance between CI and normal hearing ears but with less precision. This difference, along with electric analysis, showed that accurate tuning was achieved by listening to beats rather than discriminating pitch, effectively turning a spectral task into a temporal discrimination task.

  14. Search and Recommendation

    DEFF Research Database (Denmark)

    Bogers, Toine

    2014-01-01

    -scale application by companies like Amazon, Facebook, and Netflix. But are search and recommendation really two different fields of research that address different problems with different sets of algorithms in papers published at distinct conferences? In my talk, I want to argue that search and recommendation......In just a little over half a century, the field of information retrieval has experienced spectacular growth and success, with IR applications such as search engines becoming a billion-dollar industry in the past decades. Recommender systems have seen an even more meteoric rise to success with wide...... are more similar than they have been treated in the past decade. By looking more closely at the tasks and problems that search and recommendation try to solve, at the algorithms used to solve these problems and at the way their performance is evaluated, I want to show that there is no clear black and white...

  15. More Similar Than Different

    DEFF Research Database (Denmark)

    Pedersen, Mogens Jin

    2015-01-01

    What role do employee features play into the success of different personnel management practices for serving high performance? Using data from a randomized survey experiment among 5,982 individuals of all ages, this article examines how gender conditions the compliance effects of different...... incentive treatments—each relating to the basic content of distinct types of personnel management practices. The findings show that males and females are more similar than different in terms of the incentive treatments’ effects: Significant average effects are found for three out of five incentive...

  16. Similar dissection of sets

    CERN Document Server

    Akiyama, Shigeki; Okazaki, Ryotaro; Steiner, Wolfgang; Thuswaldner, Jörg

    2010-01-01

    In 1994, Martin Gardner stated a set of questions concerning the dissection of a square or an equilateral triangle in three similar parts. Meanwhile, Gardner's questions have been generalized and some of them are already solved. In the present paper, we solve more of his questions and treat them in a much more general context. Let $D\\subset \\mathbb{R}^d$ be a given set and let $f_1,...,f_k$ be injective continuous mappings. Does there exist a set $X$ such that $D = X \\cup f_1(X) \\cup ... \\cup f_k(X)$ is satisfied with a non-overlapping union? We prove that such a set $X$ exists for certain choices of $D$ and $\\{f_1,...,f_k\\}$. The solutions $X$ often turn out to be attractors of iterated function systems with condensation in the sense of Barnsley. Coming back to Gardner's setting, we use our theory to prove that an equilateral triangle can be dissected in three similar copies whose areas have ratio $1:1:a$ for $a \\ge (3+\\sqrt{5})/2$.

  17. SiRen: Leveraging Similar Regions for Efficient and Accurate Variant Calling

    Science.gov (United States)

    2015-05-30

    Cloudera, EMC2, Ericsson, Facebook, Guavus, HP, Huawei, Informatica , Intel, Microsoft, NetApp, Pivotal, Samsung, Schlumberger, Splunk, Virdata and VMware...EMC2, Ericsson, Facebook, Guavus, HP, Huawei, Informatica , Intel, Microsoft, NetApp, Pivotal, Samsung, Schlumberger, Splunk, Virdata and VMware

  18. Similarity transformed semiclassical dynamics

    Science.gov (United States)

    Van Voorhis, Troy; Heller, Eric J.

    2003-12-01

    In this article, we employ a recently discovered criterion for selecting important contributions to the semiclassical coherent state propagator [T. Van Voorhis and E. J. Heller, Phys. Rev. A 66, 050501 (2002)] to study the dynamics of many dimensional problems. We show that the dynamics are governed by a similarity transformed version of the standard classical Hamiltonian. In this light, our selection criterion amounts to using trajectories generated with the untransformed Hamiltonian as approximate initial conditions for the transformed boundary value problem. We apply the new selection scheme to some multidimensional Henon-Heiles problems and compare our results to those obtained with the more sophisticated Herman-Kluk approach. We find that the present technique gives near-quantitative agreement with the the standard results, but that the amount of computational effort is less than Herman-Kluk requires even when sophisticated integral smoothing techniques are employed in the latter.

  19. Predicting user click behaviour in search engine advertisements

    Science.gov (United States)

    Daryaie Zanjani, Mohammad; Khadivi, Shahram

    2015-10-01

    According to the specific requirements and interests of users, search engines select and display advertisements that match user needs and have higher probability of attracting users' attention based on their previous search history. New objects such as user, advertisement or query cause a deterioration of precision in targeted advertising due to their lack of history. This article surveys this challenge. In the case of new objects, we first extract similar observed objects to the new object and then we use their history as the history of new object. Similarity between objects is measured based on correlation, which is a relation between user and advertisement when the advertisement is displayed to the user. This method is used for all objects, so it has helped us to accurately select relevant advertisements for users' queries. In our proposed model, we assume that similar users behave in a similar manner. We find that users with few queries are similar to new users. We will show that correlation between users and advertisements' keywords is high. Thus, users who pay attention to advertisements' keywords, click similar advertisements. In addition, users who pay attention to specific brand names might have similar behaviours too.

  20. New generation of the multimedia search engines

    Science.gov (United States)

    Mijes Cruz, Mario Humberto; Soto Aldaco, Andrea; Maldonado Cano, Luis Alejandro; López Rodríguez, Mario; Rodríguez Vázqueza, Manuel Antonio; Amaya Reyes, Laura Mariel; Cano Martínez, Elizabeth; Pérez Rosas, Osvaldo Gerardo; Rodríguez Espejo, Luis; Flores Secundino, Jesús Abimelek; Rivera Martínez, José Luis; García Vázquez, Mireya Saraí; Zamudio Fuentes, Luis Miguel; Sánchez Valenzuela, Juan Carlos; Montoya Obeso, Abraham; Ramírez Acosta, Alejandro Álvaro

    2016-09-01

    Current search engines are based upon search methods that involve the combination of words (text-based search); which has been efficient until now. However, the Internet's growing demand indicates that there's more diversity on it with each passing day. Text-based searches are becoming limited, as most of the information on the Internet can be found in different types of content denominated multimedia content (images, audio files, video files). Indeed, what needs to be improved in current search engines is: search content, and precision; as well as an accurate display of expected search results by the user. Any search can be more precise if it uses more text parameters, but it doesn't help improve the content or speed of the search itself. One solution is to improve them through the characterization of the content for the search in multimedia files. In this article, an analysis of the new generation multimedia search engines is presented, focusing the needs according to new technologies. Multimedia content has become a central part of the flow of information in our daily life. This reflects the necessity of having multimedia search engines, as well as knowing the real tasks that it must comply. Through this analysis, it is shown that there are not many search engines that can perform content searches. The area of research of multimedia search engines of new generation is a multidisciplinary area that's in constant growth, generating tools that satisfy the different needs of new generation systems.

  1. Integrated Semantic Similarity Model Based on Ontology

    Institute of Scientific and Technical Information of China (English)

    LIU Ya-Jun; ZHAO Yun

    2004-01-01

    To solve the problem of the inadequacy of semantic processing in the intelligent question answering system, an integrated semantic similarity model which calculates the semantic similarity using the geometric distance and information content is presented in this paper.With the help of interrelationship between concepts, the information content of concepts and the strength of the edges in the ontology network, we can calculate the semantic similarity between two concepts and provide information for the further calculation of the semantic similarity between user's question and answers in knowlegdge base.The results of the experiments on the prototype have shown that the semantic problem in natural language processing can also be solved with the help of the knowledge and the abundant semantic information in ontology.More than 90% accuracy with less than 50 ms average searching time in the intelligent question answering prototype system based on ontology has been reached.The result is very satisfied.

  2. Accurate pose estimation for forensic identification

    Science.gov (United States)

    Merckx, Gert; Hermans, Jeroen; Vandermeulen, Dirk

    2010-04-01

    In forensic authentication, one aims to identify the perpetrator among a series of suspects or distractors. A fundamental problem in any recognition system that aims for identification of subjects in a natural scene is the lack of constrains on viewing and imaging conditions. In forensic applications, identification proves even more challenging, since most surveillance footage is of abysmal quality. In this context, robust methods for pose estimation are paramount. In this paper we will therefore present a new pose estimation strategy for very low quality footage. Our approach uses 3D-2D registration of a textured 3D face model with the surveillance image to obtain accurate far field pose alignment. Starting from an inaccurate initial estimate, the technique uses novel similarity measures based on the monogenic signal to guide a pose optimization process. We will illustrate the descriptive strength of the introduced similarity measures by using them directly as a recognition metric. Through validation, using both real and synthetic surveillance footage, our pose estimation method is shown to be accurate, and robust to lighting changes and image degradation.

  3. Textual and chemical information processing: different domains but similar algorithms

    Directory of Open Access Journals (Sweden)

    Peter Willett

    2000-01-01

    Full Text Available This paper discusses the extent to which algorithms developed for the processing of textual databases are also applicable to the processing of chemical structure databases, and vice versa. Applications discussed include: an algorithm for distribution sorting that has been applied to the design of screening systems for rapid chemical substructure searching; the use of measures of inter-molecular structural similarity for the analysis of hypertext graphs; a genetic algorithm for calculating term weights for relevance feedback searching for determining whether a molecule is likely to exhibit biological activity; and the use of data fusion to combine the results of different chemical similarity searches.

  4. A Novel Personalized Web Search Model

    Institute of Scientific and Technical Information of China (English)

    ZHU Zhengyu; XU Jingqiu; TIAN Yunyan; REN Xiang

    2007-01-01

    A novel personalized Web search model is proposed.The new system, as a middleware between a user and a Web search engine, is set up on the client machine. It can learn a user's preference implicitly and then generate the user profile automatically. When the user inputs query keywords, the system can automatically generate a few personalized expansion words by computing the term-term associations according to the current user profile, and then these words together with the query keywords are submitted to a popular search engine such as Yahoo or Google.These expansion words help to express accurately the user's search intention. The new Web search model can make a common search engine personalized, that is, the search engine can return different search results to different users who input the same keywords. The experimental results show the feasibility and applicability of the presented work.

  5. Efficient and accurate fragmentation methods.

    Science.gov (United States)

    Pruitt, Spencer R; Bertoni, Colleen; Brorsen, Kurt R; Gordon, Mark S

    2014-09-16

    Conspectus Three novel fragmentation methods that are available in the electronic structure program GAMESS (general atomic and molecular electronic structure system) are discussed in this Account. The fragment molecular orbital (FMO) method can be combined with any electronic structure method to perform accurate calculations on large molecular species with no reliance on capping atoms or empirical parameters. The FMO method is highly scalable and can take advantage of massively parallel computer systems. For example, the method has been shown to scale nearly linearly on up to 131 000 processor cores for calculations on large water clusters. There have been many applications of the FMO method to large molecular clusters, to biomolecules (e.g., proteins), and to materials that are used as heterogeneous catalysts. The effective fragment potential (EFP) method is a model potential approach that is fully derived from first principles and has no empirically fitted parameters. Consequently, an EFP can be generated for any molecule by a simple preparatory GAMESS calculation. The EFP method provides accurate descriptions of all types of intermolecular interactions, including Coulombic interactions, polarization/induction, exchange repulsion, dispersion, and charge transfer. The EFP method has been applied successfully to the study of liquid water, π-stacking in substituted benzenes and in DNA base pairs, solvent effects on positive and negative ions, electronic spectra and dynamics, non-adiabatic phenomena in electronic excited states, and nonlinear excited state properties. The effective fragment molecular orbital (EFMO) method is a merger of the FMO and EFP methods, in which interfragment interactions are described by the EFP potential, rather than the less accurate electrostatic potential. The use of EFP in this manner facilitates the use of a smaller value for the distance cut-off (Rcut). Rcut determines the distance at which EFP interactions replace fully quantum

  6. Accurate determination of antenna directivity

    DEFF Research Database (Denmark)

    Dich, Mikael

    1997-01-01

    The derivation of a formula for accurate estimation of the total radiated power from a transmitting antenna for which the radiated power density is known in a finite number of points on the far-field sphere is presented. The main application of the formula is determination of directivity from power......-pattern measurements. The derivation is based on the theory of spherical wave expansion of electromagnetic fields, which also establishes a simple criterion for the required number of samples of the power density. An array antenna consisting of Hertzian dipoles is used to test the accuracy and rate of convergence...

  7. Capacity Planning for Vertical Search Engines

    CERN Document Server

    Badue, Claudine; Almeida, Virgilio; Baeza-Yates, Ricardo; Ribeiro-Neto, Berthier; Ziviani, Artur; Ziviani, Nivio

    2010-01-01

    Vertical search engines focus on specific slices of content, such as the Web of a single country or the document collection of a large corporation. Despite this, like general open web search engines, they are expensive to maintain, expensive to operate, and hard to design. Because of this, predicting the response time of a vertical search engine is usually done empirically through experimentation, requiring a costly setup. An alternative is to develop a model of the search engine for predicting performance. However, this alternative is of interest only if its predictions are accurate. In this paper we propose a methodology for analyzing the performance of vertical search engines. Applying the proposed methodology, we present a capacity planning model based on a queueing network for search engines with a scale typically suitable for the needs of large corporations. The model is simple and yet reasonably accurate and, in contrast to previous work, considers the imbalance in query service times among homogeneous...

  8. Conjunctive Wildcard Search over Encrypted Data

    NARCIS (Netherlands)

    Bösch, Christoph; Brinkman, Richard; Hartel, Pieter; Jonker, Willem; Jonker, Willem; Petkovic, Milan

    2011-01-01

    Searchable encryption allows a party to search over encrypted data without decrypting it. Prior schemes in the symmetric setting deal only with exact or similar keyword matches. We describe a scheme for the problem of wildcard searches over encrypted data to make search queries more flexible, provid

  9. Practical fulltext search in medical records

    Directory of Open Access Journals (Sweden)

    Vít Volšička

    2015-09-01

    Full Text Available Performing a search through previously existing documents, including medical reports, is an integral part of acquiring new information and educational processes. Unfortunately, finding relevant information is not always easy, since many documents are saved in free text formats, thereby making it difficult to search through them. A full-text search is a viable solution for searching through documents. The full-text search makes it possible to efficiently search through large numbers of documents and to find those that contain specific search phrases in a short time. All leading database systems currently offer full-text search, but some do not support the complex morphology of the Czech language. Apache Solr provides full support options and some full-text libraries. This programme provides the good support of the Czech language in the basic installation, and a wide range of settings and options for its deployment over any platform. The library had been satisfactorily tested using real data from the hospitals. Solr provided useful, fast, and accurate searches. However, there is still a need to make adjustments in order to receive effective search results, particularly by correcting typographical errors made not only in the text, but also when entering words in the search box and creating a list of frequently used abbreviations and synonyms for more accurate results.

  10. Face Search at Scale.

    Science.gov (United States)

    Wang, Dayong; Otto, Charles; Jain, Anil K

    2016-06-20

    rsons of interest among the billions of shared photos on these websites. Despite significant progress in face recognition, searching a large collection of unconstrained face images remains a difficult problem. To address this challenge, we propose a face search system which combines a fast search procedure, coupled with a state-of-the-art commercial off the shelf (COTS) matcher, in a cascaded framework. Given a probe face, we first filter the large gallery of photos to find the top-k most similar faces using features learned by a convolutional neural network. The k retrieved candidates are re-ranked by combining similarities based on deep features and those output by the COTS matcher. We evaluate the proposed face search system on a gallery containing 80 million web-downloaded face images. Experimental results demonstrate that while the deep features perform worse than the COTS matcher on a mugshot dataset (93.7% vs. 98.6% TAR@FAR of 0.01%), fusing the deep features with the COTS matcher improves the overall performance (99.5% TAR@FAR of 0.01%). This shows that the learned deep features provide complementary information over representations used in state-of-the-art face matchers. On the unconstrained face image benchmarks, the performance of the learned deep features is competitive with reported accuracies. LFW database: 98.20% accuracy under the standard protocol and 88.03% TAR@FAR of 0.1% under the BLUFR protocol; IJB-A benchmark: 51.0% TAR@FAR of 0.1% (verification), rank 1 retrieval of 82.2% (closed-set search), 61.5% FNIR@FAR of 1% (open-set search). The proposed face search system offers an excellent trade-off between accuracy and scalability on galleries with millions of images. Additionally, in a face search experiment involving photos of the Tsarnaev brothers, convicted of the Boston Marathon bombing, the proposed cascade face search system could find the younger brother's (Dzhokhar Tsarnaev) photo at rank 1 in 1 second on a 5M gallery and at rank 8 in 7

  11. Notions of similarity for computational biology models

    KAUST Repository

    Waltemath, Dagmar

    2016-03-21

    Computational models used in biology are rapidly increasing in complexity, size, and numbers. To build such large models, researchers need to rely on software tools for model retrieval, model combination, and version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of similarity may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here, we introduce a general notion of quantitative model similarities, survey the use of existing model comparison methods in model building and management, and discuss potential applications of model comparison. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on different model aspects. Potentially relevant aspects of a model comprise its references to biological entities, network structure, mathematical equations and parameters, and dynamic behaviour. Future similarity measures could combine these model aspects in flexible, problem-specific ways in order to mimic users\\' intuition about model similarity, and to support complex model searches in databases.

  12. Accurate upper body rehabilitation system using kinect.

    Science.gov (United States)

    Sinha, Sanjana; Bhowmick, Brojeshwar; Chakravarty, Kingshuk; Sinha, Aniruddha; Das, Abhijit

    2016-08-01

    The growing importance of Kinect as a tool for clinical assessment and rehabilitation is due to its portability, low cost and markerless system for human motion capture. However, the accuracy of Kinect in measuring three-dimensional body joint center locations often fails to meet clinical standards of accuracy when compared to marker-based motion capture systems such as Vicon. The length of the body segment connecting any two joints, measured as the distance between three-dimensional Kinect skeleton joint coordinates, has been observed to vary with time. The orientation of the line connecting adjoining Kinect skeletal coordinates has also been seen to differ from the actual orientation of the physical body segment. Hence we have proposed an optimization method that utilizes Kinect Depth and RGB information to search for the joint center location that satisfies constraints on body segment length and as well as orientation. An experimental study have been carried out on ten healthy participants performing upper body range of motion exercises. The results report 72% reduction in body segment length variance and 2° improvement in Range of Motion (ROM) angle hence enabling to more accurate measurements for upper limb exercises.

  13. Internet Search Engines

    OpenAIRE

    Fatmaa El Zahraa Mohamed Abdou

    2004-01-01

    A general study about the internet search engines, the study deals main 7 points; the differance between search engines and search directories, components of search engines, the percentage of sites covered by search engines, cataloging of sites, the needed time for sites appearance in search engines, search capabilities, and types of search engines.

  14. Internet Search Engines

    Directory of Open Access Journals (Sweden)

    Fatmaa El Zahraa Mohamed Abdou

    2004-09-01

    Full Text Available A general study about the internet search engines, the study deals main 7 points; the differance between search engines and search directories, components of search engines, the percentage of sites covered by search engines, cataloging of sites, the needed time for sites appearance in search engines, search capabilities, and types of search engines.

  15. Accurate Modeling of Advanced Reflectarrays

    DEFF Research Database (Denmark)

    Zhou, Min

    Analysis and optimization methods for the design of advanced printed re ectarrays have been investigated, and the study is focused on developing an accurate and efficient simulation tool. For the analysis, a good compromise between accuracy and efficiency can be obtained using the spectral domain...... to the POT. The GDOT can optimize for the size as well as the orientation and position of arbitrarily shaped array elements. Both co- and cross-polar radiation can be optimized for multiple frequencies, dual polarization, and several feed illuminations. Several contoured beam reflectarrays have been designed...... using the GDOT to demonstrate its capabilities. To verify the accuracy of the GDOT, two offset contoured beam reflectarrays that radiate a high-gain beam on a European coverage have been designed and manufactured, and subsequently measured at the DTU-ESA Spherical Near-Field Antenna Test Facility...

  16. The Accurate Particle Tracer Code

    CERN Document Server

    Wang, Yulei; Qin, Hong; Yu, Zhi

    2016-01-01

    The Accurate Particle Tracer (APT) code is designed for large-scale particle simulations on dynamical systems. Based on a large variety of advanced geometric algorithms, APT possesses long-term numerical accuracy and stability, which are critical for solving multi-scale and non-linear problems. Under the well-designed integrated and modularized framework, APT serves as a universal platform for researchers from different fields, such as plasma physics, accelerator physics, space science, fusion energy research, computational mathematics, software engineering, and high-performance computation. The APT code consists of seven main modules, including the I/O module, the initialization module, the particle pusher module, the parallelization module, the field configuration module, the external force-field module, and the extendible module. The I/O module, supported by Lua and Hdf5 projects, provides a user-friendly interface for both numerical simulation and data analysis. A series of new geometric numerical methods...

  17. Accurate ab initio spin densities

    CERN Document Server

    Boguslawski, Katharina; Legeza, Örs; Reiher, Markus

    2012-01-01

    We present an approach for the calculation of spin density distributions for molecules that require very large active spaces for a qualitatively correct description of their electronic structure. Our approach is based on the density-matrix renormalization group (DMRG) algorithm to calculate the spin density matrix elements as basic quantity for the spatially resolved spin density distribution. The spin density matrix elements are directly determined from the second-quantized elementary operators optimized by the DMRG algorithm. As an analytic convergence criterion for the spin density distribution, we employ our recently developed sampling-reconstruction scheme [J. Chem. Phys. 2011, 134, 224101] to build an accurate complete-active-space configuration-interaction (CASCI) wave function from the optimized matrix product states. The spin density matrix elements can then also be determined as an expectation value employing the reconstructed wave function expansion. Furthermore, the explicit reconstruction of a CA...

  18. Accurate thickness measurement of graphene.

    Science.gov (United States)

    Shearer, Cameron J; Slattery, Ashley D; Stapleton, Andrew J; Shapter, Joseph G; Gibson, Christopher T

    2016-03-29

    Graphene has emerged as a material with a vast variety of applications. The electronic, optical and mechanical properties of graphene are strongly influenced by the number of layers present in a sample. As a result, the dimensional characterization of graphene films is crucial, especially with the continued development of new synthesis methods and applications. A number of techniques exist to determine the thickness of graphene films including optical contrast, Raman scattering and scanning probe microscopy techniques. Atomic force microscopy (AFM), in particular, is used extensively since it provides three-dimensional images that enable the measurement of the lateral dimensions of graphene films as well as the thickness, and by extension the number of layers present. However, in the literature AFM has proven to be inaccurate with a wide range of measured values for single layer graphene thickness reported (between 0.4 and 1.7 nm). This discrepancy has been attributed to tip-surface interactions, image feedback settings and surface chemistry. In this work, we use standard and carbon nanotube modified AFM probes and a relatively new AFM imaging mode known as PeakForce tapping mode to establish a protocol that will allow users to accurately determine the thickness of graphene films. In particular, the error in measuring the first layer is reduced from 0.1-1.3 nm to 0.1-0.3 nm. Furthermore, in the process we establish that the graphene-substrate adsorbate layer and imaging force, in particular the pressure the tip exerts on the surface, are crucial components in the accurate measurement of graphene using AFM. These findings can be applied to other 2D materials.

  19. Accurate thickness measurement of graphene

    Science.gov (United States)

    Shearer, Cameron J.; Slattery, Ashley D.; Stapleton, Andrew J.; Shapter, Joseph G.; Gibson, Christopher T.

    2016-03-01

    Graphene has emerged as a material with a vast variety of applications. The electronic, optical and mechanical properties of graphene are strongly influenced by the number of layers present in a sample. As a result, the dimensional characterization of graphene films is crucial, especially with the continued development of new synthesis methods and applications. A number of techniques exist to determine the thickness of graphene films including optical contrast, Raman scattering and scanning probe microscopy techniques. Atomic force microscopy (AFM), in particular, is used extensively since it provides three-dimensional images that enable the measurement of the lateral dimensions of graphene films as well as the thickness, and by extension the number of layers present. However, in the literature AFM has proven to be inaccurate with a wide range of measured values for single layer graphene thickness reported (between 0.4 and 1.7 nm). This discrepancy has been attributed to tip-surface interactions, image feedback settings and surface chemistry. In this work, we use standard and carbon nanotube modified AFM probes and a relatively new AFM imaging mode known as PeakForce tapping mode to establish a protocol that will allow users to accurately determine the thickness of graphene films. In particular, the error in measuring the first layer is reduced from 0.1-1.3 nm to 0.1-0.3 nm. Furthermore, in the process we establish that the graphene-substrate adsorbate layer and imaging force, in particular the pressure the tip exerts on the surface, are crucial components in the accurate measurement of graphene using AFM. These findings can be applied to other 2D materials.

  20. Estimating similarity of XML Schemas using path similarity measure

    Directory of Open Access Journals (Sweden)

    Veena Trivedi

    2012-07-01

    Full Text Available In this paper, an attempt has been made to develop an algorithm which estimates the similarity for XML Schemas using multiple similarity measures. For performing the task, the XML Schema element information has been represented in the form of string and four different similarity measure approaches have been employed. To further improve the similarity measure, an overall similarity measure has also been calculated. The approach used in this paper is a distinguished one, as it calculates the similarity between two XML schemas using four approaches and gives an integrated values for the similarity measure. Keywords-componen

  1. A More Accurate Fourier Transform

    CERN Document Server

    Courtney, Elya

    2015-01-01

    Fourier transform methods are used to analyze functions and data sets to provide frequencies, amplitudes, and phases of underlying oscillatory components. Fast Fourier transform (FFT) methods offer speed advantages over evaluation of explicit integrals (EI) that define Fourier transforms. This paper compares frequency, amplitude, and phase accuracy of the two methods for well resolved peaks over a wide array of data sets including cosine series with and without random noise and a variety of physical data sets, including atmospheric $\\mathrm{CO_2}$ concentrations, tides, temperatures, sound waveforms, and atomic spectra. The FFT uses MIT's FFTW3 library. The EI method uses the rectangle method to compute the areas under the curve via complex math. Results support the hypothesis that EI methods are more accurate than FFT methods. Errors range from 5 to 10 times higher when determining peak frequency by FFT, 1.4 to 60 times higher for peak amplitude, and 6 to 10 times higher for phase under a peak. The ability t...

  2. A new adaptive fast motion estimation algorithm based on local motion similarity degree (LMSD)

    Institute of Scientific and Technical Information of China (English)

    LIU Long; HAN Chongzhao; BAI Yan

    2005-01-01

    In the motion vector field adaptive search technique (MVFAST) and the predictive motion vector field adaptive search technique (PMVFAST), the size of the largest motion vector from the three adjacent blocks (left, top, top-right) is compared with the threshold to select different search scheme. But a suitable search center and search pattern will not be selected in the adaptive search technique when the adjacent motion vectors are not coherent in local region. This paper presents an efficient adaptive search algorithm. The motion vector variation degree (MVVD) is considered a reasonable factor for adaptive search selection. By the relationship between local motion similarity degree (LMSD) and the variation degree of motion vector (MVVD), the motion vectors are classified as three categories according to corresponding LMSD; then different proposed search schemes are adopted for motion estimation. The experimental results show that the proposed algorithm has a significant computational speedup compared with MVFAST and PMVFAST algorithms, and offers a similar, even better performance.

  3. Search for $\

    OpenAIRE

    Astier, P.; Autiero, D.; Baldisseri, A.; Baldo-Ceolin, M.; Banner, M.; Bassompierre, G.; Benslama, K.; Besson, N.; Bird, I.; Blumenfeld, B.; Bobisut, F.; J. Bouchez; Boyd, S.; A. Bueno; Bunyatov, S.

    2003-01-01

    Neutrinos; We present the results of a search for nu_mu → nu_e oscillations in the NOMAD experiment at Cern. The experiment looked for the appearance of nu_e in a predominantly nu_mu wide-band neutrino beam at the CERN SPS. No evidence for oscillations was found. The 90% confidence limits obtained are Delta m^2 ~ 10 eV^2.

  4. Search for $\

    CERN Document Server

    Astier, Pierre; Baldisseri, Alberto; Baldo-Ceolin, Massimilla; Banner, M; Bassompierre, Gabriel; Benslama, K; Besson, N; Bird, I; Blumenfeld, B; Bobisut, F; Bouchez, J; Boyd, S; Bueno, A G; Bunyatov, S A; Camilleri, L L; Cardini, A; Cattaneo, Paolo Walter; Cavasinni, V; Cervera-Villanueva, A; Challis, R C; Chukanov, A; Collazuol, G; Conforto, G; Conta, C; Contalbrigo, M; Cousins, R D; Daniels, D; De Santo, A; Degaudenzi, H M; Del Prete, T; Di Lella, L; Dignan, T; Dumarchez, J; Feldman, G J; Ferrari, A; Ferrari, R; Ferrère, D; Flaminio, Vincenzo; Fraternali, M; Gaillard, J M; Gangler, E; Geiser, A; Geppert, D; Gibin, D; Gninenko, S N; Godley, A; Gosset, J; Gouanère, M; Grant, A; Graziani, G; Guglielmi, A M; Gómez-Cadenas, J J; Gössling, C; Hagner, C; Hernando, J; Hong, T M; Hubbard, D B; Hurst, P; Hyett, N; Iacopini, E; Joseph, C L; Juget, F R; Kent, N; Kirsanov, M M; Klimov, O; Kokkonen, J; Kovzelev, A; Krasnoperov, A V; Kustov, D; La Rotonda, L; Lacaprara, S; Lachaud, C; Lakic, B; Lanza, A; Laveder, M; Letessier-Selvon, A A; Linssen, Lucie; Ljubicic, A; Long, J; Lupi, A; Lévy, J M; Marchionni, A; Martelli, F; Mendiburu, J P; Meyer, J P; Mezzetto, Mauro; Mishra, S R; Moorhead, G F; Méchain, X; Naumov, D V; Nefedov, Yu A; Nguyen-Mau, C; Nédélec, P; Orestano, D; Pastore, F; Peak, L S; Pennacchio, E; Pessard, H; Petti, R; Placci, A; Polesello, G; Pollmann, D; Polyarush, A Yu; Popov, B; Poulsen, C; Rebuffi, L; Renò, R; Rico, J; Riemann, P; Roda, C; Rubbia, André; Salvatore, F; Schahmaneche, K; Schmidt, B; Schmidt, T; Sconza, A; Sevior, M E; Shih, D; Sillou, D; Soler, F J P; Sozzi, G; Steele, D; Stiegler, U; Stipcevic, M; Stolarczyk, T; Tareb-Reyes, M; Taylor, G N; Tereshchenko, V V; Toropin, A N; Touchard, A M; Tovey, Stuart N; Tran, M T; Tsesmelis, E; Ulrichs, J; Vacavant, L; Valdata-Nappi, M; Valuev, V Yu; Vannucci, François; Varvell, K E; Veltri, M; Vercesi, V; Vidal-Sitjes, G; Vieira, J M; Vinogradova, T G; Weber, F V; Weisse, T; Wilson, F F; Winton, L J; Yabsley, B D; Zaccone, Henri; Zuber, K; Zuccon, P; do Couto e Silva, E

    2003-01-01

    We present the results of a search for nu_mu → nu_e oscillations in the NOMAD experiment at Cern. The experiment looked for the appearance of nu_e in a predominantly nu_mu wide-band neutrino beam at the CERN SPS. No evidence for oscillations was found. The 90% confidence limits obtained are Delta m^2 ~ 10 eV^2.

  5. Search for $\

    CERN Document Server

    Astier, Pierre; Baldisseri, Alberto; Baldo-Ceolin, Massimilla; Banner, M; Bassompierre, Gabriel; Benslama, K; Besson, N; Bird, I; Blumenfeld, B; Bobisut, F; Bouchez, J; Boyd, S; Bueno, A G; Bunyatov, S; Camilleri, L L; Cardini, A; Cattaneo, Paolo Walter; Cavasinni, V; Cervera-Villanueva, A; Challis, R C; Chukanov, A; Collazuol, G; Conforto, G; Conta, C; Contalbrigo, M; Cousins, R; Daniels, D; Degaudenzi, H M; Del Prete, T; De Santo, A; Dignan, T; Di Lella, L; do Couto e Silva, E; Dumarchez, J; Ellis, M; Feldman, G J; Ferrari, R; Ferrère, D; Flaminio, Vincenzo; Fraternali, M; Gaillard, J M; Gangler, E; Geiser, A; Geppert, D; Gibin, D; Gninenko, S N; Godley, A; Gómez-Cadenas, J J; Gosset, J; Gössling, C; Gouanère, M; Grant, A; Graziani, G; Guglielmi, A M; Hagner, C; Hernando, J A; Hubbard, D B; Hurst, P; Hyett, N; Iacopini, E; Joseph, C L; Juget, F R; Kent, N; Kirsanov, M M; Klimov, O; Kokkonen, J; Kovzelev, A; Krasnoperov, A V; Kustov, D; Lacaprara, S; Lachaud, C; Lakic, B; Lanza, A; La Rotonda, L; Laveder, M; Letessier-Selvon, A A; Lévy, J M; Linssen, Lucie; Ljubicic, A; Long, J; Lupi, A; Marchionni, A; Martelli, F; Méchain, X; Mendiburu, J P; Meyer, J P; Mezzetto, Mauro; Mishra, S R; Moorhead, G F; Naumov, D V; Nédélec, P; Nefedov, Yu A; Nguyen-Mau, C; Orestano, D; Pastore, F; Peak, L S; Pennacchio, E; Pessard, H; Petti, R; Placci, A; Polesello, G; Pollmann, D; Polyarush, A Yu; Popov, B; Poulsen, C; Rebuffi, L; Renò, R; Rico, J; Riemann, P; Roda, C; Rubbia, André; Salvatore, F; Schahmaneche, K; Schmidt, B; Schmidt, T; Sconza, A; Sevior, M E; Sillou, D; Soler, F J P; Sozzi, G; Steele, D; Stiegler, U; Stipcevic, M; Stolarczyk, T; Tareb-Reyes, M; Taylor, G; Tereshchenko, V V; Toropin, A N; Touchard, A M; Tovey, Stuart N; Tran, M T; Tsesmelis, E; Ulrichs, J; Vacavant, L; Valdata-Nappi, M; Valuev, V Y; Vannucci, François; Varvell, K E; Veltri, M; Vercesi, V; Vidal-Sitjes, G; Vieira, J M; Vinogradova, T G; Weber, F V; Weisse, T; Wilson, F F; Winton, L J; Yabsley, B D; Zaccone, Henri; Zuber, K; Zuccon, P

    2001-01-01

    We present the results of a search for nu(mu)-->nu(e) oscillations in the NOMAD experiment at CERN. The experiment looked for the appearance of nu(e) in a predominantly nu(mu) wide-band neutrino beam at the CERN SPS. No evidence for oscillations was found. The 90% confidence limits obtained are delta m^2 10 eV^2.

  6. Search Engines Selection Based on Relevance Terms%基于相关术语集的搜索引擎选择

    Institute of Scientific and Technical Information of China (English)

    欧洁

    2003-01-01

    Metasearch can effectively search distributed immense electronic resources. It is built on top of severalsearch engines, providing user with uniform access to these engines. Metasearch first passes user's query to underly-ing useful search engines, and then collects and reorganizes the results from the search engines used. It is calledsearch engines selection when metasearch selects underlying useful search engines. In this paper, we present a statis-tical method based on relevance terms to estimate the usefulness of a search engine for any given query, which is suit-able for both Boolean query and vector query. Experimental results indicate that the proposed estimation method isquite accurate, especially when the critical similarity is high between the query and the results.

  7. Similarity measures for protein ensembles

    DEFF Research Database (Denmark)

    Lindorff-Larsen, Kresten; Ferkinghoff-Borg, Jesper

    2009-01-01

    Analyses of similarities and changes in protein conformation can provide important information regarding protein function and evolution. Many scores, including the commonly used root mean square deviation, have therefore been developed to quantify the similarities of different protein conformatio...

  8. Pentaquark searches with ALICE

    CERN Document Server

    Bobulska, Dana

    2016-01-01

    In this report we present the results of the data analysis for searching for possible invariant mass signals from pentaquarks in the ALICE data. Analysis was based on filtered data from real p-Pb events at psNN=5.02 TeV collected in 2013. The motivation for this project was the recent discovery of pentaquark states by the LHCb collaboration (c ¯ cuud resonance P+ c ) [1]. The search for similar not yet observed pentaquarks is an interesting research topic [2]. In this analysis we searched for a s ¯ suud pentaquark resonance P+ s and its possible decay channel to f meson and proton. The ALICE detector is well suited for the search of certain candidates thanks to its low material budget and strong PID capabilities. Additionally we might expect the production of such particles in ALICE as in heavy-ion and proton-ion collisions the thermal models describes well the particle yields and ratios [3]. Therefore it is reasonable to expect other species of hadrons, including also possible pentaquarks, to be produced w...

  9. Functional Similarity and Interpersonal Attraction.

    Science.gov (United States)

    Neimeyer, Greg J.; Neimeyer, Robert A.

    1981-01-01

    Students participated in dyadic disclosure exercises over a five-week period. Results indicated members of high functional similarity dyads evidenced greater attraction to one another than did members of low functional similarity dyads. "Friendship" pairs of male undergraduates displayed greater functional similarity than did "nominal" pairs from…

  10. Perceived and actual similarities in biological and adoptive families: does perceived similarity bias genetic inferences?

    Science.gov (United States)

    Scarr, S; Scarf, E; Weinberg, R A

    1980-09-01

    Critics of the adoption method to estimate the relative effects of genetic and environmental differences on behavioral development claim that important biases are created by the knowledge of biological relatedness or adoptive status. Since the 1950s, agency policy has led to nearly all adopted children knowing that they are adopted. To test the hypothesis that knowledge of biological or adoptive status influences actual similarity, we correlated absolute differences in objective test scores with ratings of similarity by adolescents and their parents in adoptive and biological families. Although biological family members see themselves as more similar than adoptive family members, there are also important generational and gender differences in perceived similarity that cut across family type. There is moderate agreement among family members on the degree of perceived similarity, but there is no correlation between perceived and actual similarity in intelligence or temperament. However, family members are more accurate about shared social attitudes. Knowledge of adoptive or biological relatedness is related to the degree of perceived similarity, but perceptions of similarity are not related to objective similarities and thus do not constitute a bias in comparisons of measured differences in intelligence or temperament in adoptive and biological families.

  11. 38 CFR 4.46 - Accurate measurement.

    Science.gov (United States)

    2010-07-01

    ... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Accurate measurement. 4... RATING DISABILITIES Disability Ratings The Musculoskeletal System § 4.46 Accurate measurement. Accurate measurement of the length of stumps, excursion of joints, dimensions and location of scars with respect...

  12. Autonomous Search

    CERN Document Server

    Hamadi, Youssef; Saubion, Frédéric

    2012-01-01

    Decades of innovations in combinatorial problem solving have produced better and more complex algorithms. These new methods are better since they can solve larger problems and address new application domains. They are also more complex which means that they are hard to reproduce and often harder to fine-tune to the peculiarities of a given problem. This last point has created a paradox where efficient tools are out of reach of practitioners. Autonomous search (AS) represents a new research field defined to precisely address the above challenge. Its major strength and originality consist in the

  13. Search in

    OpenAIRE

    Gaona Román, Alejandro

    2015-01-01

    "Search in" consiste en una instalación artística compuesta por escultura y video con un trasfondo conceptual sobre la identidad. Es una obra que invita al espectador a rodearla e introducirse en ella viéndose así como parte de la obra, al igual que el concepto de identidad puede vivirse desde la sensación del “yo” separado del mundo y a su vez desde el “yo” como parte de la sociedad. Nos hace viajar desde nuestros inicios como sociedad y seres conscientes hasta la actualidad, la era de las c...

  14. Fundamentals of database indexing and searching

    CERN Document Server

    Bhattacharya, Arnab

    2014-01-01

    Fundamentals of Database Indexing and Searching presents well-known database searching and indexing techniques. It focuses on similarity search queries, showing how to use distance functions to measure the notion of dissimilarity.After defining database queries and similarity search queries, the book organizes the most common and representative index structures according to their characteristics. The author first describes low-dimensional index structures, memory-based index structures, and hierarchical disk-based index structures. He then outlines useful distance measures and index structures

  15. Enhancing Divergent Search through Extinction Events

    DEFF Research Database (Denmark)

    Lehman, Joel; Miikkulainen, Risto

    2015-01-01

    A challenge in evolutionary computation is to create representations as evolvable as those in natural evolution. This paper hypothesizes that extinction events, i.e. mass extinctions, can significantly increase evolvability, but only when combined with a divergent search algorithm, i.e. a search...... for the capacity to evolve. This hypothesis is tested through experiments in two evolutionary robotics domains. The results show that combining extinction events with divergent search increases evolvability, while combining them with convergent search offers no similar benefit. The conclusion is that extinction...... events may provide a simple and effective mechanism to enhance performance of divergent search algorithms....

  16. A COMPARISON OF SEMANTIC SIMILARITY MODELS IN EVALUATING CONCEPT SIMILARITY

    Directory of Open Access Journals (Sweden)

    Q. X. Xu

    2012-08-01

    Full Text Available The semantic similarities are important in concept definition, recognition, categorization, interpretation, and integration. Many semantic similarity models have been established to evaluate semantic similarities of objects or/and concepts. To find out the suitability and performance of different models in evaluating concept similarities, we make a comparison of four main types of models in this paper: the geometric model, the feature model, the network model, and the transformational model. Fundamental principles and main characteristics of these models are introduced and compared firstly. Land use and land cover concepts of NLCD92 are employed as examples in the case study. The results demonstrate that correlations between these models are very high for a possible reason that all these models are designed to simulate the similarity judgement of human mind.

  17. a Comparison of Semantic Similarity Models in Evaluating Concept Similarity

    Science.gov (United States)

    Xu, Q. X.; Shi, W. Z.

    2012-08-01

    The semantic similarities are important in concept definition, recognition, categorization, interpretation, and integration. Many semantic similarity models have been established to evaluate semantic similarities of objects or/and concepts. To find out the suitability and performance of different models in evaluating concept similarities, we make a comparison of four main types of models in this paper: the geometric model, the feature model, the network model, and the transformational model. Fundamental principles and main characteristics of these models are introduced and compared firstly. Land use and land cover concepts of NLCD92 are employed as examples in the case study. The results demonstrate that correlations between these models are very high for a possible reason that all these models are designed to simulate the similarity judgement of human mind.

  18. A study of Consistency in the Selection of Search Terms and Search Concepts: A Case Study in National Taiwan University

    Directory of Open Access Journals (Sweden)

    Mu-hsuan Huang

    2001-12-01

    Full Text Available This article analyzes the consistency in the selection of search terms and search contents of college and graduate students in National Taiwan University when they are using PsycLIT CD-ROM database. 31 students conducted pre-assigned searches, doing 59 searches generating 609 search terms. The study finds the consistency in selection of search terms of first level is 22.14% and second level is 35%. These results are similar with others’ researches. About the consistency in search concepts, no matter the overlaps of searched articles or judge relevant articles are lower than other researches. [Article content in Chinese

  19. Renewing the Respect for Similarity

    Directory of Open Access Journals (Sweden)

    Shimon eEdelman

    2012-07-01

    Full Text Available In psychology, the concept of similarity has traditionally evoked a mixture of respect, stemmingfrom its ubiquity and intuitive appeal, and concern, due to its dependence on the framing of the problemat hand and on its context. We argue for a renewed focus on similarity as an explanatory concept, bysurveying established results and new developments in the theory and methods of similarity-preservingassociative lookup and dimensionality reduction — critical components of many cognitive functions, aswell as of intelligent data management in computer vision. We focus in particular on the growing familyof algorithms that support associative memory by performing hashing that respects local similarity, andon the uses of similarity in representing structured objects and scenes. Insofar as these similarity-basedideas and methods are useful in cognitive modeling and in AI applications, they should be included inthe core conceptual toolkit of computational neuroscience.

  20. Web Search Engines: Search Syntax and Features.

    Science.gov (United States)

    Ojala, Marydee

    2002-01-01

    Presents a chart that explains the search syntax, features, and commands used by the 12 most widely used general Web search engines. Discusses Web standardization, expanded types of content searched, size of databases, and search engines that include both simple and advanced versions. (LRW)

  1. Similarity Learning of Manifold Data.

    Science.gov (United States)

    Chen, Si-Bao; Ding, Chris H Q; Luo, Bin

    2015-09-01

    Without constructing adjacency graph for neighborhood, we propose a method to learn similarity among sample points of manifold in Laplacian embedding (LE) based on adding constraints of linear reconstruction and least absolute shrinkage and selection operator type minimization. Two algorithms and corresponding analyses are presented to learn similarity for mix-signed and nonnegative data respectively. The similarity learning method is further extended to kernel spaces. The experiments on both synthetic and real world benchmark data sets demonstrate that the proposed LE with new similarity has better visualization and achieves higher accuracy in classification.

  2. Comparing NEO Search Telescopes

    CERN Document Server

    Myhrvold, Nathan

    2015-01-01

    Multiple terrestrial and space-based telescopes have been proposed for detecting and tracking near-Earth objects (NEOs). Detailed simulations of the search performance of these systems have used complex computer codes that are not widely available, which hinders accurate cross- comparison of the proposals and obscures whether they have consistent assumptions. Moreover, some proposed instruments would survey infrared (IR) bands, whereas others would operate in the visible band, and differences among asteroid thermal and visible light models used in the simulations further complicate like-to-like comparisons. I use simple physical principles to estimate basic performance metrics for the ground-based Large Synoptic Survey Telescope and three space-based instruments - Sentinel, NEOCam, and a Cubesat constellation. The performance is measured against two different NEO distributions, the Bottke et al. distribution of general NEOs, and the Veres et al. distribution of earth impacting NEO. The results of the comparis...

  3. Dynamic similarity in erosional processes

    Science.gov (United States)

    Scheidegger, A.E.

    1963-01-01

    A study is made of the dynamic similarity conditions obtaining in a variety of erosional processes. The pertinent equations for each type of process are written in dimensionless form; the similarity conditions can then easily be deduced. The processes treated are: raindrop action, slope evolution and river erosion. ?? 1963 Istituto Geofisico Italiano.

  4. Multiple-Goal Heuristic Search

    CERN Document Server

    Davidov, D; 10.1613/jair.1940

    2011-01-01

    This paper presents a new framework for anytime heuristic search where the task is to achieve as many goals as possible within the allocated resources. We show the inadequacy of traditional distance-estimation heuristics for tasks of this type and present alternative heuristics that are more appropriate for multiple-goal search. In particular, we introduce the marginal-utility heuristic, which estimates the cost and the benefit of exploring a subtree below a search node. We developed two methods for online learning of the marginal-utility heuristic. One is based on local similarity of the partial marginal utility of sibling nodes, and the other generalizes marginal-utility over the state feature space. We apply our adaptive and non-adaptive multiple-goal search algorithms to several problems, including focused crawling, and show their superiority over existing methods.

  5. Attacks on Local Searching Tools

    CERN Document Server

    Nielson, Seth James; Wallach, Dan S

    2011-01-01

    The Google Desktop Search is an indexing tool, currently in beta testing, designed to allow users fast, intuitive, searching for local files. The principle interface is provided through a local web server which supports an interface similar to Google.com's normal web page. Indexing of local files occurs when the system is idle, and understands a number of common file types. A optional feature is that Google Desktop can integrate a short summary of a local search results with Google.com web searches. This summary includes 30-40 character snippets of local files. We have uncovered a vulnerability that would release private local data to an unauthorized remote entity. Using two different attacks, we expose the small snippets of private local data to a remote third party.

  6. The Search for Directed Intelligence

    CERN Document Server

    Lubin, Philip

    2016-01-01

    We propose a search for sources of directed energy systems such as those now becoming technologically feasible on Earth. Recent advances in our own abilities allow us to foresee our own capability that will radically change our ability to broadcast our presence. We show that systems of this type have the ability to be detected at vast distances and indeed can be detected across the entire horizon. This profoundly changes the possibilities for searches for extra-terrestrial technology advanced civilizations. We show that even modest searches can be extremely effective at detecting or limiting many civilization classes. We propose a search strategy that will observe more than 10 12 stellar and planetary systems with possible extensions to more than 10 20 systems allowing us to test the hypothesis that other similarly or more advanced civilization with this same capability, and are broadcasting, exist.

  7. Contextual Factors for Finding Similar Experts

    DEFF Research Database (Denmark)

    Hofmann, Katja; Balog, Krisztian; Bogers, Toine;

    2010-01-01

    Expertise-seeking research studies how people search for expertise and choose whom to contact in the context of a specific task. An important outcome are models that identify factors that influence expert finding. Expertise retrieval addresses the same problem, expert finding, but from a system......-seeking models, are rarely taken into account. In this article, we extend content-based expert-finding approaches with contextual factors that have been found to influence human expert finding. We focus on a task of science communicators in a knowledge-intensive environment, the task of finding similar experts......, given an example expert. Our approach combines expertise-seeking and retrieval research. First, we conduct a user study to identify contextual factors that may play a role in the studied task and environment. Then, we design expert retrieval models to capture these factors. We combine these with content...

  8. Similarity of samples and trimming

    CERN Document Server

    Álvarez-Esteban, Pedro C; Cuesta-Albertos, Juan A; Matrán, Carlos; 10.3150/11-BEJ351

    2012-01-01

    We say that two probabilities are similar at level $\\alpha$ if they are contaminated versions (up to an $\\alpha$ fraction) of the same common probability. We show how this model is related to minimal distances between sets of trimmed probabilities. Empirical versions turn out to present an overfitting effect in the sense that trimming beyond the similarity level results in trimmed samples that are closer than expected to each other. We show how this can be combined with a bootstrap approach to assess similarity from two data samples.

  9. Improved Scatter Search Using Cuckoo Search

    Directory of Open Access Journals (Sweden)

    Ahmed T.Sadiq Al-Obaidi

    2013-02-01

    Full Text Available The Scatter Search (SS is a deterministic strategy that has been applied successfully to some combinatorial and continuous optimization problems. Cuckoo Search (CS is heuristic search algorithm which is inspired by the reproduction strategy of cuckoos. This paper presents enhanced scatter search algorithm using CS algorithm. The improvement provides Scatter Search with random exploration for search space of problem and more of diversity and intensification for promising solutions. The original and improved Scatter Search has been tested on Traveling Salesman Problem. A computational experiment with benchmark instances is reported. The results demonstrate that the improved Scatter Search algorithms produce better performance than original Scatter Search algorithm. The improvement in the value of average fitness is 23.2% comparing with original SS. The developed algorithm has been compared with other algorithms for the same problem, and the result was competitive with some algorithm and insufficient with another.

  10. Self-similar aftershock rates

    CERN Document Server

    Davidsen, Jörn

    2016-01-01

    In many important systems exhibiting crackling noise --- intermittent avalanche-like relaxation response with power-law and, thus, self-similar distributed event sizes --- the "laws" for the rate of activity after large events are not consistent with the overall self-similar behavior expected on theoretical grounds. This is in particular true for the case of seismicity and a satisfying solution to this paradox has remained outstanding. Here, we propose a generalized description of the aftershock rates which is both self-similar and consistent with all other known self-similar features. Comparing our theoretical predictions with high resolution earthquake data from Southern California we find excellent agreement, providing in particular clear evidence for a unified description of aftershocks and foreshocks. This may offer an improved way of time-dependent seismic hazard assessment and earthquake forecasting.

  11. Self-similar aftershock rates

    Science.gov (United States)

    Davidsen, Jörn; Baiesi, Marco

    2016-08-01

    In many important systems exhibiting crackling noise—an intermittent avalanchelike relaxation response with power-law and, thus, self-similar distributed event sizes—the "laws" for the rate of activity after large events are not consistent with the overall self-similar behavior expected on theoretical grounds. This is particularly true for the case of seismicity, and a satisfying solution to this paradox has remained outstanding. Here, we propose a generalized description of the aftershock rates which is both self-similar and consistent with all other known self-similar features. Comparing our theoretical predictions with high-resolution earthquake data from Southern California we find excellent agreement, providing particularly clear evidence for a unified description of aftershocks and foreshocks. This may offer an improved framework for time-dependent seismic hazard assessment and earthquake forecasting.

  12. Similarity measures for protein ensembles

    DEFF Research Database (Denmark)

    Lindorff-Larsen, Kresten; Ferkinghoff-Borg, Jesper

    2009-01-01

    Analyses of similarities and changes in protein conformation can provide important information regarding protein function and evolution. Many scores, including the commonly used root mean square deviation, have therefore been developed to quantify the similarities of different protein conformations...... a synthetic example from molecular dynamics simulations. We then apply the algorithms to revisit the problem of ensemble averaging during structure determination of proteins, and find that an ensemble refinement method is able to recover the correct distribution of conformations better than standard single...

  13. Community Detection by Neighborhood Similarity

    Institute of Scientific and Technical Information of China (English)

    LIU Xu; XIE Zheng; YI Dong-Yun

    2012-01-01

    Detection of the community structure in a network is important for understanding the structure and dynamics of the network.By exploring the neighborhood of vertices,a local similarity metric is proposed,which can be quickly computed.The resulting similarity matrix retains the same support as the adjacency matrix.Based on local similarity,an agglomerative hierarchical clustering algorithm is proposed for community detection.The algorithm is implemented by an efficient max-heap data structure and runs in nearly linear time,thus is capable of dealing with large sparse networks with tens of thousands of nodes.Experiments on synthesized and real-world networks demonstrate that our method is efficient to detect community structures,and the proposed metric is the most suitable one among all the tested similarity indices.%Detection of the community structure in a network is important for understanding the structure and dynamics of the network. By exploring the neighborhood of vertices, a local similarity metric is proposed, which can be quickly computed. The resulting similarity matrix retains the same support as the adjacency matrix. Based on local similarity, an agglomerative hierarchical clustering algorithm is proposed for community detection. The algorithm is implemented by an efficient max-heap data structure and runs in nearly linear time, thus is capable of dealing with large sparse networks with tens of thousands of nodes. Experiments on synthesized and real-world networks demonstrate that our method is efficient to detect community structures, and the proposed metric is the most suitable one among all the tested similarity indices.

  14. Similarity of atoms in molecules

    Energy Technology Data Exchange (ETDEWEB)

    Cioslowski, J.; Nanayakkara, A. (Florida State Univ., Tallahassee, FL (United States))

    1993-12-01

    Similarity of atoms in molecules is quantitatively assessed with a measure that employs electron densities within respective atomic basins. This atomic similarity measure does not rely on arbitrary assumptions concerning basis functions or 'atomic orbitals', is relatively inexpensive to compute, and has straightforward interpretation. Inspection of similarities between pairs of carbon, hydrogen, and fluorine atoms in the CH[sub 4], CH[sub 3]F, CH[sub 2]F[sub 2], CHF[sub 3], CF[sub 4], C[sub 2]H[sub 2], C[sub 2]H[sub 4], and C[sub 2]H[sub 6] molecules, calculated at the MP2/6-311G[sup **] level of theory, reveals that the atomic similarity is greatly reduced by a change in the number or the character of ligands (i.e. the atoms with nuclei linked through bond paths to the nucleus of the atom in question). On the other hand, atoms with formally identical (i.e. having the same nuclei and numbers of ligands) ligands resemble each other to a large degree, with the similarity indices greater than 0.95 for hydrogens and 0.99 for non-hydrogens. 19 refs., 6 tabs.

  15. Searching chemical space with the Bayesian Idea Generator.

    Science.gov (United States)

    van Hoorn, Willem P; Bell, Andrew S

    2009-10-01

    The Pfizer Global Virtual Library (PGVL) is defined as a set compounds that could be synthesized using validated protocols and monomers. However, it is too large (10(12) compounds) to search by brute-force methods for close analogues of a given input structure. In this paper the Bayesian Idea Generator is described which is based on a novel application of Bayesian statistics to narrow down the search space to a prioritized set of existing library arrays (the default is 16). For each of these libraries the 6 closest neighbors are retrieved from the existing compound file, resulting in a screenable hypothesis of 96 compounds. Using the Bayesian models for library space, the Pfizer file of singleton compounds has been mapped to library space and is optionally searched as well. The method is >99% accurate in retrieving known library provenance from an independent test set. The compounds retrieved strike a balance between similarity and diversity resulting in frequent scaffold hops. Four examples of how the Bayesian Idea Generator has been successfully used in drug discovery are provided. The methodology of the Bayesian Idea Generator can be used for any collection of compounds containing distinct clusters, and an example using compound vendor catalogues has been included.

  16. Reading and visual search: a developmental study in normal children.

    Directory of Open Access Journals (Sweden)

    Magali Seassau

    Full Text Available Studies dealing with developmental aspects of binocular eye movement behaviour during reading are scarce. In this study we have explored binocular strategies during reading and during visual search tasks in a large population of normal young readers. Binocular eye movements were recorded using an infrared video-oculography system in sixty-nine children (aged 6 to 15 and in a group of 10 adults (aged 24 to 39. The main findings are (i in both tasks the number of progressive saccades (to the right and regressive saccades (to the left decreases with age; (ii the amplitude of progressive saccades increases with age in the reading task only; (iii in both tasks, the duration of fixations as well as the total duration of the task decreases with age; (iv in both tasks, the amplitude of disconjugacy recorded during and after the saccades decreases with age; (v children are significantly more accurate in reading than in visual search after 10 years of age. Data reported here confirms and expands previous studies on children's reading. The new finding is that younger children show poorer coordination than adults, both while reading and while performing a visual search task. Both reading skills and binocular saccades coordination improve with age and children reach a similar level to adults after the age of 10. This finding is most likely related to the fact that learning mechanisms responsible for saccade yoking develop during childhood until adolescence.

  17. A Survey of Binary Similarity and Distance Measures

    Directory of Open Access Journals (Sweden)

    Seung-Seok Choi

    2010-02-01

    Full Text Available The binary feature vector is one of the most common representations of patterns and measuring similarity and distance measures play a critical role in many problems such as clustering, classification, etc. Ever since Jaccard proposed a similarity measure to classify ecological species in 1901, numerous binary similarity and distance measures have been proposed in various fields. Applying appropriate measures results in more accurate data analysis. Notwithstanding, few comprehensive surveys on binary measures have been conducted. Hence we collected 76 binary similarity and distance measures used over the last century and reveal their correlations through the hierarchical clustering technique.

  18. Generating personalized web search using semantic context.

    Science.gov (United States)

    Xu, Zheng; Chen, Hai-Yan; Yu, Jie

    2015-01-01

    The "one size fits the all" criticism of search engines is that when queries are submitted, the same results are returned to different users. In order to solve this problem, personalized search is proposed, since it can provide different search results based upon the preferences of users. However, existing methods concentrate more on the long-term and independent user profile, and thus reduce the effectiveness of personalized search. In this paper, the method captures the user context to provide accurate preferences of users for effectively personalized search. First, the short-term query context is generated to identify related concepts of the query. Second, the user context is generated based on the click through data of users. Finally, a forgetting factor is introduced to merge the independent user context in a user session, which maintains the evolution of user preferences. Experimental results fully confirm that our approach can successfully represent user context according to individual user information needs.

  19. Similarity measures for face recognition

    CERN Document Server

    Vezzetti, Enrico

    2015-01-01

    Face recognition has several applications, including security, such as (authentication and identification of device users and criminal suspects), and in medicine (corrective surgery and diagnosis). Facial recognition programs rely on algorithms that can compare and compute the similarity between two sets of images. This eBook explains some of the similarity measures used in facial recognition systems in a single volume. Readers will learn about various measures including Minkowski distances, Mahalanobis distances, Hansdorff distances, cosine-based distances, among other methods. The book also summarizes errors that may occur in face recognition methods. Computer scientists "facing face" and looking to select and test different methods of computing similarities will benefit from this book. The book is also useful tool for students undertaking computer vision courses.

  20. An intelligent method for geographic Web search

    Science.gov (United States)

    Mei, Kun; Yuan, Ying

    2008-10-01

    While the electronically available information in the World-Wide Web is explosively growing and thus increasing, the difficulty to find relevant information is also increasing for search engine user. In this paper we discuss how to constrain web queries geographically. A number of search queries are associated with geographical locations, either explicitly or implicitly. Accurately and effectively detecting the locations where search queries are truly about has huge potential impact on increasing search relevance, bringing better targeted search results, and improving search user satisfaction. Our approach focus on both in the way geographic information is extracted from the web and, as far as we can tell, in the way it is integrated into query processing. This paper gives an overview of a spatially aware search engine for semantic querying of web document. It also illustrates algorithms for extracting location from web documents and query requests using the location ontologies to encode and reason about formal semantics of geographic web search. Based on a real-world scenario of tourism guide search, the application of our approach shows that the geographic information retrieval can be efficiently supported.

  1. Landscape similarity, retrieval, and machine mapping of physiographic units

    Science.gov (United States)

    Jasiewicz, Jaroslaw; Netzel, Pawel; Stepinski, Tomasz F.

    2014-09-01

    We introduce landscape similarity - a numerical measure that assesses affinity between two landscapes on the basis of similarity between the patterns of their constituent landform elements. Such a similarity function provides core technology for a landscape search engine - an algorithm that parses the topography of a study area and finds all places with landscapes broadly similar to a landscape template. A landscape search can yield answers to a query in real time, enabling a highly effective means to explore large topographic datasets. In turn, a landscape search facilitates auto-mapping of physiographic units within a study area. The country of Poland serves as a test bed for these novel concepts. The topography of Poland is given by a 30 m resolution DEM. The geomorphons method is applied to this DEM to classify the topography into ten common types of landform elements. A local landscape is represented by a square tile cut out of a map of landform elements. A histogram of cell-pair features is used to succinctly encode the composition and texture of a pattern within a local landscape. The affinity between two local landscapes is assessed using the Wave-Hedges similarity function applied to the two corresponding histograms. For a landscape search the study area is organized into a lattice of local landscapes. During the search the algorithm calculates the similarity between each local landscape and a given query. Our landscape search for Poland is implemented as a GeoWeb application called TerraEx-Pl and is available at http://sil.uc.edu/. Given a sample, or a number of samples, from a target physiographic unit the landscape search delineates this unit using the principles of supervised machine learning. Repeating this procedure for all units yields a complete physiographic map. The application of this methodology to topographic data of Poland results in the delineation of nine physiographic units. The resultant map bears a close resemblance to a conventional

  2. Haptic Influence on Visual Search

    Directory of Open Access Journals (Sweden)

    Marcia Grabowecky

    2011-10-01

    Full Text Available Information from different sensory modalities influences perception and attention in tasks such as visual search. We have previously reported identity-based crossmodal influences of audition on visual search (Iordanescu, Guzman-Martinez, Grabowecky, & Suzuki, 2008; Iordanescu, Grabowecky, Franconeri, Theeuwes, and Suzuki, 2010; Iordanescu, Grabowecky and Suzuki, 2011. Here, we extend those results and demonstrate a novel crossmodal interaction between haptic shape information and visual attention. Manually-explored, but unseen, shapes facilitated visual search for similarly-shaped objects. This effect manifests as a reduction in both overall search times and initial saccade latencies when the haptic shape (eg, a sphere is consistent with a visual target (eg, an orange compared to when it is inconsistent with a visual target (eg, a hockey puck]. This haptic-visual interaction occurred even though the manually held shapes were not predictive of the visual target's shape or location, suggesting that the interaction occurs automatically. Furthermore, when the haptic shape was consistent with a distracter in the visual search array (instead of with the target, initial saccades toward the target were disrupted. Together, these results demonstrate a robust shape-specific haptic influence on visual search.

  3. Slowed Search in the Context of Unimpaired Grouping in Autism: Evidence from Multiple Conjunction Search.

    Science.gov (United States)

    Keehn, Brandon; Joseph, Robert M

    2016-03-01

    In multiple conjunction search, the target is not known in advance but is defined only with respect to the distractors in a given search array, thus reducing the contributions of bottom-up and top-down attentional and perceptual processes during search. This study investigated whether the superior visual search skills typically demonstrated by individuals with autism spectrum disorder (ASD) would be evident in multiple conjunction search. Thirty-two children with ASD and 32 age- and nonverbal IQ-matched typically developing (TD) children were administered a multiple conjunction search task. Contrary to findings from the large majority of studies on visual search in ASD, response times of individuals with ASD were significantly slower than those of their TD peers. Evidence of slowed performance in ASD suggests that the mechanisms responsible for superior ASD performance in other visual search paradigms are not available in multiple conjunction search. Although the ASD group failed to exhibit superior performance, they showed efficient search and intertrial priming levels similar to the TD group. Efficient search indicates that ASD participants were able to group distractors into distinct subsets. In summary, while demonstrating grouping and priming effects comparable to those exhibited by their TD peers, children with ASD were slowed in their performance on a multiple conjunction search task, suggesting that their usual superior performance in visual search tasks is specifically dependent on top-down and/or bottom-up attentional and perceptual processes.

  4. Distance learning for similarity estimation

    NARCIS (Netherlands)

    Yu, J.; Amores, J.; Sebe, N.; Radeva, P.; Tian, Q.

    2008-01-01

    In this paper, we present a general guideline to find a better distance measure for similarity estimation based on statistical analysis of distribution models and distance functions. A new set of distance measures are derived from the harmonic distance, the geometric distance, and their generalized

  5. Comparison of hydrological similarity measures

    Science.gov (United States)

    Rianna, Maura; Ridolfi, Elena; Manciola, Piergiorgio; Napolitano, Francesco; Russo, Fabio

    2016-04-01

    The use of a traditional at site approach for the statistical characterization and simulation of spatio-temporal precipitation fields has a major recognized drawback. Indeed, the weakness of the methodology is related to the estimation of rare events and it involves the uncertainty of the at-site sample statistical inference, because of the limited length of records. In order to overcome the lack of at-site observations, regional frequency approach uses the idea of substituting space for time to estimate design floods. The conventional regional frequency analysis estimates quantile values at a specific site from multi-site analysis. The main idea is that homogeneous sites, once pooled together, have similar probability distribution curves of extremes, except for a scaling factor. The method for pooling groups of sites can be based on geographical or climatological considerations. In this work the region of influence (ROI) pooling method is compared with an entropy-based one. The ROI is a flexible pooling group approach which defines for each site its own "region" formed by a unique set of similar stations. The similarity is found through the Euclidean distance metric in the attribute space. Here an alternative approach based on entropy is introduced to cluster homogeneous sites. The core idea is that homogeneous sites share a redundant (i.e. similar) amount of information. Homogeneous sites are pooled through a hierarchical selection based on the mutual information index (i.e. a measure of redundancy). The method is tested on precipitation data in Central Italy area.

  6. Google Ajax Search API

    CERN Document Server

    Fitzgerald, Michael

    2007-01-01

    Use the Google Ajax Search API to integrateweb search, image search, localsearch, and other types of search intoyour web site by embedding a simple, dynamicsearch box to display search resultsin your own web pages using a fewlines of JavaScript. For those who do not want to write code,the search wizards and solutions builtwith the Google Ajax Search API generatecode to accomplish common taskslike adding local search results to a GoogleMaps API mashup, adding videosearch thumbnails to your web site, oradding a news reel with the latest up todate stories to your blog. More advanced users can

  7. HOW DISSIMILARLY SIMILAR ARE BIOSIMILARS?

    Directory of Open Access Journals (Sweden)

    Ramshankar Vijayalakshmi

    2012-05-01

    Full Text Available Recently Biopharmaceuticals are the new chemotherapeutical agents that are called as “Biosimilars” or “follow on protein products” by the European Medicines Agency (EMA and the American regulatory agencies (Food and Drug Administration respectively. Biosimilars are extremely similar to the reference molecule but not identical, however close their similarities may be. A regulatory framework is therefore in place to assess the application for marketing authorisation of biosimilars. When a biosimilar is similar to the reference biopharmaceutical in terms of safety, quality, and efficacy, it can be registered. It is important to document data from clinical trials with a view of similar safety and efficacy. If the development time for a generic medicine is around 3 years, a biosimilar takes about 6-9 years. Generic medicines need to demonstrate bioequivalence only unlike biosimilars that need to conduct phase I and Phase III clinical trials. In this review, different biosimilars that are already being used successfully in the field on Oncology is discussed. Their similarity, differences and guidelines to be followed before a clinically informed decision to be taken, is discussed. More importantly the regulatory guidelines that are operational in India with a work flow of making a biosimilar with relevant dos and dont’s are discussed. For a large populous country like India, where with improved treatments in all sectors including oncology, our ageing population is increasing. For the health care of this sector, we need more newer, cheaper and effective biosimilars in the market. It becomes therefore important to understand the regulatory guidelines and steps to come up with more biosimilars for the existing population and also more information is mandatory for the practicing clinicians to translate these effectively into clinical practice.

  8. AN FFT-BASED SELF-SIMILAR TRAFFIC GENERATOR

    Institute of Scientific and Technical Information of China (English)

    施建俊; 薛质; 诸鸿文

    2001-01-01

    The self-similarity of the network traffic has great influences on the performance. But there are few analytical or even numerical solutions for such a model. So simulation becomes the most efficient method for research. Fractal Gaussian noise (FGN) is the most popularly used self-similar model. This paper presented an FGN generator based on fast Fourier transform (FFT). The study indicates that this algorithm is fairly fast and accurate.

  9. Rotational invariant similarity measurement for content-based image indexing

    Science.gov (United States)

    Ro, Yong M.; Yoo, Kiwon

    2000-04-01

    We propose a similarity matching technique for contents based image retrieval. The proposed technique is invariant from rotated images. Since image contents for image indexing and retrieval would be arbitrarily extracted from still image or key frame of video, the rotation invariant property of feature description of image is important for general application of contents based image indexing and retrieval. In this paper, we propose a rotation invariant similarity measurement in cooperating with texture featuring base on HVS. To simplify computational complexity, we employed hierarchical similarity distance searching. To verify the method, experiments with MPEG-7 data set are performed.

  10. Sparse Similarity-Based Fisherfaces

    DEFF Research Database (Denmark)

    Fagertun, Jens; Gomez, David Delgado; Hansen, Mads Fogtmann;

    2011-01-01

    In this work, the effect of introducing Sparse Principal Component Analysis within the Similarity-based Fisherfaces algorithm is examined. The technique aims at mimicking the human ability to discriminate faces by projecting the faces in a highly discriminative and easy interpretative way. Pixel...... obtain the same recognition results as the technique in a dense version using only a fraction of the input data. Furthermore, the presented results suggest that using SPCA in the technique offers robustness to occlusions....

  11. Self-similarity Driven Demosaicking

    Directory of Open Access Journals (Sweden)

    Antoni Buades

    2011-06-01

    Full Text Available Digital cameras record only one color component per pixel, red, green or blue. Demosaicking is the process by which one can infer a whole color matrix from such a matrix of values, thus interpolating the two missing color values per pixel. In this article we propose a demosaicking method based on the property of non-local self-similarity of images.

  12. Laboratory Building for Accurate Determination of Plutonium

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    <正>The accurate determination of plutonium is one of the most important assay techniques of nuclear fuel, also the key of the chemical measurement transfer and the base of the nuclear material balance. An

  13. Self-Similar Collisionless Shocks

    CERN Document Server

    Katz, B; Waxman, E; Katz, Boaz; Keshet, Uri; Waxman, Eli

    2006-01-01

    Observations of gamma-ray burst afterglows suggest that the correlation length of magnetic field fluctuations downstream of relativistic non-magnetized collisionless shocks grows with distance from the shock to scales much larger than the plasma skin depth. We argue that this indicates that the plasma properties are described by a self-similar solution, and derive constraints on the scaling properties of the solution. For example, we find that the scaling of the characteristic magnetic field amplitude with distance from the shock is B \\propto D^{s_B} with -1 \\propto x^{2s_B} (for x>>D). We show that the plasma may be approximated as a combination of two self-similar components: a kinetic component of energetic particles and an MHD-like component representing "thermal" particles. We argue that the latter may be considered as infinitely conducting, in which case s_B=0 and the scalings are completely determined (e.g. dn/dE \\propto E^{-2} and B \\propto D^0). Similar claims apply to non- relativistic shocks such a...

  14. Roget's Thesaurus and Semantic Similarity

    CERN Document Server

    Jarmasz, Mario

    2012-01-01

    We have implemented a system that measures semantic similarity using a computerized 1987 Roget's Thesaurus, and evaluated it by performing a few typical tests. We compare the results of these tests with those produced by WordNet-based similarity measures. One of the benchmarks is Miller and Charles' list of 30 noun pairs to which human judges had assigned similarity measures. We correlate these measures with those computed by several NLP systems. The 30 pairs can be traced back to Rubenstein and Goodenough's 65 pairs, which we have also studied. Our Roget's-based system gets correlations of .878 for the smaller and .818 for the larger list of noun pairs; this is quite close to the .885 that Resnik obtained when he employed humans to replicate the Miller and Charles experiment. We further evaluate our measure by using Roget's and WordNet to answer 80 TOEFL, 50 ESL and 300 Reader's Digest questions: the correct synonym must be selected amongst a group of four words. Our system gets 78.75%, 82.00% and 74.33% of ...

  15. Gait Recognition Using Image Self-Similarity

    Directory of Open Access Journals (Sweden)

    Cutler Ross G

    2004-01-01

    Full Text Available Gait is one of the few biometrics that can be measured at a distance, and is hence useful for passive surveillance as well as biometric applications. Gait recognition research is still at its infancy, however, and we have yet to solve the fundamental issue of finding gait features which at once have sufficient discrimination power and can be extracted robustly and accurately from low-resolution video. This paper describes a novel gait recognition technique based on the image self-similarity of a walking person. We contend that the similarity plot encodes a projection of gait dynamics. It is also correspondence-free, robust to segmentation noise, and works well with low-resolution video. The method is tested on multiple data sets of varying sizes and degrees of difficulty. Performance is best for fronto-parallel viewpoints, whereby a recognition rate of 98% is achieved for a data set of 6 people, and 70% for a data set of 54 people.

  16. Understanding the Code: keeping accurate records.

    Science.gov (United States)

    Griffith, Richard

    2015-10-01

    In his continuing series looking at the legal and professional implications of the Nursing and Midwifery Council's revised Code of Conduct, Richard Griffith discusses the elements of accurate record keeping under Standard 10 of the Code. This article considers the importance of accurate record keeping for the safety of patients and protection of district nurses. The legal implications of records are explained along with how district nurses should write records to ensure these legal requirements are met.

  17. Simulated Milky Way analogues: implications for dark matter indirect searches

    CERN Document Server

    Calore, F; Lovell, M; Bertone, G; Schaller, M; Frenk, C S; Crain, R A; Schaye, J; Theuns, T; Trayford, J W

    2015-01-01

    We study high-resolution hydrodynamic simulations of Milky Way type galaxies obtained within the "Evolution and Assembly of GaLaxies and their Environments" (EAGLE) project, and identify the those that best satisfy observational constraints on the Milky Way total stellar mass, rotation curve, and galaxy shape. Contrary to mock galaxies selected on the basis of their total virial mass, the Milky Way analogues so identified consistently exhibit very similar dark matter profiles inside the solar circle, therefore enabling more accurate predictions for indirect dark matter searches. We find in particular that high resolution simulated haloes satisfying observational constraints exhibit, within the inner few kiloparsecs, dark matter profiles shallower than those required to explain the so-called Fermi GeV excess via dark matter annihilation.

  18. Relativistic mergers of black hole binaries have large, similar masses, low spins and are circular

    Science.gov (United States)

    Amaro-Seoane, Pau; Chen, Xian

    2016-05-01

    Gravitational waves are a prediction of general relativity, and with ground-based detectors now running in their advanced configuration, we will soon be able to measure them directly for the first time. Binaries of stellar-mass black holes are among the most interesting sources for these detectors. Unfortunately, the many different parameters associated with the problem make it difficult to promptly produce a large set of waveforms for the search in the data stream. To reduce the number of templates to develop, one must restrict some of the physical parameters to a certain range of values predicted by either (electromagnetic) observations or theoretical modelling. In this work, we show that `hyperstellar' black holes (HSBs) with masses 30 ≲ MBH/M⊙ ≲ 100, i.e black holes significantly larger than the nominal 10 M⊙, will have an associated low value for the spin, i.e. a < 0.5. We prove that this is true regardless of the formation channel, and that when two HSBs build a binary, each of the spin magnitudes is also low, and the binary members have similar masses. We also address the distribution of the eccentricities of HSB binaries in dense stellar systems using a large suite of three-body scattering experiments that include binary-single interactions and long-lived hierarchical systems with a highly accurate integrator, including relativistic corrections up to O(1/c^5). We find that most sources in the detector band will have nearly zero eccentricities. This correlation between large, similar masses, low spin and low eccentricity will help to accelerate the searches for gravitational-wave signals.

  19. Accurate discrimination of conserved coding and non-coding regions through multiple indicators of evolutionary dynamics

    Directory of Open Access Journals (Sweden)

    Pesole Graziano

    2009-09-01

    Full Text Available Abstract Background The conservation of sequences between related genomes has long been recognised as an indication of functional significance and recognition of sequence homology is one of the principal approaches used in the annotation of newly sequenced genomes. In the context of recent findings that the number non-coding transcripts in higher organisms is likely to be much higher than previously imagined, discrimination between conserved coding and non-coding sequences is a topic of considerable interest. Additionally, it should be considered desirable to discriminate between coding and non-coding conserved sequences without recourse to the use of sequence similarity searches of protein databases as such approaches exclude the identification of novel conserved proteins without characterized homologs and may be influenced by the presence in databases of sequences which are erroneously annotated as coding. Results Here we present a machine learning-based approach for the discrimination of conserved coding sequences. Our method calculates various statistics related to the evolutionary dynamics of two aligned sequences. These features are considered by a Support Vector Machine which designates the alignment coding or non-coding with an associated probability score. Conclusion We show that our approach is both sensitive and accurate with respect to comparable methods and illustrate several situations in which it may be applied, including the identification of conserved coding regions in genome sequences and the discrimination of coding from non-coding cDNA sequences.

  20. Web Search Engines

    OpenAIRE

    Rajashekar, TB

    1998-01-01

    The World Wide Web is emerging as an all-in-one information source. Tools for searching Web-based information include search engines, subject directories and meta search tools. We take a look at key features of these tools and suggest practical hints for effective Web searching.

  1. Sound Search Engine Concept

    DEFF Research Database (Denmark)

    2006-01-01

    Sound search is provided by the major search engines, however, indexing is text based, not sound based. We will establish a dedicated sound search services with based on sound feature indexing. The current demo shows the concept of the sound search engine. The first engine will be realased June...

  2. White Noise in Quantum Random Walk Search Algorithm

    Institute of Scientific and Technical Information of China (English)

    MA Lei; DU Jiang-Feng; LI Yun; LI Hui; KWEK L. C.; OH C. H.

    2006-01-01

    @@ The quantum random walk is a possible approach to construct new quantum search algorithms. It has been shown by Shenvi et al. [Phys. Rev. A 67 (2003)52307] that a kind of algorithm can perform an oracle search on a database of N items with O(√N) calling to the oracle, yielding a speedup similar to other quantum search algorithms.

  3. Dynamic Search and Working Memory in Social Recall

    Science.gov (United States)

    Hills, Thomas T.; Pachur, Thorsten

    2012-01-01

    What are the mechanisms underlying search in social memory (e.g., remembering the people one knows)? Do the search mechanisms involve dynamic local-to-global transitions similar to semantic search, and are these transitions governed by the general control of attention, associated with working memory span? To find out, we asked participants to…

  4. Fast and accurate determination of modularity and its effect size

    CERN Document Server

    Treviño, Santiago; Del Genio, Charo I; Bassler, Kevin E

    2014-01-01

    We present a fast spectral algorithm for community detection in complex networks. Our method searches for the partition with the maximum value of the modularity via the interplay of several refinement steps that include both agglomeration and division. We validate the accuracy of the algorithm by applying it to several real-world benchmark networks. On all these, our algorithm performs as well or better than any other known polynomial scheme. This allows us to extensively study the modularity distribution in ensembles of Erd\\H{o}s-R\\'enyi networks, producing theoretical predictions for means and variances inclusive of finite-size corrections. Our work provides a way to accurately estimate the effect size of modularity, providing a $z$-score measure of it and enabling a more informative comparison of networks with different numbers of nodes and links.

  5. Assessing protein kinase target similarity

    DEFF Research Database (Denmark)

    Gani, Osman A; Thakkar, Balmukund; Narayanan, Dilip

    2015-01-01

    : focussed chemical libraries, drug repurposing, polypharmacological design, to name a few. Protein kinase target similarity is easily quantified by sequence, and its relevance to ligand design includes broad classification by key binding sites, evaluation of resistance mutations, and the use of surrogate......" of sequence and crystal structure information, with statistical methods able to identify key correlates to activity but also here, "the devil is in the details." Examples from specific repurposing and polypharmacology applications illustrate these points. This article is part of a Special Issue entitled...

  6. Large Neighborhood Search

    DEFF Research Database (Denmark)

    Pisinger, David; Røpke, Stefan

    2010-01-01

    Heuristics based on large neighborhood search have recently shown outstanding results in solving various transportation and scheduling problems. Large neighborhood search methods explore a complex neighborhood by use of heuristics. Using large neighborhoods makes it possible to find better...... candidate solutions in each iteration and hence traverse a more promising search path. Starting from the large neighborhood search method,we give an overview of very large scale neighborhood search methods and discuss recent variants and extensions like variable depth search and adaptive large neighborhood...... search....

  7. Mechanisms for similarity based cooperation

    Science.gov (United States)

    Traulsen, A.

    2008-06-01

    Cooperation based on similarity has been discussed since Richard Dawkins introduced the term “green beard” effect. In these models, individuals cooperate based on an aribtrary signal (or tag) such as the famous green beard. Here, two different models for such tag based cooperation are analysed. As neutral drift is important in both models, a finite population framework is applied. The first model, which we term “cooperative tags” considers a situation in which groups of cooperators are formed by some joint signal. Defectors adopting the signal and exploiting the group can lead to a breakdown of cooperation. In this case, conditions are derived under which the average abundance of the more cooperative strategy exceeds 50%. The second model considers a situation in which individuals start defecting towards others that are not similar to them. This situation is termed “defective tags”. It is shown that in this case, individuals using tags to cooperate exclusively with their own kind dominate over unconditional cooperators.

  8. A Short Survey of Document Structure Similarity Algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Buttler, D

    2004-02-27

    This paper provides a brief survey of document structural similarity algorithms, including the optimal Tree Edit Distance algorithm and various approximation algorithms. The approximation algorithms include the simple weighted tag similarity algorithm, Fourier transforms of the structure, and a new application of the shingle technique to structural similarity. We show three surprising results. First, the Fourier transform technique proves to be the least accurate of any of approximation algorithms, while also being slowest. Second, optimal Tree Edit Distance algorithms may not be the best technique for clustering pages from different sites. Third, the simplest approximation to structure may be the most effective and efficient mechanism for many applications.

  9. The Search Performance Evaluation and Prediction in Exploratory Search

    OpenAIRE

    2016-01-01

    The exploratory search for complex search tasks requires an effective search behavior model to evaluate and predict user search performance. Few studies have investigated the relationship between user search behavior and search performance in exploratory search. This research adopts a mixed approach combining search system development, user search experiment, search query log analysis, and multivariate regression analysis to resolve the knowledge gap. Through this study, it is shown that expl...

  10. MMDB and VAST+: tracking structural similarities between macromolecular complexes.

    Science.gov (United States)

    Madej, Thomas; Lanczycki, Christopher J; Zhang, Dachuan; Thiessen, Paul A; Geer, Renata C; Marchler-Bauer, Aron; Bryant, Stephen H

    2014-01-01

    The computational detection of similarities between protein 3D structures has become an indispensable tool for the detection of homologous relationships, the classification of protein families and functional inference. Consequently, numerous algorithms have been developed that facilitate structure comparison, including rapid searches against a steadily growing collection of protein structures. To this end, NCBI's Molecular Modeling Database (MMDB), which is based on the Protein Data Bank (PDB), maintains a comprehensive and up-to-date archive of protein structure similarities computed with the Vector Alignment Search Tool (VAST). These similarities have been recorded on the level of single proteins and protein domains, comprising in excess of 1.5 billion pairwise alignments. Here we present VAST+, an extension to the existing VAST service, which summarizes and presents structural similarity on the level of biological assemblies or macromolecular complexes. VAST+ simplifies structure neighboring results and shows, for macromolecular complexes tracked in MMDB, lists of similar complexes ranked by the extent of similarity. VAST+ replaces the previous VAST service as the default presentation of structure neighboring data in NCBI's Entrez query and retrieval system. MMDB and VAST+ can be accessed via http://www.ncbi.nlm.nih.gov/Structure.

  11. A similarity-based data warehousing environment for medical images.

    Science.gov (United States)

    Teixeira, Jefferson William; Annibal, Luana Peixoto; Felipe, Joaquim Cezar; Ciferri, Ricardo Rodrigues; Ciferri, Cristina Dutra de Aguiar

    2015-11-01

    A core issue of the decision-making process in the medical field is to support the execution of analytical (OLAP) similarity queries over images in data warehousing environments. In this paper, we focus on this issue. We propose imageDWE, a non-conventional data warehousing environment that enables the storage of intrinsic features taken from medical images in a data warehouse and supports OLAP similarity queries over them. To comply with this goal, we introduce the concept of perceptual layer, which is an abstraction used to represent an image dataset according to a given feature descriptor in order to enable similarity search. Based on this concept, we propose the imageDW, an extended data warehouse with dimension tables specifically designed to support one or more perceptual layers. We also detail how to build an imageDW and how to load image data into it. Furthermore, we show how to process OLAP similarity queries composed of a conventional predicate and a similarity search predicate that encompasses the specification of one or more perceptual layers. Moreover, we introduce an index technique to improve the OLAP query processing over images. We carried out performance tests over a data warehouse environment that consolidated medical images from exams of several modalities. The results demonstrated the feasibility and efficiency of our proposed imageDWE to manage images and to process OLAP similarity queries. The results also demonstrated that the use of the proposed index technique guaranteed a great improvement in query processing.

  12. Interneurons targeting similar layers receive synaptic inputs with similar kinetics.

    Science.gov (United States)

    Cossart, Rosa; Petanjek, Zdravko; Dumitriu, Dani; Hirsch, June C; Ben-Ari, Yehezkel; Esclapez, Monique; Bernard, Christophe

    2006-01-01

    GABAergic interneurons play diverse and important roles in controlling neuronal network dynamics. They are characterized by an extreme heterogeneity morphologically, neurochemically, and physiologically, but a functionally relevant classification is still lacking. Present taxonomy is essentially based on their postsynaptic targets, but a physiological counterpart to this classification has not yet been determined. Using a quantitative analysis based on multidimensional clustering of morphological and physiological variables, we now demonstrate a strong correlation between the kinetics of glutamate and GABA miniature synaptic currents received by CA1 hippocampal interneurons and the laminar distribution of their axons: neurons that project to the same layer(s) receive synaptic inputs with similar kinetics distributions. In contrast, the kinetics distributions of GABAergic and glutamatergic synaptic events received by a given interneuron do not depend upon its somatic location or dendritic arborization. Although the mechanisms responsible for this unexpected observation are still unclear, our results suggest that interneurons may be programmed to receive synaptic currents with specific temporal dynamics depending on their targets and the local networks in which they operate.

  13. Combination of visual and textual similarity retrieval from medical documents.

    Science.gov (United States)

    Eggel, Ivan; Müller, Henning

    2009-01-01

    Medical visual information retrieval has been an active research area over the past ten years as an increasing amount of images are produced digitally and have become available in patient records, scientific literature, and other medical documents. Most visual retrieval systems concentrate on images only, but it has become apparent that the retrieval of similar images alone is of limited interest, and rather the retrieval of similar documents is an important domain. Most medical institutions as well as the World Health Organization (WHO) produce many complex documents. Searching them, including a visual search, can help finding important information and also facilitates the reuse of document content and images. The work described in this paper is based on a proposal of the WHO that produces large amounts of documents from studies but also for training. The majority of these documents are in complex formats such as PDF, Microsoft Word, Excel, or PowerPoint. Goal is to create an information retrieval system that allows easy addition of documents and search by keywords and visual content. For text retrieval, Lucene is used and for image retrieval the GNU Image Finding Tool (GIFT). A Web 2.0 interface allows for an easy upload as well as simple searching.

  14. Performance Indexes: Similarities and Differences

    Directory of Open Access Journals (Sweden)

    André Machado Caldeira

    2013-06-01

    Full Text Available The investor of today is more rigorous on monitoring a financial assets portfolio. He no longer thinks only in terms of the expected return (one dimension, but in terms of risk-return (two dimensions. Thus new perception is more complex, since the risk measurement can vary according to anyone’s perception; some use the standard deviation for that, others disagree with this measure by proposing others. In addition to this difficulty, there is the problem of how to consider these two dimensions. The objective of this essay is to study the main performance indexes through an empirical study in order to verify the differences and similarities for some of the selected assets. One performance index proposed in Caldeira (2005 shall be included in this analysis.

  15. SIMILARITIES AND DIFFERENCES BETWEEN COMPANIES

    Directory of Open Access Journals (Sweden)

    NAGY CRISTINA MIHAELA

    2015-05-01

    Full Text Available act: Similarities between the accounting of companies and territorial administrative units accounting are the following: organizing double entry accounting; accounting method both in terms of fundamental theoretical principles and specific practical tools. The differences between the accounting of companies and of territorial administrative units refer to: the accounting of territorial administrative units includes besides general accounting (financial also budgetary accounting, and the accounts system of the budgetary accounting is completely different from that of companies; financial statements of territorial administrative units to which leaders are not main authorizing officers are submitted to the hierarchically superior body (not at MPF; the accounts of territorial administrative units are opened at treasury and financial institutions, accounts at commercial banks being prohibited; equity accounts in territorial administrative units are structured into groups of funds; long term debts have a specific structure in territorial administrative units (internal local public debt and external local public debt.

  16. Features Based Text Similarity Detection

    CERN Document Server

    Kent, Chow Kok

    2010-01-01

    As the Internet help us cross cultural border by providing different information, plagiarism issue is bound to arise. As a result, plagiarism detection becomes more demanding in overcoming this issue. Different plagiarism detection tools have been developed based on various detection techniques. Nowadays, fingerprint matching technique plays an important role in those detection tools. However, in handling some large content articles, there are some weaknesses in fingerprint matching technique especially in space and time consumption issue. In this paper, we propose a new approach to detect plagiarism which integrates the use of fingerprint matching technique with four key features to assist in the detection process. These proposed features are capable to choose the main point or key sentence in the articles to be compared. Those selected sentence will be undergo the fingerprint matching process in order to detect the similarity between the sentences. Hence, time and space usage for the comparison process is r...

  17. Accurate source location from waves scattered by surface topography

    Science.gov (United States)

    Wang, Nian; Shen, Yang; Flinders, Ashton; Zhang, Wei

    2016-06-01

    Accurate source locations of earthquakes and other seismic events are fundamental in seismology. The location accuracy is limited by several factors, including velocity models, which are often poorly known. In contrast, surface topography, the largest velocity contrast in the Earth, is often precisely mapped at the seismic wavelength (>100 m). In this study, we explore the use of P coda waves generated by scattering at surface topography to obtain high-resolution locations of near-surface seismic events. The Pacific Northwest region is chosen as an example to provide realistic topography. A grid search algorithm is combined with the 3-D strain Green's tensor database to improve search efficiency as well as the quality of hypocenter solutions. The strain Green's tensor is calculated using a 3-D collocated-grid finite difference method on curvilinear grids. Solutions in the search volume are obtained based on the least squares misfit between the "observed" and predicted P and P coda waves. The 95% confidence interval of the solution is provided as an a posteriori error estimation. For shallow events tested in the study, scattering is mainly due to topography in comparison with stochastic lateral velocity heterogeneity. The incorporation of P coda significantly improves solution accuracy and reduces solution uncertainty. The solution remains robust with wide ranges of random noises in data, unmodeled random velocity heterogeneities, and uncertainties in moment tensors. The method can be extended to locate pairs of sources in close proximity by differential waveforms using source-receiver reciprocity, further reducing errors caused by unmodeled velocity structures.

  18. Accurate source location from P waves scattered by surface topography

    Science.gov (United States)

    Wang, N.; Shen, Y.

    2015-12-01

    Accurate source locations of earthquakes and other seismic events are fundamental in seismology. The location accuracy is limited by several factors, including velocity models, which are often poorly known. In contrast, surface topography, the largest velocity contrast in the Earth, is often precisely mapped at the seismic wavelength (> 100 m). In this study, we explore the use of P-coda waves generated by scattering at surface topography to obtain high-resolution locations of near-surface seismic events. The Pacific Northwest region is chosen as an example. The grid search method is combined with the 3D strain Green's tensor database type method to improve the search efficiency as well as the quality of hypocenter solution. The strain Green's tensor is calculated by the 3D collocated-grid finite difference method on curvilinear grids. Solutions in the search volume are then obtained based on the least-square misfit between the 'observed' and predicted P and P-coda waves. A 95% confidence interval of the solution is also provided as a posterior error estimation. We find that the scattered waves are mainly due to topography in comparison with random velocity heterogeneity characterized by the von Kάrmάn-type power spectral density function. When only P wave data is used, the 'best' solution is offset from the real source location mostly in the vertical direction. The incorporation of P coda significantly improves solution accuracy and reduces its uncertainty. The solution remains robust with a range of random noises in data, un-modeled random velocity heterogeneities, and uncertainties in moment tensors that we tested.

  19. Analysis of a librarian-mediated literature search service.

    Science.gov (United States)

    Friesen, Carol; Lê, Mê-Linh; Cooke, Carol; Raynard, Melissa

    2015-01-01

    Librarian-mediated literature searching is a key service provided at medical libraries. This analysis outlines ten years of data on 19,248 literature searches and describes information on the volume and frequency of search requests, time spent per search, databases used, and professional designations of the patron requestors. Combined with information on best practices for expert searching and evaluations of similar services, these findings were used to form recommendations on the improvement and standardization of a literature search service at a large health library system.

  20. The Search for Another Earth

    Indian Academy of Sciences (India)

    2016-07-01

    Is there life anywhere else in the vast cosmos?Are there planets similar to the Earth? For centuries,these questions baffled curious minds. Eithera positive or negative answer, if found oneday, would carry a deep philosophical significancefor our very existence in the universe. Althoughthe search for extra-terrestrial intelligence wasinitiated decades ago, a systematic scientific andglobal quest towards achieving a convincing answerbegan in 1995 with the discovery of the firstconfirmed planet orbiting around the solar-typestar 51 Pegasi. Since then, astronomers have discoveredmany exoplanets using two main techniques,radial velocity and transit measurements.In the first part of this article, we shall describethe different astronomical methods through whichthe extrasolar planets of various kinds are discovered.In the second part of the article we shalldiscuss the various kinds of exoplanets, in particularabout the habitable planets discovered tilldate and the present status of our search for ahabitable planet similar to the Earth.

  1. A New Generalized Similarity-Based Topic Distillation Algorithm

    Institute of Scientific and Technical Information of China (English)

    ZHOU Hongfang; DANG Xiaohui

    2007-01-01

    The procedure of hypertext induced topic search based on a semantic relation model is analyzed, and the reason for the topic drift of HITS algorithm was found to prove that Web pages are projected to a wrong latent semantic basis. A new concept-generalized similarity is introduced and, based on this, a new topic distillation algorithm GSTDA(generalized similarity based topic distillation algorithm) was presented to improve the quality of topic distillation. GSTDA was applied not only to avoid the topic drift, but also to explore relative topics to user query. The experimental results on 10 queries show that GSTDA reduces topic drift rate by 10% to 58% compared to that of HITS(hypertext induced topic search) algorithm, and discovers several relative topics to queries that have multiple meanings.

  2. Semantic Features for Classifying Referring Search Terms

    Energy Technology Data Exchange (ETDEWEB)

    May, Chandler J.; Henry, Michael J.; McGrath, Liam R.; Bell, Eric B.; Marshall, Eric J.; Gregory, Michelle L.

    2012-05-11

    When an internet user clicks on a result in a search engine, a request is submitted to the destination web server that includes a referrer field containing the search terms given by the user. Using this information, website owners can analyze the search terms leading to their websites to better understand their visitors needs. This work explores some of the features that can be used for classification-based analysis of such referring search terms. We present initial results for the example task of classifying HTTP requests countries of origin. A system that can accurately predict the country of origin from query text may be a valuable complement to IP lookup methods which are susceptible to the obfuscation of dereferrers or proxies. We suggest that the addition of semantic features improves classifier performance in this example application. We begin by looking at related work and presenting our approach. After describing initial experiments and results, we discuss paths forward for this work.

  3. Optimal directed searches for continuous gravitational waves

    CERN Document Server

    Ming, Jing; Papa, Maria Alessandra; Aulbert, Carsten; Fehrmann, Henning

    2015-01-01

    Wide parameter space searches for long lived continuous gravitational wave signals are computationally limited. It is therefore critically important that available computational resources are used rationally. In this paper we consider directed searches, i.e. targets for which the sky position is known accurately but the frequency and spindown parameters are completely unknown. Given a list of such potential astrophysical targets, we therefore need to prioritize. On which target(s) should we spend scarce computing resources? What parameter space region in frequency and spindown should we search? Finally, what is the optimal search set-up that we should use? In this paper we present a general framework that allows to solve all three of these problems. This framework is based on maximizing the probability of making a detection subject to a constraint on the maximum available computational cost. We illustrate the method for a simplified problem.

  4. Accurate tracking control in LOM application

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    The fabrication of accurate prototype from CAD model directly in short time depends on the accurate tracking control and reference trajectory planning in (Laminated Object Manufacture) LOM application. An improvement on contour accuracy is acquired by the introduction of a tracking controller and a trajectory generation policy. A model of the X-Y positioning system of LOM machine is developed as the design basis of tracking controller. The ZPETC (Zero Phase Error Tracking Controller) is used to eliminate single axis following error, thus reduce the contour error. The simulation is developed on a Maltab model based on a retrofitted LOM machine and the satisfied result is acquired.

  5. Durham Zoo: Powering a Search-&-Innovation Engine with Collective Intelligence

    Directory of Open Access Journals (Sweden)

    Richard Absalom

    2015-02-01

    graphical representations has been used.Practical implications – Concept searching is seen as having two categories: prior art searching, which is searching for what already exists, and solution searching: a search for a novel solution to an existing problem.Prior art searching is not as efficient a process, as all encompassing in scope, or as accurate in result, as it could and probably should be. The prior art includes library collections, journals, conference proceedings and everything else that has been written, drawn, spoken or made public in any way. Much technical information is only published in patents. There is a good reason to improve prior art searching: research, industry, and indeed humanity faces the spectre of patent thickets: an impenetrable legal space that effectively hinders innovation rather than promotes it. Improved prior-art searching would help with the gardening and result in fewer and higher-quality patents. Poor-quality patents can reward patenting activity per se, which is not what the system was designed for. Improved prior-art searching could also result in less duplication in research, and/or lead to improved collaboration.As regards solution search, the authors of the paper believe that much better use could be made of the existing literature to find solutions from non-obvious areas of science and technology. The so-called cross industry innovation could be joined by biomimetics, the inspiration of solutions from nature.Crowdsourcing the concept shorthand could produce a system ‘by the people, for the people’, to quote Abraham Lincoln out of context. A Citizen Science and Technology initiative that developed a working search engine could generate revenue for academia. Any monies accruing could be invested in research for the common good, such as the development of climate change mitigation technologies, or the discovery of new antibiotics.Originality – The authors know of no similar systems in development

  6. Modeling of Hysteresis in Piezoelectric Actuator Based on Segment Similarity

    Directory of Open Access Journals (Sweden)

    Rui Xiong

    2015-11-01

    Full Text Available To successfully exploit the full potential of piezoelectric actuators in micro/nano positioning systems, it is essential to model their hysteresis behavior accurately. A novel hysteresis model for piezoelectric actuator is proposed in this paper. Firstly, segment-similarity, which describes the similarity relationship between hysteresis curve segments with different turning points, is proposed. Time-scale similarity, which describes the similarity relationship between hysteresis curves with different rates, is used to solve the problem of dynamic effect. The proposed model is formulated using these similarities. Finally, the experiments are performed with respect to a micro/nano-meter movement platform system. The effectiveness of the proposed model is verified as compared with the Preisach model. The experimental results show that the proposed model is able to precisely predict the hysteresis trajectories of piezoelectric actuators and performs better than the Preisach model.

  7. Cultural similarity, cultural competence, and nurse workforce diversity.

    Science.gov (United States)

    McGinnis, Sandra L; Brush, Barbara L; Moore, Jean

    2010-11-01

    Proponents of health workforce diversity argue that increasing the number of minority health care providers will enhance cultural similarity between patients and providers as well as the health system's capacity to provide culturally competent care. Measuring cultural similarity has been difficult, however, given that current benchmarks of workforce diversity categorize health workers by major racial/ethnic classifications rather than by cultural measures. This study examined the use of national racial/ethnic categories in both patient and registered nurse (RN) populations and found them to be a poor indicator of cultural similarity. Rather, we found that cultural similarity between RN and patient populations needs to be established at the level of local labor markets and broadened to include other cultural parameters such as country of origin, primary language, and self-identified ancestry. Only then can the relationship between cultural similarity and cultural competence be accurately determined and its outcomes measured.

  8. Accurate Switched-Voltage voltage averaging circuit

    OpenAIRE

    金光, 一幸; 松本, 寛樹

    2006-01-01

    Abstract ###This paper proposes an accurate Switched-Voltage (SV) voltage averaging circuit. It is presented ###to compensated for NMOS missmatch error at MOS differential type voltage averaging circuit. ###The proposed circuit consists of a voltage averaging and a SV sample/hold (S/H) circuit. It can ###operate using nonoverlapping three phase clocks. Performance of this circuit is verified by PSpice ###simulations.

  9. Accurate overlaying for mobile augmented reality

    NARCIS (Netherlands)

    Pasman, W; van der Schaaf, A; Lagendijk, RL; Jansen, F.W.

    1999-01-01

    Mobile augmented reality requires accurate alignment of virtual information with objects visible in the real world. We describe a system for mobile communications to be developed to meet these strict alignment criteria using a combination of computer vision. inertial tracking and low-latency renderi

  10. Semantic Web Based Efficient Search Using Ontology and Mathematical Model

    Directory of Open Access Journals (Sweden)

    K.Palaniammal

    2014-01-01

    Full Text Available The semantic web is the forthcoming technology in the world of search engine. It becomes mainly focused towards the search which is more meaningful rather than the syntactic search prevailing now. This proposed work concerns about the semantic search with respect to the educational domain. In this paper, we propose semantic web based efficient search using ontology and mathematical model that takes into account the misleading, unmatched kind of service information, lack of relevant domain knowledge and the wrong service queries. To solve these issues in this framework is designed to make three major contributions, which are ontology knowledge base, Natural Language Processing (NLP techniques and search model. Ontology knowledge base is to store domain specific service ontologies and service description entity (SDE metadata. The search model is to retrieve SDE metadata as efficient for Education lenders, which include mathematical model. The Natural language processing techniques for spell-check and synonym based search. The results are retrieved and stored in an ontology, which in terms prevents the data redundancy. The results are more accurate to search, sensitive to spell check and synonymous context. This paper reduces the user’s time and complexity in finding for the correct results of his/her search text and our model provides more accurate results. A series of experiments are conducted in order to respectively evaluate the mechanism and the employed mathematical model.

  11. Predicting consumer behavior with Web search.

    Science.gov (United States)

    Goel, Sharad; Hofman, Jake M; Lahaie, Sébastien; Pennock, David M; Watts, Duncan J

    2010-10-12

    Recent work has demonstrated that Web search volume can "predict the present," meaning that it can be used to accurately track outcomes such as unemployment levels, auto and home sales, and disease prevalence in near real time. Here we show that what consumers are searching for online can also predict their collective future behavior days or even weeks in advance. Specifically we use search query volume to forecast the opening weekend box-office revenue for feature films, first-month sales of video games, and the rank of songs on the Billboard Hot 100 chart, finding in all cases that search counts are highly predictive of future outcomes. We also find that search counts generally boost the performance of baseline models fit on other publicly available data, where the boost varies from modest to dramatic, depending on the application in question. Finally, we reexamine previous work on tracking flu trends and show that, perhaps surprisingly, the utility of search data relative to a simple autoregressive model is modest. We conclude that in the absence of other data sources, or where small improvements in predictive performance are material, search queries provide a useful guide to the near future.

  12. A Survey of Meta Search Engine%元搜索引擎研究

    Institute of Scientific and Technical Information of China (English)

    张卫丰; 徐宝文; 周晓宇; 李东; 许蕾

    2001-01-01

    With the explosive increase of the network information,it is more and more difficult for people to look up information. The occurrence of the Web search engines overcomes this problem in some degree. However, because different search engines use different mechanisms, scope and algorithms, the repetition of the search results for the same query is no more than 34 %. If wish to get relativly fullscale ,accurate search results,multi-search engines should be used and the meta search engines occur. In this paper ,the meta search engines are surveyed. At first ,the history ,the principles and the elements of the meta search engines are discussed. Then,the related creteria of the meta search engines are analyzed and several typical meta search engines are compared. Finally,on this base,the trend of the meta search engine is introduced.

  13. Partial Recurrent Laryngeal Nerve Paralysis or Paresis? In Search for the Accurate Diagnosis

    Directory of Open Access Journals (Sweden)

    Alexander Delides

    2015-01-01

    Full Text Available “Partial paralysis” of the larynx is a term often used to describe a hypomobile vocal fold as is the term “paresis.” We present a case of a dysphonic patient with a mobility disorder of the vocal fold, for whom idiopathic “partial paralysis” was the diagnosis made after laryngeal electromyography, and discuss a proposition for a different implementation of the term.

  14. Keep Searching and You’ll Find

    DEFF Research Database (Denmark)

    Laursen, Keld

    2012-01-01

    triggers for different kinds of search. It argues that the initial focus on local search was a consequence, in part, of the attention in evolutionary economics to path-dependent behavior, but that as localized behavior was increasingly accepted as the standard mode, studies began to question whether local...... search was the best solution in all cases. More recently, the literature has focused on the trade-offs being created, by firms having to balance local and non-local search. We account also for the apparent “variety paradox” in the stylized fact that organizations within the same industry tend to follow...... different search strategies, but end up with very similar technological profiles in fast-growing technologies. The article concludes by highlighting what we have learnt from the literature and suggesting some new avenues for research....

  15. Searching Databases with Keywords

    Institute of Scientific and Technical Information of China (English)

    Shan Wang; Kun-Long Zhang

    2005-01-01

    Traditionally, SQL query language is used to search the data in databases. However, it is inappropriate for end-users, since it is complex and hard to learn. It is the need of end-user, searching in databases with keywords, like in web search engines. This paper presents a survey of work on keyword search in databases. It also includes a brief introduction to the SEEKER system which has been developed.

  16. Integrated vs. Federated Search

    DEFF Research Database (Denmark)

    Løvschall, Kasper

    2009-01-01

    Oplæg om forskelle og ligheder mellem integrated og federated search i bibliotekskontekst. Holdt ved temadag om "Integrated Search - samsøgning i alle kilder" på Danmarks Biblioteksskole den 22. januar 2009.......Oplæg om forskelle og ligheder mellem integrated og federated search i bibliotekskontekst. Holdt ved temadag om "Integrated Search - samsøgning i alle kilder" på Danmarks Biblioteksskole den 22. januar 2009....

  17. Searching and Indexing Genomic Databases via Kernelization

    Directory of Open Access Journals (Sweden)

    Travis eGagie

    2015-02-01

    Full Text Available The rapid advance of DNA sequencing technologies has yielded databases of thousands of genomes. To search and index these databases effectively, it is important that we take advantage of the similarity between those genomes. Several authors have recently suggested searching or indexing only one reference genome and the parts of the other genomes where they differ. In this paper we survey the twenty-year history of this idea and discuss its relation to kernelization in parameterized complexity.

  18. The Information Search

    Science.gov (United States)

    Doraiswamy, Uma

    2011-01-01

    This paper in the form of story discusses a college student's information search process. In this story we see Kuhlthau's information search process: initiation, selection, exploration, formulation, collection, and presentation. Katie is a student who goes in search of information for her class research paper. Katie's class readings, her interest…

  19. Search and the city

    NARCIS (Netherlands)

    P.A. Gautier; C.N. Teulings

    2009-01-01

    We develop a model of an economy with several regions, which differ in scale. Within each region, workers have to search for a job-type that matches their skill. They face a trade-off between match quality and the cost of extended search. This trade-off differs between regions, because search is mor

  20. How doctors search

    DEFF Research Database (Denmark)

    Lykke, Marianne; Price, Susan; Delcambre, Lois

    2012-01-01

    to context-specific aspects of the main topic of the documents. We have tested the model in an interactive searching study with family doctors with the purpose to explore doctors’ querying behaviour, how they applied the means for specifying a search, and how these features contributed to the search outcome...

  1. Assessment of metabolome annotation quality: a method for evaluating the false discovery rate of elemental composition searches.

    Directory of Open Access Journals (Sweden)

    Fumio Matsuda

    Full Text Available BACKGROUND: In metabolomics researches using mass spectrometry (MS, systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed. METHODOLOGY/PRINCIPAL FINDINGS: The FDR can be determined from one measured value (i.e., the hit rate for search queries and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30-50% were obtained when searching time-of-flight (TOF/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem always produced unacceptable results (FDR >70%. The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries. CONCLUSIONS/SIGNIFICANCE: High accuracy mass analysis, such as Fourier transform (FT-MS, is needed for reliable annotation (FDR <10%. In addition, a small, customized compound database is preferable for high-quality annotation of metabolome data.

  2. Efficient Proposed Framework for Semantic Search Engine using New Semantic Ranking Algorithm

    Directory of Open Access Journals (Sweden)

    M. M. El-gayar

    2015-08-01

    Full Text Available The amount of information raises billions of databases every year and there is an urgent need to search for that information by a specialize tool called search engine. There are many of search engines available today, but the main challenge in these search engines is that most of them cannot retrieve meaningful information intelligently. The semantic web technology is a solution that keeps data in a readable format that helps machines to match smartly this data with related information based on meanings. In this paper, we will introduce a proposed semantic framework that includes four phases crawling, indexing, ranking and retrieval phase. This semantic framework operates over a sorting RDF by using efficient proposed ranking algorithm and enhanced crawling algorithm. The enhanced crawling algorithm crawls relevant forum content from the web with minimal overhead. The proposed ranking algorithm is produced to order and evaluate similar meaningful data in order to make the retrieval process becomes faster, easier and more accurate. We applied our work on a standard database and achieved 99 percent effectiveness on semantic performance in minimum time and less than 1 percent error rate compared with the other semantic systems.

  3. Accurate colorimetric feedback for RGB LED clusters

    Science.gov (United States)

    Man, Kwong; Ashdown, Ian

    2006-08-01

    We present an empirical model of LED emission spectra that is applicable to both InGaN and AlInGaP high-flux LEDs, and which accurately predicts their relative spectral power distributions over a wide range of LED junction temperatures. We further demonstrate with laboratory measurements that changes in LED spectral power distribution with temperature can be accurately predicted with first- or second-order equations. This provides the basis for a real-time colorimetric feedback system for RGB LED clusters that can maintain the chromaticity of white light at constant intensity to within +/-0.003 Δuv over a range of 45 degrees Celsius, and to within 0.01 Δuv when dimmed over an intensity range of 10:1.

  4. A toolbox for representational similarity analysis.

    Directory of Open Access Journals (Sweden)

    Hamed Nili

    2014-04-01

    Full Text Available Neuronal population codes are increasingly being investigated with multivariate pattern-information analyses. A key challenge is to use measured brain-activity patterns to test computational models of brain information processing. One approach to this problem is representational similarity analysis (RSA, which characterizes a representation in a brain or computational model by the distance matrix of the response patterns elicited by a set of stimuli. The representational distance matrix encapsulates what distinctions between stimuli are emphasized and what distinctions are de-emphasized in the representation. A model is tested by comparing the representational distance matrix it predicts to that of a measured brain region. RSA also enables us to compare representations between stages of processing within a given brain or model, between brain and behavioral data, and between individuals and species. Here, we introduce a Matlab toolbox for RSA. The toolbox supports an analysis approach that is simultaneously data- and hypothesis-driven. It is designed to help integrate a wide range of computational models into the analysis of multichannel brain-activity measurements as provided by modern functional imaging and neuronal recording techniques. Tools for visualization and inference enable the user to relate sets of models to sets of brain regions and to statistically test and compare the models using nonparametric inference methods. The toolbox supports searchlight-based RSA, to continuously map a measured brain volume in search of a neuronal population code with a specific geometry. Finally, we introduce the linear-discriminant t value as a measure of representational discriminability that bridges the gap between linear decoding analyses and RSA. In order to demonstrate the capabilities of the toolbox, we apply it to both simulated and real fMRI data. The key functions are equally applicable to other modalities of brain-activity measurement. The

  5. Efficient Accurate Context-Sensitive Anomaly Detection

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    For program behavior-based anomaly detection, the only way to ensure accurate monitoring is to construct an efficient and precise program behavior model. A new program behavior-based anomaly detection model,called combined pushdown automaton (CPDA) model was proposed, which is based on static binary executable analysis. The CPDA model incorporates the optimized call stack walk and code instrumentation technique to gain complete context information. Thereby the proposed method can detect more attacks, while retaining good performance.

  6. On accurate determination of contact angle

    Science.gov (United States)

    Concus, P.; Finn, R.

    1992-01-01

    Methods are proposed that exploit a microgravity environment to obtain highly accurate measurement of contact angle. These methods, which are based on our earlier mathematical results, do not require detailed measurement of a liquid free-surface, as they incorporate discontinuous or nearly-discontinuous behavior of the liquid bulk in certain container geometries. Physical testing is planned in the forthcoming IML-2 space flight and in related preparatory ground-based experiments.

  7. Accurate Control of Josephson Phase Qubits

    Science.gov (United States)

    2016-04-14

    61 ~1986!. 23 K. Kraus, States, Effects, and Operations: Fundamental Notions of Quantum Theory, Lecture Notes in Physics , Vol. 190 ~Springer-Verlag... PHYSICAL REVIEW B 68, 224518 ~2003!Accurate control of Josephson phase qubits Matthias Steffen,1,2,* John M. Martinis,3 and Isaac L. Chuang1 1Center...for Bits and Atoms and Department of Physics , MIT, Cambridge, Massachusetts 02139, USA 2Solid State and Photonics Laboratory, Stanford University

  8. Accurate guitar tuning by cochlear implant musicians.

    Science.gov (United States)

    Lu, Thomas; Huang, Juan; Zeng, Fan-Gang

    2014-01-01

    Modern cochlear implant (CI) users understand speech but find difficulty in music appreciation due to poor pitch perception. Still, some deaf musicians continue to perform with their CI. Here we show unexpected results that CI musicians can reliably tune a guitar by CI alone and, under controlled conditions, match simultaneously presented tones to electric analysis, showed that accurate tuning was achieved by listening to beats rather than discriminating pitch, effectively turning a spectral task into a temporal discrimination task.

  9. Synthesizing Accurate Floating-Point Formulas

    OpenAIRE

    Ioualalen, Arnault; Martel, Matthieu

    2013-01-01

    International audience; Many critical embedded systems perform floating-point computations yet their accuracy is difficult to assert and strongly depends on how formulas are written in programs. In this article, we focus on the synthesis of accurate formulas mathematically equal to the original formulas occurring in source codes. In general, an expression may be rewritten in many ways. To avoid any combinatorial explosion, we use an intermediate representation, called APEG, enabling us to rep...

  10. Automatic Planning of External Search Engine Optimization

    Directory of Open Access Journals (Sweden)

    Vita Jasevičiūtė

    2015-07-01

    Full Text Available This paper describes an investigation of the external search engine optimization (SEO action planning tool, dedicated to automatically extract a small set of most important keywords for each month during whole year period. The keywords in the set are extracted accordingly to external measured parameters, such as average number of searches during the year and for every month individually. Additionally the position of the optimized web site for each keyword is taken into account. The generated optimization plan is similar to the optimization plans prepared manually by the SEO professionals and can be successfully used as a support tool for web site search engine optimization.

  11. Accurate structural correlations from maximum likelihood superpositions.

    Directory of Open Access Journals (Sweden)

    Douglas L Theobald

    2008-02-01

    Full Text Available The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method ("PCA plots" for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology.

  12. Cerebral fat embolism: Use of MR spectroscopy for accurate diagnosis

    Directory of Open Access Journals (Sweden)

    Laxmi Kokatnur

    2015-01-01

    Full Text Available Cerebral fat embolism (CFE is an uncommon but serious complication following orthopedic procedures. It usually presents with altered mental status, and can be a part of fat embolism syndrome (FES if associated with cutaneous and respiratory manifestations. Because of the presence of other common factors affecting the mental status, particularly in the postoperative period, the diagnosis of CFE can be challenging. Magnetic resonance imaging (MRI of brain typically shows multiple lesions distributed predominantly in the subcortical region, which appear as hyperintense lesions on T2 and diffusion weighted images. Although the location offers a clue, the MRI findings are not specific for CFE. Watershed infarcts, hypoxic encephalopathy, disseminated infections, demyelinating disorders, diffuse axonal injury can also show similar changes on MRI of brain. The presence of fat in these hyperintense lesions, identified by MR spectroscopy as raised lipid peaks will help in accurate diagnosis of CFE. Normal brain tissue or conditions producing similar MRI changes will not show any lipid peak on MR spectroscopy. We present a case of CFE initially misdiagnosed as brain stem stroke based on clinical presentation and cranial computed tomography (CT scan, and later, MR spectroscopy elucidated the accurate diagnosis.

  13. Scattering from Rough Surfaces with Extended Self-Similarity

    Institute of Scientific and Technical Information of China (English)

    张延冬; 吴振森

    2002-01-01

    An extended self-similarity (ESS) model is developed by extending the self-similarity condition in fractional Brownian motion (FBM), then the incremental Fourier synthesis algorithm is introduced to generate ESS rough surfaces, and an estimation algorithm is presented to extract the generalized multiscale Hurst parameter, which can also be modified to estimate the Hurst parameter for FBM more accurately. Finally, the scattering coefficient from ESS rough surfaces is calculated with the scalar Kirchhoff approximation, and its variation with the parameters in the ESS model is obtained. Compared with experimental measurements, it can be concluded that the ESS model provides a good tool to model natural rough surfaces.

  14. Similarity-based denoising of point-sampled surface

    Institute of Scientific and Technical Information of China (English)

    Ren-fang WANG; Wen-zhi CHEN; San-yuan ZHANG; Yin ZHANG; Xiu-zi YE

    2008-01-01

    A non-local denoising (NLD) algorithm for point-sampled surfaces (PSSs) is presented based on similarities, including geometry intensity and features of sample points. By using the trilateral filtering operator, the differential signal of each sample point is determined and called "geometry intensity". Based on covariance analysis, a regular grid of geometry intensity of a sample point is constructed, and the geometry-intensity similarity of two points is measured according to their grids. Based on mean shift clustering, the PSSs are clustered in terms of the local geometry-features similarity. The smoothed geometry intensity, i.e., offset distance, of the sample point is estimated according to the two similarities. Using the resulting intensity, the noise component from PSSs is finally removed by adjusting the position of each sample point along its own normal direction. Experimental results demonstrate that the algorithm is robust and can produce a more accurate denoising result while having better feature preservation.

  15. Similarity-Based Classification in Partially Labeled Networks

    Science.gov (United States)

    Zhang, Qian-Ming; Shang, Ming-Sheng; Lü, Linyuan

    Two main difficulties in the problem of classification in partially labeled networks are the sparsity of the known labeled nodes and inconsistency of label information. To address these two difficulties, we propose a similarity-based method, where the basic assumption is that two nodes are more likely to be categorized into the same class if they are more similar. In this paper, we introduce ten similarity indices defined based on the network structure. Empirical results on the co-purchase network of political books show that the similarity-based method can, to some extent, overcome these two difficulties and give higher accurate classification than the relational neighbors method, especially when the labeled nodes are sparse. Furthermore, we find that when the information of known labeled nodes is sufficient, the indices considering only local information can perform as good as those global indices while having much lower computational complexity.

  16. Personalized Predictive Modeling and Risk Factor Identification using Patient Similarity.

    Science.gov (United States)

    Ng, Kenney; Sun, Jimeng; Hu, Jianying; Wang, Fei

    2015-01-01

    Personalized predictive models are customized for an individual patient and trained using information from similar patients. Compared to global models trained on all patients, they have the potential to produce more accurate risk scores and capture more relevant risk factors for individual patients. This paper presents an approach for building personalized predictive models and generating personalized risk factor profiles. A locally supervised metric learning (LSML) similarity measure is trained for diabetes onset and used to find clinically similar patients. Personalized risk profiles are created by analyzing the parameters of the trained personalized logistic regression models. A 15,000 patient data set, derived from electronic health records, is used to evaluate the approach. The predictive results show that the personalized models can outperform the global model. Cluster analysis of the risk profiles show groups of patients with similar risk factors, differences in the top risk factors for different groups of patients and differences between the individual and global risk factors.

  17. Keyword Search in Databases

    CERN Document Server

    Yu, Jeffrey Xu; Chang, Lijun

    2009-01-01

    It has become highly desirable to provide users with flexible ways to query/search information over databases as simple as keyword search like Google search. This book surveys the recent developments on keyword search over databases, and focuses on finding structural information among objects in a database using a set of keywords. Such structural information to be returned can be either trees or subgraphs representing how the objects, that contain the required keywords, are interconnected in a relational database or in an XML database. The structural keyword search is completely different from

  18. Faceted Semantic Search for Personalized Social Search

    CERN Document Server

    Mas, Massimiliano Dal

    2012-01-01

    Actual social networks (like Facebook, Twitter, Linkedin, ...) need to deal with vagueness on ontological indeterminacy. In this paper is analyzed the prototyping of a faceted semantic search for personalized social search using the "joint meaning" in a community environment. User researches in a "collaborative" environment defined by folksonomies can be supported by the most common features on the faceted semantic search. A solution for the context-aware personalized search is based on "joint meaning" understood as a joint construal of the creators of the contents and the user of the contents using the faced taxonomy with the Semantic Web. A proof-of concept prototype shows how the proposed methodological approach can also be applied to existing presentation components, built with different languages and/or component technologies.

  19. A New Efficient Method for Calculating Similarity Between Web Services

    Directory of Open Access Journals (Sweden)

    T. RACHAD

    2014-08-01

    Full Text Available Web services allow communication between heterogeneous systems in a distributed environment. Their enormous success and their increased use led to the fact that thousands of Web services are present on the Internet. This significant number of Web services which not cease to increase has led to problems of the difficulty in locating and classifying web services, these problems are encountered mainly during the operations of web services discovery and substitution. Traditional ways of search based on keywords are not successful in this context, their results do not support the structure of Web services and they consider in their search only the identifiers of the web service description language (WSDL interface elements. The methods based on semantics (WSDLS, OWLS, SAWSDL… which increase the WSDL description of a Web service with a semantic description allow raising partially this problem, but their complexity and difficulty delays their adoption in real cases. Measuring the similarity between the web services interfaces is the most suitable solution for this kind of problems, it will classify available web services so as to know those that best match the searched profile and those that do not match. Thus, the main goal of this work is to study the degree of similarity between any two web services by offering a new method that is more effective than existing works.

  20. Genetic algorithms as global random search methods

    Science.gov (United States)

    Peck, Charles C.; Dhawan, Atam P.

    1995-01-01

    Genetic algorithm behavior is described in terms of the construction and evolution of the sampling distributions over the space of candidate solutions. This novel perspective is motivated by analysis indicating that the schema theory is inadequate for completely and properly explaining genetic algorithm behavior. Based on the proposed theory, it is argued that the similarities of candidate solutions should be exploited directly, rather than encoding candidate solutions and then exploiting their similarities. Proportional selection is characterized as a global search operator, and recombination is characterized as the search process that exploits similarities. Sequential algorithms and many deletion methods are also analyzed. It is shown that by properly constraining the search breadth of recombination operators, convergence of genetic algorithms to a global optimum can be ensured.

  1. Cascade category-aware visual search.

    Science.gov (United States)

    Zhang, Shiliang; Tian, Qi; Huang, Qingming; Gao, Wen; Rui, Yong

    2014-06-01

    Incorporating image classification into image retrieval system brings many attractive advantages. For instance, the search space can be narrowed down by rejecting images in irrelevant categories of the query. The retrieved images can be more consistent in semantics by indexing and returning images in the relevant categories together. However, due to their different goals on recognition accuracy and retrieval scalability, it is hard to efficiently incorporate most image classification works into large-scale image search. To study this problem, we propose cascade category-aware visual search, which utilizes weak category clue to achieve better retrieval accuracy, efficiency, and memory consumption. To capture the category and visual clues of an image, we first learn category-visual words, which are discriminative and repeatable local features labeled with categories. By identifying category-visual words in database images, we are able to discard noisy local features and extract image visual and category clues, which are hence recorded in a hierarchical index structure. Our retrieval system narrows down the search space by: 1) filtering the noisy local features in query; 2) rejecting irrelevant categories in database; and 3) preforming discriminative visual search in relevant categories. The proposed algorithm is tested on object search, landmark search, and large-scale similar image search on the large-scale LSVRC10 data set. Although the category clue introduced is weak, our algorithm still shows substantial advantages in retrieval accuracy, efficiency, and memory consumption than the state-of-the-art.

  2. Automatization and training in visual search.

    Science.gov (United States)

    Czerwinski, M; Lightfoot, N; Shiffrin, R M

    1992-01-01

    In several search tasks, the amount of practice on particular combinations of targets and distractors was equated in varied-mapping (VM) and consistent-mapping (CM) conditions. The results indicate the importance of distinguishing between memory and visual search tasks, and implicate a number of factors that play important roles in visual search and its learning. Visual search was studied in Experiment 1. VM and CM performance were almost equal, and slope reductions occurred during practice for both, suggesting the learning of efficient attentive search based on features, and no important role for automatic attention attraction. However, positive transfer effects occurred when previous CM targets were re-paired with previous CM distractors, even though these targets and distractors had not been trained together. Also, the introduction of a demanding simultaneous task produced advantages of CM over VM. These latter two results demonstrated the operation of automatic attention attraction. Visual search was further studied in Experiment 2, using novel characters for which feature overlap and similarity were controlled. The design and many of the findings paralleled Experiment 1. In addition, enormous search improvement was seen over 35 sessions of training, suggesting the operation of perceptual unitization for the novel characters. Experiment 3 showed a large, persistent advantage for CM over VM performance in memory search, even when practice on particular combinations of targets and distractors was equated in the two training conditions. A multifactor theory of automatization and attention is put forth to account for these findings and others in the literature.

  3. When Gravity Fails: Local Search Topology

    Science.gov (United States)

    Frank, Jeremy; Cheeseman, Peter; Stutz, John; Lau, Sonie (Technical Monitor)

    1997-01-01

    Local search algorithms for combinatorial search problems frequently encounter a sequence of states in which it is impossible to improve the value of the objective function; moves through these regions, called {\\em plateau moves), dominate the time spent in local search. We analyze and characterize {\\em plateaus) for three different classes of randomly generated Boolean Satisfiability problems. We identify several interesting features of plateaus that impact the performance of local search algorithms. We show that local minima tend to be small but occasionally may be very large. We also show that local minima can be escaped without unsatisfying a large number of clauses, but that systematically searching for an escape route may be computationally expensive if the local minimum is large. We show that plateaus with exits, called benches, tend to be much larger than minima, and that some benches have very few exit states which local search can use to escape. We show that the solutions (i.e. global minima) of randomly generated problem instances form clusters, which behave similarly to local minima. We revisit several enhancements of local search algorithms and explain their performance in light of our results. Finally we discuss strategies for creating the next generation of local search algorithms.

  4. University Students' Online Information Searching Strategies in Different Search Contexts

    Science.gov (United States)

    Tsai, Meng-Jung; Liang, Jyh-Chong; Hou, Huei-Tse; Tsai, Chin-Chung

    2012-01-01

    This study investigates the role of search context played in university students' online information searching strategies. A total of 304 university students in Taiwan were surveyed with questionnaires in which two search contexts were defined as searching for learning, and searching for daily life information. Students' online search strategies…

  5. [Advanced online search techniques and dedicated search engines for physicians].

    Science.gov (United States)

    Nahum, Yoav

    2008-02-01

    In recent years search engines have become an essential tool in the work of physicians. This article will review advanced search techniques from the world of information specialists, as well as some advanced search engine operators that may help physicians improve their online search capabilities, and maximize the yield of their searches. This article also reviews popular dedicated scientific and biomedical literature search engines.

  6. Automated Discovery of New Chemical Reactions and Accurate Calculation of Their Rates

    Science.gov (United States)

    2015-06-02

    of the Search Space Thermochemistry Calculations. Even for a small hydrocarbon system, the number of reaction pathways which can be generated using... reaction is considered to be too endothermic to be interesting if the standard enthalpy of reaction (denoted as Hr0 in Tables 2-5) is higher than 20 kcal...AFRL-OSR-VA-TR-2015-0169 Automated discovery of new chemical reactions and accurate calculation of their rates William Green MASSACHUSETTS INSTITUTE

  7. Multi Agent Architecture for Search Engine

    Directory of Open Access Journals (Sweden)

    Disha Verma

    2016-03-01

    Full Text Available The process of retrieving information is becoming ambiguous day by day due to huge collection of documents present on web. A single keyword produces millions of results related to given query but these results are not up to user expectations. The search results produced from traditional text search engines may be relevant or irrelevant. The underlying reason is Web documents are HTML documents that do not contain semantic descriptors and annotations. The paper proposes multi agent architecture to produce fewer but personalized results. The purpose of the research is to provide platform for domain specific personalized search. Personalized search allows delivering web pages in accordance with user’s interest and domain. The proposed architecture uses client side as well server side personalization to provide user with personalized fever but more accurate results. Multi agent search engine architecture uses the concept of semantic descriptors for acquiring knowledge about given domain and leading to personalized search results. Semantic descriptors are represented as network graph that holds relationship between given problem in form of hierarchy. This hierarchical classification is termed as Taxonomy.

  8. Accurate measurement of unsteady state fluid temperature

    Science.gov (United States)

    Jaremkiewicz, Magdalena

    2017-03-01

    In this paper, two accurate methods for determining the transient fluid temperature were presented. Measurements were conducted for boiling water since its temperature is known. At the beginning the thermometers are at the ambient temperature and next they are immediately immersed into saturated water. The measurements were carried out with two thermometers of different construction but with the same housing outer diameter equal to 15 mm. One of them is a K-type industrial thermometer widely available commercially. The temperature indicated by the thermometer was corrected considering the thermometers as the first or second order inertia devices. The new design of a thermometer was proposed and also used to measure the temperature of boiling water. Its characteristic feature is a cylinder-shaped housing with the sheath thermocouple located in its center. The temperature of the fluid was determined based on measurements taken in the axis of the solid cylindrical element (housing) using the inverse space marching method. Measurements of the transient temperature of the air flowing through the wind tunnel using the same thermometers were also carried out. The proposed measurement technique provides more accurate results compared with measurements using industrial thermometers in conjunction with simple temperature correction using the inertial thermometer model of the first or second order. By comparing the results, it was demonstrated that the new thermometer allows obtaining the fluid temperature much faster and with higher accuracy in comparison to the industrial thermometer. Accurate measurements of the fast changing fluid temperature are possible due to the low inertia thermometer and fast space marching method applied for solving the inverse heat conduction problem.

  9. The Hofmethode: Computing Semantic Similarities between E-Learning Products

    Directory of Open Access Journals (Sweden)

    Oliver Michel

    2009-11-01

    Full Text Available The key task in building useful e-learning repositories is to develop a system with an algorithm allowing users to retrieve information that corresponds to their specific requirements. To achieve this, products (or their verbal descriptions, i.e. presented in metadata need to be compared and structured according to the results of this comparison. Such structuring is crucial insofar as there are many search results that correspond to the entered keyword. The Hofmethode is an algorithm (based on psychological considerations to compute semantic similarities between texts and therefore offer a way to compare e-learning products. The computed similarity values are used to build semantic maps in which the products are visually arranged according to their similarities. The paper describes how the Hofmethode is implemented in the online database edulap, and how it contributes to help the user to explore the data in which he is interested.

  10. Accurate estimation of indoor travel times

    DEFF Research Database (Denmark)

    Prentow, Thor Siiger; Blunck, Henrik; Stisen, Allan

    2014-01-01

    the InTraTime method for accurately estimating indoor travel times via mining of historical and real-time indoor position traces. The method learns during operation both travel routes, travel times and their respective likelihood---both for routes traveled as well as for sub-routes thereof. In...... are collected within the building complex. Results indicate that InTraTime is superior with respect to metrics such as deployment cost, maintenance cost and estimation accuracy, yielding an average deviation from actual travel times of 11.7 %. This accuracy was achieved despite using a minimal-effort setup...

  11. Accurate diagnosis is essential for amebiasis

    Institute of Scientific and Technical Information of China (English)

    2004-01-01

    @@ Amebiasis is one of the three most common causes of death from parasitic disease, and Entamoeba histolytica is the most widely distributed parasites in the world. Particularly, Entamoeba histolytica infection in the developing countries is a significant health problem in amebiasis-endemic areas with a significant impact on infant mortality[1]. In recent years a world wide increase in the number of patients with amebiasis has refocused attention on this important infection. On the other hand, improving the quality of parasitological methods and widespread use of accurate tecniques have improved our knowledge about the disease.

  12. The first accurate description of an aurora

    Science.gov (United States)

    Schröder, Wilfried

    2006-12-01

    As technology has advanced, the scientific study of auroral phenomena has increased by leaps and bounds. A look back at the earliest descriptions of aurorae offers an interesting look into how medieval scholars viewed the subjects that we study.Although there are earlier fragmentary references in the literature, the first accurate description of the aurora borealis appears to be that published by the German Catholic scholar Konrad von Megenberg (1309-1374) in his book Das Buch der Natur (The Book of Nature). The book was written between 1349 and 1350.

  13. New law requires 'medically accurate' lesson plans.

    Science.gov (United States)

    1999-09-17

    The California Legislature has passed a bill requiring all textbooks and materials used to teach about AIDS be medically accurate and objective. Statements made within the curriculum must be supported by research conducted in compliance with scientific methods, and published in peer-reviewed journals. Some of the current lesson plans were found to contain scientifically unsupported and biased information. In addition, the bill requires material to be "free of racial, ethnic, or gender biases." The legislation is supported by a wide range of interests, but opposed by the California Right to Life Education Fund, because they believe it discredits abstinence-only material.

  14. Universality: Accurate Checks in Dyson's Hierarchical Model

    Science.gov (United States)

    Godina, J. J.; Meurice, Y.; Oktay, M. B.

    2003-06-01

    In this talk we present high-accuracy calculations of the susceptibility near βc for Dyson's hierarchical model in D = 3. Using linear fitting, we estimate the leading (γ) and subleading (Δ) exponents. Independent estimates are obtained by calculating the first two eigenvalues of the linearized renormalization group transformation. We found γ = 1.29914073 ± 10 -8 and, Δ = 0.4259469 ± 10-7 independently of the choice of local integration measure (Ising or Landau-Ginzburg). After a suitable rescaling, the approximate fixed points for a large class of local measure coincide accurately with a fixed point constructed by Koch and Wittwer.

  15. Semantic Search among Heterogeneous Biological Databases Based on Gene Ontology

    Institute of Scientific and Technical Information of China (English)

    Shun-Liang CAO; Lei QIN; Wei-Zhong HE; Yang ZHONG; Yang-Yong ZHU; Yi-Xue LI

    2004-01-01

    Semantic search is a key issue in integration of heterogeneous biological databases. In thispaper, we present a methodology for implementing semantic search in BioDW, an integrated biological datawarehouse. Two tables are presented: the DB2GO table to correlate Gene Ontology (GO) annotated entriesfrom BioDW data sources with GO, and the semantic similarity table to record similarity scores derived fromany pair of GO terms. Based on the two tables, multifarious ways for semantic search are provided and thecorresponding entries in heterogeneous biological databases in semantic terms can be expediently searched.

  16. Quantifying the Search Behaviour of Different Demographics Using Google Correlate.

    Directory of Open Access Journals (Sweden)

    Adrian Letchford

    Full Text Available Vast records of our everyday interests and concerns are being generated by our frequent interactions with the Internet. Here, we investigate how the searches of Google users vary across U.S. states with different birth rates and infant mortality rates. We find that users in states with higher birth rates search for more information about pregnancy, while those in states with lower birth rates search for more information about cats. Similarly, we find that users in states with higher infant mortality rates search for more information about credit, loans and diseases. Our results provide evidence that Internet search data could offer new insight into the concerns of different demographics.

  17. Personalized online information search and visualization

    Directory of Open Access Journals (Sweden)

    Orthner Helmuth F

    2005-03-01

    Full Text Available Abstract Background The rapid growth of online publications such as the Medline and other sources raises the questions how to get the relevant information efficiently. It is important, for a bench scientist, e.g., to monitor related publications constantly. It is also important, for a clinician, e.g., to access the patient records anywhere and anytime. Although time-consuming, this kind of searching procedure is usually similar and simple. Likely, it involves a search engine and a visualization interface. Different words or combination reflects different research topics. The objective of this study is to automate this tedious procedure by recording those words/terms in a database and online sources, and use the information for an automated search and retrieval. The retrieved information will be available anytime and anywhere through a secure web server. Results We developed such a database that stored searching terms, journals and et al., and implement a piece of software for searching the medical subject heading-indexed sources such as the Medline and other online sources automatically. The returned information were stored locally, as is, on a server and visible through a Web-based interface. The search was performed daily or otherwise scheduled and the users logon to the website anytime without typing any words. The system has potentials to retrieve similarly from non-medical subject heading-indexed literature or a privileged information source such as a clinical information system. The issues such as security, presentation and visualization of the retrieved information were thus addressed. One of the presentation issues such as wireless access was also experimented. A user survey showed that the personalized online searches saved time and increased and relevancy. Handheld devices could also be used to access the stored information but less satisfactory. Conclusion The Web-searching software or similar system has potential to be an efficient

  18. Inequalities between similarities for numerical data

    NARCIS (Netherlands)

    Warrens, Matthijs J.

    2016-01-01

    Similarity measures are entities that can be used to quantify the similarity between two vectors with real numbers. We present inequalities between seven well known similarities. The inequalities are valid if the vectors contain non-negative real numbers.

  19. Adaptive Levy processes and area-restricted search in human foraging.

    Directory of Open Access Journals (Sweden)

    Thomas T Hills

    Full Text Available A considerable amount of research has claimed that animals' foraging behaviors display movement lengths with power-law distributed tails, characteristic of Lévy flights and Lévy walks. Though these claims have recently come into question, the proposal that many animals forage using Lévy processes nonetheless remains. A Lévy process does not consider when or where resources are encountered, and samples movement lengths independently of past experience. However, Lévy processes too have come into question based on the observation that in patchy resource environments resource-sensitive foraging strategies, like area-restricted search, perform better than Lévy flights yet can still generate heavy-tailed distributions of movement lengths. To investigate these questions further, we tracked humans as they searched for hidden resources in an open-field virtual environment, with either patchy or dispersed resource distributions. Supporting previous research, for both conditions logarithmic binning methods were consistent with Lévy flights and rank-frequency methods-comparing alternative distributions using maximum likelihood methods-showed the strongest support for bounded power-law distributions (truncated Lévy flights. However, goodness-of-fit tests found that even bounded power-law distributions only accurately characterized movement behavior for 4 (out of 32 participants. Moreover, paths in the patchy environment (but not the dispersed environment showed a transition to intensive search following resource encounters, characteristic of area-restricted search. Transferring paths between environments revealed that paths generated in the patchy environment were adapted to that environment. Our results suggest that though power-law distributions do not accurately reflect human search, Lévy processes may still describe movement in dispersed environments, but not in patchy environments-where search was area-restricted. Furthermore, our results

  20. How Accurately can we Calculate Thermal Systems?

    Energy Technology Data Exchange (ETDEWEB)

    Cullen, D; Blomquist, R N; Dean, C; Heinrichs, D; Kalugin, M A; Lee, M; Lee, Y; MacFarlan, R; Nagaya, Y; Trkov, A

    2004-04-20

    I would like to determine how accurately a variety of neutron transport code packages (code and cross section libraries) can calculate simple integral parameters, such as K{sub eff}, for systems that are sensitive to thermal neutron scattering. Since we will only consider theoretical systems, we cannot really determine absolute accuracy compared to any real system. Therefore rather than accuracy, it would be more precise to say that I would like to determine the spread in answers that we obtain from a variety of code packages. This spread should serve as an excellent indicator of how accurately we can really model and calculate such systems today. Hopefully, eventually this will lead to improvements in both our codes and the thermal scattering models that they use in the future. In order to accomplish this I propose a number of extremely simple systems that involve thermal neutron scattering that can be easily modeled and calculated by a variety of neutron transport codes. These are theoretical systems designed to emphasize the effects of thermal scattering, since that is what we are interested in studying. I have attempted to keep these systems very simple, and yet at the same time they include most, if not all, of the important thermal scattering effects encountered in a large, water-moderated, uranium fueled thermal system, i.e., our typical thermal reactors.

  1. Accurate pattern registration for integrated circuit tomography

    Energy Technology Data Exchange (ETDEWEB)

    Levine, Zachary H.; Grantham, Steven; Neogi, Suneeta; Frigo, Sean P.; McNulty, Ian; Retsch, Cornelia C.; Wang, Yuxin; Lucatorto, Thomas B.

    2001-07-15

    As part of an effort to develop high resolution microtomography for engineered structures, a two-level copper integrated circuit interconnect was imaged using 1.83 keV x rays at 14 angles employing a full-field Fresnel zone plate microscope. A major requirement for high resolution microtomography is the accurate registration of the reference axes in each of the many views needed for a reconstruction. A reconstruction with 100 nm resolution would require registration accuracy of 30 nm or better. This work demonstrates that even images that have strong interference fringes can be used to obtain accurate fiducials through the use of Radon transforms. We show that we are able to locate the coordinates of the rectilinear circuit patterns to 28 nm. The procedure is validated by agreement between an x-ray parallax measurement of 1.41{+-}0.17 {mu}m and a measurement of 1.58{+-}0.08 {mu}m from a scanning electron microscope image of a cross section.

  2. Accurate determination of characteristic relative permeability curves

    Science.gov (United States)

    Krause, Michael H.; Benson, Sally M.

    2015-09-01

    A recently developed technique to accurately characterize sub-core scale heterogeneity is applied to investigate the factors responsible for flowrate-dependent effective relative permeability curves measured on core samples in the laboratory. The dependency of laboratory measured relative permeability on flowrate has long been both supported and challenged by a number of investigators. Studies have shown that this apparent flowrate dependency is a result of both sub-core scale heterogeneity and outlet boundary effects. However this has only been demonstrated numerically for highly simplified models of porous media. In this paper, flowrate dependency of effective relative permeability is demonstrated using two rock cores, a Berea Sandstone and a heterogeneous sandstone from the Otway Basin Pilot Project in Australia. Numerical simulations of steady-state coreflooding experiments are conducted at a number of injection rates using a single set of input characteristic relative permeability curves. Effective relative permeability is then calculated from the simulation data using standard interpretation methods for calculating relative permeability from steady-state tests. Results show that simplified approaches may be used to determine flowrate-independent characteristic relative permeability provided flow rate is sufficiently high, and the core heterogeneity is relatively low. It is also shown that characteristic relative permeability can be determined at any typical flowrate, and even for geologically complex models, when using accurate three-dimensional models.

  3. Accurate taxonomic assignment of short pyrosequencing reads.

    Science.gov (United States)

    Clemente, José C; Jansson, Jesper; Valiente, Gabriel

    2010-01-01

    Ambiguities in the taxonomy dependent assignment of pyrosequencing reads are usually resolved by mapping each read to the lowest common ancestor in a reference taxonomy of all those sequences that match the read. This conservative approach has the drawback of mapping a read to a possibly large clade that may also contain many sequences not matching the read. A more accurate taxonomic assignment of short reads can be made by mapping each read to the node in the reference taxonomy that provides the best precision and recall. We show that given a suffix array for the sequences in the reference taxonomy, a short read can be mapped to the node of the reference taxonomy with the best combined value of precision and recall in time linear in the size of the taxonomy subtree rooted at the lowest common ancestor of the matching sequences. An accurate taxonomic assignment of short reads can thus be made with about the same efficiency as when mapping each read to the lowest common ancestor of all matching sequences in a reference taxonomy. We demonstrate the effectiveness of our approach on several metagenomic datasets of marine and gut microbiota.

  4. Location-based Services using Image Search

    DEFF Research Database (Denmark)

    Vertongen, Pieter-Paulus; Hansen, Dan Witzner

    2008-01-01

    situations, for example in urban environments. We propose a system to provide location-based services using image searches without requiring GPS. The goal of this system is to assist tourists in cities with additional information using their mobile phones and built-in cameras. Based upon the result......Recent developments in image search has made them sufficiently efficient to be used in real-time applications. GPS has become a popular navigation tool. While GPS information provide reasonably good accuracy, they are not always present in all hand held devices nor are they accurate in all...... of the image search engine and database image location knowledge, the location is determined of the query image and associated data can be presented to the user....

  5. GeoSearch: A lightweight broking middleware for geospatial resources discovery

    Science.gov (United States)

    Gui, Z.; Yang, C.; Liu, K.; Xia, J.

    2012-12-01

    With petabytes of geodata, thousands of geospatial web services available over the Internet, it is critical to support geoscience research and applications by finding the best-fit geospatial resources from the massive and heterogeneous resources. Past decades' developments witnessed the operation of many service components to facilitate geospatial resource management and discovery. However, efficient and accurate geospatial resource discovery is still a big challenge due to the following reasons: 1)The entry barriers (also called "learning curves") hinder the usability of discovery services to end users. Different portals and catalogues always adopt various access protocols, metadata formats and GUI styles to organize, present and publish metadata. It is hard for end users to learn all these technical details and differences. 2)The cost for federating heterogeneous services is high. To provide sufficient resources and facilitate data discovery, many registries adopt periodic harvesting mechanism to retrieve metadata from other federated catalogues. These time-consuming processes lead to network and storage burdens, data redundancy, and also the overhead of maintaining data consistency. 3)The heterogeneous semantics issues in data discovery. Since the keyword matching is still the primary search method in many operational discovery services, the search accuracy (precision and recall) is hard to guarantee. Semantic technologies (such as semantic reasoning and similarity evaluation) offer a solution to solve these issues. However, integrating semantic technologies with existing service is challenging due to the expandability limitations on the service frameworks and metadata templates. 4)The capabilities to help users make final selection are inadequate. Most of the existing search portals lack intuitive and diverse information visualization methods and functions (sort, filter) to present, explore and analyze search results. Furthermore, the presentation of the value

  6. Classification of similar medical images in the lifting domain

    Science.gov (United States)

    Sallee, Chad W.; Tashakkori, Rahman

    2002-03-01

    In this paper lifting is used for similarity analysis and classification of sets of similar medical images. The lifting scheme is an invertible wavelet transform that maps integers to integers. Lifting provides efficient in-place calculation of transfer coefficients and is widely used for analysis of similar image sets. Images of a similar set show high degrees of correlation with one another. The inter-set redundancy can be exploited for the purposes of prediction, compression, feature extraction, and classification. This research intends to show that there is a higher degree of correlation between images of a similar set in the lifting domain than in the pixel domain. Such a high correlation will result in more accurate classification and prediction of images in a similar set. Several lifting schemes from Calderbank-Daubechies-Fauveue's family were used in this research. The research shows that some of these lifting schemes decorrelates the images of similar sets more effectively than others. The research presents the statistical analysis of the data in scatter plots and regression models.

  7. Search on Rugged Landscapes

    DEFF Research Database (Denmark)

    Billinger, Stephan; Stieglitz, Nils; Schumacher, Terry

    2014-01-01

    This paper presents findings from a laboratory experiment on human decision-making in a complex combinatorial task. We find strong evidence for a behavioral model of adaptive search. Success narrows down search to the neighborhood of the status quo, while failure promotes gradually more explorative...... search. Task complexity does not have a direct effect on behavior, but systematically affects the feedback conditions that guide success-induced exploitation and failure-induced exploration. The analysis also shows that human participants were prone to over-exploration, since they broke off the search...... for local improvements too early. We derive stylized decision rules that generate the search behavior observed in the experiment and discuss the implications of our findings for individual decision-making and organizational search....

  8. Supporting complex search tasks

    DEFF Research Database (Denmark)

    Gäde, Maria; Hall, Mark; Huurdeman, Hugo;

    2015-01-01

    There is broad consensus in the field of IR that search is complex in many use cases and applications, both on the Web and in domain specific collections, and both professionally and in our daily life. Yet our understanding of complex search tasks, in comparison to simple look up tasks......, is fragmented at best. The workshop addressed the many open research questions: What are the obvious use cases and applications of complex search? What are essential features of work tasks and search tasks to take into account? And how do these evolve over time? With a multitude of information, varying from...... introductory to specialized, and from authoritative to speculative or opinionated, when to show what sources of information? How does the information seeking process evolve and what are relevant differences between different stages? With complex task and search process management, blending searching, browsing...

  9. Adaptive Large Neighbourhood Search

    DEFF Research Database (Denmark)

    Røpke, Stefan

    Large neighborhood search is a metaheuristic that has gained popularity in recent years. The heuristic repeatedly moves from solution to solution by first partially destroying the solution and then repairing it. The best solution observed during this search is presented as the final solution....... This tutorial introduces the large neighborhood search metaheuristic and the variant adaptive large neighborhood search that dynamically tunes parameters of the heuristic while it is running. Both heuristics belong to a broader class of heuristics that are searching a solution space using very large...... neighborhoods. The tutorial also present applications of the adaptive large neighborhood search, mostly related to vehicle routing problems for which the heuristic has been extremely successful. We discuss how the heuristic can be parallelized and thereby take advantage of modern desktop computers...

  10. Supporting complex search tasks

    DEFF Research Database (Denmark)

    Gäde, Maria; Hall, Mark; Huurdeman, Hugo

    2015-01-01

    There is broad consensus in the field of IR that search is complex in many use cases and applications, both on the Web and in domain specific collections, and both professionally and in our daily life. Yet our understanding of complex search tasks, in comparison to simple look up tasks, is fragme......There is broad consensus in the field of IR that search is complex in many use cases and applications, both on the Web and in domain specific collections, and both professionally and in our daily life. Yet our understanding of complex search tasks, in comparison to simple look up tasks......, and recommendations, and supporting exploratory search to sensemaking and analytics, UI and UX design pose an overconstrained challenge. How do we know that our approach is any good? Supporting complex search task requires new collaborations across the whole field of IR, and the proposed workshop will bring together...

  11. Data mining technique for fast retrieval of similar waveforms in Fusion massive databases

    Energy Technology Data Exchange (ETDEWEB)

    Vega, J. [Asociacion EURATOM/CIEMAT Para Fusion, Madrid (Spain)], E-mail: jesus.vega@ciemat.es; Pereira, A.; Portas, A. [Asociacion EURATOM/CIEMAT Para Fusion, Madrid (Spain); Dormido-Canto, S.; Farias, G.; Dormido, R.; Sanchez, J.; Duro, N. [Departamento de Informatica y Automatica, UNED, Madrid (Spain); Santos, M. [Departamento de Arquitectura de Computadores y Automatica, UCM, Madrid (Spain); Sanchez, E. [Asociacion EURATOM/CIEMAT Para Fusion, Madrid (Spain); Pajares, G. [Departamento de Arquitectura de Computadores y Automatica, UCM, Madrid (Spain)

    2008-01-15

    Fusion measurement systems generate similar waveforms for reproducible behavior. A major difficulty related to data analysis is the identification, in a rapid and automated way, of a set of discharges with comparable behaviour, i.e. discharges with 'similar' waveforms. Here we introduce a new technique for rapid searching and retrieval of 'similar' signals. The approach consists of building a classification system that avoids traversing the whole database looking for similarities. The classification system diminishes the problem dimensionality (by means of waveform feature extraction) and reduces the searching space to just the most probable 'similar' waveforms (clustering techniques). In the searching procedure, the input waveform is classified in any of the existing clusters. Then, a similarity measure is computed between the input signal and all cluster elements in order to identify the most similar waveforms. The inner product of normalized vectors is used as the similarity measure as it allows the searching process to be independent of signal gain and polarity. This development has been applied recently to TJ-II stellarator databases and has been integrated into its remote participation system.

  12. A New Method for Measuring Text Similarity in Learning Management Systems Using WordNet

    Science.gov (United States)

    Alkhatib, Bassel; Alnahhas, Ammar; Albadawi, Firas

    2014-01-01

    As text sources are getting broader, measuring text similarity is becoming more compelling. Automatic text classification, search engines and auto answering systems are samples of applications that rely on text similarity. Learning management systems (LMS) are becoming more important since electronic media is getting more publicly available. As…

  13. Accurate Classification of RNA Structures Using Topological Fingerprints

    Science.gov (United States)

    Li, Kejie; Gribskov, Michael

    2016-01-01

    While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furthermore, predicted RNA structures often lack pseudoknots (a crucial aspect of biological activity), and are only partially correct, or incomplete. A topological approach addresses all of these difficulties. In this work we describe each RNA structure as a graph that can be converted to a topological spectrum (RNA fingerprint). The set of subgraphs in an RNA structure, its RNA fingerprint, can be compared with the fingerprints of other RNA structures to identify and correctly classify functionally related RNAs. Topologically similar RNAs can be identified even when a large fraction, up to 30%, of the stems are omitted, indicating that highly accurate structures are not necessary. We investigate the performance of the RNA fingerprint approach on a set of eight highly curated RNA families, with diverse sizes and functions, containing pseudoknots, and with little sequence similarity–an especially difficult test set. In spite of the difficult test set, the RNA fingerprint approach is very successful (ROC AUC > 0.95). Due to the inclusion of pseudoknots, the RNA fingerprint approach both covers a wider range of possible structures than methods based only on secondary structure, and its tolerance for incomplete structures suggests that it can be applied even to predicted structures. Source code is freely available at https://github.rcac.purdue.edu/mgribsko/XIOS_RNA_fingerprint. PMID:27755571

  14. A Distributed Weighted Voting Approach for Accurate Eye Center Estimation

    Directory of Open Access Journals (Sweden)

    Gagandeep Singh

    2013-05-01

    Full Text Available This paper proposes a novel approach for accurate estimation of eye center in face images. A distributed voting based approach in which every pixel votes is adopted for potential eye center candidates. The votes are distributed over a subset of pixels which lie in a direction which is opposite to gradient direction and the weightage of votes is distributed according to a novel mechanism.  First, image is normalized to eliminate illumination variations and its edge map is generated using Canny edge detector. Distributed voting is applied on the edge image to generate different eye center candidates. Morphological closing and local maxima search are used to reduce the number of candidates. A classifier based on spatial and intensity information is used to choose the correct candidates for the locations of eye center. The proposed approach was tested on BioID face database and resulted in better Iris detection rate than the state-of-the-art. The proposed approach is robust against illumination variation, small pose variations, presence of eye glasses and partial occlusion of eyes.Defence Science Journal, 2013, 63(3, pp.292-297, DOI:http://dx.doi.org/10.14429/dsj.63.2763

  15. Mastering ElasticSearch

    CERN Document Server

    Kuc, Rafal

    2013-01-01

    A practical tutorial that covers the difficult design, implementation, and management of search solutions.Mastering ElasticSearch is aimed at to intermediate users who want to extend their knowledge about ElasticSearch. The topics that are described in the book are detailed, but we assume that you already know the basics, like the query DSL or data indexing. Advanced users will also find this book useful, as the examples are getting deep into the internals where it is needed.

  16. ElasticSearch cookbook

    CERN Document Server

    Paro, Alberto

    2013-01-01

    Written in an engaging, easy-to-follow style, the recipes will help you to extend the capabilities of ElasticSearch to manage your data effectively.If you are a developer who implements ElasticSearch in your web applications, manage data, or have decided to start using ElasticSearch, this book is ideal for you. This book assumes that you've got working knowledge of JSON and Java

  17. Fast Structural Search in Phylogenetic Databases

    Directory of Open Access Journals (Sweden)

    William H. Piel

    2005-01-01

    Full Text Available As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. We propose structural search techniques that, given a query or pattern tree P and a database of phylogenies D, find trees in D that are sufficiently close to P . The “closeness” is a measure of the topological relationships in P that are found to be the same or similar in a tree D in D. We develop a filtering technique that accelerates searches and present algorithms for rooted and unrooted trees where the trees can be weighted or unweighted. Experimental results on comparing the similarity measure with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate that the proposed approach is promising

  18. Delaying information search

    Directory of Open Access Journals (Sweden)

    Yaniv Shani

    2012-11-01

    Full Text Available In three studies, we examined factors that may temporarily attenuate information search. People are generally curious and dislike uncertainty, which typically encourages them to look for relevant information. Despite these strong forces that promote information search, people sometimes deliberately delay obtaining valuable information. We find they may do so when they are concerned that the information might interfere with future pleasurable activities. Interestingly, the decision to search or to postpone searching for information is influenced not only by the value and importance of the information itself but also by well-being maintenance goals related to possible detrimental effects that negative knowledge may have on unrelated future plans.

  19. Google Power Search

    CERN Document Server

    Spencer, Stephan

    2011-01-01

    Behind Google's deceptively simple interface is immense power for both market and competitive research-if you know how to use it well. Sure, basic searches are easy, but complex searches require specialized skills. This concise book takes you through the full range of Google's powerful search-refinement features, so you can quickly find the specific information you need. Learn techniques ranging from simple Boolean logic to URL parameters and other advanced tools, and see how they're applied to real-world market research examples. Incorporate advanced search operators such as filetype:, intit

  20. Toward Accurate and Quantitative Comparative Metagenomics

    Science.gov (United States)

    Nayfach, Stephen; Pollard, Katherine S.

    2016-01-01

    Shotgun metagenomics and computational analysis are used to compare the taxonomic and functional profiles of microbial communities. Leveraging this approach to understand roles of microbes in human biology and other environments requires quantitative data summaries whose values are comparable across samples and studies. Comparability is currently hampered by the use of abundance statistics that do not estimate a meaningful parameter of the microbial community and biases introduced by experimental protocols and data-cleaning approaches. Addressing these challenges, along with improving study design, data access, metadata standardization, and analysis tools, will enable accurate comparative metagenomics. We envision a future in which microbiome studies are replicable and new metagenomes are easily and rapidly integrated with existing data. Only then can the potential of metagenomics for predictive ecological modeling, well-powered association studies, and effective microbiome medicine be fully realized. PMID:27565341

  1. Apparatus for accurately measuring high temperatures

    Science.gov (United States)

    Smith, D.D.

    The present invention is a thermometer used for measuring furnace temperatures in the range of about 1800/sup 0/ to 2700/sup 0/C. The thermometer comprises a broadband multicolor thermal radiation sensor positioned to be in optical alignment with the end of a blackbody sight tube extending into the furnace. A valve-shutter arrangement is positioned between the radiation sensor and the sight tube and a chamber for containing a charge of high pressure gas is positioned between the valve-shutter arrangement and the radiation sensor. A momentary opening of the valve shutter arrangement allows a pulse of the high gas to purge the sight tube of air-borne thermal radiation contaminants which permits the radiation sensor to accurately measure the thermal radiation emanating from the end of the sight tube.

  2. Accurate renormalization group analyses in neutrino sector

    Energy Technology Data Exchange (ETDEWEB)

    Haba, Naoyuki [Graduate School of Science and Engineering, Shimane University, Matsue 690-8504 (Japan); Kaneta, Kunio [Kavli IPMU (WPI), The University of Tokyo, Kashiwa, Chiba 277-8568 (Japan); Takahashi, Ryo [Graduate School of Science and Engineering, Shimane University, Matsue 690-8504 (Japan); Yamaguchi, Yuya [Department of Physics, Faculty of Science, Hokkaido University, Sapporo 060-0810 (Japan)

    2014-08-15

    We investigate accurate renormalization group analyses in neutrino sector between ν-oscillation and seesaw energy scales. We consider decoupling effects of top quark and Higgs boson on the renormalization group equations of light neutrino mass matrix. Since the decoupling effects are given in the standard model scale and independent of high energy physics, our method can basically apply to any models beyond the standard model. We find that the decoupling effects of Higgs boson are negligible, while those of top quark are not. Particularly, the decoupling effects of top quark affect neutrino mass eigenvalues, which are important for analyzing predictions such as mass squared differences and neutrinoless double beta decay in an underlying theory existing at high energy scale.

  3. Accurate Weather Forecasting for Radio Astronomy

    Science.gov (United States)

    Maddalena, Ronald J.

    2010-01-01

    The NRAO Green Bank Telescope routinely observes at wavelengths from 3 mm to 1 m. As with all mm-wave telescopes, observing conditions depend upon the variable atmospheric water content. The site provides over 100 days/yr when opacities are low enough for good observing at 3 mm, but winds on the open-air structure reduce the time suitable for 3-mm observing where pointing is critical. Thus, to maximum productivity the observing wavelength needs to match weather conditions. For 6 years the telescope has used a dynamic scheduling system (recently upgraded; www.gb.nrao.edu/DSS) that requires accurate multi-day forecasts for winds and opacities. Since opacity forecasts are not provided by the National Weather Services (NWS), I have developed an automated system that takes available forecasts, derives forecasted opacities, and deploys the results on the web in user-friendly graphical overviews (www.gb.nrao.edu/ rmaddale/Weather). The system relies on the "North American Mesoscale" models, which are updated by the NWS every 6 hrs, have a 12 km horizontal resolution, 1 hr temporal resolution, run to 84 hrs, and have 60 vertical layers that extend to 20 km. Each forecast consists of a time series of ground conditions, cloud coverage, etc, and, most importantly, temperature, pressure, humidity as a function of height. I use the Liebe's MWP model (Radio Science, 20, 1069, 1985) to determine the absorption in each layer for each hour for 30 observing wavelengths. Radiative transfer provides, for each hour and wavelength, the total opacity and the radio brightness of the atmosphere, which contributes substantially at some wavelengths to Tsys and the observational noise. Comparisons of measured and forecasted Tsys at 22.2 and 44 GHz imply that the forecasted opacities are good to about 0.01 Nepers, which is sufficient for forecasting and accurate calibration. Reliability is high out to 2 days and degrades slowly for longer-range forecasts.

  4. Monad Transformers for Backtracking Search

    Directory of Open Access Journals (Sweden)

    Jules Hedges

    2014-06-01

    Full Text Available This paper extends Escardo and Oliva's selection monad to the selection monad transformer, a general monadic framework for expressing backtracking search algorithms in Haskell. The use of the closely related continuation monad transformer for similar purposes is also discussed, including an implementation of a DPLL-like SAT solver with no explicit recursion. Continuing a line of work exploring connections between selection functions and game theory, we use the selection monad transformer with the nondeterminism monad to obtain an intuitive notion of backward induction for a certain class of nondeterministic games.

  5. The FLUKA Code: An Accurate Simulation Tool for Particle Therapy.

    Science.gov (United States)

    Battistoni, Giuseppe; Bauer, Julia; Boehlen, Till T; Cerutti, Francesco; Chin, Mary P W; Dos Santos Augusto, Ricardo; Ferrari, Alfredo; Ortega, Pablo G; Kozłowska, Wioletta; Magro, Giuseppe; Mairani, Andrea; Parodi, Katia; Sala, Paola R; Schoofs, Philippe; Tessonnier, Thomas; Vlachoudis, Vasilis

    2016-01-01

    Monte Carlo (MC) codes are increasingly spreading in the hadrontherapy community due to their detailed description of radiation transport and interaction with matter. The suitability of a MC code for application to hadrontherapy demands accurate and reliable physical models capable of handling all components of the expected radiation field. This becomes extremely important for correctly performing not only physical but also biologically based dose calculations, especially in cases where ions heavier than protons are involved. In addition, accurate prediction of emerging secondary radiation is of utmost importance in innovative areas of research aiming at in vivo treatment verification. This contribution will address the recent developments of the FLUKA MC code and its practical applications in this field. Refinements of the FLUKA nuclear models in the therapeutic energy interval lead to an improved description of the mixed radiation field as shown in the presented benchmarks against experimental data with both (4)He and (12)C ion beams. Accurate description of ionization energy losses and of particle scattering and interactions lead to the excellent agreement of calculated depth-dose profiles with those measured at leading European hadron therapy centers, both with proton and ion beams. In order to support the application of FLUKA in hospital-based environments, Flair, the FLUKA graphical interface, has been enhanced with the capability of translating CT DICOM images into voxel-based computational phantoms in a fast and well-structured way. The interface is capable of importing also radiotherapy treatment data described in DICOM RT standard. In addition, the interface is equipped with an intuitive PET scanner geometry generator and automatic recording of coincidence events. Clinically, similar cases will be presented both in terms of absorbed dose and biological dose calculations describing the various available features.

  6. Interactive searching of facial image databases

    Science.gov (United States)

    Nicholls, Robert A.; Shepherd, John W.; Shepherd, Jean

    1995-09-01

    A set of psychological facial descriptors has been devised to enable computerized searching of criminal photograph albums. The descriptors have been used to encode image databased of up to twelve thousand images. Using a system called FACES, the databases are searched by translating a witness' verbal description into corresponding facial descriptors. Trials of FACES have shown that this coding scheme is more productive and efficient than searching traditional photograph albums. An alternative method of searching the encoded database using a genetic algorithm is currenly being tested. The genetic search method does not require the witness to verbalize a description of the target but merely to indicate a degree of similarity between the target and a limited selection of images from the database. The major drawback of FACES is that is requires a manual encoding of images. Research is being undertaken to automate the process, however, it will require an algorithm which can predict human descriptive values. Alternatives to human derived coding schemes exist using statistical classifications of images. Since databases encoded using statistical classifiers do not have an obvious direct mapping to human derived descriptors, a search method which does not require the entry of human descriptors is required. A genetic search algorithm is being tested for such a purpose.

  7. JSC Search System Usability Case Study

    Science.gov (United States)

    Meza, David; Berndt, Sarah

    2014-01-01

    The advanced nature of "search" has facilitated the movement from keyword match to the delivery of every conceivable information topic from career, commerce, entertainment, learning... the list is infinite. At NASA Johnson Space Center (JSC ) the Search interface is an important means of knowledge transfer. By indexing multiple sources between directorates and organizations, the system's potential is culture changing in that through search, knowledge of the unique accomplishments in engineering and science can be seamlessly passed between generations. This paper reports the findings of an initial survey, the first of a four part study to help determine user sentiment on the intranet, or local (JSC) enterprise search environment as well as the larger NASA enterprise. The survey is a means through which end users provide direction on the development and transfer of knowledge by way of the search experience. The ideal is to identify what is working and what needs to be improved from the users' vantage point by documenting: (1) Where users are satisfied/dissatisfied (2) Perceived value of interface components (3) Gaps which cause any disappointment in search experience. The near term goal is it to inform JSC search in order to improve users' ability to utilize existing services and infrastructure to perform tasks with a shortened life cycle. Continuing steps include an agency based focus with modified questions to accomplish a similar purpose

  8. Accurate measurement of streamwise vortices using dual-plane PIV

    Science.gov (United States)

    Waldman, Rye M.; Breuer, Kenneth S.

    2012-11-01

    Low Reynolds number aerodynamic experiments with flapping animals (such as bats and small birds) are of particular interest due to their application to micro air vehicles which operate in a similar parameter space. Previous PIV wake measurements described the structures left by bats and birds and provided insight into the time history of their aerodynamic force generation; however, these studies have faced difficulty drawing quantitative conclusions based on said measurements. The highly three-dimensional and unsteady nature of the flows associated with flapping flight are major challenges for accurate measurements. The challenge of animal flight measurements is finding small flow features in a large field of view at high speed with limited laser energy and camera resolution. Cross-stream measurement is further complicated by the predominately out-of-plane flow that requires thick laser sheets and short inter-frame times, which increase noise and measurement uncertainty. Choosing appropriate experimental parameters requires compromise between the spatial and temporal resolution and the dynamic range of the measurement. To explore these challenges, we do a case study on the wake of a fixed wing. The fixed model simplifies the experiment and allows direct measurements of the aerodynamic forces via load cell. We present a detailed analysis of the wake measurements, discuss the criteria for making accurate measurements, and present a solution for making quantitative aerodynamic load measurements behind free-flyers.

  9. Accurate complex scaling of three dimensional numerical potentials.

    Science.gov (United States)

    Cerioni, Alessandro; Genovese, Luigi; Duchemin, Ivan; Deutsch, Thierry

    2013-05-28

    The complex scaling method, which consists in continuing spatial coordinates into the complex plane, is a well-established method that allows to compute resonant eigenfunctions of the time-independent Schrödinger operator. Whenever it is desirable to apply the complex scaling to investigate resonances in physical systems defined on numerical discrete grids, the most direct approach relies on the application of a similarity transformation to the original, unscaled Hamiltonian. We show that such an approach can be conveniently implemented in the Daubechies wavelet basis set, featuring a very promising level of generality, high accuracy, and no need for artificial convergence parameters. Complex scaling of three dimensional numerical potentials can be efficiently and accurately performed. By carrying out an illustrative resonant state computation in the case of a one-dimensional model potential, we then show that our wavelet-based approach may disclose new exciting opportunities in the field of computational non-Hermitian quantum mechanics.

  10. Approaching system equilibrium with accurate or not accurate feedback information in a two-route system

    Science.gov (United States)

    Zhao, Xiao-mei; Xie, Dong-fan; Li, Qi

    2015-02-01

    With the development of intelligent transport system, advanced information feedback strategies have been developed to reduce traffic congestion and enhance the capacity. However, previous strategies provide accurate information to travelers and our simulation results show that accurate information brings negative effects, especially in delay case. Because travelers prefer to the best condition route with accurate information, and delayed information cannot reflect current traffic condition but past. Then travelers make wrong routing decisions, causing the decrease of the capacity and the increase of oscillations and the system deviating from the equilibrium. To avoid the negative effect, bounded rationality is taken into account by introducing a boundedly rational threshold BR. When difference between two routes is less than the BR, routes have equal probability to be chosen. The bounded rationality is helpful to improve the efficiency in terms of capacity, oscillation and the gap deviating from the system equilibrium.

  11. Tales from the Field: Search Strategies Applied in Web Searching

    Directory of Open Access Journals (Sweden)

    Soohyung Joo

    2010-08-01

    Full Text Available In their web search processes users apply multiple types of search strategies, which consist of different search tactics. This paper identifies eight types of information search strategies with associated cases based on sequences of search tactics during the information search process. Thirty-one participants representing the general public were recruited for this study. Search logs and verbal protocols offered rich data for the identification of different types of search strategies. Based on the findings, the authors further discuss how to enhance web-based information retrieval (IR systems to support each type of search strategy.

  12. Learning-Based Video Superresolution Reconstruction Using Spatiotemporal Nonlocal Similarity

    Directory of Open Access Journals (Sweden)

    Meiyu Liang

    2015-01-01

    Full Text Available Aiming at improving the video visual resolution quality and details clarity, a novel learning-based video superresolution reconstruction algorithm using spatiotemporal nonlocal similarity is proposed in this paper. Objective high-resolution (HR estimations of low-resolution (LR video frames can be obtained by learning LR-HR correlation mapping and fusing spatiotemporal nonlocal similarities between video frames. With the objective of improving algorithm efficiency while guaranteeing superresolution quality, a novel visual saliency-based LR-HR correlation mapping strategy between LR and HR patches is proposed based on semicoupled dictionary learning. Moreover, aiming at improving performance and efficiency of spatiotemporal similarity matching and fusion, an improved spatiotemporal nonlocal fuzzy registration scheme is established using the similarity weighting strategy based on pseudo-Zernike moment feature similarity and structural similarity, and the self-adaptive regional correlation evaluation strategy. The proposed spatiotemporal fuzzy registration scheme does not rely on accurate estimation of subpixel motion, and therefore it can be adapted to complex motion patterns and is robust to noise and rotation. Experimental results demonstrate that the proposed algorithm achieves competitive superresolution quality compared to other state-of-the-art algorithms in terms of both subjective and objective evaluations.

  13. Anesthetizing animals: Similar to humans yet, peculiar?

    Science.gov (United States)

    Kurdi, Madhuri S; Ramaswamy, Ashwini H

    2015-01-01

    From time immemorial, animals have served as models for humans. Like humans, animals too have to undergo several types of elective and emergency surgeries. Several anesthetic techniques and drugs used in humans are also used in animals. However, unlike humans, the animal kingdom includes a wide variety of species, breeds, and sizes. Different species have variable pharmacological responses, anatomy, temperament, behavior, and lifestyles. The anesthetic techniques and drugs have to suit different species and breeds. Nevertheless, there are several drugs and many peculiar anesthetic techniques used in animals but not in human beings. Keeping this in mind, literature was hand searched and electronically searched using the words "veterinary anesthesia," "anesthetic drugs and techniques in animals" using Google search engine. The interesting information so collected is presented in this article which highlights some challenging and amazing aspects of anesthetizing animals including the preanesthetic assessment, preparation, premedication, monitoring, induction of general anesthesia, intubation, equipment, regional blocks, neuraxial block, and perioperative complications.

  14. How Users Search the Library from a Single Search Box

    Science.gov (United States)

    Lown, Cory; Sierra, Tito; Boyer, Josh

    2013-01-01

    Academic libraries are turning increasingly to unified search solutions to simplify search and discovery of library resources. Unfortunately, very little research has been published on library user search behavior in single search box environments. This study examines how users search a large public university library using a prominent, single…

  15. Citation Searching: Search Smarter & Find More

    Science.gov (United States)

    Hammond, Chelsea C.; Brown, Stephanie Willen

    2008-01-01

    The staff at University of Connecticut are participating in Elsevier's Student Ambassador Program (SAmP) in which graduate students train their peers on "citation searching" research using Scopus and Web of Science, two tremendous citation databases. They are in the fourth semester of these training programs, and they are wildly successful: They…

  16. Fixing Dataset Search

    Science.gov (United States)

    Lynnes, Chris

    2014-01-01

    Three current search engines are queried for ozone data at the GES DISC. The results range from sub-optimal to counter-intuitive. We propose a method to fix dataset search by implementing a robust relevancy ranking scheme. The relevancy ranking scheme is based on several heuristics culled from more than 20 years of helping users select datasets.

  17. Distributed deep web search

    NARCIS (Netherlands)

    Tjin-Kam-Jet, Kien-Tsoi Theodorus Egbert

    2013-01-01

    The World Wide Web contains billions of documents (and counting); hence, it is likely that some document will contain the answer or content you are searching for. While major search engines like Bing and Google often manage to return relevant results to your query, there are plenty of situations in

  18. With News Search Engines

    Science.gov (United States)

    Gunn, Holly

    2005-01-01

    Although there are many news search engines on the Web, finding the news items one wants can be challenging. Choosing appropriate search terms is one of the biggest challenges. Unless one has seen the article that one is seeking, it is often difficult to select words that were used in the headline or text of the article. The limited archives of…

  19. ElasticSearch cookbook

    CERN Document Server

    Paro, Alberto

    2015-01-01

    If you are a developer who implements ElasticSearch in your web applications and want to sharpen your understanding of the core elements and applications, this is the book for you. It is assumed that you've got working knowledge of JSON and, if you want to extend ElasticSearch, of Java and related technologies.

  20. Towards Accessible Search Systems

    NARCIS (Netherlands)

    Serdyukov, Pavel; Hiemstra, Djoerd; Ruthven, Ian

    2010-01-01

    The SIGIR workshop Towards Accessible Search Systems was the first workshop in the field to raise the discussion on how to make search engines accessible for different types of users. We report on the results of the workshop that was held on 23 July 2010 in conjunction with the 33rd Annual ACM SIGIR

  1. An advanced search engine for patent analytics in medicinal chemistry.

    Science.gov (United States)

    Pasche, Emilie; Gobeill, Julien; Teodoro, Douglas; Gaudinat, Arnaud; Vishnykova, Dina; Lovis, Christian; Ruch, Patrick

    2012-01-01

    Patent collections contain an important amount of medical-related knowledge, but existing tools were reported to lack of useful functionalities. We present here the development of TWINC, an advanced search engine dedicated to patent retrieval in the domain of health and life sciences. Our tool embeds two search modes: an ad hoc search to retrieve relevant patents given a short query and a related patent search to retrieve similar patents given a patent. Both search modes rely on tuning experiments performed during several patent retrieval competitions. Moreover, TWINC is enhanced with interactive modules, such as chemical query expansion, which is of prior importance to cope with various ways of naming biomedical entities. While the related patent search showed promising performances, the ad-hoc search resulted in fairly contrasted results. Nonetheless, TWINC performed well during the Chemathlon task of the PatOlympics competition and experts appreciated its usability.

  2. COMPARISON OF EXPLORATION STRATEGIES FOR MULTI-ROBOT SEARCH

    Directory of Open Access Journals (Sweden)

    Miroslav Kulich

    2015-06-01

    Full Text Available Searching for a stationary object in an unknown environment can be formulated as an iterative procedure consisting of map updating, selection of a next goal and navigation to this goal. It finishes when the object of interest is found. This formulation and a general search structure is similar to the related exploration problem. The only difference is in goal-selection, as search and exploration objectives are not the same. Although search is a key task in many search and rescue scenarios, the robotics community has paid little attention to the problem. There is no goal-selection strategy that has been designed specifically for search. In this paper, we study four state-of-the-art strategies for multi-robot exploration, and we evaluate their performance in various environments with respect to the expected time needed to find an object, i.e. to achieve the objective of the search.

  3. A Quantum-Based Similarity Method in Virtual Screening.

    Science.gov (United States)

    Al-Dabbagh, Mohammed Mumtaz; Salim, Naomie; Himmat, Mubarak; Ahmed, Ali; Saeed, Faisal

    2015-10-02

    One of the most widely-used techniques for ligand-based virtual screening is similarity searching. This study adopted the concepts of quantum mechanics to present as state-of-the-art similarity method of molecules inspired from quantum theory. The representation of molecular compounds in mathematical quantum space plays a vital role in the development of quantum-based similarity approach. One of the key concepts of quantum theory is the use of complex numbers. Hence, this study proposed three various techniques to embed and to re-represent the molecular compounds to correspond with complex numbers format. The quantum-based similarity method that developed in this study depending on complex pure Hilbert space of molecules called Standard Quantum-Based (SQB). The recall of retrieved active molecules were at top 1% and top 5%, and significant test is used to evaluate our proposed methods. The MDL drug data report (MDDR), maximum unbiased validation (MUV) and Directory of Useful Decoys (DUD) data sets were used for experiments and were represented by 2D fingerprints. Simulated virtual screening experiment show that the effectiveness of SQB method was significantly increased due to the role of representational power of molecular compounds in complex numbers forms compared to Tanimoto benchmark similarity measure.

  4. A Quantum-Based Similarity Method in Virtual Screening

    Directory of Open Access Journals (Sweden)

    Mohammed Mumtaz Al-Dabbagh

    2015-10-01

    Full Text Available One of the most widely-used techniques for ligand-based virtual screening is similarity searching. This study adopted the concepts of quantum mechanics to present as state-of-the-art similarity method of molecules inspired from quantum theory. The representation of molecular compounds in mathematical quantum space plays a vital role in the development of quantum-based similarity approach. One of the key concepts of quantum theory is the use of complex numbers. Hence, this study proposed three various techniques to embed and to re-represent the molecular compounds to correspond with complex numbers format. The quantum-based similarity method that developed in this study depending on complex pure Hilbert space of molecules called Standard Quantum-Based (SQB. The recall of retrieved active molecules were at top 1% and top 5%, and significant test is used to evaluate our proposed methods. The MDL drug data report (MDDR, maximum unbiased validation (MUV and Directory of Useful Decoys (DUD data sets were used for experiments and were represented by 2D fingerprints. Simulated virtual screening experiment show that the effectiveness of SQB method was significantly increased due to the role of representational power of molecular compounds in complex numbers forms compared to Tanimoto benchmark similarity measure.

  5. The Search for Another Earth - Part II

    Indian Academy of Sciences (India)

    2016-10-01

    In the first part, we discussed the various methods for thedetection of planets outside the solar system known as theexoplanets. In this part, we will describe various kinds ofexoplanets. The habitable planets discovered so far and thepresent status of our search for a habitable planet similar tothe Earth will also be discussed.

  6. Insights: Talent Searches from Parents' Perspectives

    Science.gov (United States)

    Willis, Mariam

    2012-01-01

    Talent Searches offer an opportunity for gifted children to experience learning on prestigious college campuses around the nation, and as importantly, an opportunity to form relationships with like-minded, similar-age peers. Few opportunities open doors for intellectual, social, and emotional growth in gifted children as efficiently as…

  7. Chemical-text hybrid search engines.

    Science.gov (United States)

    Zhou, Yingyao; Zhou, Bin; Jiang, Shumei; King, Frederick J

    2010-01-01

    As the amount of chemical literature increases, it is critical that researchers be enabled to accurately locate documents related to a particular aspect of a given compound. Existing solutions, based on text and chemical search engines alone, suffer from the inclusion of "false negative" and "false positive" results, and cannot accommodate diverse repertoire of formats currently available for chemical documents. To address these concerns, we developed an approach called Entity-Canonical Keyword Indexing (ECKI), which converts a chemical entity embedded in a data source into its canonical keyword representation prior to being indexed by text search engines. We implemented ECKI using Microsoft Office SharePoint Server Search, and the resultant hybrid search engine not only supported complex mixed chemical and keyword queries but also was applied to both intranet and Internet environments. We envision that the adoption of ECKI will empower researchers to pose more complex search questions that were not readily attainable previously and to obtain answers at much improved speed and accuracy.

  8. Generating Personalized Web Search Using Semantic Context

    Directory of Open Access Journals (Sweden)

    Zheng Xu

    2015-01-01

    Full Text Available The “one size fits the all” criticism of search engines is that when queries are submitted, the same results are returned to different users. In order to solve this problem, personalized search is proposed, since it can provide different search results based upon the preferences of users. However, existing methods concentrate more on the long-term and independent user profile, and thus reduce the effectiveness of personalized search. In this paper, the method captures the user context to provide accurate preferences of users for effectively personalized search. First, the short-term query context is generated to identify related concepts of the query. Second, the user context is generated based on the click through data of users. Finally, a forgetting factor is introduced to merge the independent user context in a user session, which maintains the evolution of user preferences. Experimental results fully confirm that our approach can successfully represent user context according to individual user information needs.

  9. Search and Disrupt

    DEFF Research Database (Denmark)

    Ørding Olsen, Anders

    This paper analyzes how external search is affected by strategic interest alignment among knowledge sources. I focus on misalignment arising from the heterogeneous effects of disruptive technologies by analyzing the influence of incumbents on 2,855 non-incumbents? external knowledge search efforts....... The efforts most likely to solve innovation problems obtained funding from the European Commission?s 7th Framework Program (2007-2013). The results show that involving incumbents improves search in complementary technologies, while demoting it when strategic interests are misaligned in disruptive technologies....... However, incumbent sources engaged in capability reconfiguration to accommodate disruption improve search efforts in disruptive technologies. The paper concludes that the value of external sources is contingent on more than their knowledge. Specifically, interdependence of sources in search gives rise...

  10. Reconstructing propagation networks with temporal similarity metrics

    CERN Document Server

    Liao, Hao

    2014-01-01

    Node similarity is a significant property driving the growth of real networks. In this paper, based on the observed spreading results we apply the node similarity metrics to reconstruct propagation networks. We find that the reconstruction accuracy of the similarity metrics is strongly influenced by the infection rate of the spreading process. Moreover, there is a range of infection rate in which the reconstruction accuracy of some similarity metrics drops to nearly zero. In order to improve the similarity-based reconstruction method, we finally propose a temporal similarity metric to take into account the time information of the spreading. The reconstruction results are remarkably improved with the new method.

  11. Similarity landscapes: An improved method for scientific visualization of information from protein and DNA database searches

    Energy Technology Data Exchange (ETDEWEB)

    Dogget, N.; Myers, G. [Los Alamos National Lab., NM (United States); Wills, C.J. [Univ. of California, San Diego, CA (United States)

    1998-12-01

    This is the final report of a three-year, Laboratory Directed Research and Development (LDRD) project at the Los Alamos National Laboratory (LANL). The authors have used computer simulations and examination of a variety of databases to answer questions about a wide range of evolutionary questions. The authors have found that there is a clear distinction in the evolution of HIV-1 and HIV-2, with the former and more virulent virus evolving more rapidly at a functional level. The authors have discovered highly non-random patterns in the evolution of HIV-1 that can be attributed to a variety of selective pressures. In the course of examination of microsatellite DNA (short repeat regions) in microorganisms, the authors have found clear differences between prokaryotes and eukaryotes in their distribution, differences that can be tied to different selective pressures. They have developed a new method (topiary pruning) for enhancing the phylogenetic information contained in DNA sequences. Most recently, the authors have discovered effects in complex rainforest ecosystems that indicate strong frequency-dependent interactions between host species and their parasites, leading to the maintenance of ecosystem variability.

  12. Searching for similarities : transfer-oriented learning in health education at secondary schools

    NARCIS (Netherlands)

    Peters, L.W.H.

    2012-01-01

    Het voortgezet onderwijs wordt overladen met lespakketten over gezond gedrag. Deze richten zich veelal op enkelvoudige gedragsdomeinen (bijvoorbeeld roken, alcohol, voeding of veilige seks). Het zou efficiënter zijn als één lespakket, met beperkte lestijd, effecten op meerdere gedragsdomeinen tegeli

  13. Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement

    Science.gov (United States)

    2002-01-01

    friendship and compre- hension. Particularly I want to thank Ulises Cervantes, Miguel Valdez, Gabriel Lopez , Olga Baos, Seguti, Erick and Javiera...Yueting Zhuang, Iosef Lazaridis, Koushik , Yiming, Yong Rui, Dawit, and the impossible to forget Adolfo Grego. I owe a special debt of gratitude to...Techniques. Morgan Kauf- mann, San Mateo , CA, 1993. [62] J. Gray and P. Shenoy. Rules of thumb in data engineering. http://www. research. microsoft. com

  14. NFFinder: an online bioinformatics tool for searching similar transcriptomics experiments in the context of drug repositioning.

    Science.gov (United States)

    Setoain, Javier; Franch, Mònica; Martínez, Marta; Tabas-Madrid, Daniel; Sorzano, Carlos O S; Bakker, Annette; Gonzalez-Couto, Eduardo; Elvira, Juan; Pascual-Montano, Alberto

    2015-07-01

    Drug repositioning, using known drugs for treating conditions different from those the drug was originally designed to treat, is an important drug discovery tool that allows for a faster and cheaper development process by using drugs that are already approved or in an advanced trial stage for another purpose. This is especially relevant for orphan diseases because they affect too few people to make drug research de novo economically viable. In this paper we present NFFinder, a bioinformatics tool for identifying potential useful drugs in the context of orphan diseases. NFFinder uses transcriptomic data to find relationships between drugs, diseases and a phenotype of interest, as well as identifying experts having published on that domain. The application shows in a dashboard a series of graphics and tables designed to help researchers formulate repositioning hypotheses and identify potential biological relationships between drugs and diseases. NFFinder is freely available at http://nffinder.cnb.csic.es.

  15. A fingerprint based metric for measuring similarities of crystalline structures

    Energy Technology Data Exchange (ETDEWEB)

    Zhu, Li; Fuhrer, Tobias; Schaefer, Bastian; Grauzinyte, Migle; Goedecker, Stefan, E-mail: stefan.goedecker@unibas.ch [Department of Physics, Universität Basel, Klingelbergstr. 82, 4056 Basel (Switzerland); Amsler, Maximilian [Department of Physics, Universität Basel, Klingelbergstr. 82, 4056 Basel (Switzerland); Department of Materials Science and Engineering, Northwestern University, Evanston, Illinois 60208 (United States); Faraji, Somayeh; Rostami, Samare; Ghasemi, S. Alireza [Institute for Advanced Studies in Basic Sciences, P.O. Box 45195-1159, Zanjan (Iran, Islamic Republic of); Sadeghi, Ali [Physics Department, Shahid Beheshti University, G. C., Evin, 19839 Tehran (Iran, Islamic Republic of); Wolverton, Chris [Department of Materials Science and Engineering, Northwestern University, Evanston, Illinois 60208 (United States)

    2016-01-21

    Measuring similarities/dissimilarities between atomic structures is important for the exploration of potential energy landscapes. However, the cell vectors together with the coordinates of the atoms, which are generally used to describe periodic systems, are quantities not directly suitable as fingerprints to distinguish structures. Based on a characterization of the local environment of all atoms in a cell, we introduce crystal fingerprints that can be calculated easily and define configurational distances between crystalline structures that satisfy the mathematical properties of a metric. This distance between two configurations is a measure of their similarity/dissimilarity and it allows in particular to distinguish structures. The new method can be a useful tool within various energy landscape exploration schemes, such as minima hopping, random search, swarm intelligence algorithms, and high-throughput screenings.

  16. A fingerprint based metric for measuring similarities of crystalline structures

    CERN Document Server

    Zhu, Li; Fuhrer, Tobias; Schaefer, Bastian; Faraji, Somayeh; Rostami, Samara; Ghasemi, S Alireza; Sadeghi, Ali; Grauzinyte, Migle; Wolverton, Christopher; Goedecker, Stefan

    2015-01-01

    Measuring similarities/dissimilarities between atomic structures is important for the exploration of potential energy landscapes. However, the cell vectors together with the coordinates of the atoms, which are generally used to describe periodic systems, are quantities not suitable as fingerprints to distinguish structures. Based on a characterization of the local environment of all atoms in a cell we introduce crystal fingerprints that can be calculated easily and allow to define configurational distances between crystalline structures that satisfy the mathematical properties of a metric. This distance between two configurations is a measure of their similarity/dissimilarity and it allows in particular to distinguish structures. The new method is an useful tool within various energy landscape exploration schemes, such as minima hopping, random search, swarm intelligence algorithms and high-throughput screenings.

  17. A fingerprint based metric for measuring similarities of crystalline structures.

    Science.gov (United States)

    Zhu, Li; Amsler, Maximilian; Fuhrer, Tobias; Schaefer, Bastian; Faraji, Somayeh; Rostami, Samare; Ghasemi, S Alireza; Sadeghi, Ali; Grauzinyte, Migle; Wolverton, Chris; Goedecker, Stefan

    2016-01-21

    Measuring similarities/dissimilarities between atomic structures is important for the exploration of potential energy landscapes. However, the cell vectors together with the coordinates of the atoms, which are generally used to describe periodic systems, are quantities not directly suitable as fingerprints to distinguish structures. Based on a characterization of the local environment of all atoms in a cell, we introduce crystal fingerprints that can be calculated easily and define configurational distances between crystalline structures that satisfy the mathematical properties of a metric. This distance between two configurations is a measure of their similarity/dissimilarity and it allows in particular to distinguish structures. The new method can be a useful tool within various energy landscape exploration schemes, such as minima hopping, random search, swarm intelligence algorithms, and high-throughput screenings.

  18. User oriented trajectory search for trip recommendation

    KAUST Repository

    Shang, Shuo

    2012-01-01

    Trajectory sharing and searching have received significant attentions in recent years. In this paper, we propose and investigate a novel problem called User Oriented Trajectory Search (UOTS) for trip recommendation. In contrast to conventional trajectory search by locations (spatial domain only), we consider both spatial and textual domains in the new UOTS query. Given a trajectory data set, the query input contains a set of intended places given by the traveler and a set of textual attributes describing the traveler\\'s preference. If a trajectory is connecting/close to the specified query locations, and the textual attributes of the trajectory are similar to the traveler\\'e preference, it will be recommended to the traveler for reference. This type of queries can bring significant benefits to travelers in many popular applications such as trip planning and recommendation. There are two challenges in the UOTS problem, (i) how to constrain the searching range in two domains and (ii) how to schedule multiple query sources effectively. To overcome the challenges and answer the UOTS query efficiently, a novel collaborative searching approach is developed. Conceptually, the UOTS query processing is conducted in the spatial and textual domains alternately. A pair of upper and lower bounds are devised to constrain the searching range in two domains. In the meantime, a heuristic searching strategy based on priority ranking is adopted for scheduling the multiple query sources, which can further reduce the searching range and enhance the query efficiency notably. Furthermore, the devised collaborative searching approach can be extended to situations where the query locations are ordered. The performance of the proposed UOTS query is verified by extensive experiments based on real and synthetic trajectory data in road networks. © 2012 ACM.

  19. User Oriented Trajectory Search for Trip Recommendation

    KAUST Repository

    Ding, Ruogu

    2012-09-08

    Trajectory sharing and searching have received significant attention in recent years. In this thesis, we propose and investigate the methods to find and recommend the best trajectory to the traveler, and mainly focus on a novel technique named User Oriented Trajectory Search (UOTS) query processing. In contrast to conventional trajectory search by locations (spatial domain only), we consider both spatial and textual domains in the new UOTS query. Given a trajectory data set, the query input contains a set of intended places given by the traveler and a set of textual attributes describing the traveler’s preference. If a trajectory is connecting/close to the specified query locations, and the textual attributes of the trajectory are similar to the traveler’s preference, it will be recommended to the traveler. This type of queries can enable many popular applications such as trip planning and recommendation. There are two challenges in UOTS query processing, (i) how to constrain the searching range in two domains and (ii) how to schedule multiple query sources effectively. To overcome the challenges and answer the UOTS query efficiently, a novel collaborative searching approach is developed. Conceptually, the UOTS query processing is conducted in the spatial and textual domains alternately. A pair of upper and lower bounds are devised to constrain the searching range in two domains. In the meantime, a heuristic searching strategy based on priority ranking is adopted for scheduling the multiple query sources, which can further reduce the searching range and enhance the query efficiency notably. Furthermore, the devised collaborative searching approach can be extended to situations where the query locations are or- dered. Extensive experiments are conducted on both real and synthetic trajectory data in road networks. Our approach is verified to be effective in reducing both CPU time and disk I/O time.

  20. User Oriented Trajectory Search for Trip Recommendation

    KAUST Repository

    Ding, Ruogu

    2012-07-08

    Trajectory sharing and searching have received significant attention in recent years. In this thesis, we propose and investigate the methods to find and recommend the best trajectory to the traveler, and mainly focus on a novel technique named User Oriented Trajectory Search (UOTS) query processing. In contrast to conventional trajectory search by locations (spatial domain only), we consider both spatial and textual domains in the new UOTS query. Given a trajectory data set, the query input contains a set of intended places given by the traveler and a set of textual attributes describing the traveler’s preference. If a trajectory is connecting/close to the specified query locations, and the textual attributes of the trajectory are similar to the traveler’s preference, it will be recommended to the traveler. This type of queries can enable many popular applications such as trip planning and recommendation. There are two challenges in UOTS query processing, (i) how to constrain the searching range in two domains and (ii) how to schedule multiple query sources effectively. To overcome the challenges and answer the UOTS query efficiently, a novel collaborative searching approach is developed. Conceptually, the UOTS query processing is conducted in the spatial and textual domains alternately. A pair of upper and lower bounds are devised to constrain the searching range in two domains. In the meantime, a heuristic searching strategy based on priority ranking is adopted for scheduling the multiple query sources, which can further reduce the searching range and enhance the query efficiency notably. Furthermore, the devised collaborative searching approach can be extended to situations where the query locations are ordered. Extensive experiments are conducted on both real and synthetic trajectory data in road networks. Our approach is verified to be effective in reducing both CPU time and disk I/O time.

  1. Fast and accurate exhaled breath ammonia measurement.

    Science.gov (United States)

    Solga, Steven F; Mudalel, Matthew L; Spacek, Lisa A; Risby, Terence H

    2014-06-11

    This exhaled breath ammonia method uses a fast and highly sensitive spectroscopic method known as quartz enhanced photoacoustic spectroscopy (QEPAS) that uses a quantum cascade based laser. The monitor is coupled to a sampler that measures mouth pressure and carbon dioxide. The system is temperature controlled and specifically designed to address the reactivity of this compound. The sampler provides immediate feedback to the subject and the technician on the quality of the breath effort. Together with the quick response time of the monitor, this system is capable of accurately measuring exhaled breath ammonia representative of deep lung systemic levels. Because the system is easy to use and produces real time results, it has enabled experiments to identify factors that influence measurements. For example, mouth rinse and oral pH reproducibly and significantly affect results and therefore must be controlled. Temperature and mode of breathing are other examples. As our understanding of these factors evolves, error is reduced, and clinical studies become more meaningful. This system is very reliable and individual measurements are inexpensive. The sampler is relatively inexpensive and quite portable, but the monitor is neither. This limits options for some clinical studies and provides rational for future innovations.

  2. Noninvasive hemoglobin monitoring: how accurate is enough?

    Science.gov (United States)

    Rice, Mark J; Gravenstein, Nikolaus; Morey, Timothy E

    2013-10-01

    Evaluating the accuracy of medical devices has traditionally been a blend of statistical analyses, at times without contextualizing the clinical application. There have been a number of recent publications on the accuracy of a continuous noninvasive hemoglobin measurement device, the Masimo Radical-7 Pulse Co-oximeter, focusing on the traditional statistical metrics of bias and precision. In this review, which contains material presented at the Innovations and Applications of Monitoring Perfusion, Oxygenation, and Ventilation (IAMPOV) Symposium at Yale University in 2012, we critically investigated these metrics as applied to the new technology, exploring what is required of a noninvasive hemoglobin monitor and whether the conventional statistics adequately answer our questions about clinical accuracy. We discuss the glucose error grid, well known in the glucose monitoring literature, and describe an analogous version for hemoglobin monitoring. This hemoglobin error grid can be used to evaluate the required clinical accuracy (±g/dL) of a hemoglobin measurement device to provide more conclusive evidence on whether to transfuse an individual patient. The important decision to transfuse a patient usually requires both an accurate hemoglobin measurement and a physiologic reason to elect transfusion. It is our opinion that the published accuracy data of the Masimo Radical-7 is not good enough to make the transfusion decision.

  3. Accurate free energy calculation along optimized paths.

    Science.gov (United States)

    Chen, Changjun; Xiao, Yi

    2010-05-01

    The path-based methods of free energy calculation, such as thermodynamic integration and free energy perturbation, are simple in theory, but difficult in practice because in most cases smooth paths do not exist, especially for large molecules. In this article, we present a novel method to build the transition path of a peptide. We use harmonic potentials to restrain its nonhydrogen atom dihedrals in the initial state and set the equilibrium angles of the potentials as those in the final state. Through a series of steps of geometrical optimization, we can construct a smooth and short path from the initial state to the final state. This path can be used to calculate free energy difference. To validate this method, we apply it to a small 10-ALA peptide and find that the calculated free energy changes in helix-helix and helix-hairpin transitions are both self-convergent and cross-convergent. We also calculate the free energy differences between different stable states of beta-hairpin trpzip2, and the results show that this method is more efficient than the conventional molecular dynamics method in accurate free energy calculation.

  4. Accurate fission data for nuclear safety

    CERN Document Server

    Solders, A; Jokinen, A; Kolhinen, V S; Lantz, M; Mattera, A; Penttila, H; Pomp, S; Rakopoulos, V; Rinta-Antila, S

    2013-01-01

    The Accurate fission data for nuclear safety (AlFONS) project aims at high precision measurements of fission yields, using the renewed IGISOL mass separator facility in combination with a new high current light ion cyclotron at the University of Jyvaskyla. The 30 MeV proton beam will be used to create fast and thermal neutron spectra for the study of neutron induced fission yields. Thanks to a series of mass separating elements, culminating with the JYFLTRAP Penning trap, it is possible to achieve a mass resolving power in the order of a few hundred thousands. In this paper we present the experimental setup and the design of a neutron converter target for IGISOL. The goal is to have a flexible design. For studies of exotic nuclei far from stability a high neutron flux (10^12 neutrons/s) at energies 1 - 30 MeV is desired while for reactor applications neutron spectra that resembles those of thermal and fast nuclear reactors are preferred. It is also desirable to be able to produce (semi-)monoenergetic neutrons...

  5. Towards Accurate Modeling of Moving Contact Lines

    CERN Document Server

    Holmgren, Hanna

    2015-01-01

    A main challenge in numerical simulations of moving contact line problems is that the adherence, or no-slip boundary condition leads to a non-integrable stress singularity at the contact line. In this report we perform the first steps in developing the macroscopic part of an accurate multiscale model for a moving contact line problem in two space dimensions. We assume that a micro model has been used to determine a relation between the contact angle and the contact line velocity. An intermediate region is introduced where an analytical expression for the velocity exists. This expression is used to implement boundary conditions for the moving contact line at a macroscopic scale, along a fictitious boundary located a small distance away from the physical boundary. Model problems where the shape of the interface is constant thought the simulation are introduced. For these problems, experiments show that the errors in the resulting contact line velocities converge with the grid size $h$ at a rate of convergence $...

  6. Does a pneumotach accurately characterize voice function?

    Science.gov (United States)

    Walters, Gage; Krane, Michael

    2016-11-01

    A study is presented which addresses how a pneumotach might adversely affect clinical measurements of voice function. A pneumotach is a device, typically a mask, worn over the mouth, in order to measure time-varying glottal volume flow. By measuring the time-varying difference in pressure across a known aerodynamic resistance element in the mask, the glottal volume flow waveform is estimated. Because it adds aerodynamic resistance to the vocal system, there is some concern that using a pneumotach may not accurately portray the behavior of the voice. To test this hypothesis, experiments were performed in a simplified airway model with the principal dimensions of an adult human upper airway. A compliant constriction, fabricated from silicone rubber, modeled the vocal folds. Variations of transglottal pressure, time-averaged volume flow, model vocal fold vibration amplitude, and radiated sound with subglottal pressure were performed, with and without the pneumotach in place, and differences noted. Acknowledge support of NIH Grant 2R01DC005642-10A1.

  7. Accurate lineshape spectroscopy and the Boltzmann constant.

    Science.gov (United States)

    Truong, G-W; Anstie, J D; May, E F; Stace, T M; Luiten, A N

    2015-10-14

    Spectroscopy has an illustrious history delivering serendipitous discoveries and providing a stringent testbed for new physical predictions, including applications from trace materials detection, to understanding the atmospheres of stars and planets, and even constraining cosmological models. Reaching fundamental-noise limits permits optimal extraction of spectroscopic information from an absorption measurement. Here, we demonstrate a quantum-limited spectrometer that delivers high-precision measurements of the absorption lineshape. These measurements yield a very accurate measurement of the excited-state (6P1/2) hyperfine splitting in Cs, and reveals a breakdown in the well-known Voigt spectral profile. We develop a theoretical model that accounts for this breakdown, explaining the observations to within the shot-noise limit. Our model enables us to infer the thermal velocity dispersion of the Cs vapour with an uncertainty of 35 p.p.m. within an hour. This allows us to determine a value for Boltzmann's constant with a precision of 6 p.p.m., and an uncertainty of 71 p.p.m.

  8. Accurate thermoplasmonic simulation of metallic nanoparticles

    Science.gov (United States)

    Yu, Da-Miao; Liu, Yan-Nan; Tian, Fa-Lin; Pan, Xiao-Min; Sheng, Xin-Qing

    2017-01-01

    Thermoplasmonics leads to enhanced heat generation due to the localized surface plasmon resonances. The measurement of heat generation is fundamentally a complicated task, which necessitates the development of theoretical simulation techniques. In this paper, an efficient and accurate numerical scheme is proposed for applications with complex metallic nanostructures. Light absorption and temperature increase are, respectively, obtained by solving the volume integral equation (VIE) and the steady-state heat diffusion equation through the method of moments (MoM). Previously, methods based on surface integral equations (SIEs) were utilized to obtain light absorption. However, computing light absorption from the equivalent current is as expensive as O(NsNv), where Ns and Nv, respectively, denote the number of surface and volumetric unknowns. Our approach reduces the cost to O(Nv) by using VIE. The accuracy, efficiency and capability of the proposed scheme are validated by multiple simulations. The simulations show that our proposed method is more efficient than the approach based on SIEs under comparable accuracy, especially for the case where many incidents are of interest. The simulations also indicate that the temperature profile can be tuned by several factors, such as the geometry configuration of array, beam direction, and light wavelength.

  9. Fast and Provably Accurate Bilateral Filtering.

    Science.gov (United States)

    Chaudhury, Kunal N; Dabhade, Swapnil D

    2016-06-01

    The bilateral filter is a non-linear filter that uses a range filter along with a spatial filter to perform edge-preserving smoothing of images. A direct computation of the bilateral filter requires O(S) operations per pixel, where S is the size of the support of the spatial filter. In this paper, we present a fast and provably accurate algorithm for approximating the bilateral filter when the range kernel is Gaussian. In particular, for box and Gaussian spatial filters, the proposed algorithm can cut down the complexity to O(1) per pixel for any arbitrary S . The algorithm has a simple implementation involving N+1 spatial filterings, where N is the approximation order. We give a detailed analysis of the filtering accuracy that can be achieved by the proposed approximation in relation to the target bilateral filter. This allows us to estimate the order N required to obtain a given accuracy. We also present comprehensive numerical results to demonstrate that the proposed algorithm is competitive with the state-of-the-art methods in terms of speed and accuracy.

  10. Conditional Similarity Solutions of the Boussinesq Equation

    Institute of Scientific and Technical Information of China (English)

    TANG Xiao-Yan; LIN Ji; LOU Sen-Yue

    2001-01-01

    The direct method proposed by Clarkson and Kruskal is modified to obtain some conditional similarity solutions of a nonlinear physics model. Taking the (1+ 1 )-dimensional Boussinesq equation as a simple example, six types of conditional similarity reductions are obtained.

  11. Identifying cis-regulatory sequences by word profile similarity.

    Directory of Open Access Journals (Sweden)

    Garmay Leung

    Full Text Available BACKGROUND: Recognizing regulatory sequences in genomes is a continuing challenge, despite a wealth of available genomic data and a growing number of experimentally validated examples. METHODOLOGY/PRINCIPAL FINDINGS: We discuss here a simple approach to search for regulatory sequences based on the compositional similarity of genomic regions and known cis-regulatory sequences. This method, which is not limited to searching for predefined motifs, recovers sequences known to be under similar regulatory control. The words shared by the recovered sequences often correspond to known binding sites. Furthermore, we show that although local word profile clustering is predictive for the regulatory sequences involved in blastoderm segmentation, local dissimilarity is a more universal feature of known regulatory sequences in Drosophila. CONCLUSIONS/SIGNIFICANCE: Our method leverages sequence motifs within a known regulatory sequence to identify co-regulated sequences without explicitly defining binding sites. We also show that regulatory sequences can be distinguished from surrounding sequences by local sequence dissimilarity, a novel feature in identifying regulatory sequences across a genome. Source code for WPH-finder is available for download at http://rana.lbl.gov/downloads/wph.tar.gz.

  12. Contextual factors for finding similar experts

    NARCIS (Netherlands)

    K. Hofmann; K. Balog; T. Bogers; M. de Rijke

    2010-01-01

    Expertise-seeking research studies how people search for expertise and choose whom to contact in the context of a specific task. An important outcome are models that identify factors that influence expert finding. Expertise retrieval addresses the same problem, expert finding, but from a system-cent

  13. 2-Jump DNA Search Multiple Pattern Matching Algorithm

    OpenAIRE

    Raju Bhukya; D. V. L. N. Somayajulu

    2011-01-01

    Pattern matching in a DNA sequence or searching a pattern from a large data base is a major research area in computational biology. To extract pattern match from a large sequence it takes more time, in order to reduce searching time we have proposed an approach that reduces the search time with accurate retrieval of the matched pattern in the sequence. As performance plays a major role in extracting patterns from a given DNA sequence or from a database independent of the size of the sequence....

  14. Shape Similarity Measures of Linear Entities

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The essential of feature matching technology lies in how to measure the similarity of spatial entities.Among all the possible similarity measures,the shape similarity measure is one of the most important measures because it is easy to collect the necessary parameters and it is also well matched with the human intuition.In this paper a new shape similarity measure of linear entities based on the differences of direction change along each line is presented and its effectiveness is illustrated.

  15. Similarity boosted quantitative structure-activity relationship--a systematic study of enhancing structural descriptors by molecular similarity.

    Science.gov (United States)

    Girschick, Tobias; Almeida, Pedro R; Kramer, Stefan; Stålring, Jonna

    2013-05-24

    The concept of molecular similarity is one of the most central in the fields of predictive toxicology and quantitative structure-activity relationship (QSAR) research. Many toxicological responses result from a multimechanistic process and, consequently, structural diversity among the active compounds is likely. Combining this knowledge, we introduce similarity boosted QSAR modeling, where we calculate molecular descriptors using similarities with respect to representative reference compounds to aid a statistical learning algorithm in distinguishing between different structural classes. We present three approaches for the selection of reference compounds, one by literature search and two by clustering. Our experimental evaluation on seven publicly available data sets shows that the similarity descriptors used on their own perform quite well compared to structural descriptors. We show that the combination of similarity and structural descriptors enhances the performance and that a simple stacking approach is able to use the complementary information encoded by the different descriptor sets to further improve predictive results. All software necessary for our experiments is available within the cheminformatics software framework AZOrange.

  16. Ghosts of the Milky Way: a search for topology in new quasar catalogues

    Science.gov (United States)

    Weatherley, S. J.; Warren, S. J.; Croom, S. M.; Smith, R. J.; Boyle, B. J.; Shanks, T.; Miller, L.; Baltovic, M. P.

    2003-06-01

    We revisit the possibility that we inhabit a compact multi-connected flat, or nearly flat, Universe. Analysis of COBE data has shown that, for such a case, the size of the fundamental domain must be a substantial fraction of the horizon size. Nevertheless, there could be several copies of the Universe within the horizon. If the Milky Way was once a quasar we might detect its `ghost' images. Using new large quasar catalogues we repeat the search by Fagundes & Wichoski for antipodal quasar pairs. By applying linear theory to account for the peculiar velocity of the Local Group, we are able to narrow the search radius to 134 arcsec. We find seven candidate antipodal quasar pairs within this search radius. However, a similar number would be expected by chance. We argue that, even with larger quasar catalogues, and more accurate values of the cosmological parameters, it is unlikely to be possible to identify putative ghost pairs unambiguously, because of the uncertainty of the correction for peculiar motion of the Milky Way.

  17. Ghosts of the Milky Way: a search for topology in new quasar catalogues

    CERN Document Server

    Weatherley, S J; Croom, S M; Smith, R J; Boyle, B J; Shanks, T; Millar, L; Baltovic, M P

    2003-01-01

    We revisit the possibility that we inhabit a compact multi-connected flat, or nearly-flat, Universe. Analysis of COBE data has shown that, for such a case, the size of the fundamental domain must be a substantial fraction of the horizon size. Nevertheless, there could be several copies of the Universe within the horizon. If the Milky Way was once a quasar we might detect its `ghost' images. Using new large quasar catalogues we repeat the search by Fagundes & Wichoski for antipodal quasar pairs. By applying linear theory to account for the peculiar velocity of the local group, we are able to narrow the search radius to 134 arcsec. We find seven candidate antipodal quasar pairs within this search radius. However, a similar number would be expected by chance. We argue that, even with larger quasar catalogues, and more accurate values of the cosmological parameters, it is unlikely to be possible to identify putative ghost pairs unambiguously, because of the uncertainty of the correction for peculiar motion of...

  18. Edge-SIFT: discriminative binary descriptor for scalable partial-duplicate mobile search.

    Science.gov (United States)

    Zhang, Shiliang; Tian, Qi; Lu, Ke; Huang, Qingming; Gao, Wen

    2013-07-01

    As the basis of large-scale partial duplicate visual search on mobile devices, image local descriptor is expected to be discriminative, efficient, and compact. Our study shows that the popularly used histogram-based descriptors, such as scale invariant feature transform (SIFT) are not optimal for this task. This is mainly because histogram representation is relatively expensive to compute on mobile platforms and loses significant spatial clues, which are important for improving discriminative power and matching near-duplicate image patches. To address these issues, we propose to extract a novel binary local descriptor named Edge-SIFT from the binary edge maps of scale- and orientation-normalized image patches. By preserving both locations and orientations of edges and compressing the sparse binary edge maps with a boosting strategy, the final Edge-SIFT shows strong discriminative power with compact representation. Furthermore, we propose a fast similarity measurement and an indexing framework with flexible online verification. Hence, the Edge-SIFT allows an accurate and efficient image search and is ideal for computation sensitive scenarios such as a mobile image search. Experiments on a large-scale dataset manifest that the Edge-SIFT shows superior retrieval accuracy to Oriented BRIEF (ORB) and is superior to SIFT in the aspects of retrieval precision, efficiency, compactness, and transmission cost.

  19. Myanmar Language Search Engine

    Directory of Open Access Journals (Sweden)

    Pann Yu Mon

    2011-03-01

    Full Text Available With the enormous growth of the World Wide Web, search engines play a critical role in retrieving information from the borderless Web. Although many search engines are available for the major languages, but they are not much proficient for the less computerized languages including Myanmar. The main reason is that those search engines are not considering the specific features of those languages. A search engine which capable of searching the Web documents written in those languages is highly needed, especially when more and more Web sites are coming up with localized content in multiple languages. In this study, the design and the architecture of language specific search engine for Myanmar language is proposed. The main feature of the system are, (1 it can search the multiple encodings of the Myanmar Web page, (2 the system is designed to comply with the specific features of the Myanmar language. Finally the experiment has been done to prove whether it meets the design requirements.

  20. Skewed Binary Search Trees

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Moruz, Gabriel

    2006-01-01

    It is well-known that to minimize the number of comparisons a binary search tree should be perfectly balanced. Previous work has shown that a dominating factor over the running time for a search is the number of cache faults performed, and that an appropriate memory layout of a binary search tree...... can reduce the number of cache faults by several hundred percent. Motivated by the fact that during a search branching to the left or right at a node does not necessarily have the same cost, e.g. because of branch prediction schemes, we in this paper study the class of skewed binary search trees....... For all nodes in a skewed binary search tree the ratio between the size of the left subtree and the size of the tree is a fixed constant (a ratio of 1/2 gives perfect balanced trees). In this paper we present an experimental study of various memory layouts of static skewed binary search trees, where each...

  1. Accurate ionization potential of semiconductors from efficient density functional calculations

    Science.gov (United States)

    Ye, Lin-Hui

    2016-07-01

    Despite its huge successes in total-energy-related applications, the Kohn-Sham scheme of density functional theory cannot get reliable single-particle excitation energies for solids. In particular, it has not been able to calculate the ionization potential (IP), one of the most important material parameters, for semiconductors. We illustrate that an approximate exact-exchange optimized effective potential (EXX-OEP), the Becke-Johnson exchange, can be used to largely solve this long-standing problem. For a group of 17 semiconductors, we have obtained the IPs to an accuracy similar to that of the much more sophisticated G W approximation (GWA), with the computational cost of only local-density approximation/generalized gradient approximation. The EXX-OEP, therefore, is likely as useful for solids as for finite systems. For solid surfaces, the asymptotic behavior of the vx c has effects similar to those of finite systems which, when neglected, typically cause the semiconductor IPs to be underestimated. This may partially explain why standard GWA systematically underestimates the IPs and why using the same GWA procedures has not been able to get an accurate IP and band gap at the same time.

  2. Towards Accurate Application Characterization for Exascale (APEX)

    Energy Technology Data Exchange (ETDEWEB)

    Hammond, Simon David [Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)

    2015-09-01

    Sandia National Laboratories has been engaged in hardware and software codesign activities for a number of years, indeed, it might be argued that prototyping of clusters as far back as the CPLANT machines and many large capability resources including ASCI Red and RedStorm were examples of codesigned solutions. As the research supporting our codesign activities has moved closer to investigating on-node runtime behavior a nature hunger has grown for detailed analysis of both hardware and algorithm performance from the perspective of low-level operations. The Application Characterization for Exascale (APEX) LDRD was a project concieved of addressing some of these concerns. Primarily the research was to intended to focus on generating accurate and reproducible low-level performance metrics using tools that could scale to production-class code bases. Along side this research was an advocacy and analysis role associated with evaluating tools for production use, working with leading industry vendors to develop and refine solutions required by our code teams and to directly engage with production code developers to form a context for the application analysis and a bridge to the research community within Sandia. On each of these accounts significant progress has been made, particularly, as this report will cover, in the low-level analysis of operations for important classes of algorithms. This report summarizes the development of a collection of tools under the APEX research program and leaves to other SAND and L2 milestone reports the description of codesign progress with Sandia’s production users/developers.

  3. Optimizing cell arrays for accurate functional genomics

    Directory of Open Access Journals (Sweden)

    Fengler Sven

    2012-07-01

    Full Text Available Abstract Background Cellular responses emerge from a complex network of dynamic biochemical reactions. In order to investigate them is necessary to develop methods that allow perturbing a high number of gene products in a flexible and fast way. Cell arrays (CA enable such experiments on microscope slides via reverse transfection of cellular colonies growing on spotted genetic material. In contrast to multi-well plates, CA are susceptible to contamination among neighboring spots hindering accurate quantification in cell-based screening projects. Here we have developed a quality control protocol for quantifying and minimizing contamination in CA. Results We imaged checkered CA that express two distinct fluorescent proteins and segmented images into single cells to quantify the transfection efficiency and interspot contamination. Compared with standard procedures, we measured a 3-fold reduction of contaminants when arrays containing HeLa cells were washed shortly after cell seeding. We proved that nucleic acid uptake during cell seeding rather than migration among neighboring spots was the major source of contamination. Arrays of MCF7 cells developed without the washing step showed 7-fold lower percentage of contaminant cells, demonstrating that contamination is dependent on specific cell properties. Conclusions Previously published methodological works have focused on achieving high transfection rate in densely packed CA. Here, we focused in an equally important parameter: The interspot contamination. The presented quality control is essential for estimating the rate of contamination, a major source of false positives and negatives in current microscopy based functional genomics screenings. We have demonstrated that a washing step after seeding enhances CA quality for HeLA but is not necessary for MCF7. The described method provides a way to find optimal seeding protocols for cell lines intended to be used for the first time in CA.

  4. Accurate paleointensities - the multi-method approach

    Science.gov (United States)

    de Groot, Lennart

    2016-04-01

    The accuracy of models describing rapid changes in the geomagnetic field over the past millennia critically depends on the availability of reliable paleointensity estimates. Over the past decade methods to derive paleointensities from lavas (the only recorder of the geomagnetic field that is available all over the globe and through geologic times) have seen significant improvements and various alternative techniques were proposed. The 'classical' Thellier-style approach was optimized and selection criteria were defined in the 'Standard Paleointensity Definitions' (Paterson et al, 2014). The Multispecimen approach was validated and the importance of additional tests and criteria to assess Multispecimen results must be emphasized. Recently, a non-heating, relative paleointensity technique was proposed -the pseudo-Thellier protocol- which shows great potential in both accuracy and efficiency, but currently lacks a solid theoretical underpinning. Here I present work using all three of the aforementioned paleointensity methods on suites of young lavas taken from the volcanic islands of Hawaii, La Palma, Gran Canaria, Tenerife, and Terceira. Many of the sampled cooling units are <100 years old, the actual field strength at the time of cooling is therefore reasonably well known. Rather intuitively, flows that produce coherent results from two or more different paleointensity methods yield the most accurate estimates of the paleofield. Furthermore, the results for some flows pass the selection criteria for one method, but fail in other techniques. Scrutinizing and combing all acceptable results yielded reliable paleointensity estimates for 60-70% of all sampled cooling units - an exceptionally high success rate. This 'multi-method paleointensity approach' therefore has high potential to provide the much-needed paleointensities to improve geomagnetic field models for the Holocene.

  5. How flatbed scanners upset accurate film dosimetry.

    Science.gov (United States)

    van Battum, L J; Huizenga, H; Verdaasdonk, R M; Heukelom, S

    2016-01-21

    Film is an excellent dosimeter for verification of dose distributions due to its high spatial resolution. Irradiated film can be digitized with low-cost, transmission, flatbed scanners. However, a disadvantage is their lateral scan effect (LSE): a scanner readout change over its lateral scan axis. Although anisotropic light scattering was presented as the origin of the LSE, this paper presents an alternative cause. Hereto, LSE for two flatbed scanners (Epson 1680 Expression Pro and Epson 10000XL), and Gafchromic film (EBT, EBT2, EBT3) was investigated, focused on three effects: cross talk, optical path length and polarization. Cross talk was examined using triangular sheets of various optical densities. The optical path length effect was studied using absorptive and reflective neutral density filters with well-defined optical characteristics (OD range 0.2-2.0). Linear polarizer sheets were used to investigate light polarization on the CCD signal in absence and presence of (un)irradiated Gafchromic film. Film dose values ranged between 0.2 to 9 Gy, i.e. an optical density range between 0.25 to 1.1. Measurements were performed in the scanner's transmission mode, with red-green-blue channels. LSE was found to depend on scanner construction and film type. Its magnitude depends on dose: for 9 Gy increasing up to 14% at maximum lateral position. Cross talk was only significant in high contrast regions, up to 2% for very small fields. The optical path length effect introduced by film on the scanner causes 3% for pixels in the extreme lateral position. Light polarization due to film and the scanner's optical mirror system is the main contributor, different in magnitude for the red, green and blue channel. We concluded that any Gafchromic EBT type film scanned with a flatbed scanner will face these optical effects. Accurate dosimetry requires correction of LSE, therefore, determination of the LSE per color channel and dose delivered to the film.

  6. Improved Gravitational Search Algorithm (GSA Using Fuzzy Logic

    Directory of Open Access Journals (Sweden)

    Omid Mokhlesi

    2013-04-01

    Full Text Available Researchers tendency to use different collective intelligence as the search methods to optimize complex engineering problems has increased because of the high performance of this algorithms. Gravitational search algorithm (GSA is among these algorithms. This algorithm is inspired by Newton's laws of physics and gravitational attraction. Random masses are agents who have searched for the space. This paper presents a new Fuzzy Population GSA model called FPGSA. The proposed method is a combination of parametric fuzzy controller and gravitational search algorithm. The space being searched using this combined reasonable and accurate method. In the collective intelligence algorithms, population size influences the final answer so that for a large population, a better response is obtained but the algorithm execution time is longer. To overcome this problem, a new parameter called the dispersion coefficient is added to the algorithm. Implementation results show that by controlling this factor, system performance can be improved.

  7. Performance Evaluation of search engines via user effort measures

    Directory of Open Access Journals (Sweden)

    Rajesh Kumar Goutam

    2012-07-01

    Full Text Available Many metrics exist to perform the task of search engine evaluation that are either looking for the experts judgments or believe in searchers decisions about the relevancy of the web documents. However, search logs can provide us information about how real users search. This paper explains, our attempts to incorporate the users searching behavior in formulation of user efforts centric evaluation metric. We also incorporate two dimensional users traversing approach in the ERR metric. After the formulation of the evaluation metric, authors judge its goodness and found that presented metric fulfills all the requirements that are needed for a metric to be mathematically accurate. The findings obtained from experiments, present a complete description for search engine evaluation procedure.

  8. Analysis of search in an online clinical laboratory manual.

    Science.gov (United States)

    Blechner, Michael; Kish, Joshua; Chadaga, Vivek; Dighe, Anand S

    2006-08-01

    Online laboratory manuals have developed into an important gateway to the laboratory. Clinicians increasingly expect up-to-date laboratory test information to be readily available online. During the past decade, sophisticated Internet search technology has developed, permitting rapid and accurate retrieval of a wide variety of content. We studied the role of search in an online laboratory manual. We surveyed the utilization of search technology in publicly available online manuals and examined how users interact with the search feature of a laboratory handbook. We show how a laboratory can improve its online handbook through insights gained by collecting information about each user's activity. We also discuss future applications for search-related technologies and the potential role of the online laboratory manual as the primary laboratory information portal.

  9. Quantum searching application in search based software engineering

    Science.gov (United States)

    Wu, Nan; Song, FangMin; Li, Xiangdong

    2013-05-01

    The Search Based Software Engineering (SBSE) is widely used in software engineering for identifying optimal solutions. However, there is no polynomial-time complexity solution used in the traditional algorithms for SBSE, and that causes the cost very high. In this paper, we analyze and compare several quantum search algorithms that could be applied for SBSE: quantum adiabatic evolution searching algorithm, fixed-point quantum search (FPQS), quantum walks, and a rapid modified Grover quantum searching method. The Grover's algorithm is thought as the best choice for a large-scaled unstructured data searching and theoretically it can be applicable to any search-space structure and any type of searching problems.

  10. AREAL FEATURE MATCHING BASED ON SIMILARITY USING CRITIC METHOD

    Directory of Open Access Journals (Sweden)

    J. Kim

    2015-10-01

    Full Text Available In this paper, we propose an areal feature matching method that can be applied for many-to-many matching, which involves matching a simple entity with an aggregate of several polygons or two aggregates of several polygons with fewer user intervention. To this end, an affine transformation is applied to two datasets by using polygon pairs for which the building name is the same. Then, two datasets are overlaid with intersected polygon pairs that are selected as candidate matching pairs. If many polygons intersect at this time, we calculate the inclusion function between such polygons. When the value is more than 0.4, many of the polygons are aggregated as single polygons by using a convex hull. Finally, the shape similarity is calculated between the candidate pairs according to the linear sum of the weights computed in CRITIC method and the position similarity, shape ratio similarity, and overlap similarity. The candidate pairs for which the value of the shape similarity is more than 0.7 are determined as matching pairs. We applied the method to two geospatial datasets: the digital topographic map and the KAIS map in South Korea. As a result, the visual evaluation showed two polygons that had been well detected by using the proposed method. The statistical evaluation indicates that the proposed method is accurate when using our test dataset with a high F-measure of 0.91.

  11. Predicting the evolution of complex networks via similarity dynamics

    Science.gov (United States)

    Wu, Tao; Chen, Leiting; Zhong, Linfeng; Xian, Xingping

    2017-01-01

    Almost all real-world networks are subject to constant evolution, and plenty of them have been investigated empirically to uncover the underlying evolution mechanism. However, the evolution prediction of dynamic networks still remains a challenging problem. The crux of this matter is to estimate the future network links of dynamic networks. This paper studies the evolution prediction of dynamic networks with link prediction paradigm. To estimate the likelihood of the existence of links more accurate, an effective and robust similarity index is presented by exploiting network structure adaptively. Moreover, most of the existing link prediction methods do not make a clear distinction between future links and missing links. In order to predict the future links, the networks are regarded as dynamic systems in this paper, and a similarity updating method, spatial-temporal position drift model, is developed to simulate the evolutionary dynamics of node similarity. Then the updated similarities are used as input information for the future links' likelihood estimation. Extensive experiments on real-world networks suggest that the proposed similarity index performs better than baseline methods and the position drift model performs well for evolution prediction in real-world evolving networks.

  12. Improving Search Engine Reliability

    Science.gov (United States)

    Pruthi, Jyoti; Kumar, Ela

    2010-11-01

    Search engines on the Internet are used daily to access and find information. While these services are providing an easy way to find information globally, they are also suffering from artificially created false results. This paper describes two techniques that are being used to manipulate the search engines: spam pages (used to achieve higher rankings on the result page) and cloaking (used to feed falsified data into search engines). This paper also describes two proposed methods to fight this kind of misuse, algorithms for both of the formerly mentioned cases of spamdexing.

  13. ElasticSearch server

    CERN Document Server

    Rogozinski, Marek

    2014-01-01

    This book is a detailed, practical, hands-on guide packed with real-life scenarios and examples which will show you how to implement an ElasticSearch search engine on your own websites.If you are a web developer or a user who wants to learn more about ElasticSearch, then this is the book for you. You do not need to know anything about ElastiSeach, Java, or Apache Lucene in order to use this book, though basic knowledge about databases and queries is required.

  14. Stability of similarity measurements for bipartite networks

    CERN Document Server

    Liu, Jian-Guo; Pan, Xue; Guo, Qiang; Zhou, Tao

    2015-01-01

    Similarity is a fundamental measure in network analyses and machine learning algorithms, with wide applications ranging from personalized recommendation to socio-economic dynamics. We argue that an effective similarity measurement should guarantee the stability even under some information loss. With six bipartite networks, we investigate the stabilities of fifteen similarity measurements by comparing the similarity matrixes of two data samples which are randomly divided from original data sets. Results show that, the fifteen measurements can be well classified into three clusters according to their stabilities, and measurements in the same cluster have similar mathematical definitions. In addition, we develop a top-$n$-stability method for personalized recommendation, and find that the unstable similarities would recommend false information to users, and the performance of recommendation would be largely improved by using stable similarity measurements. This work provides a novel dimension to analyze and eval...

  15. Discovering More Accurate Frequent Web Usage Patterns

    CERN Document Server

    Bayir, Murat Ali; Cosar, Ahmet; Fidan, Guven

    2008-01-01

    Web usage mining is a type of web mining, which exploits data mining techniques to discover valuable information from navigation behavior of World Wide Web users. As in classical data mining, data preparation and pattern discovery are the main issues in web usage mining. The first phase of web usage mining is the data processing phase, which includes the session reconstruction operation from server logs. Session reconstruction success directly affects the quality of the frequent patterns discovered in the next phase. In reactive web usage mining techniques, the source data is web server logs and the topology of the web pages served by the web server domain. Other kinds of information collected during the interactive browsing of web site by user, such as cookies or web logs containing similar information, are not used. The next phase of web usage mining is discovering frequent user navigation patterns. In this phase, pattern discovery methods are applied on the reconstructed sessions obtained in the first phas...

  16. An in-depth look at saccadic search in infancy.

    Science.gov (United States)

    Hessels, Roy S; Hooge, Ignace T C; Kemner, Chantal

    2016-06-01

    Two questions were posed in the present study: (1) Do infants search for discrepant items in the absence of instructions? We outline where previous research has been inconclusive in answering this question. (2) In what manner do infants search, and what are the fixation and saccade characteristics in saccadic search? A thorough characterization of saccadic search in infancy is of great importance as a reference for future eye-movement studies in infancy. We presented 10-month-old infants with 24 visual search displays in two separate sessions within two weeks. We report that infant saccadic search performance at 10 months is above what may be expected by our model of chance, and is dependent on the specific target. Infant fixation and saccade characteristics show similarities to adult fixation and saccade characteristics in saccadic search. All findings were highly consistent across two separate sessions on the group level. An examination of the reliability of saccadic search revealed that test-retest reliability for oculomotor characteristics was high, particularly for fixation duration. We suggest that future research into saccadic search in infancy adopt the presented model of chance as a baseline against which to compare search performance. Researchers investigating both the typical and atypical development of visual search may benefit from the presented results.

  17. Lie algebraic similarity transformed Hamiltonians for lattice model systems

    Science.gov (United States)

    Wahlen-Strothman, Jacob M.; Jiménez-Hoyos, Carlos A.; Henderson, Thomas M.; Scuseria, Gustavo E.

    2015-01-01

    We present a class of Lie algebraic similarity transformations generated by exponentials of two-body on-site Hermitian operators whose Hausdorff series can be summed exactly without truncation. The correlators are defined over the entire lattice and include the Gutzwiller factor ni ↑ni ↓ , and two-site products of density (ni ↑+ni ↓) and spin (ni ↑-ni ↓) operators. The resulting non-Hermitian many-body Hamiltonian can be solved in a biorthogonal mean-field approach with polynomial computational cost. The proposed similarity transformation generates locally weighted orbital transformations of the reference determinant. Although the energy of the model is unbound, projective equations in the spirit of coupled cluster theory lead to well-defined solutions. The theory is tested on the one- and two-dimensional repulsive Hubbard model where it yields accurate results for small and medium sized interaction strengths.

  18. Diphoton searches (CMS)

    CERN Document Server

    Quittnat, Milena Eleonore

    2016-01-01

    Many physics scenarios beyond the standard model predict the existence of heavy resonances decaying to diphotons. This talk presents searches for BSM physics in the diphoton final state at CMS, focusing on the recent results.

  19. Automated search for supernovae

    Energy Technology Data Exchange (ETDEWEB)

    Kare, J.T.

    1984-11-15

    This thesis describes the design, development, and testing of a search system for supernovae, based on the use of current computer and detector technology. This search uses a computer-controlled telescope and charge coupled device (CCD) detector to collect images of hundreds of galaxies per night of observation, and a dedicated minicomputer to process these images in real time. The system is now collecting test images of up to several hundred fields per night, with a sensitivity corresponding to a limiting magnitude (visual) of 17. At full speed and sensitivity, the search will examine some 6000 galaxies every three nights, with a limiting magnitude of 18 or fainter, yielding roughly two supernovae per week (assuming one supernova per galaxy per 50 years) at 5 to 50 percent of maximum light. An additional 500 nearby galaxies will be searched every night, to locate about 10 supernovae per year at one or two percent of maximum light, within hours of the initial explosion.

  20. Transplant Center Search Form

    Science.gov (United States)

    ... Share Your Story Give Us Feedback - A + A Transplant Center Search Form Welcome to the Blood & Marrow ... transplant centers for patients with a particular disease. Transplant Center login Username: * Password: * Request new password Join ...

  1. Chemical Search Web Utility

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Chemical Search Web Utility is an intuitive web application that allows the public to easily find the chemical that they are interested in using, and which...

  2. Testing Self-Similarity Through Lamperti Transformations

    KAUST Repository

    Lee, Myoungji

    2016-07-14

    Self-similar processes have been widely used in modeling real-world phenomena occurring in environmetrics, network traffic, image processing, and stock pricing, to name but a few. The estimation of the degree of self-similarity has been studied extensively, while statistical tests for self-similarity are scarce and limited to processes indexed in one dimension. This paper proposes a statistical hypothesis test procedure for self-similarity of a stochastic process indexed in one dimension and multi-self-similarity for a random field indexed in higher dimensions. If self-similarity is not rejected, our test provides a set of estimated self-similarity indexes. The key is to test stationarity of the inverse Lamperti transformations of the process. The inverse Lamperti transformation of a self-similar process is a strongly stationary process, revealing a theoretical connection between the two processes. To demonstrate the capability of our test, we test self-similarity of fractional Brownian motions and sheets, their time deformations and mixtures with Gaussian white noise, and the generalized Cauchy family. We also apply the self-similarity test to real data: annual minimum water levels of the Nile River, network traffic records, and surface heights of food wrappings. © 2016, International Biometric Society.

  3. Semantic Clustering of Search Engine Results

    Directory of Open Access Journals (Sweden)

    Sara Saad Soliman

    2015-01-01

    Full Text Available This paper presents a novel approach for search engine results clustering that relies on the semantics of the retrieved documents rather than the terms in those documents. The proposed approach takes into consideration both lexical and semantics similarities among documents and applies activation spreading technique in order to generate semantically meaningful clusters. This approach allows documents that are semantically similar to be clustered together rather than clustering documents based on similar terms. A prototype is implemented and several experiments are conducted to test the prospered solution. The result of the experiment confirmed that the proposed solution achieves remarkable results in terms of precision.

  4. Semantic Clustering of Search Engine Results.

    Science.gov (United States)

    Soliman, Sara Saad; El-Sayed, Maged F; Hassan, Yasser F

    2015-01-01

    This paper presents a novel approach for search engine results clustering that relies on the semantics of the retrieved documents rather than the terms in those documents. The proposed approach takes into consideration both lexical and semantics similarities among documents and applies activation spreading technique in order to generate semantically meaningful clusters. This approach allows documents that are semantically similar to be clustered together rather than clustering documents based on similar terms. A prototype is implemented and several experiments are conducted to test the prospered solution. The result of the experiment confirmed that the proposed solution achieves remarkable results in terms of precision.

  5. Search and Disrupt

    OpenAIRE

    Ørding Olsen, Anders

    2015-01-01

    This paper analyzes how external search is affected by strategic interest alignment among knowledge sources. I focus on misalignment arising from the heterogeneous effects of disruptive technologies by analyzing the influence of incumbents on 2,855 non-incumbents? external knowledge search efforts. The efforts most likely to solve innovation problems obtained funding from the European Commission?s 7th Framework Program (2007-2013). The results show that involving incumbents improv...

  6. Supersymmetry searches in ATLAS

    CERN Document Server

    Torro Pastor, Emma; The ATLAS collaboration

    2016-01-01

    Weak scale supersymmetry remains one of the best motivated and studied Standard Model extensions. This talk summarises recent ATLAS results for searches for supersymmetric (SUSY) particles. Weak and strong production in both R-Parity conserving and R-Parity violating SUSY scenarios are considered. The searches involved final states including jets, missing transverse momentum, light leptons, taus or photons, as well as long-lived particle signatures.

  7. SUSY Searches in ATLAS

    CERN Document Server

    Zhuang, Xuai; The ATLAS collaboration

    2016-01-01

    Despite the absence of experimental evidence, weak scale supersymmetry remains one of the best motivated and studied Standard Model extensions. This talk summarises recent ATLAS results for searches for supersymmetric (SUSY) particles, with focus on those obtained using proton-proton collisions at a centre of mass energy of 13 TeV using 2015+2016 data. The searches with final states including jets, missing transverse momentum, light leptons will be presented.

  8. General Search Market Equilibrium

    OpenAIRE

    Albrecht, James W.; Axell, Bo

    1982-01-01

    In this paper we extend models of “search market equilibrium” to incorporate general equilibrium considerations. The model we treat is one with a single product market and a single labor market. Imperfectly informed individuals follow optimal strategies in searching for a suitably low price and high wage. For any distribution of price and wage offers across firms these optimal strategies generate product demand and labor supply schedules. Firms then choose prices and wages to maximize expecte...

  9. Harmony Search as a Metaheuristic Algorithm

    CERN Document Server

    Yang, Xin-She

    2010-01-01

    This first chapter intends to review and analyze the powerful new Harmony Search (HS) algorithm in the context of metaheuristic algorithms. I will first outline the fundamental steps of Harmony Search, and how it works. I then try to identify the characteristics of metaheuristics and analyze why HS is a good meta-heuristic algorithm. I then review briefly other popular metaheuristics such as par-ticle swarm optimization so as to find their similarities and differences from HS. Finally, I will discuss the ways to improve and develop new variants of HS, and make suggestions for further research including open questions.

  10. A novel adjustable multiple cross-hexagonal search algorithm for fast block motion estimation

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    In this paper, we propose a novel adjustable multiple cross-hexagonal search (AMCHS) algorithm for fast block motion estimation. It employs adjustable multiple cross search patterns (AMCSP) in the first step and then uses half-way-skip and half-way-stop technique to determine whether to employ two hexagonal search patterns (HSPs) subsequently. The AMCSP can be used to find small motion vectors efficiently while the HSPs can be used to find large ones accurately to ensure prediction quality.Simulation results showed that our proposed AMCHS achieves faster search speed, and provides better distortion performance than other popular fast search algorithms, such as CDS and CDHS.

  11. Searching versus surfing: how different ways of acquiring content online affect cognitive processing.

    Science.gov (United States)

    Wise, Kevin; Kim, Hyo Jung

    2008-06-01

    An experiment tested whether people orient to and encode pictures selected from a Web site differently, depending on whether the pictures were selected by searching or surfing. Participants in the search condition spent more time selecting pictures than the participants in the surf condition spent. The pictures chosen in the search condition elicited cardiac orienting, while pictures chosen in the surf condition did not. Participants recognized pictures acquired by searching more accurately than they recognized those acquired by surfing, indicating that searching led to better encoding than surfing.

  12. Global OpenSearch

    Science.gov (United States)

    Newman, D. J.; Mitchell, A. E.

    2015-12-01

    At AGU 2014, NASA EOSDIS demonstrated a case-study of an OpenSearch framework for Earth science data discovery. That framework leverages the IDN and CWIC OpenSearch API implementations to provide seamless discovery of data through the 'two-step' discovery process as outlined by the Federation for Earth Sciences (ESIP) OpenSearch Best Practices. But how would an Earth Scientist leverage this framework and what are the benefits? Using a client that understands the OpenSearch specification and, for further clarity, the various best practices and extensions, a scientist can discovery a plethora of data not normally accessible either by traditional methods (NASA Earth Data Search, Reverb, etc) or direct methods (going to the source of the data) We will demonstrate, via the CWICSmart web client, how an earth scientist can access regional data on a regional phenomena in a uniform and aggregated manner. We will demonstrate how an earth scientist can 'globalize' their discovery. You want to find local data on 'sea surface temperature of the Indian Ocean'? We can help you with that. 'European meteorological data'? Yes. 'Brazilian rainforest satellite imagery'? That too. CWIC allows you to get earth science data in a uniform fashion from a large number of disparate, world-wide agencies. This is what we mean by Global OpenSearch.

  13. Molecular quantum similarity using conceptual DFT descriptors

    Indian Academy of Sciences (India)

    Patrick Bultinck; Ramon carbó-dorca

    2005-09-01

    This paper reports a Molecular Quantum Similarity study for a set of congeneric steroid molecules, using as basic similarity descriptors electron density ρ (r), shape function (r), the Fukui functions +(r) and -(r) and local softness +(r) and -(r). Correlations are investigated between similarity indices for each couple of descriptors used and compared to assess whether these different descriptors sample different information and to investigate what information is revealed by each descriptor.

  14. Saccadic search performance: the effect of element spacing.

    Science.gov (United States)

    Vlaskamp, Björn N S; Over, Eelco A B; Hooge, Ignace Th C

    2005-11-01

    In a saccadic search task, we investigated whether spacing between elements affects search performance. Since it has been suggested in the literature that element spacing can affect the eye movement strategy in several ways, its effects on search time per element are hard to predict. In the first experiment, we varied the element spacing (3.4 degrees -7.1 degrees distance between elements) and target-distracter similarity. As expected, search time per element increased with target-distracter similarity. Decreasing element spacing decreased the search time per element. However, this effect was surprisingly small in comparison to the effect of varying target-distracter similarity. In a second experiment, we elaborated on this finding and decreased element spacing even further (between 0.8 degrees and 3.2 degrees). Here, we did not find an effect on search time per element for element spacings from 3.2 degrees to spacings as small as 1.5 degrees . It was only at distances smaller than 1.5 degrees that search time per element increased with decreasing element spacing. In order to explain the remarkable finding that search time per element was not affected for such a wide range of element spacings, we propose that irrespective of the spacing crowding kept the number of elements processed per fixation more or less constant.

  15. Similarity effects in visual working memory.

    Science.gov (United States)

    Jiang, Yuhong V; Lee, Hyejin J; Asaad, Anthony; Remington, Roger

    2016-04-01

    Perceptual similarity is an important property of multiple stimuli. Its computation supports a wide range of cognitive functions, including reasoning, categorization, and memory recognition. It is important, therefore, to determine why previous research has found conflicting effects of inter-item similarity on visual working memory. Studies reporting a similarity advantage have used simple stimuli whose similarity varied along a featural continuum. Studies reporting a similarity disadvantage have used complex stimuli from either a single or multiple categories. To elucidate stimulus conditions for similarity effects in visual working memory, we tested memory for complex stimuli (faces) whose similarity varied along a morph continuum. Participants encoded 3 morphs generated from a single face identity in the similar condition, or 3 morphs generated from different face identities in the dissimilar condition. After a brief delay, a test face appeared at one of the encoding locations for participants to make a same/different judgment. Two experiments showed that similarity enhanced memory accuracy without changing the response criterion. These findings support previous computational models that incorporate featural variance as a component of working memory load. They delineate limitations of models that emphasize cortical resources or response decisions.

  16. A student's guide to searching the literature using online databases

    CERN Document Server

    Miller, Casey W; Messina, Troy C

    2010-01-01

    A method is described to empower students to efficiently perform general and literature searches using online resources. The method was tested on undergraduate and graduate students with varying backgrounds with scientific literature. Students involved in this study showed marked improvement in their awareness of how and where to find accurate scientific information.

  17. Using Clinicians’ Search Query Data to Monitor Influenza Epidemics

    Science.gov (United States)

    Santillana, Mauricio; Nsoesie, Elaine O.; Mekaru, Sumiko R.; Scales, David; Brownstein, John S.

    2014-01-01

    Search query information from a clinician's database, UpToDate, is shown to predict influenza epidemics in the United States in a timely manner. Our results show that digital disease surveillance tools based on experts' databases may be able to provide an alternative, reliable, and stable signal for accurate predictions of influenza outbreaks. PMID:25115873

  18. Using clinicians' search query data to monitor influenza epidemics.

    Science.gov (United States)

    Santillana, Mauricio; Nsoesie, Elaine O; Mekaru, Sumiko R; Scales, David; Brownstein, John S

    2014-11-15

    Search query information from a clinician's database, UpToDate, is shown to predict influenza epidemics in the United States in a timely manner. Our results show that digital disease surveillance tools based on experts' databases may be able to provide an alternative, reliable, and stable signal for accurate predictions of influenza outbreaks.

  19. CAST: a new program package for the accurate characterization of large and flexible molecular systems.

    Science.gov (United States)

    Grebner, Christoph; Becker, Johannes; Weber, Daniel; Bellinger, Daniel; Tafipolski, Maxim; Brückner, Charlotte; Engels, Bernd

    2014-09-15

    The presented program package, Conformational Analysis and Search Tool (CAST) allows the accurate treatment of large and flexible (macro) molecular systems. For the determination of thermally accessible minima CAST offers the newly developed TabuSearch algorithm, but algorithms such as Monte Carlo (MC), MC with minimization, and molecular dynamics are implemented as well. For the determination of reaction paths, CAST provides the PathOpt, the Nudge Elastic band, and the umbrella sampling approach. Access to free energies is possible through the free energy perturbation approach. Along with a number of standard force fields, a newly developed symmetry-adapted perturbation theory-based force field is included. Semiempirical computations are possible through DFTB+ and MOPAC interfaces. For calculations based on density functional theory, a Message Passing Interface (MPI) interface to the Graphics Processing Unit (GPU)-accelerated TeraChem program is available. The program is available on request.

  20. Children's Search Engines from an Information Search Process Perspective.

    Science.gov (United States)

    Broch, Elana

    2000-01-01

    Describes cognitive and affective characteristics of children and teenagers that may affect their Web searching behavior. Reviews literature on children's searching in online public access catalogs (OPACs) and using digital libraries. Profiles two Web search engines. Discusses some of the difficulties children have searching the Web, in the…

  1. SearchResultFinder: federated search made easy

    NARCIS (Netherlands)

    Trieschnigg, Dolf; Tjin-Kam-Jet, Kien; Hiemstra, Djoerd

    2013-01-01

    Building a federated search engine based on a large number existing web search engines is a challenge: implementing the programming interface (API) for each search engine is an exacting and time-consuming job. In this demonstration we present SearchResultFinder, a browser plugin which speeds up dete

  2. Motion Vector Estimation Using Line-Square Search Block Matching Algorithm for Video Sequences

    Directory of Open Access Journals (Sweden)

    Guo Bao-long

    2004-09-01

    Full Text Available Motion estimation and compensation techniques are widely used for video coding applications but the real-time motion estimation is not easily achieved due to its enormous computations. In this paper, a new fast motion estimation algorithm based on line search is presented, in which computation complexity is greatly reduced by using the line search strategy and a parallel search pattern. Moreover, the accurate search is achieved because the small square search pattern is used. It has a best-case scenario of only 9 search points, which is 4 search points less than the diamond search algorithm. Simulation results show that, compared with the previous techniques, the LSPS algorithm significantly reduces the computational requirements for finding motion vectors, and also produces close performance in terms of motion compensation errors.

  3. A Hybrid Model Ranking Search Result for Research Paper Searching on Social Bookmarking

    Directory of Open Access Journals (Sweden)

    pijitra jomsri

    2015-11-01

    Full Text Available Social bookmarking and publication sharing systems are essential tools for web resource discovery. The performance and capabilities of search results from research paper bookmarking system are vital. Many researchers use social bookmarking for searching papers related to their topics of interest. This paper proposes a combination of similarity based indexing “tag title and abstract” and static ranking to improve search results. In this particular study, the year of the published paper and type of research paper publication are combined with similarity ranking called (HybridRank. Different weighting scores are employed. The retrieval performance of these weighted combination rankings are evaluated using mean values of NDCG. The results suggest that HybridRank and similarity rank with weight 75:25 has the highest NDCG scores. From the preliminary result of experiment, the combination ranking technique provide more relevant research paper search results. Furthermore the chosen heuristic ranking can improve the efficiency of research paper searching on social bookmarking websites.

  4. Fast and accurate hashing via iterative nearest neighbors expansion.

    Science.gov (United States)

    Jin, Zhongming; Zhang, Debing; Hu, Yao; Lin, Shiding; Cai, Deng; He, Xiaofei

    2014-11-01

    Recently, the hashing techniques have been widely applied to approximate the nearest neighbor search problem in many real applications. The basic idea of these approaches is to generate binary codes for data points which can preserve the similarity between any two of them. Given a query, instead of performing a linear scan of the entire data base, the hashing method can perform a linear scan of the points whose hamming distance to the query is not greater than rh , where rh is a constant. However, in order to find the true nearest neighbors, both the locating time and the linear scan time are proportional to O(∑i=0(rh)(c || i)) ( c is the code length), which increase exponentially as rh increases. To address this limitation, we propose a novel algorithm named iterative expanding hashing in this paper, which builds an auxiliary index based on an offline constructed nearest neighbor table to avoid large rh . This auxiliary index can be easily combined with all the traditional hashing methods. Extensive experimental results over various real large-scale datasets demonstrate the superiority of the proposed approach.

  5. Accelerating chemical database searching using graphics processing units.

    Science.gov (United States)

    Liu, Pu; Agrafiotis, Dimitris K; Rassokhin, Dmitrii N; Yang, Eric

    2011-08-22

    The utility of chemoinformatics systems depends on the accurate computer representation and efficient manipulation of chemical compounds. In such systems, a small molecule is often digitized as a large fingerprint vector, where each element indicates the presence/absence or the number of occurrences of a particular structural feature. Since in theory the number of unique features can be exceedingly large, these fingerprint vectors are usually folded into much shorter ones using hashing and modulo operations, allowing fast "in-memory" manipulation and comparison of molecules. There is increasing evidence that lossless fingerprints can substantially improve retrieval performance in chemical database searching (substructure or similarity), which have led to the development of several lossless fingerprint compression algorithms. However, any gains in storage and retrieval afforded by compression need to be weighed against the extra computational burden required for decompression before these fingerprints can be compared. Here we demonstrate that graphics processing units (GPU) can greatly alleviate this problem, enabling the practical application of lossless fingerprints on large databases. More specifically, we show that, with the help of a ~$500 ordinary video card, the entire PubChem database of ~32 million compounds can be searched in ~0.2-2 s on average, which is 2 orders of magnitude faster than a conventional CPU. If multiple query patterns are processed in batch, the speedup is even more dramatic (less than 0.02-0.2 s/query for 1000 queries). In the present study, we use the Elias gamma compression algorithm, which results in a compression ratio as high as 0.097.

  6. Mastering Search Analytics Measuring SEO, SEM and Site Search

    CERN Document Server

    Chaters, Brent

    2011-01-01

    Many companies still approach Search Engine Optimization (SEO) and paid search as separate initiatives. This in-depth guide shows you how to use these programs as part of a comprehensive strategy-not just to improve your site's search rankings, but to attract the right people and increase your conversion rate. Learn how to measure, test, analyze, and interpret all of your search data with a wide array of analytic tools. Gain the knowledge you need to determine the strategy's return on investment. Ideal for search specialists, webmasters, and search marketing managers, Mastering Search Analyt

  7. Accurate, low-cost 3D-models of gullies

    Science.gov (United States)

    Onnen, Nils; Gronz, Oliver; Ries, Johannes B.; Brings, Christine

    2015-04-01

    Soil erosion is a widespread problem in arid and semi-arid areas. The most severe form is the gully erosion. They often cut into agricultural farmland and can make a certain area completely unproductive. To understand the development and processes inside and around gullies, we calculated detailed 3D-models of gullies in the Souss Valley in South Morocco. Near Taroudant, we had four study areas with five gullies different in size, volume and activity. By using a Canon HF G30 Camcorder, we made varying series of Full HD videos with 25fps. Afterwards, we used the method Structure from Motion (SfM) to create the models. To generate accurate models maintaining feasible runtimes, it is necessary to select around 1500-1700 images from the video, while the overlap of neighboring images should be at least 80%. In addition, it is very important to avoid selecting photos that are blurry or out of focus. Nearby pixels of a blurry image tend to have similar color values. That is why we used a MATLAB script to compare the derivatives of the images. The higher the sum of the derivative, the sharper an image of similar objects. MATLAB subdivides the video into image intervals. From each interval, the image with the highest sum is selected. E.g.: 20min. video at 25fps equals 30.000 single images. The program now inspects the first 20 images, saves the sharpest and moves on to the next 20 images etc. Using this algorithm, we selected 1500 images for our modeling. With VisualSFM, we calculated features and the matches between all images and produced a point cloud. Then, MeshLab has been used to build a surface out of it using the Poisson surface reconstruction approach. Afterwards we are able to calculate the size and the volume of the gullies. It is also possible to determine soil erosion rates, if we compare the data with old recordings. The final step would be the combination of the terrestrial data with the data from our aerial photography. So far, the method works well and we

  8. Enhancing Web Search with Semantic Identification of User Preferences

    Directory of Open Access Journals (Sweden)

    Naglaa Fathy

    2011-11-01

    Full Text Available Personalized web search is able to satisfy individuals information needs by modeling long-term and short-term user interests based on user actions, browsed documents or past queries and incorporate these in the search process. In this paper, we propose a personalized search approach which models the user search preferences in an ontological user profile and semantically compares this model against user current query context to re-rank search results. Our user profile is based on the predefined ontology Open Directory Project (ODP so that after a user's search, relevant web pages are classified into topics in the ontology using semantic and cosine similarity measures. Moreover, interest scores are assigned to topics based on the users ongoing behavior. Our experiments show that re-ranking based on the semantic evidence of the updated user profile efficiently satisfies user information needs with the most relevant results being brought on to the top of the returned results.

  9. Search techniques in intelligent classification systems

    CERN Document Server

    Savchenko, Andrey V

    2016-01-01

    A unified methodology for categorizing various complex objects is presented in this book. Through probability theory, novel asymptotically minimax criteria suitable for practical applications in imaging and data analysis are examined including the special cases such as the Jensen-Shannon divergence and the probabilistic neural network. An optimal approximate nearest neighbor search algorithm, which allows faster classification of databases is featured. Rough set theory, sequential analysis and granular computing are used to improve performance of the hierarchical classifiers. Practical examples in face identification (including deep neural networks), isolated commands recognition in voice control system and classification of visemes captured by the Kinect depth camera are included. This approach creates fast and accurate search procedures by using exact probability densities of applied dissimilarity measures. This book can be used as a guide for independent study and as supplementary material for a technicall...

  10. The Development of Landmark and Beacon Use in Young Children: Evidence from a Touchscreen Search Task

    Science.gov (United States)

    Sutton, Jennifer E.

    2006-01-01

    Children ages 2, 3 and 4 years participated in a novel hide-and-seek search task presented on a touchscreen monitor. On beacon trials, the target hiding place could be located using a beacon cue, but on landmark trials, searching required the use of a nearby landmark cue. In Experiment 1, 2-year-olds performed less accurately than older children…

  11. Similarity Structure of Wave-Collapse

    DEFF Research Database (Denmark)

    Rypdal, Kristoffer; Juul Rasmussen, Jens; Thomsen, Kenneth

    1985-01-01

    Similarity transformations of the cubic Schrödinger equation (CSE) are investigated. The transformations are used to remove the explicit time variation in the CSE and reduce it to differential equations in the spatial variables only. Two different methods for similarity reduction are employed and...

  12. Perceived Similarity, Proactive Adjustment, and Organizational Socialization

    Science.gov (United States)

    Kammeyer-Mueller, John D.; Livingston, Beth A.; Liao, Hui

    2011-01-01

    The present study explores how perceived demographic and attitudinal similarity can influence proactive behavior among organizational newcomers. We propose that newcomers who perceive themselves as similar to their co-workers will be more willing to seek new information or build relationships, which in turn will lead to better long-term…

  13. Appropriate Similarity Measures for Author Cocitation Analysis

    NARCIS (Netherlands)

    N.J.P. van Eck (Nees Jan); L. Waltman (Ludo)

    2007-01-01

    textabstractWe provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similar

  14. Interleaving Helps Students Distinguish among Similar Concepts

    Science.gov (United States)

    Rohrer, Doug

    2012-01-01

    When students encounter a set of concepts (or terms or principles) that are similar in some way, they often confuse one with another. For instance, they might mistake one word for another word with a similar spelling (e.g., allusion instead of illusion) or choose the wrong strategy for a mathematics problem because it resembles a different kind of…

  15. Mining Diagnostic Assessment Data for Concept Similarity

    Science.gov (United States)

    Madhyastha, Tara; Hunt, Earl

    2009-01-01

    This paper introduces a method for mining multiple-choice assessment data for similarity of the concepts represented by the multiple choice responses. The resulting similarity matrix can be used to visualize the distance between concepts in a lower-dimensional space. This gives an instructor a visualization of the relative difficulty of concepts…

  16. Similar methodological analysis involving the user experience.

    Science.gov (United States)

    Almeida e Silva, Caio Márcio; Okimoto, Maria Lúcia R L; Tanure, Raffaela Leane Zenni

    2012-01-01

    This article deals with the use of a protocol for analysis of similar methodological analysis related to user experience. For both, were selected articles recounting experiments in the area. They were analyze based on the similar analysis protocol and finally, synthesized and associated.

  17. Similarity indices I: what do they measure.

    Energy Technology Data Exchange (ETDEWEB)

    Johnston, J.W.

    1976-11-01

    A method for estimating the effects of environmental effusions on ecosystems is described. The characteristics of 25 similarity indices used in studies of ecological communities were investigated. The type of data structure, to which these indices are frequently applied, was described as consisting of vectors of measurements on attributes (species) observed in a set of samples. A general similarity index was characterized as the result of a two-step process defined on a pair of vectors. In the first step an attribute similarity score is obtained for each attribute by comparing the attribute values observed in the pair of vectors. The result is a vector of attribute similarity scores. These are combined in the second step to arrive at the similarity index. The operation in the first step was characterized as a function, g, defined on pairs of attribute values. The second operation was characterized as a function, F, defined on the vector of attribute similarity scores from the first step. Usually, F was a simple sum or weighted sum of the attribute similarity scores. It is concluded that similarity indices should not be used as the test statistic to discriminate between two ecological communities.

  18. Measure of Node Similarity in Multilayer Networks

    CERN Document Server

    Mollgaard, Anders; Dammeyer, Jesper; Jensen, Mogens H; Lehmann, Sune; Mathiesen, Joachim

    2016-01-01

    The weight of links in a network is often related to the similarity of the nodes. Here, we introduce a simple tunable measure for analysing the similarity of nodes across different link weights. In particular, we use the measure to analyze homophily in a group of 659 freshman students at a large university. Our analysis is based on data obtained using smartphones equipped with custom data collection software, complemented by questionnaire-based data. The network of social contacts is represented as a weighted multilayer network constructed from different channels of telecommunication as well as data on face-to-face contacts. We find that even strongly connected individuals are not more similar with respect to basic personality traits than randomly chosen pairs of individuals. In contrast, several socio-demographics variables have a significant degree of similarity. We further observe that similarity might be present in one layer of the multilayer network and simultaneously be absent in the other layers. For a...

  19. Self-learning search engines

    NARCIS (Netherlands)

    Schuth, A.

    2015-01-01

    How does a search engine such as Google know which search results to display? There are many competing algorithms that generate search results, but which one works best? We developed a new probabilistic method for quickly comparing large numbers of search algorithms by examining the results users cl

  20. The Evolution of Web Searching.

    Science.gov (United States)

    Green, David

    2000-01-01

    Explores the interrelation between Web publishing and information retrieval technologies and lists new approaches to Web indexing and searching. Highlights include Web directories; search engines; portalisation; Internet service providers; browser providers; meta search engines; popularity based analysis; natural language searching; links-based…

  1. Standardization of Keyword Search Mode

    Science.gov (United States)

    Su, Di

    2010-01-01

    In spite of its popularity, keyword search mode has not been standardized. Though information professionals are quick to adapt to various presentations of keyword search mode, novice end-users may find keyword search confusing. This article compares keyword search mode in some major reference databases and calls for standardization. (Contains 3…

  2. SUSY Searches at the Tevatron

    Energy Technology Data Exchange (ETDEWEB)

    Zivkovic, L.

    2011-07-01

    In this article results from supersymmetry searches at D0 and CDF are reported. Searches for third generation squarks, searches for gauginos, and searches for models with R-parity violation are described. As no signs of supersymmetry for these models are observed, the most stringent limits to date are presented.

  3. The Search for Planet Nine

    Science.gov (United States)

    Brown, Michael E.; Batygin, Konstantin

    2016-10-01

    We use an extensive suite of numerical simulations to constrain the mass and orbit of Planet Nine, and we use these constraints to begin the search for this newly proposed planet in new and in archival data. Here, we compare our simulations to the observed population of aligned eccentric high semimajor axis Kuiper belt objects and determine which simulation parameters are statistically compatible with the observations. We find that only a narrow range of orbital elements can reproduce the observations. In particular, the combination of semimajor axis, eccentricity, and mass of Planet Nine strongly dictates the semimajor axis range of the orbital confinement of the distant eccentric Kuiper belt objects. Allowed orbits, which confine Kuiper belt objects with semimajor axis beyond 380 AU, have perihelia roughly between 150 and 350 AU, semimajor axes between 380 and 980 AU, and masses between 5 and 20 Earth masses. Orbitally confined objects also generally have orbital planes similar to that of the planet, suggesting that the planet is inclined approximately 30 degrees to the ecliptic. We compare the allowed orbital positions and estimated brightness of Planet Nine to previous and ongoing surveys which would be sensitive to the planet's detection and use these surveys to rule out approximately two-thirds of the planet's orbit. Planet Nine is likely near aphelion with an approximate brightness of 22hours. We discuss the state of our current and archival searches for this newly predicted planet.

  4. Searching for What I Want

    DEFF Research Database (Denmark)

    Liu, Fei; Xiao, Bo Sophia; Lim, Eric

    2016-01-01

    of anticipa-tory system as our theoretical foundation to articulate the relationships between two salient types of search controls, namely search anticipation and search efficiency. We empirically validate our re-search model by conducting a field survey with 77 university students on an online restaurant......Inefficiencies associated with online information search are amplifying in the current era of big data. Despite growing scholarly interest in studying Internet users’ information search behaviour, there is a paucity of theory-guided investigation in this regard. In this paper, we draw on the theory...... review website that is modelled after its actual counterpart and populated with real restaurant review data. Findings from this study suggest that both search determination control and search manipulation con-trol enhance search result anticipation, which in turn improves search efficiency. Theoretical...

  5. Mining Object Similarity for Predicting Next Locations

    Institute of Scientific and Technical Information of China (English)

    Meng Chen; Xiaohui Yu; Yang Liu

    2016-01-01

    Next location prediction is of great importance for many location-based applications. With the virtue of solid theoretical foundations, Markov-based approaches have gained success along this direction. In this paper, we seek to enhance the prediction performance by understanding the similarity between objects. In particular, we propose a novel method, called weighted Markov model (weighted-MM), which exploits both the sequence of just-passed locations and the object similarity in mining the mobility patterns. To this end, we first train a Markov model for each object with its own trajectory records, and then quantify the similarities between different objects from two aspects: spatial locality similarity and trajectory similarity. Finally, we incorporate the object similarity into the Markov model by considering the similarity as the weight of the probability of reaching each possible next location, and return the top-rankings as results. We have conducted extensive experiments on a real dataset, and the results demonstrate significant improvements in prediction accuracy over existing solutions.

  6. Measure of Node Similarity in Multilayer Networks.

    Directory of Open Access Journals (Sweden)

    Anders Mollgaard

    Full Text Available The weight of links in a network is often related to the similarity of the nodes. Here, we introduce a simple tunable measure for analysing the similarity of nodes across different link weights. In particular, we use the measure to analyze homophily in a group of 659 freshman students at a large university. Our analysis is based on data obtained using smartphones equipped with custom data collection software, complemented by questionnaire-based data. The network of social contacts is represented as a weighted multilayer network constructed from different channels of telecommunication as well as data on face-to-face contacts. We find that even strongly connected individuals are not more similar with respect to basic personality traits than randomly chosen pairs of individuals. In contrast, several socio-demographics variables have a significant degree of similarity. We further observe that similarity might be present in one layer of the multilayer network and simultaneously be absent in the other layers. For a variable such as gender, our measure reveals a transition from similarity between nodes connected with links of relatively low weight to dis-similarity for the nodes connected by the strongest links. We finally analyze the overlap between layers in the network for different levels of acquaintanceships.

  7. Efficient Privacy Preserving Protocols for Similarity Join

    Directory of Open Access Journals (Sweden)

    Bilal Hawashin

    2012-04-01

    Full Text Available During the similarity join process, one or more sources may not allow sharing its data with other sources. In this case, a privacy preserving similarity join is required. We showed in our previous work [4] that using long attributes, such as paper abstracts, movie summaries, product descriptions, and user feedbacks, could improve the similarity join accuracy using supervised learning. However, the existing secure protocols for similarity join methods can not be used to join sources using these long attributes. Moreover, the majority of the existing privacy‐preserving protocols do not consider the semantic similarities during the similarity join process. In this paper, we introduce a secure efficient protocol to semantically join sources when the join attributes are long attributes. We provide two secure protocols for both scenarios when a training set exists and when there is no available training set. Furthermore, we introduced the multi‐label supervised secure protocol and the expandable supervised secure protocol. Results show that our protocols can efficiently join sources using the long attributes by considering the semantic relationships among the long string values. Therefore, it improves the overall secure similarity join performance.

  8. Measure of Node Similarity in Multilayer Networks

    Science.gov (United States)

    Mollgaard, Anders; Zettler, Ingo; Dammeyer, Jesper; Jensen, Mogens H.; Lehmann, Sune; Mathiesen, Joachim

    2016-01-01

    The weight of links in a network is often related to the similarity of the nodes. Here, we introduce a simple tunable measure for analysing the similarity of nodes across different link weights. In particular, we use the measure to analyze homophily in a group of 659 freshman students at a large university. Our analysis is based on data obtained using smartphones equipped with custom data collection software, complemented by questionnaire-based data. The network of social contacts is represented as a weighted multilayer network constructed from different channels of telecommunication as well as data on face-to-face contacts. We find that even strongly connected individuals are not more similar with respect to basic personality traits than randomly chosen pairs of individuals. In contrast, several socio-demographics variables have a significant degree of similarity. We further observe that similarity might be present in one layer of the multilayer network and simultaneously be absent in the other layers. For a variable such as gender, our measure reveals a transition from similarity between nodes connected with links of relatively low weight to dis-similarity for the nodes connected by the strongest links. We finally analyze the overlap between layers in the network for different levels of acquaintanceships. PMID:27300084

  9. Radioastronomical Searches for Instellar Biomolecules

    Science.gov (United States)

    Kuan, Y.-J.; Huang, H.-C.; Charnley, S. B.; Markwick, A.; Botta, O.; Ehrenfreund, P.; Kisiel, Z.; Butner, H. M.

    2003-01-01

    Impacts of comets and asteroids could have delivered large amounts of organic matter to the early Earth. to retain a significant interstellar signature; observations of recent bright comets indicate that they have a molecular inventory consistent with their ices being largely unmodified interstellar material. Many simple organic molecules with biochemical significance observed in circumstellar envelopes and in molecular clouds, similar to that from which the Solar System formed, may have acted as the precursors of the more complex organics found in meteorites. Therefore, there is potentially a strong link between interstellar organics and prebiotic chemical evolution. Radioastronomical observations, particularly at millimeter wavelengths, allow us to determine the chemical composition and characteristics of the molecular inventory in interstellar space. Here we report some of our recent results from extensive astronomical searches for astrobiologically-important interstellar organics.

  10. GPU accelerated chemical similarity calculation for compound library comparison.

    Science.gov (United States)

    Ma, Chao; Wang, Lirong; Xie, Xiang-Qun

    2011-07-25

    Chemical similarity calculation plays an important role in compound library design, virtual screening, and "lead" optimization. In this manuscript, we present a novel GPU-accelerated algorithm for all-vs-all Tanimoto matrix calculation and nearest neighbor search. By taking advantage of multicore GPU architecture and CUDA parallel programming technology, the algorithm is up to 39 times superior to the existing commercial software that runs on CPUs. Because of the utilization of intrinsic GPU instructions, this approach is nearly 10 times faster than existing GPU-accelerated sparse vector algorithm, when Unity fingerprints are used for Tanimoto calculation. The GPU program that implements this new method takes about 20 min to complete the calculation of Tanimoto coefficients between 32 M PubChem compounds and 10K Active Probes compounds, i.e., 324G Tanimoto coefficients, on a 128-CUDA-core GPU.

  11. Collaborative Personalized Web Recommender System using Entropy based Similarity Measure

    CERN Document Server

    Mehta, Harita; Bedi, Punam; Dixit, V S

    2012-01-01

    On the internet, web surfers, in the search of information, always strive for recommendations. The solutions for generating recommendations become more difficult because of exponential increase in information domain day by day. In this paper, we have calculated entropy based similarity between users to achieve solution for scalability problem. Using this concept, we have implemented an online user based collaborative web recommender system. In this model based collaborative system, the user session is divided into two levels. Entropy is calculated at both the levels. It is shown that from the set of valuable recommenders obtained at level I; only those recommenders having lower entropy at level II than entropy at level I, served as trustworthy recommenders. Finally, top N recommendations are generated from such trustworthy recommenders for an online user.

  12. Faster and More Accurate Sequence Alignment with SNAP

    CERN Document Server

    Zaharia, Matei; Curtis, Kristal; Fox, Armando; Patterson, David; Shenker, Scott; Stoica, Ion; Karp, Richard M; Sittler, Taylor

    2011-01-01

    We present the Scalable Nucleotide Alignment Program (SNAP), a new short and long read aligner that is both more accurate (i.e., aligns more reads with fewer errors) and 10-100x faster than state-of-the-art tools such as BWA. Unlike recent aligners based on the Burrows-Wheeler transform, SNAP uses a simple hash index of short seed sequences from the genome, similar to BLAST's. However, SNAP greatly reduces the number and cost of local alignment checks performed through several measures: it uses longer seeds to reduce the false positive locations considered, leverages larger memory capacities to speed index lookup, and excludes most candidate locations without fully computing their edit distance to the read. The result is an algorithm that scales well for reads from one hundred to thousands of bases long and provides a rich error model that can match classes of mutations (e.g., longer indels) that today's fast aligners ignore. We calculate that SNAP can align a dataset with 30x coverage of a human genome in le...

  13. Accurate measurement of streamwise vortices in low speed aerodynamic flows

    Science.gov (United States)

    Waldman, Rye M.; Kudo, Jun; Breuer, Kenneth S.

    2010-11-01

    Low Reynolds number experiments with flapping animals (such as bats and small birds) are of current interest in understanding biological flight mechanics, and due to their application to Micro Air Vehicles (MAVs) which operate in a similar parameter space. Previous PIV wake measurements have described the structures left by bats and birds, and provided insight to the time history of their aerodynamic force generation; however, these studies have faced difficulty drawing quantitative conclusions due to significant experimental challenges associated with the highly three-dimensional and unsteady nature of the flows, and the low wake velocities associated with lifting bodies that only weigh a few grams. This requires the high-speed resolution of small flow features in a large field of view using limited laser energy and finite camera resolution. Cross-stream measurements are further complicated by the high out-of-plane flow which requires thick laser sheets and short interframe times. To quantify and address these challenges we present data from a model study on the wake behind a fixed wing at conditions comparable to those found in biological flight. We present a detailed analysis of the PIV wake measurements, discuss the criteria necessary for accurate measurements, and present a new dual-plane PIV configuration to resolve these issues.

  14. Similarity-based pattern analysis and recognition

    CERN Document Server

    Pelillo, Marcello

    2013-01-01

    This accessible text/reference presents a coherent overview of the emerging field of non-Euclidean similarity learning. The book presents a broad range of perspectives on similarity-based pattern analysis and recognition methods, from purely theoretical challenges to practical, real-world applications. The coverage includes both supervised and unsupervised learning paradigms, as well as generative and discriminative models. Topics and features: explores the origination and causes of non-Euclidean (dis)similarity measures, and how they influence the performance of traditional classification alg

  15. A new method to search for high-redshift clusters using photometric redshifts

    Energy Technology Data Exchange (ETDEWEB)

    Castignani, G.; Celotti, A. [SISSA, Via Bonomea 265, I-34136 Trieste (Italy); Chiaberge, M.; Norman, C., E-mail: castigna@sissa.it [Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218 (United States)

    2014-09-10

    We describe a new method (Poisson probability method, PPM) to search for high-redshift galaxy clusters and groups by using photometric redshift information and galaxy number counts. The method relies on Poisson statistics and is primarily introduced to search for megaparsec-scale environments around a specific beacon. The PPM is tailored to both the properties of the FR I radio galaxies in the Chiaberge et al. sample, which are selected within the COSMOS survey, and to the specific data set used. We test the efficiency of our method of searching for cluster candidates against simulations. Two different approaches are adopted. (1) We use two z ∼ 1 X-ray detected cluster candidates found in the COSMOS survey and we shift them to higher redshift up to z = 2. We find that the PPM detects the cluster candidates up to z = 1.5, and it correctly estimates both the redshift and size of the two clusters. (2) We simulate spherically symmetric clusters of different size and richness, and we locate them at different redshifts (i.e., z = 1.0, 1.5, and 2.0) in the COSMOS field. We find that the PPM detects the simulated clusters within the considered redshift range with a statistical 1σ redshift accuracy of ∼0.05. The PPM is an efficient alternative method for high-redshift cluster searches that may also be applied to both present and future wide field surveys such as SDSS Stripe 82, LSST, and Euclid. Accurate photometric redshifts and a survey depth similar or better than that of COSMOS (e.g., I < 25) are required.

  16. Abyss or Shelter? On the Relevance of Web Search Engines' Search Results When People Google for Suicide.

    Science.gov (United States)

    Haim, Mario; Arendt, Florian; Scherr, Sebastian

    2017-02-01

    Despite evidence that suicide rates can increase after suicides are widely reported in the media, appropriate depictions of suicide in the media can help people to overcome suicidal crises and can thus elicit preventive effects. We argue on the level of individual media users that a similar ambivalence can be postulated for search results on online suicide-related search queries. Importantly, the filter bubble hypothesis (Pariser, 2011) states that search results are biased by algorithms based on a person's previous search behavior. In this study, we investigated whether suicide-related search queries, including either potentially suicide-preventive or -facilitative terms, influence subsequent search results. This might thus protect or harm suicidal Internet users. We utilized a 3 (search history: suicide-related harmful, suicide-related helpful, and suicide-unrelated) × 2 (reactive: clicking the top-most result link and no clicking) experimental design applying agent-based testing. While findings show no influences either of search histories or of reactivity on search results in a subsequent situation, the presentation of a helpline offer raises concerns about possible detrimental algorithmic decision-making: Algorithms "decided" whether or not to present a helpline, and this automated decision, then, followed the agent throughout the rest of the observation period. Implications for policy-making and search providers are discussed.

  17. Effects of antecedent variables on disruptive behavior and accurate responding in young children in outpatient settings.

    Science.gov (United States)

    Boelter, Eric W; Wacker, David P; Call, Nathan A; Ringdahl, Joel E; Kopelman, Todd; Gardner, Andrew W

    2007-01-01

    The effects of manipulations of task variables on inaccurate responding and disruption were investigated with 3 children who engaged in noncompliance. With 2 children in an outpatient clinic, task directives were first manipulated to identify directives that guided accurate responding; then, additional dimensions of the task were manipulated to evaluate their influence on disruptive behavior. With a 3rd child, similar procedures were employed at school. Results showed one-step directives set the occasion for accurate responding and that other dimensions of the task (e.g., preference) functioned as motivating operations for negative reinforcement.

  18. Searching for supersymmetric scalelessly

    Energy Technology Data Exchange (ETDEWEB)

    Schlaffer, M. [Deutsches Elektronen-Synchrotron (DESY), Hamburg (Germany); Weizmann Institute of Science, Rehovot (Israel). Dept. of Patricle Physics and Astrophysics; Spannowsky, M. [Durham Univ. (United Kingdom). Inst. for Particle Physics Phenomenology; Weiler, A. [Technische Univ. Muenchen, Garching (Germany). Physik Dept. T75

    2016-03-15

    In this paper we propose a scale invariant search strategy for hadronic top or bottom plus missing energy final states. We present a method which shows flat efficiencies and background rejection factors over broad ranges of parameters and masses. The resulting search can be easily recast into a limit on alternative models. We show the strength of the method in a natural SUSY setup where stop and sbottom squarks are pair produced and decay into hadronically decaying top quarks or bottom quarks and higgsinos.

  19. Searching for supersymmetry scalelessly

    CERN Document Server

    Schlaffer, Matthias; Weiler, Andreas

    2016-01-01

    In this paper we propose a scale invariant search strategy for hadronic top or bottom plus missing energy final states. We present a method which shows flat efficiencies and background rejection factors over broad ranges of parameters and masses. The resulting search can be easily recast into a limit on alternative models. We show the strength of the method in a natural SUSY setup where stop and sbottom squarks are pair produced and decay into hadronically decaying top quarks or bottom quarks and higgsinos.

  20. Direct policy search

    DEFF Research Database (Denmark)

    Heidrich-Meisner, V.; Igel, Christian

    2010-01-01

    process. Exploration is realized by stochastic perturbations, which can be applied at different levels. When considering direct policy search in the space of neural network policies, exploration can be applied on the synaptic level or on the level of neuronal activity. We propose neuroevolution strategies...... (NeuroESs) for direct policy search in RL. Learning using NeuroESs can be interpreted as modelling of extrinsic perturbations on the level of synaptic weights. In contrast, policy gradient methods (PGMs) can be regarded as intrinsic perturbation of neuronal activity. We compare these two approaches...

  1. Supersymmetry searches in ATLAS

    CERN Document Server

    Meloni, Federico; The ATLAS collaboration

    2015-01-01

    Despite the absence of experimental evidence, weak scale supersymmetry remains one of the best motivated and studied Standard Model extensions. This talk summarises recent ATLAS results for searches for supersymmetric (SUSY) particles. Weak and strong production in both R-Parity conserving and R-Parity violating SUSY scenarios are considered. The searches involved final states including jets, missing transverse momentum, light leptons, taus or photons, as well as long-lived particle signatures. Sensitivity projections for the data that will be collected in 2015 are also presented.

  2. Supersymmetry searches in ATLAS

    CERN Document Server

    Meloni, Federico; The ATLAS collaboration

    2015-01-01

    This document summarises recent ATLAS results for searches for supersymmetric particles using LHC proton-proton collision data. Despite the absence of experimental evidence, weak scale supersymmetry remains one of the best motivated and studied Standard Model extensions. We consider both R-Parity conserving and R-Parity violating SUSY scenarios. The searches involve final states including jets, missing transverse momentum, light leptons, taus or photons, as well as long-lived particle signatures. Sensitivity projections for the data that will be collected in 2015 are also presented.

  3. Search for $2\

    CERN Document Server

    :,; Auty, D J; Barbeau, P S; Beck, D; Belov, V; Breidenbach, M; Brunner, T; Burenkov, A; Cao, G F; Chambers, C; Chaves, J; Cleveland, B; Coon, M; Craycraft, A; Daniels, T; Danilov, M; Daugherty, S J; Davis, J; Delaquis, S; Der Mesrobian-Kabakian, A; DeVoe, R; Didberidze, T; Dilling, J; Dolgolenko, A; Dolinski, M J; Dunford, M; Fairbank, W; Farine, J; Feldmeier, W; Feyzbakhsh, S; Fierlinger, P; Fudenberg, D; Gornea, R; Graham, K; Gratta, G; Hall, C; Hughes, M; Jewell, M J; Johnson, A; Johnson, T N; Johnston, S; Karelin, A; Kaufman, L J; Killick, R; King, J; Koffas, T; Kravitz, S; Krücken, R; Kuchenkov, A; Kumar, K S; Leonard, D S; Licciardi, C; Lin, Y H; Ling, J; MacLellan, R; Marino, M G; Mong, B; Moore, D; Njoya, O; Nelson, R; Odian, A; Ostrovskiy, I; Piepke, A; Pocar, A; Prescott, C Y; Retière, F; Rowson, P C; Russell, J J; Schubert, A; Sinclair, D; Smith, E; Stekhanov, V; Tarka, M; Tolba, T; Tsang, R; Twelker, K; Vogel, P; Vuilleumier, J -L; Waite, A; Walton, J; Walton, T; Weber, M; Wen, L J; Wichoski, U; Winick, T A; Wood, J; Xu, Q Y; Yang, L; Yen, Y -R; Zeldovich, O Ya

    2015-01-01

    EXO-200 is a single phase liquid xenon detector designed to search for neutrinoless double-beta decay of $^{136}$Xe to the ground state of $^{136}$Ba. We report here on a search for the two-neutrino double-beta decay of $^{136}$Xe to the first $0^+$ excited state, $0^+_1$, of $^{136}$Ba based on a 100 kg$\\cdot$yr exposure of $^{136}$Xe. Using a specialized analysis employing a machine learning algorithm, we obtain a 90% CL half-life sensitivity of $1.7 \\times 10^{24}$ yr. We find no statistically significant evidence for the $2\

  4. Decoherence in Search Algorithms

    CERN Document Server

    Abal, G; Marquezino, F L; Oliveira, A C; Portugal, R

    2009-01-01

    Recently several quantum search algorithms based on quantum walks were proposed. Those algorithms differ from Grover's algorithm in many aspects. The goal is to find a marked vertex in a graph faster than classical algorithms. Since the implementation of those new algorithms in quantum computers or in other quantum devices is error-prone, it is important to analyze their robustness under decoherence. In this work we analyze the impact of decoherence on quantum search algorithms implemented on two-dimensional grids and on hypercubes.

  5. Searching for excellence

    Science.gov (United States)

    Yager, Robert E.

    Visits to six school districts which were identified by the National Science Teachers Association's Search for Excellence program were made during 1983 by teams of 17 researchers. The reports were analyzed in search for common characteristics that can explain the requirements necessary for excellent science programs. The results indicate that creative ideas, administrative and community involvement, local ownership and pride, and well-developed in-service programs and implementation strategies are vital. Exceptional teachers with boundless energies also seem to exist where exemplary science programs are found.

  6. Search for glueballs

    Energy Technology Data Exchange (ETDEWEB)

    Toki, W. [Colorado State Univ., Ft. Collins, CO (United States). Dept. of Physics

    1997-06-01

    In these Summer School lectures, the author reviews the results of recent glueball searches. He begins with a brief review of glueball phenomenology and meson spectroscopy, including a discussion of resonance behavior. The results on the f{sub o}(1500) and f{sub J}(1700) resonances from proton-antiproton experiments and radiative J/{Psi} decays are discussed. Finally, {pi}{pi} and {eta}{pi} studies from D{sub s} decays and exotic meson searches are reviewed. 46 refs., 40 figs.

  7. Critically damped quantum search.

    Science.gov (United States)

    Mizel, Ari

    2009-04-17

    Although measurement and unitary processes can accomplish any quantum evolution in principle, thinking in terms of dissipation and damping can be powerful. We propose a modification of Grover's algorithm in which the idea of damping plays a natural role. Remarkably, we find that there is a critical damping value that divides between the quantum O(sqrt[N]) and classical O(N) search regimes. In addition, by allowing the damping to vary in a fashion we describe, one obtains a fixed-point quantum search algorithm in which ignorance of the number of targets increases the number of oracle queries only by a factor of 1.5.

  8. Searching for supersymmetry scalelessly

    Energy Technology Data Exchange (ETDEWEB)

    Schlaffer, M. [DESY, Hamburg (Germany); Weizmann Institute of Science, Department of Particle Physics and Astrophysics, Rehovot (Israel); Spannowsky, M. [Durham University, Department of Physics, Institute for Particle Physics Phenomenology, Durham (United Kingdom); Weiler, A. [Technische Universitaet Muenchen, Physik Department T75, Garching (Germany)

    2016-08-15

    In this paper we propose a scale invariant search strategy for hadronic top or bottom plus missing energy final states. We present a method which shows flat efficiencies and background rejection factors over broad ranges of parameters and masses. The resulting search can easily be recast into a limit on alternative models. We show the strength of the method in a natural SUSY setup where stop and sbottom squarks are pair produced and decay into hadronically decaying top quarks or bottom quarks and higgsinos. (orig.)

  9. Distances and similarities in intuitionistic fuzzy sets

    CERN Document Server

    Szmidt, Eulalia

    2014-01-01

    This book presents the state-of-the-art in theory and practice regarding similarity and distance measures for intuitionistic fuzzy sets. Quantifying similarity and distances is crucial for many applications, e.g. data mining, machine learning, decision making, and control. The work provides readers with a comprehensive set of theoretical concepts and practical tools for both defining and determining similarity between intuitionistic fuzzy sets. It describes an automatic algorithm for deriving intuitionistic fuzzy sets from data, which can aid in the analysis of information in large databases. The book also discusses other important applications, e.g. the use of similarity measures to evaluate the extent of agreement between experts in the context of decision making.

  10. Discovering Music Structure via Similarity Fusion

    DEFF Research Database (Denmark)

    Arenas-García, Jerónimo; Parrado-Hernandez, Emilio; Meng, Anders;

    Automatic methods for music navigation and music recommendation exploit the structure in the music to carry out a meaningful exploration of the “song space”. To get a satisfactory performance from such systems, one should incorporate as much information about songs similarity as possible; however...... semantics”, in such a way that all observed similarities can be satisfactorily explained using the latent semantics. Therefore, one can think of these semantics as the real structure in music, in the sense that they can explain the observed similarities among songs. The suitability of the PLSA model...... for representing music structure is studied in a simplified scenario consisting of 4412 songs and two similarity measures among them. The results suggest that the PLSA model is a useful framework to combine different sources of information, and provides a reasonable space for song representation....

  11. Similarity Theory of Withdrawn Water Temperature Experiment

    Directory of Open Access Journals (Sweden)

    Yunpeng Han

    2015-01-01

    Full Text Available Selective withdrawal from a thermal stratified reservoir has been widely utilized in managing reservoir water withdrawal. Besides theoretical analysis and numerical simulation, model test was also necessary in studying the temperature of withdrawn water. However, information on the similarity theory of the withdrawn water temperature model remains lacking. Considering flow features of selective withdrawal, the similarity theory of the withdrawn water temperature model was analyzed theoretically based on the modification of governing equations, the Boussinesq approximation, and some simplifications. The similarity conditions between the model and the prototype were suggested. The conversion of withdrawn water temperature between the model and the prototype was proposed. Meanwhile, the fundamental theory of temperature distribution conversion was firstly proposed, which could significantly improve the experiment efficiency when the basic temperature of the model was different from the prototype. Based on the similarity theory, an experiment was performed on the withdrawn water temperature which was verified by numerical method.

  12. Distance and Similarity Measures for Soft Sets

    CERN Document Server

    Kharal, Athar

    2010-01-01

    In [P. Majumdar, S. K. Samanta, Similarity measure of soft sets, New Mathematics and Natural Computation 4(1)(2008) 1-12], the authors use matrix representation based distances of soft sets to introduce matching function and distance based similarity measures. We first give counterexamples to show that their Definition 2.7 and Lemma 3.5(3) contain errors, then improve their Lemma 4.4 making it a corllary of our result. The fundamental assumption of Majumdar et al has been shown to be flawed. This motivates us to introduce set operations based measures. We present a case (Example 28) where Majumdar-Samanta similarity measure produces an erroneous result but the measure proposed herein decides correctly. Several properties of the new measures have been presented and finally the new similarity measures have been applied to the problem of financial diagnosis of firms.

  13. Bilateral Trade Flows and Income Distribution Similarity.

    Science.gov (United States)

    Martínez-Zarzoso, Inmaculada; Vollmer, Sebastian

    2016-01-01

    Current models of bilateral trade neglect the effects of income distribution. This paper addresses the issue by accounting for non-homothetic consumer preferences and hence investigating the role of income distribution in the context of the gravity model of trade. A theoretically justified gravity model is estimated for disaggregated trade data (Dollar volume is used as dependent variable) using a sample of 104 exporters and 108 importers for 1980-2003 to achieve two main goals. We define and calculate new measures of income distribution similarity and empirically confirm that greater similarity of income distribution between countries implies more trade. Using distribution-based measures as a proxy for demand similarities in gravity models, we find consistent and robust support for the hypothesis that countries with more similar income-distributions trade more with each other. The hypothesis is also confirmed at disaggregated level for differentiated product categories.

  14. Interpersonal Congruency, Attitude Similarity, and Interpersonal Attraction

    Science.gov (United States)

    Touhey, John C.

    1975-01-01

    As no experimental study has examined the effects of congruency on attraction, the present investigation orthogonally varied attitude similarity and interpersonal congruency in order to compare the two independent variables as determinants of interpersonal attraction. (Author/RK)

  15. Correlation between social proximity and mobility similarity

    CERN Document Server

    Fan, Chao; Huang, Junming; Rong, Zhihai; Zhou, Tao

    2016-01-01

    Human behaviors exhibit ubiquitous correlations in many aspects, such as individual and collective levels, temporal and spatial dimensions, content, social and geographical layers. With rich Internet data of online behaviors becoming available, it attracts academic interests to explore human mobility similarity from the perspective of social network proximity. Existent analysis shows a strong correlation between online social proximity and offline mobility similari- ty, namely, mobile records between friends are significantly more similar than between strangers, and those between friends with common neighbors are even more similar. We argue the importance of the number and diversity of com- mon friends, with a counter intuitive finding that the number of common friends has no positive impact on mobility similarity while the diversity plays a key role, disagreeing with previous studies. Our analysis provides a novel view for better understanding the coupling between human online and offline behaviors, and will...

  16. Exploiting Data Similarity to Reduce Memory Footprints

    Science.gov (United States)

    2011-01-01

    leslie3d Fortran Computational Fluid Dynamics (CFD) application 122. tachyon C Parallel Ray Tracing application 128.GAPgeofem C and Fortran Simulates...benefits most from SBLLmalloc; LAMMPS, which shows moderate similarity from primarily zero pages; and 122. tachyon , a parallel ray- tracing application...similarity across MPI tasks. They primarily are zero- pages although a small fraction (≈10%) are non-zero pages. 122. tachyon is an image rendering

  17. Some more similarities between Peirce and Skinner

    OpenAIRE

    Moxley, Roy A.

    2002-01-01

    C. S. Peirce is noted for pioneering a variety of views, and the case is made here for the similarities and parallels between his views and B. F. Skinner's radical behaviorism. In addition to parallels previously noted, these similarities include an advancement of experimental science, a behavioral psychology, a shift from nominalism to realism, an opposition to positivism, a selectionist account for strengthening behavior, the importance of a community of selves, a recursive approach to meth...

  18. Interlinguistic similarity and language death dynamics

    CERN Document Server

    Mira, J

    2005-01-01

    We analyze the time evolution of a system of two coexisting languages (Castillian Spanish and Galician, both spoken in northwest Spain) in the framework of a model given by Abrams and Strogatz [Nature 424, 900 (2003)]. It is shown that, contrary to the model's initial prediction, a stable bilingual situation is possible if the languages in competition are similar enough. Similarity is described with a simple parameter, whose value can be estimated from fits of the data.

  19. On distributional assumptions and whitened cosine similarities

    DEFF Research Database (Denmark)

    Loog, Marco

    2008-01-01

    Recently, an interpretation of the whitened cosine similarity measure as a Bayes decision rule was proposed (C. Liu, "The Bayes Decision Rule Induced Similarity Measures,'' IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1086-1090, June 2007. This communication makes th...... the observation that some of the distributional assumptions made to derive this measure are very restrictive and, considered simultaneously, even inconsistent....

  20. Motives for e-marketplace participation: differences and similarities between buyers and suppliers

    DEFF Research Database (Denmark)

    Rask, Morten; Kragh, Hanne

    2004-01-01

    -marketplaces to find new or alternative suppliers. Similarly, even though demands from existing customers have spurred their initial decision to participate in e-marketplaces, many suppliers also use the marketplaces to search for new customers. When expressing their motives for engaging in e-marketplace activities...