WorldWideScience

Sample records for searching article retrieval

  1. Impact of PubMed search filters on the retrieval of evidence by physicians.

    Science.gov (United States)

    Shariff, Salimah Z; Sontrop, Jessica M; Haynes, R Brian; Iansavichus, Arthur V; McKibbon, K Ann; Wilczynski, Nancy L; Weir, Matthew A; Speechley, Mark R; Thind, Amardeep; Garg, Amit X

    2012-02-21

    Physicians face challenges when searching PubMed for research evidence, and they may miss relevant articles while retrieving too many nonrelevant articles. We investigated whether the use of search filters in PubMed improves searching by physicians. We asked a random sample of Canadian nephrologists to answer unique clinical questions derived from 100 systematic reviews of renal therapy. Physicians provided the search terms that they would type into PubMed to locate articles to answer these questions. We entered the physician-provided search terms into PubMed and applied two types of search filters alone or in combination: a methods-based filter designed to identify high-quality studies about treatment (clinical queries "therapy") and a topic-based filter designed to identify studies with renal content. We evaluated the comprehensiveness (proportion of relevant articles found) and efficiency (ratio of relevant to nonrelevant articles) of the filtered and nonfiltered searches. Primary studies included in the systematic reviews served as the reference standard for relevant articles. The average physician-provided search terms retrieved 46% of the relevant articles, while 6% of the retrieved articles were relevant (corrected) (the ratio of relevant to nonrelevant articles was 1:16). The use of both filters together produced a marked improvement in efficiency, resulting in a ratio of relevant to nonrelevant articles of 1:5 (16 percentage point improvement; 99% confidence interval 9% to 22%; p PubMed search filters improves the efficiency of physician searches. Improved search performance may enhance the transfer of research into practice and improve patient care.

  2. Using PubMed search strings for efficient retrieval of manual therapy research literature.

    Science.gov (United States)

    Pillastrini, Paolo; Vanti, Carla; Curti, Stefania; Mattioli, Stefano; Ferrari, Silvano; Violante, Francesco Saverio; Guccione, Andrew

    2015-02-01

    The aim of this study was to construct PubMed search strings that could efficiently retrieve studies on manual therapy (MT), especially for time-constrained clinicians. Our experts chose 11 Medical Subject Heading terms describing MT along with 84 additional potential terms. For each term that was able to retrieve more than 100 abstracts, we systematically extracted a sample of abstracts from which we estimated the proportion of studies potentially relevant to MT. We then constructed 2 search strings: 1 narrow (threshold of pertinent articles ≥40%) and 1 expanded (including all terms for which a proportion had been calculated). We tested these search strings against articles on 2 conditions relevant to MT (thoracic and temporomandibular pain). We calculated the number of abstracts needed to read (NNR) to identify 1 potentially pertinent article in the context of these conditions. Finally, we evaluated the efficiency of the proposed PubMed search strings to identify relevant articles included in a systematic review on spinal manipulative therapy for chronic low back pain. Fifty-five search terms were able to extract more than 100 citations. The NNR to find 1 potentially pertinent article using the narrow string was 1.2 for thoracic pain and 1.3 for temporomandibular pain, and the NNR for the expanded string was 1.9 and 1.6, respectively. The narrow search strategy retrieved all the randomized controlled trials included in the systematic review selected for comparison. The proposed PubMed search strings may help health care professionals locate potentially pertinent articles and review a large number of MT studies efficiently to better implement evidence-based practice. Copyright © 2015 National University of Health Sciences. Published by Elsevier Inc. All rights reserved.

  3. Retrieving clinical evidence: a comparison of PubMed and Google Scholar for quick clinical searches.

    Science.gov (United States)

    Shariff, Salimah Z; Bejaimal, Shayna Ad; Sontrop, Jessica M; Iansavichus, Arthur V; Haynes, R Brian; Weir, Matthew A; Garg, Amit X

    2013-08-15

    Physicians frequently search PubMed for information to guide patient care. More recently, Google Scholar has gained popularity as another freely accessible bibliographic database. To compare the performance of searches in PubMed and Google Scholar. We surveyed nephrologists (kidney specialists) and provided each with a unique clinical question derived from 100 renal therapy systematic reviews. Each physician provided the search terms they would type into a bibliographic database to locate evidence to answer the clinical question. We executed each of these searches in PubMed and Google Scholar and compared results for the first 40 records retrieved (equivalent to 2 default search pages in PubMed). We evaluated the recall (proportion of relevant articles found) and precision (ratio of relevant to nonrelevant articles) of the searches performed in PubMed and Google Scholar. Primary studies included in the systematic reviews served as the reference standard for relevant articles. We further documented whether relevant articles were available as free full-texts. Compared with PubMed, the average search in Google Scholar retrieved twice as many relevant articles (PubMed: 11%; Google Scholar: 22%; PGoogle Scholar: 8%; P=.07). Google Scholar provided significantly greater access to free full-text publications (PubMed: 5%; Google Scholar: 14%; PGoogle Scholar returns twice as many relevant articles as PubMed and provides greater access to free full-text articles.

  4. Information retrieval implementing and evaluating search engines

    CERN Document Server

    Büttcher, Stefan; Cormack, Gordon V

    2016-01-01

    Information retrieval is the foundation for modern search engines. This textbook offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. Wumpus -- a multiuser open-source information retrieval system developed by one of the authors and available online -- provides model implementations and a basis for student work. The modular structure of the book allows instructors to use it in a variety of graduate-level courses, including courses taught from a database systems perspective, traditional information retrieval courses with a focus on IR theory, and courses covering the basics of Web retrieval. In addition to its classroom use, Information Retrieval will be a valuable reference for professionals in computer science, computer engineering, and software engineering.

  5. Affinity between information retrieval system and search topic

    International Nuclear Information System (INIS)

    Ebinuma, Yukio

    1979-01-01

    Ten search profiles are tested on the INIS system at the Japan Atomic Energy Research Institute. The results are plotted on recall-precision chart ranging from 100% recall to 100% precision. The curves are not purely systems-dependent nor search-dependent, and are determined substantially by the ''affinity'' between the system and the search topic. The curves are named ''Affinity curves of search topics with information retrieval systems'', and hence retrieval affinity factors are derived. They are obtained not only for individual search topics but also for averages in the system. By such a quantitative examination, the difference of affinity among search topics in a given system, that of the same search topic among various systems, and that of systems to the same group of search topics can be compared reasonably. (author)

  6. Retrieval of articles in personal computer

    International Nuclear Information System (INIS)

    Choi, Byung Gil; Park, Seog Hee; Kim, Sung Hoon; Shinn, Kyung Sub

    1994-01-01

    Although many useful articles appear in the journals published in Korea, they are not always cited by researchers mainly due to absence of efficient searching system. The authors made a program with 6 predefined filtering forms to detect published articles rapidly and accurately. The programs was coded using database management system CA-Clipper Version 5.2 (Computer Associates International, Inc.) through preliminary work for 1 year. We used 486 DX II (8 Mbyte RAM, VGA, 200 Mbyte Hard Disk). Ink-jet Printer (Hewlett Packard Company), and MS-DOS Version 5.0 (Microsoft Co). We inputted total of 1986 articles published in the Journal of Korea Radiological Society from 1981 to 1993. The searching time was 10 to 15 seconds for each use. We had very flexible user interfaces and simplified searching methods, but more complicated filtering could also be performed. Although the previous version have had some bugs, this upgrade version resolved the problems and fitted in searching articles. The program would be valuable for radiologist in searching articles published not only in the Journal of the Korean Radiological Society, but also in the Journal of the Korean Society of Medicine Ultrasound and the Korean Journal of Nuclear Medicine

  7. Is searching full text more effective than searching abstracts?

    Science.gov (United States)

    Lin, Jimmy

    2009-02-03

    With the growing availability of full-text articles online, scientists and other consumers of the life sciences literature now have the ability to go beyond searching bibliographic records (title, abstract, metadata) to directly access full-text content. Motivated by this emerging trend, I posed the following question: is searching full text more effective than searching abstracts? This question is answered by comparing text retrieval algorithms on MEDLINE abstracts, full-text articles, and spans (paragraphs) within full-text articles using data from the TREC 2007 genomics track evaluation. Two retrieval models are examined: bm25 and the ranking algorithm implemented in the open-source Lucene search engine. Experiments show that treating an entire article as an indexing unit does not consistently yield higher effectiveness compared to abstract-only search. However, retrieval based on spans, or paragraphs-sized segments of full-text articles, consistently outperforms abstract-only search. Results suggest that highest overall effectiveness may be achieved by combining evidence from spans and full articles. Users searching full text are more likely to find relevant articles than searching only abstracts. This finding affirms the value of full text collections for text retrieval and provides a starting point for future work in exploring algorithms that take advantage of rapidly-growing digital archives. Experimental results also highlight the need to develop distributed text retrieval algorithms, since full-text articles are significantly longer than abstracts and may require the computational resources of multiple machines in a cluster. The MapReduce programming model provides a convenient framework for organizing such computations.

  8. Is searching full text more effective than searching abstracts?

    Directory of Open Access Journals (Sweden)

    Lin Jimmy

    2009-02-01

    Full Text Available Abstract Background With the growing availability of full-text articles online, scientists and other consumers of the life sciences literature now have the ability to go beyond searching bibliographic records (title, abstract, metadata to directly access full-text content. Motivated by this emerging trend, I posed the following question: is searching full text more effective than searching abstracts? This question is answered by comparing text retrieval algorithms on MEDLINE® abstracts, full-text articles, and spans (paragraphs within full-text articles using data from the TREC 2007 genomics track evaluation. Two retrieval models are examined: bm25 and the ranking algorithm implemented in the open-source Lucene search engine. Results Experiments show that treating an entire article as an indexing unit does not consistently yield higher effectiveness compared to abstract-only search. However, retrieval based on spans, or paragraphs-sized segments of full-text articles, consistently outperforms abstract-only search. Results suggest that highest overall effectiveness may be achieved by combining evidence from spans and full articles. Conclusion Users searching full text are more likely to find relevant articles than searching only abstracts. This finding affirms the value of full text collections for text retrieval and provides a starting point for future work in exploring algorithms that take advantage of rapidly-growing digital archives. Experimental results also highlight the need to develop distributed text retrieval algorithms, since full-text articles are significantly longer than abstracts and may require the computational resources of multiple machines in a cluster. The MapReduce programming model provides a convenient framework for organizing such computations.

  9. Enhancing Image Retrieval System Using Content Based Search ...

    African Journals Online (AJOL)

    The output shows more efficiency in retrieval because instead of performing the search on the entire image database, the image category option directs the retrieval engine to the specified category. Also, there is provision to update or modify the different image categories in the image database as need arise. Keywords: ...

  10. PageRank without hyperlinks: Reranking with PubMed related article networks for biomedical text retrieval

    Directory of Open Access Journals (Sweden)

    Lin Jimmy

    2008-06-01

    Full Text Available Abstract Background Graph analysis algorithms such as PageRank and HITS have been successful in Web environments because they are able to extract important inter-document relationships from manually-created hyperlinks. We consider the application of these techniques to biomedical text retrieval. In the current PubMed® search interface, a MEDLINE® citation is connected to a number of related citations, which are in turn connected to other citations. Thus, a MEDLINE record represents a node in a vast content-similarity network. This article explores the hypothesis that these networks can be exploited for text retrieval, in the same manner as hyperlink graphs on the Web. Results We conducted a number of reranking experiments using the TREC 2005 genomics track test collection in which scores extracted from PageRank and HITS analysis were combined with scores returned by an off-the-shelf retrieval engine. Experiments demonstrate that incorporating PageRank scores yields significant improvements in terms of standard ranked-retrieval metrics. Conclusion The link structure of content-similarity networks can be exploited to improve the effectiveness of information retrieval systems. These results generalize the applicability of graph analysis algorithms to text retrieval in the biomedical domain.

  11. PageRank without hyperlinks: reranking with PubMed related article networks for biomedical text retrieval.

    Science.gov (United States)

    Lin, Jimmy

    2008-06-06

    Graph analysis algorithms such as PageRank and HITS have been successful in Web environments because they are able to extract important inter-document relationships from manually-created hyperlinks. We consider the application of these techniques to biomedical text retrieval. In the current PubMed(R) search interface, a MEDLINE(R) citation is connected to a number of related citations, which are in turn connected to other citations. Thus, a MEDLINE record represents a node in a vast content-similarity network. This article explores the hypothesis that these networks can be exploited for text retrieval, in the same manner as hyperlink graphs on the Web. We conducted a number of reranking experiments using the TREC 2005 genomics track test collection in which scores extracted from PageRank and HITS analysis were combined with scores returned by an off-the-shelf retrieval engine. Experiments demonstrate that incorporating PageRank scores yields significant improvements in terms of standard ranked-retrieval metrics. The link structure of content-similarity networks can be exploited to improve the effectiveness of information retrieval systems. These results generalize the applicability of graph analysis algorithms to text retrieval in the biomedical domain.

  12. Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles

    Science.gov (United States)

    Liu, Rey-Long

    2015-01-01

    Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling) that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references) in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no) articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations. PMID:26440794

  13. Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles.

    Directory of Open Access Journals (Sweden)

    Rey-Long Liu

    Full Text Available Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations.

  14. Description and search labor for information retrieval

    OpenAIRE

    Warner, Julian

    2007-01-01

    Selection power is taken as the fundamental value for information retrieval systems. Selection power is regarded as produced by selection labor, which itself separates historically into description and search labor. As forms of mental labor, description and search labor participate in the conditions for labor and for mental labor. Concepts and distinctions applicable to physical and mental labor are indicated, introducing the necessity of labor for survival, the idea of technology as a human ...

  15. Information Retrieval for Education: Making Search Engines Language Aware

    Science.gov (United States)

    Ott, Niels; Meurers, Detmar

    2010-01-01

    Search engines have been a major factor in making the web the successful and widely used information source it is today. Generally speaking, they make it possible to retrieve web pages on a topic specified by the keywords entered by the user. Yet web searching currently does not take into account which of the search results are comprehensible for…

  16. Retrieval of publications addressing shared decision making: an evaluation of full-text searches on medical journal websites.

    Science.gov (United States)

    Blanc, Xavier; Collet, Tinh-Hai; Auer, Reto; Iriarte, Pablo; Krause, Jan; Légaré, France; Cornuz, Jacques; Clair, Carole

    2015-04-07

    Full-text searches of articles increase the recall, defined by the proportion of relevant publications that are retrieved. However, this method is rarely used in medical research due to resource constraints. For the purpose of a systematic review of publications addressing shared decision making, a full-text search method was required to retrieve publications where shared decision making does not appear in the title or abstract. The objective of our study was to assess the efficiency and reliability of full-text searches in major medical journals for identifying shared decision making publications. A full-text search was performed on the websites of 15 high-impact journals in general internal medicine to look up publications of any type from 1996-2011 containing the phrase "shared decision making". The search method was compared with a PubMed search of titles and abstracts only. The full-text search was further validated by requesting all publications from the same time period from the individual journal publishers and searching through the collected dataset. The full-text search for "shared decision making" on journal websites identified 1286 publications in 15 journals compared to 119 through the PubMed search. The search within the publisher-provided publications of 6 journals identified 613 publications compared to 646 with the full-text search on the respective journal websites. The concordance rate was 94.3% between both full-text searches. Full-text searching on medical journal websites is an efficient and reliable way to identify relevant articles in the field of shared decision making for review or other purposes. It may be more widely used in biomedical research in other fields in the future, with the collaboration of publishers and journals toward open-access data.

  17. Automatic evidence retrieval for systematic reviews.

    Science.gov (United States)

    Choong, Miew Keen; Galgani, Filippo; Dunn, Adam G; Tsafnat, Guy

    2014-10-01

    Snowballing involves recursively pursuing relevant references cited in the retrieved literature and adding them to the search results. Snowballing is an alternative approach to discover additional evidence that was not retrieved through conventional search. Snowballing's effectiveness makes it best practice in systematic reviews despite being time-consuming and tedious. Our goal was to evaluate an automatic method for citation snowballing's capacity to identify and retrieve the full text and/or abstracts of cited articles. Using 20 review articles that contained 949 citations to journal or conference articles, we manually searched Microsoft Academic Search (MAS) and identified 78.0% (740/949) of the cited articles that were present in the database. We compared the performance of the automatic citation snowballing method against the results of this manual search, measuring precision, recall, and F1 score. The automatic method was able to correctly identify 633 (as proportion of included citations: recall=66.7%, F1 score=79.3%; as proportion of citations in MAS: recall=85.5%, F1 score=91.2%) of citations with high precision (97.7%), and retrieved the full text or abstract for 490 (recall=82.9%, precision=92.1%, F1 score=87.3%) of the 633 correctly retrieved citations. The proposed method for automatic citation snowballing is accurate and is capable of obtaining the full texts or abstracts for a substantial proportion of the scholarly citations in review articles. By automating the process of citation snowballing, it may be possible to reduce the time and effort of common evidence surveillance tasks such as keeping trial registries up to date and conducting systematic reviews.

  18. Searching for evidence or approval? A commentary on database search in systematic reviews and alternative information retrieval methodologies.

    Science.gov (United States)

    Delaney, Aogán; Tamás, Peter A

    2018-03-01

    Despite recognition that database search alone is inadequate even within the health sciences, it appears that reviewers in fields that have adopted systematic review are choosing to rely primarily, or only, on database search for information retrieval. This commentary reminds readers of factors that call into question the appropriateness of default reliance on database searches particularly as systematic review is adapted for use in new and lower consensus fields. It then discusses alternative methods for information retrieval that require development, formalisation, and evaluation. Our goals are to encourage reviewers to reflect critically and transparently on their choice of information retrieval methods and to encourage investment in research on alternatives. Copyright © 2017 John Wiley & Sons, Ltd.

  19. Recent advances in intelligent image search and video retrieval

    CERN Document Server

    2017-01-01

    This book initially reviews the major feature representation and extraction methods and effective learning and recognition approaches, which have broad applications in the context of intelligent image search and video retrieval. It subsequently presents novel methods, such as improved soft assignment coding, Inheritable Color Space (InCS) and the Generalized InCS framework, the sparse kernel manifold learner method, the efficient Support Vector Machine (eSVM), and the Scale-Invariant Feature Transform (SIFT) features in multiple color spaces. Lastly, the book presents clothing analysis for subject identification and retrieval, and performance evaluation methods of video analytics for traffic monitoring. Digital images and videos are proliferating at an amazing speed in the fields of science, engineering and technology, media and entertainment. With the huge accumulation of such data, keyword searches and manual annotation schemes may no longer be able to meet the practical demand for retrieving relevant conte...

  20. Parallel content-based sub-image retrieval using hierarchical searching.

    Science.gov (United States)

    Yang, Lin; Qi, Xin; Xing, Fuyong; Kurc, Tahsin; Saltz, Joel; Foran, David J

    2014-04-01

    The capacity to systematically search through large image collections and ensembles and detect regions exhibiting similar morphological characteristics is central to pathology diagnosis. Unfortunately, the primary methods used to search digitized, whole-slide histopathology specimens are slow and prone to inter- and intra-observer variability. The central objective of this research was to design, develop, and evaluate a content-based image retrieval system to assist doctors for quick and reliable content-based comparative search of similar prostate image patches. Given a representative image patch (sub-image), the algorithm will return a ranked ensemble of image patches throughout the entire whole-slide histology section which exhibits the most similar morphologic characteristics. This is accomplished by first performing hierarchical searching based on a newly developed hierarchical annular histogram (HAH). The set of candidates is then further refined in the second stage of processing by computing a color histogram from eight equally divided segments within each square annular bin defined in the original HAH. A demand-driven master-worker parallelization approach is employed to speed up the searching procedure. Using this strategy, the query patch is broadcasted to all worker processes. Each worker process is dynamically assigned an image by the master process to search for and return a ranked list of similar patches in the image. The algorithm was tested using digitized hematoxylin and eosin (H&E) stained prostate cancer specimens. We have achieved an excellent image retrieval performance. The recall rate within the first 40 rank retrieved image patches is ∼90%. Both the testing data and source code can be downloaded from http://pleiad.umdnj.edu/CBII/Bioinformatics/.

  1. Expert Search Strategies: The Information Retrieval Practices of Healthcare Information Professionals.

    Science.gov (United States)

    Russell-Rose, Tony; Chamberlain, Jon

    2017-10-02

    Healthcare information professionals play a key role in closing the knowledge gap between medical research and clinical practice. Their work involves meticulous searching of literature databases using complex search strategies that can consist of hundreds of keywords, operators, and ontology terms. This process is prone to error and can lead to inefficiency and bias if performed incorrectly. The aim of this study was to investigate the search behavior of healthcare information professionals, uncovering their needs, goals, and requirements for information retrieval systems. A survey was distributed to healthcare information professionals via professional association email discussion lists. It investigated the search tasks they undertake, their techniques for search strategy formulation, their approaches to evaluating search results, and their preferred functionality for searching library-style databases. The popular literature search system PubMed was then evaluated to determine the extent to which their needs were met. The 107 respondents indicated that their information retrieval process relied on the use of complex, repeatable, and transparent search strategies. On average it took 60 minutes to formulate a search strategy, with a search task taking 4 hours and consisting of 15 strategy lines. Respondents reviewed a median of 175 results per search task, far more than they would ideally like (100). The most desired features of a search system were merging search queries and combining search results. Healthcare information professionals routinely address some of the most challenging information retrieval problems of any profession. However, their needs are not fully supported by current literature search systems and there is demand for improved functionality, in particular regarding the development and management of search strategies. ©Tony Russell-Rose, Jon Chamberlain. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 02.10.2017.

  2. Retrieval Search and Strength Evoke Dissociable Brain Activity during Episodic Memory Recall

    Science.gov (United States)

    Reas, Emilie T.; Brewer, James B.

    2014-01-01

    Neuroimaging studies of episodic memory retrieval have revealed activations in the human frontal, parietal, and medial-temporal lobes that are associated with memory strength. However, it remains unclear whether these brain responses are veritable signals of memory strength or are instead regulated by concomitant subcomponents of retrieval such as retrieval effort or mental search. This study used event-related fMRI during cued recall of previously memorized word-pair associates to dissociate brain responses modulated by memory search from those modulated by the strength of a recalled memory. Search-related deactivations, dissociated from activity due to memory strength, were observed in regions of the default network, whereas distinctly strength-dependent activations were present in superior and inferior parietal and dorsolateral PFC. Both search and strength regulated activity in dorsal anterior cingulate and anterior insula. These findings suggest that, although highly correlated and partially subserved by overlapping cognitive control mechanisms, search and memory strength engage dissociable regions of frontoparietal attention and default networks. PMID:23190328

  3. Information retrieval for children based on the aggregated search paradigm

    NARCIS (Netherlands)

    Duarte Torres, Sergio

    This report presents research to develop information services for children by expanding and adapting current Information retrieval technologies according to the search characteristics and needs of children. Concretely, we will employ the aggregated search paradigm as theoretical framework. The

  4. Effects of Diacritics on Web Search Engines’ Performance for Retrieval of Yoruba Documents

    Directory of Open Access Journals (Sweden)

    Toluwase Victor Asubiaro

    2014-06-01

    Full Text Available This paper aims to find out the possible effect of the use or nonuse of diacritics in Yoruba search queries on the performance of major search engines, AOL, Bing, Google and Yahoo!, in retrieving documents. 30 Yoruba queries created from the most searched keywords from Nigeria on Google search logs were submitted to the search engines. The search queries were posed to the search engines without diacritics and then with diacritics. All of the search engines retrieved more sites in response to the queries without diacritics. Also, they all retrieved more precise results for queries without diacritics. The search engines also answered more queries without diacritics. There was no significant difference in the precision values of any two of the four search engines for diacritized and undiacritized queries. There was a significant difference in the effectiveness of AOL and Yahoo when diacritics were applied and when they were not applied. The findings of the study indicate that the search engines do not find a relationship between the diacritized Yoruba words and the undiacritized versions. Therefore, there is a need for search engines to add normalization steps to pre-process Yoruba queries and indexes. This study concentrates on a problem with search engines that has not been previously investigated.

  5. Optimal search filters for renal information in EMBASE.

    Science.gov (United States)

    Iansavichus, Arthur V; Haynes, R Brian; Shariff, Salimah Z; Weir, Matthew; Wilczynski, Nancy L; McKibbon, Ann; Rehman, Faisal; Garg, Amit X

    2010-07-01

    EMBASE is a popular database used to retrieve biomedical information. Our objective was to develop and test search filters to help clinicians and researchers efficiently retrieve articles with renal information in EMBASE. We used a diagnostic test assessment framework because filters operate similarly to screening tests. We divided a sample of 5,302 articles from 39 journals into development and validation sets of articles. Information retrieval properties were assessed by treating each search filter as a "diagnostic test" or screening procedure for the detection of relevant articles. We tested the performance of 1,936,799 search filters made of unique renal terms and their combinations. REFERENCE STANDARD & OUTCOME: The reference standard was manual review of each article. We calculated the sensitivity and specificity of each filter to identify articles with renal information. The best renal filters consisted of multiple search terms, such as "renal replacement therapy," "renal," "kidney disease," and "proteinuria," and the truncated terms "kidney," "dialy," "neph," "glomerul," and "hemodial." These filters achieved peak sensitivities of 98.7% (95% CI, 97.9-99.6) and specificities of 98.5% (95% CI, 98.0-99.0). The retrieval performance of these filters remained excellent in the validation set of independent articles. The retrieval performance of any search will vary depending on the quality of all search concepts used, not just renal terms. We empirically developed and validated high-performance renal search filters for EMBASE. These filters can be programmed into the search engine or used on their own to improve the efficiency of searching.

  6. Understanding vaccination resistance: vaccine search term selection bias and the valence of retrieved information.

    Science.gov (United States)

    Ruiz, Jeanette B; Bell, Robert A

    2014-10-07

    Dubious vaccination-related information on the Internet leads some parents to opt out of vaccinating their children. To determine if negative, neutral and positive search terms retrieve vaccination information that differs in valence and confirms searchers' assumptions about vaccination. A content analysis of first-page Google search results was conducted using three negative, three neutral, and three positive search terms for the concepts "vaccine," "vaccination," and "MMR"; 84 of the 90 websites retrieved met inclusion requirements. Two coders independently and reliably coded for the presence or absence of each of 15 myths about vaccination (e.g., "vaccines cause autism"), statements that countered these myths, and recommendations for or against vaccination. Data were analyzed using descriptive statistics. Across all websites, at least one myth was perpetuated on 16.7% of websites and at least one myth was countered on 64.3% of websites. The mean number of myths perpetuated on websites retrieved with negative, neutral, and positive search terms, respectively, was 1.93, 0.53, and 0.40. The mean number of myths countered on websites retrieved with negative, neutral, and positive search terms, respectively, was 3.0, 3.27, and 2.87. Explicit recommendations regarding vaccination were offered on 22.6% of websites. A recommendation against vaccination was more often made on websites retrieved with negative search terms (37.5% of recommendations) than on websites retrieved with neutral (12.5%) or positive (0%) search terms. The concerned parent who seeks information about the risks of childhood immunizations will find more websites that perpetuate vaccine myths and recommend against vaccination than the parent who seeks information about the benefits of vaccination. This suggests that search term valence can lead to online information that supports concerned parents' misconceptions about vaccines. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Term Relevance Feedback and Mediated Database Searching: Implications for Information Retrieval Practice and Systems Design.

    Science.gov (United States)

    Spink, Amanda

    1995-01-01

    This study uses the human approach to examine the sources and effectiveness of search terms selected during 40 mediated interactive database searches and focuses on determining the retrieval effectiveness of search terms identified by users and intermediaries from retrieved items during term relevance feedback. (Author/JKP)

  8. Personal health records: retrieving contextual information with Google Custom Search.

    Science.gov (United States)

    Ahsan, Mahmud; Seldon, H Lee; Sayeed, Shohel

    2012-01-01

    Ubiquitous personal health records, which can accompany a person everywhere, are a necessary requirement for ubiquitous healthcare. Contextual information related to health events is important for the diagnosis and treatment of disease and for the maintenance of good health, yet it is seldom recorded in a health record. We describe a dual cellphone-and-Web-based personal health record system which can include 'external' contextual information. Much contextual information is available on the Internet and we can use ontologies to help identify relevant sites and information. But a search engine is required to retrieve information from the Web and developing a customized search engine is beyond our scope, so we can use Google Custom Search API Web service to get contextual data. In this paper we describe a framework which combines a health-and-environment 'knowledge base' or ontology with the Google Custom Search API to retrieve relevant contextual information related to entries in a ubiquitous personal health record.

  9. Distributed Information Search and Retrieval for Astronomical Resource Discovery and Data Mining

    Science.gov (United States)

    Murtagh, Fionn; Guillaume, Damien

    Information search and retrieval has become by nature a distributed task. We look at tools and techniques which are of importance in this area. Current technological evolution can be summarized as the growing stability and cohesiveness of distributed architectures of searchable objects. The objects themselves are more often than not multimedia, including published articles or grey literature reports, yellow page services, image data, catalogs, presentation and online display materials, and ``operations'' information such as scheduling and publicly accessible proposal information. The evolution towards distributed architectures, protocols and formats, and the direction of our own work, are focussed on in this paper.

  10. Retrieval of diagnostic and treatment studies for clinical use through PubMed and PubMed's Clinical Queries filters.

    Science.gov (United States)

    Lokker, Cynthia; Haynes, R Brian; Wilczynski, Nancy L; McKibbon, K Ann; Walter, Stephen D

    2011-01-01

    Clinical Queries filters were developed to improve the retrieval of high-quality studies in searches on clinical matters. The study objective was to determine the yield of relevant citations and physician satisfaction while searching for diagnostic and treatment studies using the Clinical Queries page of PubMed compared with searching PubMed without these filters. Forty practicing physicians, presented with standardized treatment and diagnosis questions and one question of their choosing, entered search terms which were processed in a random, blinded fashion through PubMed alone and PubMed Clinical Queries. Participants rated search retrievals for applicability to the question at hand and satisfaction. For treatment, the primary outcome of retrieval of relevant articles was not significantly different between the groups, but a higher proportion of articles from the Clinical Queries searches met methodologic criteria (p=0.049), and more articles were published in core internal medicine journals (p=0.056). For diagnosis, the filtered results returned more relevant articles (p=0.031) and fewer irrelevant articles (overall retrieval less, p=0.023); participants needed to screen fewer articles before arriving at the first relevant citation (p<0.05). Relevance was also influenced by content terms used by participants in searching. Participants varied greatly in their search performance. Clinical Queries filtered searches returned more high-quality studies, though the retrieval of relevant articles was only statistically different between the groups for diagnosis questions. Retrieving clinically important research studies from Medline is a challenging task for physicians. Methodological search filters can improve search retrieval.

  11. Faceted Search

    CERN Document Server

    Tunkelang, Daniel

    2009-01-01

    We live in an information age that requires us, more than ever, to represent, access, and use information. Over the last several decades, we have developed a modern science and technology for information retrieval, relentlessly pursuing the vision of a "memex" that Vannevar Bush proposed in his seminal article, "As We May Think." Faceted search plays a key role in this program. Faceted search addresses weaknesses of conventional search approaches and has emerged as a foundation for interactive information retrieval. User studies demonstrate that faceted search provides more

  12. A Heuristic Hierarchical Scheme for Academic Search and Retrieval

    DEFF Research Database (Denmark)

    Amolochitis, Emmanouil; Christou, Ioannis T.; Tan, Zheng-Hua

    2013-01-01

    and a graph-theoretic computed score that relates the paper’s index terms with each other. We designed and developed a meta-search engine that submits user queries to standard digital repositories of academic publications and re-ranks the repository results using the hierarchical heuristic scheme. We evaluate......, and by more than 907.5% in terms of LEX. We also re-rank the top-10 results of a subset of the original 58 user queries produced by Google Scholar, Microsoft Academic Search, and ArnetMiner; the results show that PubSearch compares very well against these search engines as well. The proposed scheme can...... be easily plugged in any existing search engine for retrieval of academic publications....

  13. Information Retrieval in Telemedicine: a Comparative Study on Bibliographic Databases.

    Science.gov (United States)

    Ahmadi, Maryam; Sarabi, Roghayeh Ershad; Orak, Roohangiz Jamshidi; Bahaadinbeigy, Kambiz

    2015-06-01

    The first step in each systematic review is selection of the most valid database that can provide the highest number of relevant references. This study was carried out to determine the most suitable database for information retrieval in telemedicine field. Cinhal, PubMed, Web of Science and Scopus databases were searched for telemedicine matched with Education, cost benefit and patient satisfaction. After analysis of the obtained results, the accuracy coefficient, sensitivity, uniqueness and overlap of databases were calculated. The studied databases differed in the number of retrieved articles. PubMed was identified as the most suitable database for retrieving information on the selected topics with the accuracy and sensitivity ratios of 50.7% and 61.4% respectively. The uniqueness percent of retrieved articles ranged from 38% for Pubmed to 3.0% for Cinhal. The highest overlap rate (18.6%) was found between PubMed and Web of Science. Less than 1% of articles have been indexed in all searched databases. PubMed is suggested as the most suitable database for starting search in telemedicine and after PubMed, Scopus and Web of Science can retrieve about 90% of the relevant articles.

  14. Inefficiency and Bias of Search Engines in Retrieving References Containing Scientific Names of Fossil Amphibians

    Science.gov (United States)

    Brown, Lauren E.; Dubois, Alain; Shepard, Donald B.

    2008-01-01

    Retrieval efficiencies of paper-based references in journals and other serials containing 10 scientific names of fossil amphibians were determined for seven major search engines. Retrievals were compared to the number of references obtained covering the period 1895-2006 by a Comprehensive Search. The latter was primarily a traditional…

  15. EARS: An Online Bibliographic Search and Retrieval System Based on Ordered Explosion.

    Science.gov (United States)

    Ramesh, R.; Drury, Colin G.

    1987-01-01

    Provides overview of Ergonomics Abstracts Retrieval System (EARS), an online bibliographic search and retrieval system in the area of human factors engineering. Other online systems are described, the design of EARS based on inverted file organization is explained, and system expansions including a thesaurus are discussed. (Author/LRW)

  16. Improving Web Page Retrieval using Search Context from Clicked Domain Names

    NARCIS (Netherlands)

    Li, R.

    Search context is a crucial factor that helps to understand a user’s information need in ad-hoc Web page retrieval. A query log of a search engine contains rich information on issued queries and their corresponding clicked Web pages. The clicked data implies its relevance to the query and can be

  17. A Hybrid Neural Network Model for Sales Forecasting Based on ARIMA and Search Popularity of Article Titles.

    Science.gov (United States)

    Omar, Hani; Hoang, Van Hai; Liu, Duen-Ren

    2016-01-01

    Enhancing sales and operations planning through forecasting analysis and business intelligence is demanded in many industries and enterprises. Publishing industries usually pick attractive titles and headlines for their stories to increase sales, since popular article titles and headlines can attract readers to buy magazines. In this paper, information retrieval techniques are adopted to extract words from article titles. The popularity measures of article titles are then analyzed by using the search indexes obtained from Google search engine. Backpropagation Neural Networks (BPNNs) have successfully been used to develop prediction models for sales forecasting. In this study, we propose a novel hybrid neural network model for sales forecasting based on the prediction result of time series forecasting and the popularity of article titles. The proposed model uses the historical sales data, popularity of article titles, and the prediction result of a time series, Autoregressive Integrated Moving Average (ARIMA) forecasting method to learn a BPNN-based forecasting model. Our proposed forecasting model is experimentally evaluated by comparing with conventional sales prediction techniques. The experimental result shows that our proposed forecasting method outperforms conventional techniques which do not consider the popularity of title words.

  18. A Hybrid Neural Network Model for Sales Forecasting Based on ARIMA and Search Popularity of Article Titles

    Science.gov (United States)

    Omar, Hani; Hoang, Van Hai; Liu, Duen-Ren

    2016-01-01

    Enhancing sales and operations planning through forecasting analysis and business intelligence is demanded in many industries and enterprises. Publishing industries usually pick attractive titles and headlines for their stories to increase sales, since popular article titles and headlines can attract readers to buy magazines. In this paper, information retrieval techniques are adopted to extract words from article titles. The popularity measures of article titles are then analyzed by using the search indexes obtained from Google search engine. Backpropagation Neural Networks (BPNNs) have successfully been used to develop prediction models for sales forecasting. In this study, we propose a novel hybrid neural network model for sales forecasting based on the prediction result of time series forecasting and the popularity of article titles. The proposed model uses the historical sales data, popularity of article titles, and the prediction result of a time series, Autoregressive Integrated Moving Average (ARIMA) forecasting method to learn a BPNN-based forecasting model. Our proposed forecasting model is experimentally evaluated by comparing with conventional sales prediction techniques. The experimental result shows that our proposed forecasting method outperforms conventional techniques which do not consider the popularity of title words. PMID:27313605

  19. The retrieval efficiency test of descriptors and free vocabulary terms in INIS on-line search

    International Nuclear Information System (INIS)

    Ebinuma, Yukio; Takahashi, Satoko

    1981-01-01

    The test was done for 1) search topics with appropriate descriptors, 2) search topics with considerably broader descriptors, 3) search topics with no appropriate descriptors. As to (1) and (2) the retrieval efficiency was the same both on descriptor system and on keyword system (descriptors + free terms), and the search formulas were easily constructed. As to (3) the descriptor system ensured the recall ratio but decreased the precision ratio. On the other hand the keyword system made the construction of search formulas easy and resulted in good retrieval efficiency. The search system which is available both for full match method of descriptors and truncation method of keywords is desirable because each method can be selected according to the searcher's strategy and search topics. Free-term system seems unnecessary. (author)

  20. Searching Harvard Business Review Online. . . Lessons in Searching a Full Text Database.

    Science.gov (United States)

    Tenopir, Carol

    1985-01-01

    This article examines the Harvard Business Review Online (HBRO) database (bibliographic description fields, abstracts, extracted information, full text, subject descriptors) and reports on 31 sample HBRO searches conducted in Bibliographic Retrieval Services to test differences between searching full text and searching bibliographic record. Sample…

  1. The medline UK filter: development and validation of a geographic search filter to retrieve research about the UK from OVID medline.

    Science.gov (United States)

    Ayiku, Lynda; Levay, Paul; Hudson, Tom; Craven, Jenny; Barrett, Elizabeth; Finnegan, Amy; Adams, Rachel

    2017-07-13

    A validated geographic search filter for the retrieval of research about the United Kingdom (UK) from bibliographic databases had not previously been published. To develop and validate a geographic search filter to retrieve research about the UK from OVID medline with high recall and precision. Three gold standard sets of references were generated using the relative recall method. The sets contained references to studies about the UK which had informed National Institute for Health and Care Excellence (NICE) guidance. The first and second sets were used to develop and refine the medline UK filter. The third set was used to validate the filter. Recall, precision and number-needed-to-read (NNR) were calculated using a case study. The validated medline UK filter demonstrated 87.6% relative recall against the third gold standard set. In the case study, the medline UK filter demonstrated 100% recall, 11.4% precision and a NNR of nine. A validated geographic search filter to retrieve research about the UK with high recall and precision has been developed. The medline UK filter can be applied to systematic literature searches in OVID medline for topics with a UK focus. © 2017 Crown copyright. Health Information and Libraries Journal © 2017 Health Libraries GroupThis article is published with the permission of the Controller of HMSO and the Queen's Printer for Scotland.

  2. Children’s information retrieval: beyond examining search strategies and interfaces

    NARCIS (Netherlands)

    Jochmann-Mannak, Hanna; Huibers, Theo W.C.; Sanders, T.J.M.

    2008-01-01

    The study of children’s information retrieval is still for the greater part untouched territory. Meanwhile, children can become lost in the digital information world, because they are confronted with search interfaces, both designed by and for adults. Most current research on children’s information

  3. OntoADR a semantic resource describing adverse drug reactions to support searching, coding, and information retrieval.

    Science.gov (United States)

    Souvignet, Julien; Declerck, Gunnar; Asfari, Hadyl; Jaulent, Marie-Christine; Bousquet, Cédric

    2016-10-01

    Efficient searching and coding in databases that use terminological resources requires that they support efficient data retrieval. The Medical Dictionary for Regulatory Activities (MedDRA) is a reference terminology for several countries and organizations to code adverse drug reactions (ADRs) for pharmacovigilance. Ontologies that are available in the medical domain provide several advantages such as reasoning to improve data retrieval. The field of pharmacovigilance does not yet benefit from a fully operational ontology to formally represent the MedDRA terms. Our objective was to build a semantic resource based on formal description logic to improve MedDRA term retrieval and aid the generation of on-demand custom groupings by appropriately and efficiently selecting terms: OntoADR. The method consists of the following steps: (1) mapping between MedDRA terms and SNOMED-CT, (2) generation of semantic definitions using semi-automatic methods, (3) storage of the resource and (4) manual curation by pharmacovigilance experts. We built a semantic resource for ADRs enabling a new type of semantics-based term search. OntoADR adds new search capabilities relative to previous approaches, overcoming the usual limitations of computation using lightweight description logic, such as the intractability of unions or negation queries, bringing it closer to user needs. Our automated approach for defining MedDRA terms enabled the association of at least one defining relationship with 67% of preferred terms. The curation work performed on our sample showed an error level of 14% for this automated approach. We tested OntoADR in practice, which allowed us to build custom groupings for several medical topics of interest. The methods we describe in this article could be adapted and extended to other terminologies which do not benefit from a formal semantic representation, thus enabling better data retrieval performance. Our custom groupings of MedDRA terms were used while performing signal

  4. Learning to merge search results for efficient Distributed Information Retrieval

    NARCIS (Netherlands)

    Tjin-Kam-Jet, Kien; Hiemstra, Djoerd

    2010-01-01

    Merging search results from different servers is a major problem in Distributed Information Retrieval. We used Regression-SVM and Ranking-SVM which would learn a function that merges results based on information that is readily available: i.e. the ranks, titles, summaries and URLs contained in the

  5. Episodic retrieval and feature facilitation in intertrial priming of visual search

    DEFF Research Database (Denmark)

    Asgeirsson, Arni Gunnar; Kristjánsson, Árni

    2011-01-01

    Abstract Huang, Holcombe, and Pashler (Memory & Cognition, 32, 12–20, 2004) found that priming from repetition of different features of a target in a visual search task resulted in significant response time (RT) reductions when both target brightness and size were repeated. But when only one...... feature was repeated and the other changed, RTs were longer than when neither feature was repeated. From this, they argued that priming in visual search reflected episodic retrieval of memory traces, rather than facilitation of repeated features. We tested different varia- tions of the search task...

  6. SIRW: A web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches.

    Science.gov (United States)

    Ramu, Chenna

    2003-07-01

    SIRW (http://sirw.embl.de/) is a World Wide Web interface to the Simple Indexing and Retrieval System (SIR) that is capable of parsing and indexing various flat file databases. In addition it provides a framework for doing sequence analysis (e.g. motif pattern searches) for selected biological sequences through keyword search. SIRW is an ideal tool for the bioinformatics community for searching as well as analyzing biological sequences of interest.

  7. Supporting Keyword Search for Image Retrieval with Integration of Probabilistic Annotation

    Directory of Open Access Journals (Sweden)

    Tie Hua Zhou

    2015-05-01

    Full Text Available The ever-increasing quantities of digital photo resources are annotated with enriching vocabularies to form semantic annotations. Photo-sharing social networks have boosted the need for efficient and intuitive querying to respond to user requirements in large-scale image collections. In order to help users formulate efficient and effective image retrieval, we present a novel integration of a probabilistic model based on keyword query architecture that models the probability distribution of image annotations: allowing users to obtain satisfactory results from image retrieval via the integration of multiple annotations. We focus on the annotation integration step in order to specify the meaning of each image annotation, thus leading to the most representative annotations of the intent of a keyword search. For this demonstration, we show how a probabilistic model has been integrated to semantic annotations to allow users to intuitively define explicit and precise keyword queries in order to retrieve satisfactory image results distributed in heterogeneous large data sources. Our experiments on SBU (collected by Stony Brook University database show that (i our integrated annotation contains higher quality representatives and semantic matches; and (ii the results indicating annotation integration can indeed improve image search result quality.

  8. The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI.

    Science.gov (United States)

    Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Gur, Tamer; Cowley, Andrew; Li, Weizhong; Uludag, Mahmut; Pundir, Sangya; Cham, Jennifer A; McWilliam, Hamish; Lopez, Rodrigo

    2015-07-01

    The European Bioinformatics Institute (EMBL-EBI-https://www.ebi.ac.uk) provides free and unrestricted access to data across all major areas of biology and biomedicine. Searching and extracting knowledge across these domains requires a fast and scalable solution that addresses the requirements of domain experts as well as casual users. We present the EBI Search engine, referred to here as 'EBI Search', an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. API integration provides access to analytical tools, allowing users to further investigate the results of their search. The interconnectivity that exists between data resources at EMBL-EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types including sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, together with relevant life science literature. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Effect of Mental State on the Rate of Identifying the Relevancy of Documents Retrieved in a Search

    Directory of Open Access Journals (Sweden)

    Faezeh Farhoudi

    2009-07-01

    Full Text Available The present study investigates the link between various users’ mental state while searching information systems with the outcome of the resulting documents retrieved. Various factors such as user knowledge, search skills, motivation and aims influence the decisions and evaluation of users regarding documents retrieved. MMPI instrument was used to identify users’ mental states. The sample was drawn from female senior students of librarianship, using systematic random sampling. The findings indicated that anxiety and depression have significant inverse relationship to the rate of relevancy identification of the documents retrieved by the users.

  10. Web-based information search and retrieval: effects of strategy use and age on search success.

    Science.gov (United States)

    Stronge, Aideen J; Rogers, Wendy A; Fisk, Arthur D

    2006-01-01

    The purpose of this study was to investigate the relationship between strategy use and search success on the World Wide Web (i.e., the Web) for experienced Web users. An additional goal was to extend understanding of how the age of the searcher may influence strategy use. Current investigations of information search and retrieval on the Web have provided an incomplete picture of Web strategy use because participants have not been given the opportunity to demonstrate their knowledge of Web strategies while also searching for information on the Web. Using both behavioral and knowledge-engineering methods, we investigated searching behavior and system knowledge for 16 younger adults (M = 20.88 years of age) and 16 older adults (M = 67.88 years). Older adults were less successful than younger adults in finding correct answers to the search tasks. Knowledge engineering revealed that the age-related effect resulted from ineffective search strategies and amount of Web experience rather than age per se. Our analysis led to the development of a decision-action diagram representing search behavior for both age groups. Older adults had more difficulty than younger adults when searching for information on the Web. However, this difficulty was related to the selection of inefficient search strategies, which may have been attributable to a lack of knowledge about available Web search strategies. Actual or potential applications of this research include training Web users to search more effectively and suggestions to improve the design of search engines.

  11. Design implications for task-specific search utilities for retrieval and re-engineering of code

    Science.gov (United States)

    Iqbal, Rahat; Grzywaczewski, Adam; Halloran, John; Doctor, Faiyaz; Iqbal, Kashif

    2017-05-01

    The importance of information retrieval systems is unquestionable in the modern society and both individuals as well as enterprises recognise the benefits of being able to find information effectively. Current code-focused information retrieval systems such as Google Code Search, Codeplex or Koders produce results based on specific keywords. However, these systems do not take into account developers' context such as development language, technology framework, goal of the project, project complexity and developer's domain expertise. They also impose additional cognitive burden on users in switching between different interfaces and clicking through to find the relevant code. Hence, they are not used by software developers. In this paper, we discuss how software engineers interact with information and general-purpose information retrieval systems (e.g. Google, Yahoo!) and investigate to what extent domain-specific search and recommendation utilities can be developed in order to support their work-related activities. In order to investigate this, we conducted a user study and found that software engineers followed many identifiable and repeatable work tasks and behaviours. These behaviours can be used to develop implicit relevance feedback-based systems based on the observed retention actions. Moreover, we discuss the implications for the development of task-specific search and collaborative recommendation utilities embedded with the Google standard search engine and Microsoft IntelliSense for retrieval and re-engineering of code. Based on implicit relevance feedback, we have implemented a prototype of the proposed collaborative recommendation system, which was evaluated in a controlled environment simulating the real-world situation of professional software engineers. The evaluation has achieved promising initial results on the precision and recall performance of the system.

  12. Surfing for suicide methods and help: content analysis of websites retrieved with search engines in Austria and the United States.

    Science.gov (United States)

    Till, Benedikt; Niederkrotenthaler, Thomas

    2014-08-01

    The Internet provides a variety of resources for individuals searching for suicide-related information. Structured content-analytic approaches to assess intercultural differences in web contents retrieved with method-related and help-related searches are scarce. We used the 2 most popular search engines (Google and Yahoo/Bing) to retrieve US-American and Austrian search results for the term suicide, method-related search terms (e.g., suicide methods, how to kill yourself, painless suicide, how to hang yourself), and help-related terms (e.g., suicidal thoughts, suicide help) on February 11, 2013. In total, 396 websites retrieved with US search engines and 335 websites from Austrian searches were analyzed with content analysis on the basis of current media guidelines for suicide reporting. We assessed the quality of websites and compared findings across search terms and between the United States and Austria. In both countries, protective outweighed harmful website characteristics by approximately 2:1. Websites retrieved with method-related search terms (e.g., how to hang yourself) contained more harmful (United States: P search engines generally had more protective characteristics (P search engines. Resources with harmful characteristics were better ranked than those with protective characteristics (United States: P < .01, Austria: P < .05). The quality of suicide-related websites obtained depends on the search terms used. Preventive efforts to improve the ranking of preventive web content, particularly regarding method-related search terms, seem necessary. © Copyright 2014 Physicians Postgraduate Press, Inc.

  13. Dialysis search filters for PubMed, Ovid MEDLINE, and Embase databases.

    Science.gov (United States)

    Iansavichus, Arthur V; Haynes, R Brian; Lee, Christopher W C; Wilczynski, Nancy L; McKibbon, Ann; Shariff, Salimah Z; Blake, Peter G; Lindsay, Robert M; Garg, Amit X

    2012-10-01

    Physicians frequently search bibliographic databases, such as MEDLINE via PubMed, for best evidence for patient care. The objective of this study was to develop and test search filters to help physicians efficiently retrieve literature related to dialysis (hemodialysis or peritoneal dialysis) from all other articles indexed in PubMed, Ovid MEDLINE, and Embase. A diagnostic test assessment framework was used to develop and test robust dialysis filters. The reference standard was a manual review of the full texts of 22,992 articles from 39 journals to determine whether each article contained dialysis information. Next, 1,623,728 unique search filters were developed, and their ability to retrieve relevant articles was evaluated. The high-performance dialysis filters consisted of up to 65 search terms in combination. These terms included the words "dialy" (truncated), "uremic," "catheters," and "renal transplant wait list." These filters reached peak sensitivities of 98.6% and specificities of 98.5%. The filters' performance remained robust in an independent validation subset of articles. These empirically derived and validated high-performance search filters should enable physicians to effectively retrieve dialysis information from PubMed, Ovid MEDLINE, and Embase.

  14. Beyond information retrieval: information discovery and multimedia information retrieval

    OpenAIRE

    Roberto Raieli

    2017-01-01

    The paper compares the current methodologies for search and discovery of information and information resources: terminological search and term-based language, own of information retrieval (IR); semantic search and information discovery, being developed mainly through the language of linked data; semiotic search and content-based language, experienced by multimedia information retrieval (MIR).MIR semiotic methodology is, then, detailed.

  15. Challenging Google, Microsoft Unveils a Search Tool for Scholarly Articles

    Science.gov (United States)

    Carlson, Scott

    2006-01-01

    Microsoft has introduced a new search tool to help people find scholarly articles online. The service, which includes journal articles from prominent academic societies and publishers, puts Microsoft in direct competition with Google Scholar. The new free search tool, which should work on most Web browsers, is called Windows Live Academic Search…

  16. Millennial Undergraduate Research Strategies in Web and Library Information Retrieval Systems

    Science.gov (United States)

    Porter, Brandi

    2011-01-01

    This article summarizes the author's dissertation regarding search strategies of millennial undergraduate students in Web and library online information retrieval systems. Millennials bring a unique set of search characteristics and strategies to their research since they have never known a world without the Web. Through the use of search engines,…

  17. PubMed Interact: an Interactive Search Application for MEDLINE/PubMed

    Science.gov (United States)

    Muin, Michael; Fontelo, Paul; Ackerman, Michael

    2006-01-01

    Online search and retrieval systems are important resources for medical literature research. Progressive Web 2.0 technologies provide opportunities to improve search strategies and user experience. Using PHP, Document Object Model (DOM) manipulation and Asynchronous JavaScript and XML (Ajax), PubMed Interact allows greater functionality so users can refine search parameters with ease and interact with the search results to retrieve and display relevant information and related articles. PMID:17238658

  18. Evidence-based medicine - searching the medical literature. Part 1.

    African Journals Online (AJOL)

    Ann Burgess

    password for access via HINARIa2 use that to log in. Then you can retrieve articles from the 6000 journals that will be available to you. You cannot retrieve the full text from journals that do not allow free access or HINARI access. How to search the literature on the internet. Before you start your search take a moment to think ...

  19. Novel citation-based search method for scientific literature: application to meta-analyses.

    Science.gov (United States)

    Janssens, A Cecile J W; Gwinn, M

    2015-10-13

    Finding eligible studies for meta-analysis and systematic reviews relies on keyword-based searching as the gold standard, despite its inefficiency. Searching based on direct citations is not sufficiently comprehensive. We propose a novel strategy that ranks articles on their degree of co-citation with one or more "known" articles before reviewing their eligibility. In two independent studies, we aimed to reproduce the results of literature searches for sets of published meta-analyses (n = 10 and n = 42). For each meta-analysis, we extracted co-citations for the randomly selected 'known' articles from the Web of Science database, counted their frequencies and screened all articles with a score above a selection threshold. In the second study, we extended the method by retrieving direct citations for all selected articles. In the first study, we retrieved 82% of the studies included in the meta-analyses while screening only 11% as many articles as were screened for the original publications. Articles that we missed were published in non-English languages, published before 1975, published very recently, or available only as conference abstracts. In the second study, we retrieved 79% of included studies while screening half the original number of articles. Citation searching appears to be an efficient and reasonably accurate method for finding articles similar to one or more articles of interest for meta-analysis and reviews.

  20. A 54 year analysis of articles from Mpilo Central Hospital, Bulawayo ...

    African Journals Online (AJOL)

    PubMed and Google Scholar were searched to obtain articles originating from Mpilo Central Hospital, Bulawayo, Zimbabwe - 1958 to August 2011 (54 years). 168 articles cited 999 times were retrieved giving about 6 citations per article. Analysis of publication trends over time as well as publication avenues is made.

  1. G-Bean: an ontology-graph based web tool for biomedical literature retrieval.

    Science.gov (United States)

    Wang, James Z; Zhang, Yuanyuan; Dong, Liang; Li, Lin; Srimani, Pradip K; Yu, Philip S

    2014-01-01

    Currently, most people use NCBI's PubMed to search the MEDLINE database, an important bibliographical information source for life science and biomedical information. However, PubMed has some drawbacks that make it difficult to find relevant publications pertaining to users' individual intentions, especially for non-expert users. To ameliorate the disadvantages of PubMed, we developed G-Bean, a graph based biomedical search engine, to search biomedical articles in MEDLINE database more efficiently. G-Bean addresses PubMed's limitations with three innovations: (1) Parallel document index creation: a multithreaded index creation strategy is employed to generate the document index for G-Bean in parallel; (2) Ontology-graph based query expansion: an ontology graph is constructed by merging four major UMLS (Version 2013AA) vocabularies, MeSH, SNOMEDCT, CSP and AOD, to cover all concepts in National Library of Medicine (NLM) database; a Personalized PageRank algorithm is used to compute concept relevance in this ontology graph and the Term Frequency - Inverse Document Frequency (TF-IDF) weighting scheme is used to re-rank the concepts. The top 500 ranked concepts are selected for expanding the initial query to retrieve more accurate and relevant information; (3) Retrieval and re-ranking of documents based on user's search intention: after the user selects any article from the existing search results, G-Bean analyzes user's selections to determine his/her true search intention and then uses more relevant and more specific terms to retrieve additional related articles. The new articles are presented to the user in the order of their relevance to the already selected articles. Performance evaluation with 106 OHSUMED benchmark queries shows that G-Bean returns more relevant results than PubMed does when using these queries to search the MEDLINE database. PubMed could not even return any search result for some OHSUMED queries because it failed to form the appropriate Boolean

  2. The development of PubMed search strategies for patient preferences for treatment outcomes.

    Science.gov (United States)

    van Hoorn, Ralph; Kievit, Wietske; Booth, Andrew; Mozygemba, Kati; Lysdahl, Kristin Bakke; Refolo, Pietro; Sacchini, Dario; Gerhardus, Ansgar; van der Wilt, Gert Jan; Tummers, Marcia

    2016-07-29

    The importance of respecting patients' preferences when making treatment decisions is increasingly recognized. Efficiently retrieving papers from the scientific literature reporting on the presence and nature of such preferences can help to achieve this goal. The objective of this study was to create a search filter for PubMed to help retrieve evidence on patient preferences for treatment outcomes. A total of 27 journals were hand-searched for articles on patient preferences for treatment outcomes published in 2011. Selected articles served as a reference set. To develop optimal search strategies to retrieve this set, all articles in the reference set were randomly split into a development and a validation set. MeSH-terms and keywords retrieved using PubReMiner were tested individually and as combinations in PubMed and evaluated for retrieval performance (e.g. sensitivity (Se) and specificity (Sp)). Of 8238 articles, 22 were considered to report empirical evidence on patient preferences for specific treatment outcomes. The best search filters reached Se of 100 % [95 % CI 100-100] with Sp of 95 % [94-95 %] and Sp of 97 % [97-98 %] with 75 % Se [74-76 %]. In the validation set these queries reached values of Se of 90 % [89-91 %] with Sp 94 % [93-95 %] and Se of 80 % [79-81 %] with Sp of 97 % [96-96 %], respectively. Narrow and broad search queries were developed which can help in retrieving literature on patient preferences for treatment outcomes. Identifying such evidence may in turn enhance the incorporation of patient preferences in clinical decision making and health technology assessment.

  3. Development and use of a content search strategy for retrieving studies on patients' views and preferences.

    Science.gov (United States)

    Selva, Anna; Solà, Ivan; Zhang, Yuan; Pardo-Hernandez, Hector; Haynes, R Brian; Martínez García, Laura; Navarro, Tamara; Schünemann, Holger; Alonso-Coello, Pablo

    2017-08-30

    Identifying scientific literature addressing patients' views and preferences is complex due to the wide range of studies that can be informative and the poor indexing of this evidence. Given the lack of guidance we developed a search strategy to retrieve this type of evidence. We assembled an initial list of terms from several sources, including the revision of the terms and indexing of topic-related studies and, methods research literature, and other relevant projects and systematic reviews. We used the relative recall approach, evaluating the capacity of the designed search strategy for retrieving studies included in relevant systematic reviews for the topic. We implemented in practice the final version of the search strategy for conducting systematic reviews and guidelines, and calculated search's precision and the number of references needed to read (NNR). We assembled an initial version of the search strategy, which had a relative recall of 87.4% (yield of 132/out of 151 studies). We then added some additional terms from the studies not initially identified, and re-tested this improved version against the studies included in a new set of systematic reviews, reaching a relative recall of 85.8% (151/out of 176 studies, 95% CI 79.9 to 90.2). This final version of the strategy includes two sets of terms related with two domains: "Patient Preferences and Decision Making" and "Health State Utilities Values". When we used the search strategy for the development of systematic reviews and clinical guidelines we obtained low precision values (ranging from 2% to 5%), and the NNR from 20 to 50. This search strategy fills an important research gap in this field. It will help systematic reviewers, clinical guideline developers, and policy-makers to retrieve published research on patients' views and preferences. In turn, this will facilitate the inclusion of this critical aspect when formulating heath care decisions, including recommendations.

  4. Uncovering Web search strategies in South African higher education

    Directory of Open Access Journals (Sweden)

    Surika Civilcharran

    2016-11-01

    Full Text Available Background: In spite of the enormous amount of information available on the Web and the fact that search engines are continuously evolving to enhance the search experience, students are nevertheless faced with the difficulty of effectively retrieving information. It is, therefore, imperative for the interaction between students and search tools to be understood and search strategies to be identified, in order to promote successful information retrieval. Objectives: This study identifies the Web search strategies used by postgraduate students and forms part of a wider study into information retrieval strategies used by postgraduate students at the University of KwaZulu-Natal (UKZN, Pietermaritzburg campus, South Africa. Method: Largely underpinned by Thatcher’s cognitive search strategies, the mixed-methods approach was utilised for this study, in which questionnaires were employed in Phase 1 and structured interviews in Phase 2. This article reports and reflects on the findings of Phase 2, which focus on identifying the Web search strategies employed by postgraduate students. The Phase 1 results were reported in Civilcharran, Hughes and Maharaj (2015. Results: Findings reveal the Web search strategies used for academic information retrieval. In spite of easy access to the invisible Web and the advent of meta-search engines, the use of Web search engines still remains the preferred search tool. The UKZN online library databases and especially the UKZN online library, Online Public Access Catalogue system, are being underutilised. Conclusion: Being ranked in the top three percent of the world’s universities, UKZN is investing in search tools that are not being used to their full potential. This evidence suggests an urgent need for students to be trained in Web searching and to have a greater exposure to a variety of search tools. This article is intended to further contribute to the design of undergraduate training programmes in order to deal

  5. Web-Scale Discovery Services Retrieve Relevant Results in Health Sciences Topics Including MEDLINE Content

    Directory of Open Access Journals (Sweden)

    Elizabeth Margaret Stovold

    2017-06-01

    Full Text Available A Review of: Hanneke, R., & O’Brien, K. K. (2016. Comparison of three web-scale discovery services for health sciences research. Journal of the Medical Library Association, 104(2, 109-117. http://dx.doi.org/10.3163/1536-5050.104.2.004 Abstract Objective – To compare the results of health sciences search queries in three web-scale discovery (WSD services for relevance, duplicate detection, and retrieval of MEDLINE content. Design – Comparative evaluation and bibliometric study. Setting – Six university libraries in the United States of America. Subjects – Three commercial WSD services: Primo, Summon, and EBSCO Discovery Service (EDS. Methods – The authors collected data at six universities, including their own. They tested each of the three WSDs at two data collection sites. However, since one of the sites was using a legacy version of Summon that was due to be upgraded, data collected for Summon at this site were considered obsolete and excluded from the analysis. The authors generated three questions for each of six major health disciplines, then designed simple keyword searches to mimic typical student search behaviours. They captured the first 20 results from each query run at each test site, to represent the first “page” of results, giving a total of 2,086 total search results. These were independently assessed for relevance to the topic. Authors resolved disagreements by discussion, and calculated a kappa inter-observer score. They retained duplicate records within the results so that the duplicate detection by the WSDs could be compared. They assessed MEDLINE coverage by the WSDs in several ways. Using precise strategies to generate a relevant set of articles, they conducted one search from each of the six disciplines in PubMed so that they could compare retrieval of MEDLINE content. These results were cross-checked against the first 20 results from the corresponding query in the WSDs. To aid investigation of overall

  6. Global polar geospatial information service retrieval based on search engine and ontology reasoning

    Science.gov (United States)

    Chen, Nengcheng; E, Dongcheng; Di, Liping; Gong, Jianya; Chen, Zeqiang

    2007-01-01

    In order to improve the access precision of polar geospatial information service on web, a new methodology for retrieving global spatial information services based on geospatial service search and ontology reasoning is proposed, the geospatial service search is implemented to find the coarse service from web, the ontology reasoning is designed to find the refined service from the coarse service. The proposed framework includes standardized distributed geospatial web services, a geospatial service search engine, an extended UDDI registry, and a multi-protocol geospatial information service client. Some key technologies addressed include service discovery based on search engine and service ontology modeling and reasoning in the Antarctic geospatial context. Finally, an Antarctica multi protocol OWS portal prototype based on the proposed methodology is introduced.

  7. Information Retrieval in Biomedical Research: From Articles to Datasets

    Science.gov (United States)

    Wei, Wei

    2017-01-01

    Information retrieval techniques have been applied to biomedical research for a variety of purposes, such as textual document retrieval and molecular data retrieval. As biomedical research evolves over time, information retrieval is also constantly facing new challenges, including the growing number of available data, the emerging new data types,…

  8. Information Retrieval Strategies of Millennial Undergraduate Students in Web and Library Database Searches

    Science.gov (United States)

    Porter, Brandi

    2009-01-01

    Millennial students make up a large portion of undergraduate students attending colleges and universities, and they have a variety of online resources available to them to complete academically related information searches, primarily Web based and library-based online information retrieval systems. The content, ease of use, and required search…

  9. The development of PubMed search strategies for patient preferences for treatment outcomes

    Directory of Open Access Journals (Sweden)

    Ralph van Hoorn

    2016-07-01

    Full Text Available Abstract Background The importance of respecting patients’ preferences when making treatment decisions is increasingly recognized. Efficiently retrieving papers from the scientific literature reporting on the presence and nature of such preferences can help to achieve this goal. The objective of this study was to create a search filter for PubMed to help retrieve evidence on patient preferences for treatment outcomes. Methods A total of 27 journals were hand-searched for articles on patient preferences for treatment outcomes published in 2011. Selected articles served as a reference set. To develop optimal search strategies to retrieve this set, all articles in the reference set were randomly split into a development and a validation set. MeSH-terms and keywords retrieved using PubReMiner were tested individually and as combinations in PubMed and evaluated for retrieval performance (e.g. sensitivity (Se and specificity (Sp. Results Of 8238 articles, 22 were considered to report empirical evidence on patient preferences for specific treatment outcomes. The best search filters reached Se of 100 % [95 % CI 100-100] with Sp of 95 % [94–95 %] and Sp of 97 % [97–98 %] with 75 % Se [74–76 %]. In the validation set these queries reached values of Se of 90 % [89–91 %] with Sp 94 % [93–95 %] and Se of 80 % [79–81 %] with Sp of 97 % [96–96 %], respectively. Conclusions Narrow and broad search queries were developed which can help in retrieving literature on patient preferences for treatment outcomes. Identifying such evidence may in turn enhance the incorporation of patient preferences in clinical decision making and health technology assessment.

  10. Developing optimal search strategies for detecting clinically sound prognostic studies in MEDLINE: an analytic survey

    Directory of Open Access Journals (Sweden)

    Haynes R Brian

    2004-06-01

    Full Text Available Abstract Background Clinical end users of MEDLINE have a difficult time retrieving articles that are both scientifically sound and directly relevant to clinical practice. Search filters have been developed to assist end users in increasing the success of their searches. Many filters have been developed for the literature on therapy and reviews but little has been done in the area of prognosis. The objective of this study is to determine how well various methodologic textwords, Medical Subject Headings, and their Boolean combinations retrieve methodologically sound literature on the prognosis of health disorders in MEDLINE. Methods An analytic survey was conducted, comparing hand searches of journals with retrievals from MEDLINE for candidate search terms and combinations. Six research assistants read all issues of 161 journals for the publishing year 2000. All articles were rated using purpose and quality indicators and categorized into clinically relevant original studies, review articles, general papers, or case reports. The original and review articles were then categorized as 'pass' or 'fail' for methodologic rigor in the areas of prognosis and other clinical topics. Candidate search strategies were developed for prognosis and run in MEDLINE – the retrievals being compared with the hand search data. The sensitivity, specificity, precision, and accuracy of the search strategies were calculated. Results 12% of studies classified as prognosis met basic criteria for scientific merit for testing clinical applications. Combinations of terms reached peak sensitivities of 90%. Compared with the best single term, multiple terms increased sensitivity for sound studies by 25.2% (absolute increase, and increased specificity, but by a much smaller amount (1.1% when sensitivity was maximized. Combining terms to optimize both sensitivity and specificity achieved sensitivities and specificities of approximately 83% for each. Conclusion Empirically derived

  11. PubMed vs. HighWire Press: a head-to-head comparison of two medical literature search engines.

    Science.gov (United States)

    Vanhecke, Thomas E; Barnes, Michael A; Zimmerman, Janet; Shoichet, Sandor

    2007-09-01

    PubMed and HighWire Press are both useful medical literature search engines available for free to anyone on the internet. We measured retrieval accuracy, number of results generated, retrieval speed, features and search tools on HighWire Press and PubMed using the quick search features of each. We found that using HighWire Press resulted in a higher likelihood of retrieving the desired article and higher number of search results than the same search on PubMed. PubMed was faster than HighWire Press in delivering search results regardless of search settings. There are considerable differences in search features between these two search engines.

  12. The Measurement of Relevance Amount of Documents That By Using of Google cross-language retrieval About Agriculture Subject Area are Retrieved

    Directory of Open Access Journals (Sweden)

    Fatemeh Jamshidi Ghahfarokhi

    2014-02-01

    Full Text Available In this study, the relevance amount of documents has been investigated by using google cross-language retrieval tools about a agriculture subject area in cross-language retrieval form, are retrieved. For this purpose, by using Persian journals articles that have had English abstracts, Persian phrases and subject terms with their English equivalent were extracted. In three class us, thirty number of phrases and subject terms of agriculture area were extracted: First class, subject phrases that only in agriculture are used; Secondary, agriculture subject terms that in other fields are used too; Third class, agriculture subject terms that out of this field are considered as public term. Then by these phrases and terms, documents were searched, and relevance amount of search results are investigated. Results of study showed that google cross-language retrieval tools for two classes of phrases and terms, in cross-language retrieval of relevance document about agriculture subject area, aren`t succeed: one class, agriculture subject terms that in other fields are used too. other class, agriculture subject terms that out of agriculture field are considered as public term. Google cross-language retrieval tools about subject phrase and terms that only in agriculture field are used, are performance rather desirable than other two class of phrase and terms

  13. iPixel: a visual content-based and semantic search engine for retrieving digitized mammograms by using collective intelligence.

    Science.gov (United States)

    Alor-Hernández, Giner; Pérez-Gallardo, Yuliana; Posada-Gómez, Rubén; Cortes-Robles, Guillermo; Rodríguez-González, Alejandro; Aguilar-Laserre, Alberto A

    2012-09-01

    Nowadays, traditional search engines such as Google, Yahoo and Bing facilitate the retrieval of information in the format of images, but the results are not always useful for the users. This is mainly due to two problems: (1) the semantic keywords are not taken into consideration and (2) it is not always possible to establish a query using the image features. This issue has been covered in different domains in order to develop content-based image retrieval (CBIR) systems. The expert community has focussed their attention on the healthcare domain, where a lot of visual information for medical analysis is available. This paper provides a solution called iPixel Visual Search Engine, which involves semantics and content issues in order to search for digitized mammograms. iPixel offers the possibility of retrieving mammogram features using collective intelligence and implementing a CBIR algorithm. Our proposal compares not only features with similar semantic meaning, but also visual features. In this sense, the comparisons are made in different ways: by the number of regions per image, by maximum and minimum size of regions per image and by average intensity level of each region. iPixel Visual Search Engine supports the medical community in differential diagnoses related to the diseases of the breast. The iPixel Visual Search Engine has been validated by experts in the healthcare domain, such as radiologists, in addition to experts in digital image analysis.

  14. PubMed search strategies for the identification of etiologic associations between hypothalamic-pituitary disorders and other medical conditions.

    Science.gov (United States)

    Guaraldi, Federica; Grottoli, Silvia; Arvat, Emanuela; Mattioli, Stefano; Ghigo, Ezio; Gori, Davide

    2013-12-01

    Biomedical literature has enormously grown in the last decades and become broadly available through online databases. Ad-hoc search methods, created on the basis of research field and goals, are required to enhance the quality of searching. Aim of this study was to formulate efficient, evidence-based PubMed search strategies to retrieve articles assessing etiologic associations between a condition of interest and hypothalamic-pituitary disorders (HPD). Based on expert knowledge, 17 MeSH (Medical Subjects Headings) and 79 free terms related to HPD were identified to search PubMed. Using random samples of abstracts retrieved by each term, we estimated the proportion of articles containing pertinent information and formulated two strings (one more specific, one more sensitive) for the detection of articles focusing on the etiology of HPD, that were then applied to retrieve articles identifying possible etiologic associations between HPD and three diseases (malaria, LHON and celiac disease) considered not associated to HPD, and define the number of abstracts needed to read (NNR) to find one potentially pertinent article. We propose two strings: one sensitive string derived from the combination of articles providing the largest literature coverage in the field and one specific including combined terms retrieving ≥40% of potentially pertinent articles. NNR were 2.1 and 1.6 for malaria, 3.36 and 2.29 for celiac disease, 2.8 and 2.2 for LHON, respectively. For the first time, two reliable, readily applicable strings are proposed for the retrieval of medical literature assessing putative etiologic associations between HPD and other medical conditions of interest.

  15. Improve Biomedical Information Retrieval using Modified Learning to Rank Methods.

    Science.gov (United States)

    Xu, Bo; Lin, Hongfei; Lin, Yuan; Ma, Yunlong; Yang, Liang; Wang, Jian; Yang, Zhihao

    2016-06-14

    In these years, the number of biomedical articles has increased exponentially, which becomes a problem for biologists to capture all the needed information manually. Information retrieval technologies, as the core of search engines, can deal with the problem automatically, providing users with the needed information. However, it is a great challenge to apply these technologies directly for biomedical retrieval, because of the abundance of domain specific terminologies. To enhance biomedical retrieval, we propose a novel framework based on learning to rank. Learning to rank is a series of state-of-the-art information retrieval techniques, and has been proved effective in many information retrieval tasks. In the proposed framework, we attempt to tackle the problem of the abundance of terminologies by constructing ranking models, which focus on not only retrieving the most relevant documents, but also diversifying the searching results to increase the completeness of the resulting list for a given query. In the model training, we propose two novel document labeling strategies, and combine several traditional retrieval models as learning features. Besides, we also investigate the usefulness of different learning to rank approaches in our framework. Experimental results on TREC Genomics datasets demonstrate the effectiveness of our framework for biomedical information retrieval.

  16. Search strings for the study of putative occupational determinants of disease

    Science.gov (United States)

    Mattioli, Stefano; Zanardi, Francesca; Baldasseroni, Alberto; Schaafsma, Frederieke; Cooke, Robin MT; Mancini, Gianpiero; Fierro, Mauro; Santangelo, Chiara; Farioli, Andrea; Fucksia, Serenella; Curti, Stefania; Verbeek, Jos

    2010-01-01

    Objective To identify efficient PubMed search strategies to retrieve articles regarding putative occupational determinants of conditions not generally considered to be work related. Methods Based on MeSH definitions and expert knowledge, we selected as candidate search terms the four MeSH terms describing ‘occupational disease’, ‘occupational exposure’, ‘occupational health’ and ‘occupational medicine’ (DEHM) alongside 22 other promising terms. We first explored overlaps between the candidate terms in PubMed. Using random samples of abstracts retrieved by each term, we estimated the proportions of articles containing potentially pertinent information regarding occupational aetiology in order to formulate two search strategies (one more ‘specific’, one more ‘sensitive’). We applied these strategies to retrieve information on the possible occupational aetiology of meningioma, pancreatitis and atrial fibrillation. Results Only 20.3% of abstracts were retrieved by more than one DEHM term. The more ‘specific’ search string was based on the combination of terms that yielded the highest proportion (40%) of potentially pertinent abstracts. The more ‘sensitive’ string was based on the use of broader search fields and additional coverage provided by other search terms under study. Using the specific string, the numbers of abstracts needed to read to find one potentially pertinent article were 1.2 for meningioma, 1.9 for pancreatitis and 1.8 for atrial fibrillation. Using the sensitive strategy, the numbers needed to read were 4.4 for meningioma, 8.9 for pancreatitis and 10.5 for atrial fibrillation. Conclusions The proposed strings could help health care professionals explore putative occupational aetiology for diseases that are not generally thought to be work related. PMID:19819858

  17. Multimedia information retrieval theory and techniques

    CERN Document Server

    Raieli, Roberto

    2013-01-01

    Novel processing and searching tools for the management of new multimedia documents have developed. Multimedia Information Retrieval (MMIR) is an organic system made up of Text Retrieval (TR); Visual Retrieval (VR); Video Retrieval (VDR); and Audio Retrieval (AR) systems. So that each type of digital document may be analysed and searched by the elements of language appropriate to its nature, search criteria must be extended. Such an approach is known as the Content Based Information Retrieval (CBIR), and is the core of MMIR. This novel content-based concept of information handling needs to be integrated with more traditional semantics. Multimedia Information Retrieval focuses on the tools of processing and searching applicable to the content-based management of new multimedia documents. Translated from Italian by Giles Smith, the book is divided in to two parts. Part one discusses MMIR and related theories, and puts forward new methodologies; part two reviews various experimental and operating MMIR systems, a...

  18. Design and implementation of Metta, a metasearch engine for biomedical literature retrieval intended for systematic reviewers.

    Science.gov (United States)

    Smalheiser, Neil R; Lin, Can; Jia, Lifeng; Jiang, Yu; Cohen, Aaron M; Yu, Clement; Davis, John M; Adams, Clive E; McDonagh, Marian S; Meng, Weiyi

    2014-01-01

    Individuals and groups who write systematic reviews and meta-analyses in evidence-based medicine regularly carry out literature searches across multiple search engines linked to different bibliographic databases, and thus have an urgent need for a suitable metasearch engine to save time spent on repeated searches and to remove duplicate publications from initial consideration. Unlike general users who generally carry out searches to find a few highly relevant (or highly recent) articles, systematic reviewers seek to obtain a comprehensive set of articles on a given topic, satisfying specific criteria. This creates special requirements and challenges for metasearch engine design and implementation. We created a federated search tool that is connected to five databases: PubMed, EMBASE, CINAHL, PsycINFO, and the Cochrane Central Register of Controlled Trials. Retrieved bibliographic records were shown online; optionally, results could be de-duplicated and exported in both BibTex and XML format. The query interface was extensively modified in response to feedback from users within our team. Besides a general search track and one focused on human-related articles, we also added search tracks optimized to identify case reports and systematic reviews. Although users could modify preset search options, they were rarely if ever altered in practice. Up to several thousand retrieved records could be exported within a few minutes. De-duplication of records returned from multiple databases was carried out in a prioritized fashion that favored retaining citations returned from PubMed. Systematic reviewers are used to formulating complex queries using strategies and search tags that are specific for individual databases. Metta offers a different approach that may save substantial time but which requires modification of current search strategies and better indexing of randomized controlled trial articles. We envision Metta as one piece of a multi-tool pipeline that will assist

  19. Searching PubMed for a broad subject area: how effective are palliative care clinicians in finding the evidence in their field?

    Science.gov (United States)

    Damarell, Raechel A; Tieman, Jennifer J

    2016-03-01

    Health professionals must be able to search competently for evidence to support practice. We sought to understand how palliative care clinicians construct searches for palliative care literature in PubMed, to quantify search efficacy in retrieving a set of relevant articles and to compare performance against a Palliative CareSearch Filter (PCSF). Included studies from palliative care systematic reviews formed a test set. Palliative care clinicians (n = 37) completed a search task using PubMed. Individual clinician searches were reconstructed in PubMed and combined with the test set to calculate retrieval sensitivity. PCSF performance in the test set was also determined. Many clinicians struggled to create useful searches. Twelve used a single search term, 17 narrowed the search inappropriately and 8 confused Boolean operators. The mean number of test set citations (n = 663) retrieved was 166 (SD = 188), or 25% although 76% of clinicians believed they would find more than 50% of the articles. Only 8 participants (22%) achieved this. Correlations between retrieval and PubMed confidence (r = 0.13) or frequency of use (r = -0.18) were weak. Many palliative care clinicians search PubMed ineffectively. Targeted skills training and PCSF promotion may improve evidence retrieval. © 2015 Health Libraries Group.

  20. New Search Strategies Successfully Optimize Retrieval of Clinically Sound Treatment Studies in EMBASE. A review of: Wong, Sharon S‐L, Nancy L. Wilczynski, and R. Brian Haynes. “Developing Optimal Search Strategies for Detecting Clinically Sound Treatment Studies in EMBASE.” Journal of the Medical Library Association 94.1 (Jan. 2006: 41‐47. 14 May 2007 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1324770.

    Directory of Open Access Journals (Sweden)

    John Loy

    2007-06-01

    Full Text Available Objective – To develop and test the sensitivity and specificity, precision andaccuracy of search strategies to retrieve clinically sound treatment studies in the EMBASE database.Design – Analytical study.Setting – Methodologically sound studies of treatment from 55 journals indexed in EMBASE for the year 2000.Subjects – EMBASE and hand searches performed at the Health Information Research Unit of McMaster University, Ontario, Canada.Methods – The authors compare the results of EMBASE searches using their search strategies with the “gold standard” of articles retrieved by hand search. Research assistants initially hand searched each issue of 55 selected journals published in 2000 to identify articles detailing studies on healthcare treatment of humans. Subject coverage of the journals was wide ranging and included obstetrics and gynaecology, psychiatry, oncology, neurology, surgery and general practice. Studies were then assessed to ensure they met the qualifying criteria: random allocation of participants to groups, outcome assessment of at least 80% of participants who began the study, and analysis consistent with study design. Initially, 3850 articles on treatment were identified, of which 1256 (32.6% were methodologically sound. To construct a comprehensive set of search terms, input was sought from librarians and researchers in the US and Canada. This initially produced a list of 5385 terms, of which 4843 were unique and 3524 produced hits. Individual search terms with sensitivity greater then 25% and specificity greater then 75% were incorporated into search strategies for use within the OVID interface for the EMBASE database to retrieve articles meeting the same criteria. These strategies were developed using all 27,769 articles published in the 55 journals in 2000. This all inclusive approach was used to test the search strategies’ ability to identify high quality treatment articles from a larger pool of material

  1. Search strings for the study of putative occupational determinants of disease

    NARCIS (Netherlands)

    Mattioli, Stefano; Zanardi, Francesca; Baldasseroni, Alberto; Schaafsma, Frederieke; Cooke, Robin M. T.; Mancini, Gianpiero; Fierro, Mauro; Santangelo, Chiara; Farioli, Andrea; Fucksia, Serenella; Curti, Stefania; Violante, Francesco S.; Verbeek, Jos

    2010-01-01

    To identify efficient PubMed search strategies to retrieve articles regarding putative occupational determinants of conditions not generally considered to be work related. Based on MeSH definitions and expert knowledge, we selected as candidate search terms the four MeSH terms describing

  2. A Two-Level Cache for Distributed Information Retrieval in Search Engines

    Directory of Open Access Journals (Sweden)

    Weizhe Zhang

    2013-01-01

    Full Text Available To improve the performance of distributed information retrieval in search engines, we propose a two-level cache structure based on the queries of the users’ logs. We extract the highest rank queries of users from the static cache, in which the queries are the most popular. We adopt the dynamic cache as an auxiliary to optimize the distribution of the cache data. We propose a distribution strategy of the cache data. The experiments prove that the hit rate, the efficiency, and the time consumption of the two-level cache have advantages compared with other structures of cache.

  3. A two-level cache for distributed information retrieval in search engines.

    Science.gov (United States)

    Zhang, Weizhe; He, Hui; Ye, Jianwei

    2013-01-01

    To improve the performance of distributed information retrieval in search engines, we propose a two-level cache structure based on the queries of the users' logs. We extract the highest rank queries of users from the static cache, in which the queries are the most popular. We adopt the dynamic cache as an auxiliary to optimize the distribution of the cache data. We propose a distribution strategy of the cache data. The experiments prove that the hit rate, the efficiency, and the time consumption of the two-level cache have advantages compared with other structures of cache.

  4. Search strings for the study of putative occupational determinants of disease

    NARCIS (Netherlands)

    Mattioli, S.; Zanardi, F.; Baldasseroni, A.; Schaafsma, F.; Cooke, R.M.T.; Mancini, G.; Fierro, M.; Santangelo, C.; Farioli, A.; Fucksia, S.; Curti, S.; Violante, F.S.; Verbeek, J.

    2010-01-01

    Objective To identify efficient PubMed search strategies to retrieve articles regarding putative occupational determinants of conditions not generally considered to be work related. Methods Based on MeSH definitions and expert knowledge, we selected as candidate search terms the four MeSH terms

  5. Optimal search strategies for detecting cost and economic studies in EMBASE

    Directory of Open Access Journals (Sweden)

    Haynes R Brian

    2006-06-01

    Full Text Available Abstract Background Economic evaluations in the medical literature compare competing diagnosis or treatment methods for their use of resources and their expected outcomes. The best evidence currently available from research regarding both cost and economic comparisons will continue to expand as this type of information becomes more important in today's clinical practice. Researchers and clinicians need quick, reliable ways to access this information. A key source of this type of information is large bibliographic databases such as EMBASE. The objective of this study was to develop search strategies that optimize the retrieval of health costs and economics studies from EMBASE. Methods We conducted an analytic survey, comparing hand searches of journals with retrievals from EMBASE for candidate search terms and combinations. 6 research assistants read all issues of 55 journals indexed by EMBASE for the publishing year 2000. We rated all articles using purpose and quality indicators and categorized them into clinically relevant original studies, review articles, general papers, or case reports. The original and review articles were then categorized for purpose (i.e., cost and economics and other clinical topics and depending on the purpose as 'pass' or 'fail' for methodologic rigor. Candidate search strategies were developed for economic and cost studies, then run in the 55 EMBASE journals, the retrievals being compared with the hand search data. The sensitivity, specificity, precision, and accuracy of the search strategies were calculated. Results Combinations of search terms for detecting both cost and economic studies attained levels of 100% sensitivity with specificity levels of 92.9% and 92.3% respectively. When maximizing for both sensitivity and specificity, the combination of terms for detecting cost studies (sensitivity increased 2.2% over the single term but at a slight decrease in specificity of 0.9%. The maximized combination of terms

  6. Searching PubMed for molecular epidemiology studies: the case of chromosome aberrations

    DEFF Research Database (Denmark)

    Ugolini, Donatella; Neri, Monica; Knudsen, Lisbeth E

    2006-01-01

    to environmental pollutants. The search, done on the PubMed/MedLine database, was based on a strategy combining descriptors listed in the PubMed Medical Subject Headings (MeSH) Thesaurus and other available tools (free text or phrase search tools). 178 articles were retrieved by searching the period from January 1...

  7. Invariant spatial context is learned but not retrieved in gaze-contingent tunnel-view search.

    Science.gov (United States)

    Zang, Xuelian; Jia, Lina; Müller, Hermann J; Shi, Zhuanghua

    2015-05-01

    Our visual brain is remarkable in extracting invariant properties from the noisy environment, guiding selection of where to look and what to identify. However, how the brain achieves this is still poorly understood. Here we explore interactions of local context and global structure in the long-term learning and retrieval of invariant display properties. Participants searched for a target among distractors, without knowing that some "old" configurations were presented repeatedly (randomly inserted among "new" configurations). We simulated tunnel vision, limiting the visible region around fixation. Robust facilitation of performance for old versus new contexts was observed when the visible region was large but not when it was small. However, once the display was made fully visible during the subsequent transfer phase, facilitation did become manifest. Furthermore, when participants were given a brief preview of the total display layout prior to tunnel view search with 2 items visible, facilitation was already obtained during the learning phase. The eye movement results revealed contextual facilitation to be coupled with changes of saccadic planning, characterized by slightly extended gaze durations but a reduced number of fixations and shortened scan paths for old displays. Taken together, our findings show that invariant spatial display properties can be acquired based on scarce, para-/foveal information, while their effective retrieval for search guidance requires the availability (even if brief) of a certain extent of peripheral information. (c) 2015 APA, all rights reserved).

  8. Development and Evaluation of Thesauri-Based Bibliographic Biomedical Search Engine

    Science.gov (United States)

    Alghoson, Abdullah

    2017-01-01

    Due to the large volume and exponential growth of biomedical documents (e.g., books, journal articles), it has become increasingly challenging for biomedical search engines to retrieve relevant documents based on users' search queries. Part of the challenge is the matching mechanism of free-text indexing that performs matching based on…

  9. Explicit awareness supports conditional visual search in the retrieval guidance paradigm.

    Science.gov (United States)

    Buttaccio, Daniel R; Lange, Nicholas D; Hahn, Sowon; Thomas, Rick P

    2014-01-01

    In four experiments we explored whether participants would be able to use probabilistic prompts to simplify perceptually demanding visual search in a task we call the retrieval guidance paradigm. On each trial a memory prompt appeared prior to (and during) the search task and the diagnosticity of the prompt(s) was manipulated to provide complete, partial, or non-diagnostic information regarding the target's color on each trial (Experiments 1-3). In Experiment 1 we found that the more diagnostic prompts was associated with faster visual search performance. However, similar visual search behavior was observed in Experiment 2 when the diagnosticity of the prompts was eliminated, suggesting that participants in Experiment 1 were merely relying on base rate information to guide search and were not utilizing the prompts. In Experiment 3 participants were informed of the relationship between the prompts and the color of the target and this was associated with faster search performance relative to Experiment 1, suggesting that the participants were using the prompts to guide search. Additionally, in Experiment 3 a knowledge test was implemented and performance in this task was associated with qualitative differences in search behavior such that participants that were able to name the color(s) most associated with the prompts were faster to find the target than participants who were unable to do so. However, in Experiments 1-3 diagnosticity of the memory prompt was manipulated via base rate information, making it possible that participants were merely relying on base rate information to inform search in Experiment 3. In Experiment 4 we manipulated diagnosticity of the prompts without manipulating base rate information and found a similar pattern of results as Experiment 3. Together, the results emphasize the importance of base rate and diagnosticity information in visual search behavior. In the General discussion section we explore how a recent computational model of

  10. Searching the scientific literature: implications for quantitative and qualitative reviews.

    Science.gov (United States)

    Wu, Yelena P; Aylward, Brandon S; Roberts, Michael C; Evans, Spencer C

    2012-08-01

    Literature reviews are an essential step in the research process and are included in all empirical and review articles. Electronic databases are commonly used to gather this literature. However, several factors can affect the extent to which relevant articles are retrieved, influencing future research and conclusions drawn. The current project examined articles obtained by comparable search strategies in two electronic archives using an exemplar search to illustrate factors that authors should consider when designing their own search strategies. Specifically, literature searches were conducted in PsycINFO and PubMed targeting review articles on two exemplar disorders (bipolar disorder and attention deficit/hyperactivity disorder) and issues of classification and/or differential diagnosis. Articles were coded for relevance and characteristics of article content. The two search engines yielded significantly different proportions of relevant articles overall and by disorder. Keywords differed across search engines for the relevant articles identified. Based on these results, it is recommended that when gathering literature for review papers, multiple search engines should be used, and search syntax and strategies be tailored to the unique capabilities of particular engines. For meta-analyses and systematic reviews, authors may consider reporting the extent to which different archives or sources yielded relevant articles for their particular review. Copyright © 2012 Elsevier Ltd. All rights reserved.

  11. Searching PubMed during a pandemic.

    Directory of Open Access Journals (Sweden)

    Ole Norgaard

    Full Text Available BACKGROUND: The 2009 influenza A(H1N1 pandemic has generated thousands of articles and news items. However, finding relevant scientific articles in such rapidly developing health crises is a major challenge which, in turn, can affect decision-makers' ability to utilise up-to-date findings and ultimately shape public health interventions. This study set out to show the impact that the inconsistent naming of the pandemic can have on retrieving relevant scientific articles in PubMed/MEDLINE. METHODOLOGY: We first formulated a PubMed search algorithm covering different names of the influenza pandemic and simulated the results that it would have retrieved from weekly searches for relevant new records during the first 10 weeks of the pandemic. To assess the impact of failing to include every term in this search, we then conducted the same searches but omitted in turn "h1n1," "swine," "influenza" and "flu" from the search string, and compared the results to those for the full string. PRINCIPAL FINDINGS: On average, our core search string identified 44.3 potentially relevant new records at the end of each week. Of these, we determined that an average of 27.8 records were relevant. When we excluded one term from the string, the percentage of records missed out of the total number of relevant records averaged 18.7% for omitting "h1n1," 13.6% for "swine," 17.5% for "influenza," and 20.6% for "flu." CONCLUSIONS: Due to inconsistent naming, while searching for scientific material about rapidly evolving situations such as the influenza A(H1N1 pandemic, there is a risk that one will miss relevant articles. To address this problem, the international scientific community should agree on nomenclature and the specific name to be used earlier, and the National Library of Medicine in the US could index potentially relevant materials faster and allow publishers to add alert tags to such materials.

  12. Automated Medical Literature Retrieval

    Directory of Open Access Journals (Sweden)

    David Hawking

    2012-09-01

    Full Text Available Background The constantly growing publication rate of medical research articles puts increasing pressure on medical specialists who need to be aware of the recent developments in their field. The currently used literature retrieval systems allow researchers to find specific papers; however the search task is still repetitive and time-consuming. Aims In this paper we describe a system that retrieves medical publications by automatically generating queries based on data from an electronic patient record. This allows the doctor to focus on medical issues and provide an improved service to the patient, with higher confidence that it is underpinned by current research. Method Our research prototype automatically generates query terms based on the patient record and adds weight factors for each term. Currently the patient’s age is taken into account with a fuzzy logic derived weight, and terms describing blood-related anomalies are derived from recent blood test results. Conditionally selected homonyms are used for query expansion. The query retrieves matching records from a local index of PubMed publications and displays results in descending relevance for the given patient. Recent publications are clearly highlighted for instant recognition by the researcher. Results Nine medical specialists from the Royal Adelaide Hospital evaluated the system and submitted pre-trial and post-trial questionnaires. Throughout the study we received positive feedback as doctors felt the support provided by the prototype was useful, and which they would like to use in their daily routine. Conclusion By supporting the time-consuming task of query formulation and iterative modification as well as by presenting the search results in order of relevance for the specific patient, literature retrieval becomes part of the daily workflow of busy professionals.

  13. Analysis of queries sent to PubMed at the point of care: Observation of search behaviour in a medical teaching hospital

    Science.gov (United States)

    Hoogendam, Arjen; Stalenhoef, Anton FH; Robbé, Pieter F de Vries; Overbeke, A John PM

    2008-01-01

    Background The use of PubMed to answer daily medical care questions is limited because it is challenging to retrieve a small set of relevant articles and time is restricted. Knowing what aspects of queries are likely to retrieve relevant articles can increase the effectiveness of PubMed searches. The objectives of our study were to identify queries that are likely to retrieve relevant articles by relating PubMed search techniques and tools to the number of articles retrieved and the selection of articles for further reading. Methods This was a prospective observational study of queries regarding patient-related problems sent to PubMed by residents and internists in internal medicine working in an Academic Medical Centre. We analyzed queries, search results, query tools (Mesh, Limits, wildcards, operators), selection of abstract and full-text for further reading, using a portal that mimics PubMed. Results PubMed was used to solve 1121 patient-related problems, resulting in 3205 distinct queries. Abstracts were viewed in 999 (31%) of these queries, and in 126 (39%) of 321 queries using query tools. The average term count per query was 2.5. Abstracts were selected in more than 40% of queries using four or five terms, increasing to 63% if the use of four or five terms yielded 2–161 articles. Conclusion Queries sent to PubMed by physicians at our hospital during daily medical care contain fewer than three terms. Queries using four to five terms, retrieving less than 161 article titles, are most likely to result in abstract viewing. PubMed search tools are used infrequently by our population and are less effective than the use of four or five terms. Methods to facilitate the formulation of precise queries, using more relevant terms, should be the focus of education and research. PMID:18816391

  14. Sensitivity and predictive value of 15 PubMed search strategies to answer clinical questions rated against full systematic reviews.

    Science.gov (United States)

    Agoritsas, Thomas; Merglen, Arnaud; Courvoisier, Delphine S; Combescure, Christophe; Garin, Nicolas; Perrier, Arnaud; Perneger, Thomas V

    2012-06-12

    Clinicians perform searches in PubMed daily, but retrieving relevant studies is challenging due to the rapid expansion of medical knowledge. Little is known about the performance of search strategies when they are applied to answer specific clinical questions. To compare the performance of 15 PubMed search strategies in retrieving relevant clinical trials on therapeutic interventions. We used Cochrane systematic reviews to identify relevant trials for 30 clinical questions. Search terms were extracted from the abstract using a predefined procedure based on the population, interventions, comparison, outcomes (PICO) framework and combined into queries. We tested 15 search strategies that varied in their query (PIC or PICO), use of PubMed's Clinical Queries therapeutic filters (broad or narrow), search limits, and PubMed links to related articles. We assessed sensitivity (recall) and positive predictive value (precision) of each strategy on the first 2 PubMed pages (40 articles) and on the complete search output. The performance of the search strategies varied widely according to the clinical question. Unfiltered searches and those using the broad filter of Clinical Queries produced large outputs and retrieved few relevant articles within the first 2 pages, resulting in a median sensitivity of only 10%-25%. In contrast, all searches using the narrow filter performed significantly better, with a median sensitivity of about 50% (all P PubMed pages. These results can help clinicians apply effective strategies to answer their questions at the point of care.

  15. The EBI search engine: EBI search as a service—making biological data accessible for all

    Science.gov (United States)

    Park, Young M.; Squizzato, Silvano; Buso, Nicola; Gur, Tamer

    2017-01-01

    Abstract We present an update of the EBI Search engine, an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. The interconnectivity that exists between data resources at EMBL–EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types that include nucleotide and protein sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, as well as the life science literature. EBI Search provides a powerful RESTful API that enables its integration into third-party portals, thus providing ‘Search as a Service’ capabilities, which are the main topic of this article. PMID:28472374

  16. Analysis of queries sent to PubMed at the point of care: observation of search behaviour in a medical teaching hospital.

    NARCIS (Netherlands)

    Hoogendam, A.; Stalenhoef, A.F.H.; Vries Robbe, P.F. de; Overbeke, A.J.P.M.

    2008-01-01

    BACKGROUND: The use of PubMed to answer daily medical care questions is limited because it is challenging to retrieve a small set of relevant articles and time is restricted. Knowing what aspects of queries are likely to retrieve relevant articles can increase the effectiveness of PubMed searches.

  17. An exploratory analysis of PubMed's free full-text limit on citation retrieval for clinical questions.

    Science.gov (United States)

    Krieger, Mary M; Richter, Randy R; Austin, Tricia M

    2008-10-01

    The research sought to determine (1) how use of the PubMed free full-text (FFT) limit affects citation retrieval and (2) how use of the FFT limit impacts the types of articles and levels of evidence retrieved. Four clinical questions based on a research agenda for physical therapy were searched in PubMed both with and without the use of the FFT limit. Retrieved citations were examined for relevancy to each question. Abstracts of relevant citations were reviewed to determine the types of articles and levels of evidence. Descriptive analysis was used to compare the total number of citations, number of relevant citations, types of articles, and levels of evidence both with and without the use of the FFT limit. Across all 4 questions, the FFT limit reduced the number of citations to 11.1% of the total number of citations retrieved without the FFT limit. Additionally, high-quality evidence such as systematic reviews and randomized controlled trials were missed when the FFT limit was used. Health sciences librarians play a key role in educating users about the potential impact the FFT limit has on the number of citations, types of articles, and levels of evidence retrieved.

  18. INIS: Manual for online retrieval from the INIS Database on the Internet

    International Nuclear Information System (INIS)

    2000-01-01

    This manual demonstrates the different Search Forms available to retrieve relevant records using the INIS Database online retrieval system. Information on how to search, how to store, refine and retrieve searches, and how to update a literature search is given

  19. INIS: Manual for online retrieval from the INIS Database on the Internet

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-10-01

    This manual demonstrates the different Search Forms available to retrieve relevant records using the INIS Database online retrieval system. Information on how to search, how to store, refine and retrieve searches, and how to update a literature search is given.

  20. Developing a Test Collection for the Evaluation of Integrated Search

    DEFF Research Database (Denmark)

    Lykke, Marianne; Larsen, Birger; Lund, Haakon

    2010-01-01

    he poster discusses the characteristics needed in an information retrieval (IR) test collection to facilitate the evaluation of integrated search, i.e. search across a range of different sources but with one search box and one ranked result list, and describes and analyses a new test collection c...... assessments. The test collection may be used for systems- as well as user-oriented evaluation.......he poster discusses the characteristics needed in an information retrieval (IR) test collection to facilitate the evaluation of integrated search, i.e. search across a range of different sources but with one search box and one ranked result list, and describes and analyses a new test collection...... constructed for this purpose. The test collection consists of approx. 18,000 monographic records, 160,000 papers and journal articles in PDF and 275,000 abstracts with a varied set of metadata and vocabularies from the physics domain, 65 topics based on real work tasks and corresponding graded relevance...

  1. Brazilian academic search filter: application to the scientific literature on physical activity.

    Science.gov (United States)

    Sanz-Valero, Javier; Ferreira, Marcos Santos; Castiel, Luis David; Wanden-Berghe, Carmina; Guilam, Maria Cristina Rodrigues

    2010-10-01

    To develop a search filter in order to retrieve scientific publications on physical activity from Brazilian academic institutions. The academic search filter consisted of the descriptor "exercise" associated through the term AND, to the names of the respective academic institutions, which were connected by the term OR. The MEDLINE search was performed with PubMed on 11/16/2008. The institutions were selected according to the classification from the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) for interuniversity agreements. A total of 407 references were retrieved, corresponding to about 0.9% of all articles about physical activity and 0.5% of the Brazilian academic publications indexed in MEDLINE on the search date. When compared with the manual search undertaken, the search filter (descriptor + institutional filter) showed a sensitivity of 99% and a specificity of 100%. The institutional search filter showed high sensitivity and specificity, and is applicable to other areas of knowledge in health sciences. It is desirable that every Brazilian academic institution establish its "standard name/brand" in order to efficiently retrieve their scientific literature.

  2. DORS: DDC Online Retrieval System.

    Science.gov (United States)

    Liu, Songqiao; Svenonius, Elaine

    1991-01-01

    Describes the Dewey Online Retrieval System (DORS), which was developed at the University of California, Los Angeles (UCLA), to experiment with classification-based search strategies in online catalogs. Classification structures in automated information retrieval are discussed; and specifications for a classification retrieval interface are…

  3. PubMed search strings for the study of agricultural workers' diseases.

    Science.gov (United States)

    Mattioli, Stefano; Gori, Davide; Di Gregori, Valentina; Ricotta, Lara; Baldasseroni, Alberto; Farioli, Andrea; Zanardi, Francesca; Galletti, Stefania; Colosio, Claudio; Curti, Stefania; Violante, Francesco S

    2013-12-01

    Several optimized search strategies have been developed in Medicine, and more recently in Occupational Medicine. The aim of this study was to identify efficient PubMed search strategies to retrieve articles regarding putative occupational determinants of agricultural workers' diseases. We selected the Medical Subjects Heading (MeSH) term agricultural workers' diseases and six MeSH terms describing farm work (agriculture, agrochemicals NOT pesticides, animal husbandry, pesticides, rural health, rural population) alongside 61 other promising terms. We estimated proportions of articles containing potentially pertinent information regarding occupational etiology to formulate two search strategies (one "more specific," one "more sensitive"). We applied these strategies to retrieve information on the possible occupational etiology among agricultural workers of kidney cancer, knee osteoarthritis, and multiple sclerosis. We evaluated the number of needed to read (NNR) abstracts to identify one potentially pertinent article in the context of these pathologies. The "more specific" search string was based on the combination of terms that yielded the highest proportion (40%) of potentially pertinent abstracts. The "more sensitive" string was based on use of broader search fields and additional coverage provided by other search terms under study. Using the "more specific" string, the NNR to find one potentially pertinent article were: 1.1 for kidney cancer; 1.4 for knee osteoarthritis; 1.2 for multiple sclerosis. Using the sensitive strategy, the NNR were 1.4, 3.6, and 6.3, respectively. The proposed strings could help health care professionals explore putative occupational etiology for agricultural workers' diseases (even if not generally thought to be work related). © 2013 Wiley Periodicals, Inc.

  4. Testing search strategies for systematic reviews in the Medline literature database through PubMed.

    Science.gov (United States)

    Volpato, Enilze S N; Betini, Marluci; El Dib, Regina

    2014-04-01

    A high-quality electronic search is essential in ensuring accuracy and completeness in retrieved records for the conducting of a systematic review. We analysed the available sample of search strategies to identify the best method for searching in Medline through PubMed, considering the use or not of parenthesis, double quotation marks, truncation and use of a simple search or search history. In our cross-sectional study of search strategies, we selected and analysed the available searches performed during evidence-based medicine classes and in systematic reviews conducted in the Botucatu Medical School, UNESP, Brazil. We analysed 120 search strategies. With regard to the use of phrase searches with parenthesis, there was no difference between the results with and without parenthesis and simple searches or search history tools in 100% of the sample analysed (P = 1.0). The number of results retrieved by the searches analysed was smaller using double quotations marks and using truncation compared with the standard strategy (P = 0.04 and P = 0.08, respectively). There is no need to use phrase-searching parenthesis to retrieve studies; however, we recommend the use of double quotation marks when an investigator attempts to retrieve articles in which a term appears to be exactly the same as what was proposed in the search form. Furthermore, we do not recommend the use of truncation in search strategies in the Medline via PubMed. Although the results of simple searches or search history tools were the same, we recommend using the latter.

  5. The EBI search engine: EBI search as a service-making biological data accessible for all.

    Science.gov (United States)

    Park, Young M; Squizzato, Silvano; Buso, Nicola; Gur, Tamer; Lopez, Rodrigo

    2017-07-03

    We present an update of the EBI Search engine, an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. The interconnectivity that exists between data resources at EMBL-EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types that include nucleotide and protein sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, as well as the life science literature. EBI Search provides a powerful RESTful API that enables its integration into third-party portals, thus providing 'Search as a Service' capabilities, which are the main topic of this article. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Content Based Medical Image Retrieval for Histopathological, CT and MRI Images

    Directory of Open Access Journals (Sweden)

    Swarnambiga AYYACHAMY

    2013-09-01

    Full Text Available A content based approach is followed for medical images. The purpose of this study is to access the stability of these methods for medical image retrieval. The methods used in color based retrieval for histopathological images are color co-occurrence matrix (CCM and histogram with meta features. For texture based retrieval GLCM (gray level co-occurrence matrix and local binary pattern (LBP were used. For shape based retrieval canny edge detection and otsu‘s method with multivariable threshold were used. Texture and shape based retrieval were implemented using MRI (magnetic resonance images. The most remarkable characteristics of the article are its content based approach for each medical imaging modality. Our efforts were focused on the initial visual search. From our experiment, histogram with meta features in color based retrieval for histopathological images shows a precision of 60 % and recall of 30 %. Whereas GLCM in texture based retrieval for MRI images shows a precision of 70 % and recall of 20 %. Shape based retrieval for MRI images shows a precision of 50% and recall of 25 %. The retrieval results shows that this simple approach is successful.

  7. Web information retrieval based on ontology

    Science.gov (United States)

    Zhang, Jian

    2013-03-01

    The purpose of the Information Retrieval (IR) is to find a set of documents that are relevant for a specific information need of a user. Traditional Information Retrieval model commonly used in commercial search engine is based on keyword indexing system and Boolean logic queries. One big drawback of traditional information retrieval is that they typically retrieve information without an explicitly defined domain of interest to the users so that a lot of no relevance information returns to users, which burden the user to pick up useful answer from these no relevance results. In order to tackle this issue, many semantic web information retrieval models have been proposed recently. The main advantage of Semantic Web is to enhance search mechanisms with the use of Ontology's mechanisms. In this paper, we present our approach to personalize web search engine based on ontology. In addition, key techniques are also discussed in our paper. Compared to previous research, our works concentrate on the semantic similarity and the whole process including query submission and information annotation.

  8. How to improve your PubMed/MEDLINE searches: 2. display settings, complex search queries and topic searching.

    Science.gov (United States)

    Fatehi, Farhad; Gray, Leonard C; Wootton, Richard

    2014-01-01

    The way that PubMed results are displayed can be changed using the Display Settings drop-down menu in the result screen. There are three groups of options: Format, Items per page and Sort by, which allow a good deal of control. The results from several searches can be temporarily stored on the Clipboard. Records of interest can be selected on the results page using check boxes and can then be combined, for example to form a reference list. The Related Citations is a valuable feature of PubMed that can provide a set of similar articles when you have identified a record of interest among the results. You can easily search for RCTs or reviews using the appropriate filters or field tags. If you are interested in clinical articles, rather than basic science or health service research, then the Clinical Queries tool on the PubMed home page can be used to retrieve them.

  9. What is lost when searching only one literature database for articles relevant to injury prevention and safety promotion?

    Science.gov (United States)

    Lawrence, D W

    2008-12-01

    To assess what is lost if only one literature database is searched for articles relevant to injury prevention and safety promotion (IPSP) topics. Serial textword (keyword, free-text) searches using multiple synonym terms for five key IPSP topics (bicycle-related brain injuries, ethanol-impaired driving, house fires, road rage, and suicidal behaviors among adolescents) were conducted in four of the bibliographic databases that are most used by IPSP professionals: EMBASE, MEDLINE, PsycINFO, and Web of Science. Through a systematic procedure, an inventory of articles on each topic in each database was conducted to identify the total unduplicated count of all articles on each topic, the number of articles unique to each database, and the articles available if only one database is searched. No single database included all of the relevant articles on any topic, and the database with the broadest coverage differed by topic. A search of only one literature database will return 16.7-81.5% (median 43.4%) of the available articles on any of five key IPSP topics. Each database contributed unique articles to the total bibliography for each topic. A literature search performed in only one database will, on average, lead to a loss of more than half of the available literature on a topic.

  10. Mobile medical image retrieval

    Science.gov (United States)

    Duc, Samuel; Depeursinge, Adrien; Eggel, Ivan; Müller, Henning

    2011-03-01

    Images are an integral part of medical practice for diagnosis, treatment planning and teaching. Image retrieval has gained in importance mainly as a research domain over the past 20 years. Both textual and visual retrieval of images are essential. In the process of mobile devices becoming reliable and having a functionality equaling that of formerly desktop clients, mobile computing has gained ground and many applications have been explored. This creates a new field of mobile information search & access and in this context images can play an important role as they often allow understanding complex scenarios much quicker and easier than free text. Mobile information retrieval in general has skyrocketed over the past year with many new applications and tools being developed and all sorts of interfaces being adapted to mobile clients. This article describes constraints of an information retrieval system including visual and textual information retrieval from the medical literature of BioMedCentral and of the RSNA journals Radiology and Radiographics. Solutions for mobile data access with an example on an iPhone in a web-based environment are presented as iPhones are frequently used and the operating system is bound to become the most frequent smartphone operating system in 2011. A web-based scenario was chosen to allow for a use by other smart phone platforms such as Android as well. Constraints of small screens and navigation with touch screens are taken into account in the development of the application. A hybrid choice had to be taken to allow for taking pictures with the cell phone camera and upload them for visual similarity search as most producers of smart phones block this functionality to web applications. Mobile information access and in particular access to images can be surprisingly efficient and effective on smaller screens. Images can be read on screen much faster and relevance of documents can be identified quickly through the use of images contained in

  11. Interactive information seeking, behaviour and retrieval

    CERN Document Server

    Ruthven, Ian

    2011-01-01

    Information retrieval (IR) is a complex human activity supported by sophisticated systems. This book covers the whole spectrum of information retrieval, including: history and background information; behaviour and seeking task-based information; searching and retrieval approaches to investigating information; and, evaluation interfaces for IR.

  12. Introduction to information retrieval

    CERN Document Server

    Manning, Christopher D; Schütze, Hinrich

    2008-01-01

    Class-tested and coherent, this textbook teaches classical and web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. It gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced un

  13. Efficient data retrieval method for similar plasma waveforms in EAST

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Ying, E-mail: liuying-ipp@szu.edu.cn [SZU-CASIPP Joint Laboratory for Applied Plasma, Shenzhen University, Shenzhen 518060 (China); Huang, Jianjun; Zhou, Huasheng; Wang, Fan [SZU-CASIPP Joint Laboratory for Applied Plasma, Shenzhen University, Shenzhen 518060 (China); Wang, Feng [Institute of Plasma Physics Chinese Academy of Sciences, Hefei 230031 (China)

    2016-11-15

    Highlights: • The proposed method is carried out by means of bounding envelope and angle distance. • It allows retrieving for whole similar waveforms of any time length. • In addition, the proposed method is also possible to retrieve subsequences. - Abstract: Fusion research relies highly on data analysis due to its massive-sized database. In the present work, we propose an efficient method for searching and retrieving similar plasma waveforms in Experimental Advanced Superconducting Tokamak (EAST). Based on Piecewise Linear Aggregate Approximation (PLAA) for extracting feature values, the searching process is accomplished in two steps. The first one is coarse searching to narrow down the search space, which is carried out by means of bounding envelope. The second step is fine searching to retrieval similar waveforms, which is implemented by the angle distance. The proposed method is tested in EAST databases and turns out to have good performance in retrieving similar waveforms.

  14. Statistical Language Models and Information Retrieval: Natural Language Processing Really Meets Retrieval

    NARCIS (Netherlands)

    Hiemstra, Djoerd; de Jong, Franciska M.G.

    2001-01-01

    Traditionally, natural language processing techniques for information retrieval have always been studied outside the framework of formal models of information retrieval. In this article, we introduce a new formal model of information retrieval based on the application of statistical language models.

  15. Summary report of project SIREN (Search, Intercept, Retrieve, Expulsion, Nuclear)

    International Nuclear Information System (INIS)

    Buden, D.

    1992-12-01

    Project SIREN (Search, Intercept, Retrieve, Expulsion, Nuclear) has evaluated the technologies and operational strategies needed to rendezvous with and capture aerospace radioactive materials (e.g., a distressed or spent space reactor core) before such materials can reenter the terrestrial atmosphere and to move these captured materials to a space destination for proper disposal. The use of systems external to a satellite allows multiple attempts to prevent the nuclear materials from reentering the atmosphere. SIREN also has investigated means to prevent the breakup of nuclear-powered systems already in space. The SIREN project has determined that external means can be used reliably to prevent nuclear materials from reentering the terrestrial environment, prepared a computer model that can be used to evaluate the means to dispose of radioactive materials, assessed the hazards from existing nuclear power systems in space, and in discussions with Russian Federation representatives determined interest in joint activities in this area

  16. Delve: A Data Set Retrieval and Document Analysis System

    KAUST Repository

    Akujuobi, Uchenna Thankgod; Zhang, Xiangliang

    2017-01-01

    Academic search engines (e.g., Google scholar or Microsoft academic) provide a medium for retrieving various information on scholarly documents. However, most of these popular scholarly search engines overlook the area of data set retrieval, which

  17. Delve: A Data Set Retrieval and Document Analysis System

    KAUST Repository

    Akujuobi, Uchenna Thankgod

    2017-12-29

    Academic search engines (e.g., Google scholar or Microsoft academic) provide a medium for retrieving various information on scholarly documents. However, most of these popular scholarly search engines overlook the area of data set retrieval, which should provide information on relevant data sets used for academic research. Due to the increasing volume of publications, it has become a challenging task to locate suitable data sets on a particular research area for benchmarking or evaluations. We propose Delve, a web-based system for data set retrieval and document analysis. This system is different from other scholarly search engines as it provides a medium for both data set retrieval and real time visual exploration and analysis of data sets and documents.

  18. Utah Text Retrieval Project

    Energy Technology Data Exchange (ETDEWEB)

    Hollaar, L A

    1983-10-01

    The Utah Text Retrieval project seeks well-engineered solutions to the implementation of large, inexpensive, rapid text information retrieval systems. The project has three major components. Perhaps the best known is the work on the specialized processors, particularly search engines, necessary to achieve the desired performance and cost. The other two concern the user interface to the system and the system's internal structure. The work on user interface development is not only concentrating on the syntax and semantics of the query language, but also on the overall environment the system presents to the user. Environmental enhancements include convenient ways to browse through retrieved documents, access to other information retrieval systems through gateways supporting a common command interface, and interfaces to word processing systems. The system's internal structure is based on a high-level data communications protocol linking the user interface, index processor, search processor, and other system modules. This allows them to be easily distributed in a multi- or specialized-processor configuration. It also allows new modules, such as a knowledge-based query reformulator, to be added. 15 references.

  19. Graph-Based Interactive Bibliographic Information Retrieval Systems

    Science.gov (United States)

    Zhu, Yongjun

    2017-01-01

    In the big data era, we have witnessed the explosion of scholarly literature. This explosion has imposed challenges to the retrieval of bibliographic information. Retrieval of intended bibliographic information has become challenging due to the overwhelming search results returned by bibliographic information retrieval systems for given input…

  20. Introducing PALETTE: an iterative method for conducting a literature search for a review in palliative care.

    Science.gov (United States)

    Zwakman, Marieke; Verberne, Lisa M; Kars, Marijke C; Hooft, Lotty; van Delden, Johannes J M; Spijker, René

    2018-06-02

    In the rapidly developing specialty of palliative care, literature reviews have become increasingly important to inform and improve the field. When applying widely used methods for literature reviews developed for intervention studies onto palliative care, challenges are encountered such as the heterogeneity of palliative care in practice (wide range of domains in patient characteristics, stages of illness and stakeholders), the explorative character of review questions, and the poorly defined keywords and concepts. To overcome the challenges and to provide guidance for researchers to conduct a literature search for a review in palliative care, Palliative cAre Literature rEview iTeraTive mEthod (PALLETE), a pragmatic framework, was developed. We assessed PALETTE with a detailed description. PALETTE consists of four phases; developing the review question, building the search strategy, validating the search strategy and performing the search. The framework incorporates different information retrieval techniques: contacting experts, pearl growing, citation tracking and Boolean searching in a transparent way to maximize the retrieval of literature relevant to the topic of interest. The different components and techniques are repeated until no new articles are qualified for inclusion. The phases within PALETTE are interconnected by a recurrent process of validation on 'golden bullets' (articles that undoubtedly should be part of the review), citation tracking and concept terminology reflecting the review question. To give insight in the value of PALETTE, we compared PALETTE with the recommended search method for reviews of intervention studies. By using PALETTE on two palliative care literature reviews, we were able to improve our review questions and search strategies. Moreover, in comparison with the recommended search for intervention reviews, the number of articles needed to be screened was decreased whereas more relevant articles were retrieved. Overall, PALETTE

  1. Improving biomedical information retrieval by linear combinations of different query expansion techniques.

    Science.gov (United States)

    Abdulla, Ahmed AbdoAziz Ahmed; Lin, Hongfei; Xu, Bo; Banbhrani, Santosh Kumar

    2016-07-25

    Biomedical literature retrieval is becoming increasingly complex, and there is a fundamental need for advanced information retrieval systems. Information Retrieval (IR) programs scour unstructured materials such as text documents in large reserves of data that are usually stored on computers. IR is related to the representation, storage, and organization of information items, as well as to access. In IR one of the main problems is to determine which documents are relevant and which are not to the user's needs. Under the current regime, users cannot precisely construct queries in an accurate way to retrieve particular pieces of data from large reserves of data. Basic information retrieval systems are producing low-quality search results. In our proposed system for this paper we present a new technique to refine Information Retrieval searches to better represent the user's information need in order to enhance the performance of information retrieval by using different query expansion techniques and apply a linear combinations between them, where the combinations was linearly between two expansion results at one time. Query expansions expand the search query, for example, by finding synonyms and reweighting original terms. They provide significantly more focused, particularized search results than do basic search queries. The retrieval performance is measured by some variants of MAP (Mean Average Precision) and according to our experimental results, the combination of best results of query expansion is enhanced the retrieved documents and outperforms our baseline by 21.06 %, even it outperforms a previous study by 7.12 %. We propose several query expansion techniques and their combinations (linearly) to make user queries more cognizable to search engines and to produce higher-quality search results.

  2. Contextual Bandits for Information Retrieval

    NARCIS (Netherlands)

    Hofmann, K.; Whiteson, S.; de Rijke, M.

    2011-01-01

    In this paper we give an overview of and outlook on research at the intersection of information retrieval (IR) and contextual bandit problems. A critical problem in information retrieval is online learning to rank, where a search engine strives to improve the quality of the ranked result lists it

  3. Vocabulary Control for Information Retrieval.

    Science.gov (United States)

    Lancaster, F. W.

    This book deals with properties of vocabularies for indexing and searching document collections; the construction, organization, display, and maintenance of these vocabularies; and the vocabulary as a factor affecting the performance of retrieval systems. Most of the text is concerned with vocabularies for post-coordinate retrieval systems, with…

  4. Multimodal medical information retrieval with unsupervised rank fusion.

    Science.gov (United States)

    Mourão, André; Martins, Flávio; Magalhães, João

    2015-01-01

    Modern medical information retrieval systems are paramount to manage the insurmountable quantities of clinical data. These systems empower health care experts in the diagnosis of patients and play an important role in the clinical decision process. However, the ever-growing heterogeneous information generated in medical environments poses several challenges for retrieval systems. We propose a medical information retrieval system with support for multimodal medical case-based retrieval. The system supports medical information discovery by providing multimodal search, through a novel data fusion algorithm, and term suggestions from a medical thesaurus. Our search system compared favorably to other systems in 2013 ImageCLEFMedical. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. Iterative Filtering of Retrieved Information to Increase Relevance

    Directory of Open Access Journals (Sweden)

    Robert Zeidman

    2007-12-01

    Full Text Available Efforts have been underway for years to find more effective ways to retrieve information from large knowledge domains. This effort is now being driven particularly by the Internet and the vast amount of information that is available to unsophisticated users. In the early days of the Internet, some effort involved allowing users to enter Boolean equations of search terms into search engines, for example, rather than just a list of keywords. More recently, effort has focused on understanding a user's desires from past search histories in order to narrow searches. Also there has been much effort to improve the ranking of results based on some measure of relevancy. This paper discusses using iterative filtering of retrieved information to focus in on useful information. This work was done for finding source code correlation and the author extends his findings to Internet searching and e-commerce. The paper presents specific information about a particular filtering application and then generalizes it to other forms of information retrieval.

  6. Fusion and diversification in information retrieval

    NARCIS (Netherlands)

    Liang, S.

    2014-01-01

    Data fusion and search result diversification are two critical research topics in information retrieval. Data fusion approaches combine search result lists in order to produce a new and hopefully better ranking. We propose two data fusion models for microblog search that exploit temporal information

  7. Diversity, intent, and aggregated search

    NARCIS (Netherlands)

    de Rijke, M.

    2014-01-01

    Diversity, intent and aggregated search are three core retrieval concepts that receive significant attention. In search result diversification one typically considers the relevance of a document in light of other retrieved documents. The goal is to identify the probable "aspects" of an ambiguous

  8. Automated information retrieval system for radioactivation analysis

    International Nuclear Information System (INIS)

    Lambrev, V.G.; Bochkov, P.E.; Gorokhov, S.A.; Nekrasov, V.V.; Tolstikova, L.I.

    1981-01-01

    An automated information retrieval system for radioactivation analysis has been developed. An ES-1022 computer and a problem-oriented software ''The description information search system'' were used for the purpose. Main aspects and sources of forming the system information fund, characteristics of the information retrieval language of the system are reported and examples of question-answer dialogue are given. Two modes can be used: selective information distribution and retrospective search [ru

  9. Effects of Spaced Retrieval Training on Semantic Memory in Alzheimer's Disease: A Systematic Review

    Science.gov (United States)

    Oren, Shiri; Willerton, Charlene; Small, Jeff

    2014-01-01

    Purpose: This article reports on a systematic review and meta-analysis of the effects of spaced retrieval training (SRT) on semantic memory in people with Alzheimer's disease (AD) or related disorder. Method: An initial systematic database search identified 454 potential studies. After screening and de-duplication, 35 studies that used SRT…

  10. Method of and System for Information Retrieval

    DEFF Research Database (Denmark)

    2015-01-01

    This invention relates to a system for and a method (100) of searching a collection of digital information (150) comprising a number of digital documents (110), the method comprising receiving or obtaining (102) a search query, the query comprising a number of search terms, searching (103) an ind......, a method of and a system for information retrieval or searching is readily provided that enhances the searching quality (i.e. the number of relevant documents retrieved and such documents being ranked high) when (also) using queries containing many search terms.......This invention relates to a system for and a method (100) of searching a collection of digital information (150) comprising a number of digital documents (110), the method comprising receiving or obtaining (102) a search query, the query comprising a number of search terms, searching (103) an index...... (300) using the search terms thereby providing information (301) about which digital documents (110) of the collection of digital information (150) that contains a given search term and one or more search related metrics (302; 303; 304; 305; 306), ranking (105) at least a part of the search result...

  11. When is a search not a search? A comparison of searching the AMED complementary health database via EBSCOhost, OVID and DIALOG.

    Science.gov (United States)

    Younger, Paula; Boddy, Kate

    2009-06-01

    The researchers involved in this study work at Exeter Health library and at the Complementary Medicine Unit, Peninsula School of Medicine and Dentistry (PCMD). Within this collaborative environment it is possible to access the electronic resources of three institutions. This includes access to AMED and other databases using different interfaces. The aim of this study was to investigate whether searching different interfaces to the AMED allied health and complementary medicine database produced the same results when using identical search terms. The following Internet-based AMED interfaces were searched: DIALOG DataStar; EBSCOhost and OVID SP_UI01.00.02. Search results from all three databases were saved in an endnote database to facilitate analysis. A checklist was also compiled comparing interface features. In our initial search, DIALOG returned 29 hits, OVID 14 and Ebsco 8. If we assume that DIALOG returned 100% of potential hits, OVID initially returned only 48% of hits and EBSCOhost only 28%. In our search, a researcher using the Ebsco interface to carry out a simple search on AMED would miss over 70% of possible search hits. Subsequent EBSCOhost searches on different subjects failed to find between 21 and 86% of the hits retrieved using the same keywords via DIALOG DataStar. In two cases, the simple EBSCOhost search failed to find any of the results found via DIALOG DataStar. Depending on the interface, the number of hits retrieved from the same database with the same simple search can vary dramatically. Some simple searches fail to retrieve a substantial percentage of citations. This may result in an uninformed literature review, research funding application or treatment intervention. In addition to ensuring that keywords, spelling and medical subject headings (MeSH) accurately reflect the nature of the search, database users should include wildcards and truncation and adapt their search strategy substantially to retrieve the maximum number of appropriate

  12. Facilitating medical information search using Google Glass connected to a content-based medical image retrieval system.

    Science.gov (United States)

    Widmer, Antoine; Schaer, Roger; Markonis, Dimitrios; Muller, Henning

    2014-01-01

    Wearable computing devices are starting to change the way users interact with computers and the Internet. Among them, Google Glass includes a small screen located in front of the right eye, a camera filming in front of the user and a small computing unit. Google Glass has the advantage to provide online services while allowing the user to perform tasks with his/her hands. These augmented glasses uncover many useful applications, also in the medical domain. For example, Google Glass can easily provide video conference between medical doctors to discuss a live case. Using these glasses can also facilitate medical information search by allowing the access of a large amount of annotated medical cases during a consultation in a non-disruptive fashion for medical staff. In this paper, we developed a Google Glass application able to take a photo and send it to a medical image retrieval system along with keywords in order to retrieve similar cases. As a preliminary assessment of the usability of the application, we tested the application under three conditions (images of the skin; printed CT scans and MRI images; and CT and MRI images acquired directly from an LCD screen) to explore whether using Google Glass affects the accuracy of the results returned by the medical image retrieval system. The preliminary results show that despite minor problems due to the relative stability of the Google Glass, images can be sent to and processed by the medical image retrieval system and similar images are returned to the user, potentially helping in the decision making process.

  13. Foreign Body Retrieval

    Medline Plus

    Full Text Available ... tissues. top of page What are some common uses of the procedure? Foreign body retrieval is used ... community, you can search the ACR-accredited facilities database . This website does not provide cost information. The ...

  14. Learning to rank for information retrieval

    CERN Document Server

    Liu, Tie-Yan

    2011-01-01

    Due to the fast growth of the Web and the difficulties in finding desired information, efficient and effective information retrieval systems have become more important than ever, and the search engine has become an essential tool for many people. The ranker, a central component in every search engine, is responsible for the matching between processed queries and indexed documents. Because of its central role, great attention has been paid to the research and development of ranking technologies. In addition, ranking is also pivotal for many other information retrieval applications, such as coll

  15. Content-based multimedia retrieval: indexing and diversification

    NARCIS (Netherlands)

    van Leuken, R.H.

    2009-01-01

    The demand for efficient systems that facilitate searching in multimedia databases and collections is vastly increasing. Application domains include criminology, musicology, trademark registration, medicine and image or video retrieval on the web. This thesis discusses content-based retrieval

  16. A Process Model for Goal-Based Information Retrieval

    Directory of Open Access Journals (Sweden)

    Harvey Hyman

    2014-12-01

    Full Text Available In this paper we examine the domain of information search and propose a "goal-based" approach to study search strategy. We describe "goal-based information search" using a framework of Knowledge Discovery. We identify two Information Retrieval (IR goals using the constructs of Knowledge Acquisition (KA and Knowledge Explanation (KE. We classify these constructs into two specific information problems: An exploration-exploitation problem and an implicit-explicit problem. Our proposed framework is an extension of prior work in this domain, applying an IR Process Model originally developed for Legal-IR and adapted to Medical-IR. The approach in this paper is guided by the recent ACM-SIG Medical Information Retrieval (MedIR Workshop definition: "methodologies and technologies that seek to improve access to medical information archives via a process of information retrieval."

  17. Critical Assessment of Search Strategies in Systematic Reviews in Endodontics.

    Science.gov (United States)

    Yaylali, Ibrahim Ethem; Alaçam, Tayfun

    2016-06-01

    The aim of this study was to perform an overview of literature search strategies in systematic reviews (SRs) published in 2 endodontic journals, Journal of Endodontics and International Endodontic Journal. A search was done by using the MEDLINE (PubMed interface) database to retrieve the articles published between January 1, 2000 and December 31, 2015. The last search was on January 10, 2016. All the SRs published in the 2 journals were retrieved and screened. Eligible SRs were assessed by using 11 questions about search strategies in the SRs that were adapted from 2 guidelines (ie, AMSTAR checklist and the Cochrane Handbook). A total of 83 SRs were retrieved by electronic search. Of these, 55 were from the Journal of Endodontics, and 28 were from the International Endodontic Journal. After screening, 2 SRs were excluded, and 81 SRs were included in the study. Some issues, such as search of grey literature and contact with study authors, were not fully reported (30% and 25%, respectively). On the other hand, some issues, such as the use of index terms and key words and search in at least 2 databases, were reported in most of the SRs (97% and 95%, respectively). The overall quality of the search strategy in both journals was 61%. No significant difference was found between the 2 journals in terms of evaluation criteria (P > .05). There exist areas for improving the quality of reporting of search strategies in SRs; for example, grey literature should be searched for unpublished studies, no language limitation should be applied to databases, and authors should make an attempt to contact the authors of included studies to obtain further relevant information. Copyright © 2016 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.

  18. OvidSP Medline-to-PubMed search filter translation: a methodology for extending search filter range to include PubMed's unique content.

    Science.gov (United States)

    Damarell, Raechel A; Tieman, Jennifer J; Sladek, Ruth M

    2013-07-02

    PubMed translations of OvidSP Medline search filters offer searchers improved ease of access. They may also facilitate access to PubMed's unique content, including citations for the most recently published biomedical evidence. Retrieving this content requires a search strategy comprising natural language terms ('textwords'), rather than Medical Subject Headings (MeSH). We describe a reproducible methodology that uses a validated PubMed search filter translation to create a textword-only strategy to extend retrieval to PubMed's unique heart failure literature. We translated an OvidSP Medline heart failure search filter for PubMed and established version equivalence in terms of indexed literature retrieval. The PubMed version was then run within PubMed to identify citations retrieved by the filter's MeSH terms (Heart failure, Left ventricular dysfunction, and Cardiomyopathy). It was then rerun with the same MeSH terms restricted to searching on title and abstract fields (i.e. as 'textwords'). Citations retrieved by the MeSH search but not the textword search were isolated. Frequency analysis of their titles/abstracts identified natural language alternatives for those MeSH terms that performed less effectively as textwords. These terms were tested in combination to determine the best performing search string for reclaiming this 'lost set'. This string, restricted to searching on PubMed's unique content, was then combined with the validated PubMed translation to extend the filter's performance in this database. The PubMed heart failure filter retrieved 6829 citations. Of these, 834 (12%) failed to be retrieved when MeSH terms were converted to textwords. Frequency analysis of the 834 citations identified five high frequency natural language alternatives that could improve retrieval of this set (cardiac failure, cardiac resynchronization, left ventricular systolic dysfunction, left ventricular diastolic dysfunction, and LV dysfunction). Together these terms reclaimed

  19. Profiles and Context for Structured Text Retrieval

    DEFF Research Database (Denmark)

    Koolen, Marijn; Bogers, Toine

    2017-01-01

    The combination of structured information retrieval with user profile information represents the scenario where systems search with an explicit statement of the information need—a search query—as well as a profile of a user, which can contain information about previous interactions, search histor...

  20. High Working Memory Capacity Predicts Less Retrieval Induced Forgetting

    NARCIS (Netherlands)

    Mall, Jonathan T.; Morey, Candice C.

    2013-01-01

    Background : Working Memory Capacity (WMC) is thought to be related to executive control and focused memory search abilities. These two hypotheses make contrasting predictions regarding the effects of retrieval on forgetting. Executive control during memory retrieval is believed to lead to retrieval

  1. Information visualization to user-friendly interface construction for information retrieval systems

    Directory of Open Access Journals (Sweden)

    Jessica Monique de Lira Vieira

    2011-10-01

    Full Text Available The information presented through visualization help the Information Retrieval System (IRS to reach its main goal: to retrieve relevant information that meets the informational needs of its users. The objective of this article is to describe and analyze techniques proposed by the Information Visualization area and interface models discussed in Information Science Literature, which applied to graphical interface construction would facilitate the appropriation of information by the users of IRS and would help them to search, browse and retrieve information. The methodology consists of a literature review focusing on the potential contribution of the visual representation of information in the development of user-friendly interfaces to IRS, as well as identification and analyses of visualizations used as interfaces by IRS. The use of visualizations is of great importance in the communication between SRI and users, because the information presented through visual representation are better understood by user and allow the discovery of new knowledge.

  2. Searching the ASRS Database Using QUORUM Keyword Search, Phrase Search, Phrase Generation, and Phrase Discovery

    Science.gov (United States)

    McGreevy, Michael W.; Connors, Mary M. (Technical Monitor)

    2001-01-01

    To support Search Requests and Quick Responses at the Aviation Safety Reporting System (ASRS), four new QUORUM methods have been developed: keyword search, phrase search, phrase generation, and phrase discovery. These methods build upon the core QUORUM methods of text analysis, modeling, and relevance-ranking. QUORUM keyword search retrieves ASRS incident narratives that contain one or more user-specified keywords in typical or selected contexts, and ranks the narratives on their relevance to the keywords in context. QUORUM phrase search retrieves narratives that contain one or more user-specified phrases, and ranks the narratives on their relevance to the phrases. QUORUM phrase generation produces a list of phrases from the ASRS database that contain a user-specified word or phrase. QUORUM phrase discovery finds phrases that are related to topics of interest. Phrase generation and phrase discovery are particularly useful for finding query phrases for input to QUORUM phrase search. The presentation of the new QUORUM methods includes: a brief review of the underlying core QUORUM methods; an overview of the new methods; numerous, concrete examples of ASRS database searches using the new methods; discussion of related methods; and, in the appendices, detailed descriptions of the new methods.

  3. An Integrated Information Retrieval Support System for Campus Network

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    This paper presents a new integrated information retrieval support system (IIRSS) which can help Web search engines retrieve cross-lingual information from heterogeneous resources stored in multi-databases in Intranet. The IIRSS, with a three-layer architecture, can cooperate with other application servers running in Intranet. By using intelligent agents to collect information and to create indexes on-the-fly, using an access control strategy to confine a user to browsing those accessible documents for him/her through a single portal, and using a new cross-lingual translation tool to help the search engine retrieve documents, the new system provides controllable information access with different authorizations, personalized services, and real-time information retrieval.

  4. Clinician search behaviors may be influenced by search engine design.

    Science.gov (United States)

    Lau, Annie Y S; Coiera, Enrico; Zrimec, Tatjana; Compton, Paul

    2010-06-30

    Searching the Web for documents using information retrieval systems plays an important part in clinicians' practice of evidence-based medicine. While much research focuses on the design of methods to retrieve documents, there has been little examination of the way different search engine capabilities influence clinician search behaviors. Previous studies have shown that use of task-based search engines allows for faster searches with no loss of decision accuracy compared with resource-based engines. We hypothesized that changes in search behaviors may explain these differences. In all, 75 clinicians (44 doctors and 31 clinical nurse consultants) were randomized to use either a resource-based or a task-based version of a clinical information retrieval system to answer questions about 8 clinical scenarios in a controlled setting in a university computer laboratory. Clinicians using the resource-based system could select 1 of 6 resources, such as PubMed; clinicians using the task-based system could select 1 of 6 clinical tasks, such as diagnosis. Clinicians in both systems could reformulate search queries. System logs unobtrusively capturing clinicians' interactions with the systems were coded and analyzed for clinicians' search actions and query reformulation strategies. The most frequent search action of clinicians using the resource-based system was to explore a new resource with the same query, that is, these clinicians exhibited a "breadth-first" search behaviour. Of 1398 search actions, clinicians using the resource-based system conducted 401 (28.7%, 95% confidence interval [CI] 26.37-31.11) in this way. In contrast, the majority of clinicians using the task-based system exhibited a "depth-first" search behavior in which they reformulated query keywords while keeping to the same task profiles. Of 585 search actions conducted by clinicians using the task-based system, 379 (64.8%, 95% CI 60.83-68.55) were conducted in this way. This study provides evidence that

  5. Information Retrieval Evaluation

    CERN Document Server

    Harman, Donna

    2011-01-01

    Evaluation has always played a major role in information retrieval, with the early pioneers such as Cyril Cleverdon and Gerard Salton laying the foundations for most of the evaluation methodologies in use today. The retrieval community has been extremely fortunate to have such a well-grounded evaluation paradigm during a period when most of the human language technologies were just developing. This lecture has the goal of explaining where these evaluation methodologies came from and how they have continued to adapt to the vastly changed environment in the search engine world today. The lecture

  6. Ranked retrieval of Computational Biology models.

    Science.gov (United States)

    Henkel, Ron; Endler, Lukas; Peters, Andre; Le Novère, Nicolas; Waltemath, Dagmar

    2010-08-11

    The study of biological systems demands computational support. If targeting a biological problem, the reuse of existing computational models can save time and effort. Deciding for potentially suitable models, however, becomes more challenging with the increasing number of computational models available, and even more when considering the models' growing complexity. Firstly, among a set of potential model candidates it is difficult to decide for the model that best suits ones needs. Secondly, it is hard to grasp the nature of an unknown model listed in a search result set, and to judge how well it fits for the particular problem one has in mind. Here we present an improved search approach for computational models of biological processes. It is based on existing retrieval and ranking methods from Information Retrieval. The approach incorporates annotations suggested by MIRIAM, and additional meta-information. It is now part of the search engine of BioModels Database, a standard repository for computational models. The introduced concept and implementation are, to our knowledge, the first application of Information Retrieval techniques on model search in Computational Systems Biology. Using the example of BioModels Database, it was shown that the approach is feasible and extends the current possibilities to search for relevant models. The advantages of our system over existing solutions are that we incorporate a rich set of meta-information, and that we provide the user with a relevance ranking of the models found for a query. Better search capabilities in model databases are expected to have a positive effect on the reuse of existing models.

  7. Supporting information retrieval from electronic health records: A report of University of Michigan's nine-year experience in developing and using the Electronic Medical Record Search Engine (EMERSE).

    Science.gov (United States)

    Hanauer, David A; Mei, Qiaozhu; Law, James; Khanna, Ritu; Zheng, Kai

    2015-06-01

    This paper describes the University of Michigan's nine-year experience in developing and using a full-text search engine designed to facilitate information retrieval (IR) from narrative documents stored in electronic health records (EHRs). The system, called the Electronic Medical Record Search Engine (EMERSE), functions similar to Google but is equipped with special functionalities for handling challenges unique to retrieving information from medical text. Key features that distinguish EMERSE from general-purpose search engines are discussed, with an emphasis on functions crucial to (1) improving medical IR performance and (2) assuring search quality and results consistency regardless of users' medical background, stage of training, or level of technical expertise. Since its initial deployment, EMERSE has been enthusiastically embraced by clinicians, administrators, and clinical and translational researchers. To date, the system has been used in supporting more than 750 research projects yielding 80 peer-reviewed publications. In several evaluation studies, EMERSE demonstrated very high levels of sensitivity and specificity in addition to greatly improved chart review efficiency. Increased availability of electronic data in healthcare does not automatically warrant increased availability of information. The success of EMERSE at our institution illustrates that free-text EHR search engines can be a valuable tool to help practitioners and researchers retrieve information from EHRs more effectively and efficiently, enabling critical tasks such as patient case synthesis and research data abstraction. EMERSE, available free of charge for academic use, represents a state-of-the-art medical IR tool with proven effectiveness and user acceptance. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  8. Identification of risk conditions for the development of adrenal disorders: how optimized PubMed search strategies makes the difference.

    Science.gov (United States)

    Guaraldi, Federica; Parasiliti-Caprino, Mirko; Goggi, Riccardo; Beccuti, Guglielmo; Grottoli, Silvia; Arvat, Emanuela; Ghizzoni, Lucia; Ghigo, Ezio; Giordano, Roberta; Gori, Davide

    2014-12-01

    The exponential growth of scientific literature available through electronic databases (namely PubMed) has increased the chance of finding interesting articles. At the same time, search has become more complicated, time consuming, and at risk of missing important information. Therefore, optimized strategies have to be adopted to maximize searching impact. The aim of this study was to formulate efficient strings to search PubMed for etiologic associations between adrenal disorders (ADs) and other conditions. A comprehensive list of terms identifying endogenous conditions primarily affecting adrenals was compiled. An ad hoc analysis was performed to find the best way to express each term in order to find the highest number of potentially pertinent articles in PubMed. A predefined number of retrieved abstracts were read to assess their association with ADs' etiology. A more sensitive (providing the largest literature coverage) and a more specific (including only those terms retrieving >40 % of potentially pertinent articles) string were formulated. Various researches were performed to assess strings' ability to identify articles of interest in comparison with non-optimized literature searches. We formulated optimized, ready applicable tools for the identification of the literature assessing etiologic associations in the field of ADs using PubMed, and demonstrated the advantages deriving from their application. Detailed description of the methodological process is also provided, so that this work can easily be translated to other fields of practice.

  9. Comparative analysis of some search engines

    Directory of Open Access Journals (Sweden)

    Taiwo O. Edosomwan

    2010-10-01

    Full Text Available We compared the information retrieval performances of some popular search engines (namely, Google, Yahoo, AlltheWeb, Gigablast, Zworks and AltaVista and Bing/MSN in response to a list of ten queries, varying in complexity. These queries were run on each search engine and the precision and response time of the retrieved results were recorded. The first ten documents on each retrieval output were evaluated as being ‘relevant’ or ‘non-relevant’ for evaluation of the search engine’s precision. To evaluate response time, normalised recall ratios were calculated at various cut-off points for each query and search engine. This study shows that Google appears to be the best search engine in terms of both average precision (70% and average response time (2 s. Gigablast and AlltheWeb performed the worst overall in this study.

  10. TRIP: An interactive retrieving-inferring data imputation approach

    KAUST Repository

    Li, Zhixu

    2016-06-25

    Data imputation aims at filling in missing attribute values in databases. Existing imputation approaches to nonquantitive string data can be roughly put into two categories: (1) inferring-based approaches [2], and (2) retrieving-based approaches [1]. Specifically, the inferring-based approaches find substitutes or estimations for the missing ones from the complete part of the data set. However, they typically fall short in filling in unique missing attribute values which do not exist in the complete part of the data set [1]. The retrieving-based approaches resort to external resources for help by formulating proper web search queries to retrieve web pages containing the missing values from the Web, and then extracting the missing values from the retrieved web pages [1]. This webbased retrieving approach reaches a high imputation precision and recall, but on the other hand, issues a large number of web search queries, which brings a large overhead [1]. © 2016 IEEE.

  11. TRIP: An interactive retrieving-inferring data imputation approach

    KAUST Repository

    Li, Zhixu; Qin, Lu; Cheng, Hong; Zhang, Xiangliang; Zhou, Xiaofang

    2016-01-01

    Data imputation aims at filling in missing attribute values in databases. Existing imputation approaches to nonquantitive string data can be roughly put into two categories: (1) inferring-based approaches [2], and (2) retrieving-based approaches [1]. Specifically, the inferring-based approaches find substitutes or estimations for the missing ones from the complete part of the data set. However, they typically fall short in filling in unique missing attribute values which do not exist in the complete part of the data set [1]. The retrieving-based approaches resort to external resources for help by formulating proper web search queries to retrieve web pages containing the missing values from the Web, and then extracting the missing values from the retrieved web pages [1]. This webbased retrieving approach reaches a high imputation precision and recall, but on the other hand, issues a large number of web search queries, which brings a large overhead [1]. © 2016 IEEE.

  12. Development of a PubMed Based Search Tool for Identifying Sex and Gender Specific Health Literature.

    Science.gov (United States)

    Song, Michael M; Simonsen, Cheryl K; Wilson, Joanna D; Jenkins, Marjorie R

    2016-02-01

    An effective literature search strategy is critical to achieving the aims of Sex and Gender Specific Health (SGSH): to understand sex and gender differences through research and to effectively incorporate the new knowledge into the clinical decision making process to benefit both male and female patients. The goal of this project was to develop and validate an SGSH literature search tool that is readily and freely available to clinical researchers and practitioners. PubMed, a freely available search engine for the Medline database, was selected as the platform to build the SGSH literature search tool. Combinations of Medical Subject Heading terms, text words, and title words were evaluated for optimal specificity and sensitivity. The search tool was then validated against reference bases compiled for two disease states, diabetes and stroke. Key sex and gender terms and limits were bundled to create a search tool to facilitate PubMed SGSH literature searches. During validation, the search tool retrieved 50 of 94 (53.2%) stroke and 62 of 95 (65.3%) diabetes reference articles selected for validation. A general keyword search of stroke or diabetes combined with sex difference retrieved 33 of 94 (35.1%) stroke and 22 of 95 (23.2%) diabetes reference base articles, with lower sensitivity and specificity for SGSH content. The Texas Tech University Health Sciences Center SGSH PubMed Search Tool provides higher sensitivity and specificity to sex and gender specific health literature. The tool will facilitate research, clinical decision-making, and guideline development relevant to SGSH.

  13. CIRQuL: Complex Information Retrieval Query Language

    NARCIS (Netherlands)

    Mihajlovic, V.; Hiemstra, Djoerd; Apers, Peter M.G.

    In this paper we will present a new framework for the retrieval of XML documents. We will describe the extension for existing query languages (XPath and XQuery) geared toward ranked information retrieval and full-text search in XML documents. Furthermore we will present language models for ranked

  14. Cue generation and memory construction in direct and generative autobiographical memory retrieval.

    Science.gov (United States)

    Harris, Celia B; O'Connor, Akira R; Sutton, John

    2015-05-01

    Theories of autobiographical memory emphasise effortful, generative search processes in memory retrieval. However recent research suggests that memories are often retrieved directly, without effortful search. We investigated whether direct and generative retrieval differed in the characteristics of memories recalled, or only in terms of retrieval latency. Participants recalled autobiographical memories in response to cue words. For each memory, they reported whether it was retrieved directly or generatively, rated its visuo-spatial perspective, and judged its accompanying recollective experience. Our results indicated that direct retrieval was commonly reported and was faster than generative retrieval, replicating recent findings. The characteristics of directly retrieved memories differed from generatively retrieved memories: directly retrieved memories had higher field perspective ratings and lower observer perspective ratings. However, retrieval mode did not influence recollective experience. We discuss our findings in terms of cue generation and content construction, and the implication for reconstructive models of autobiographical memory. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. An information search model for online social Networks - MOBIRSE

    Directory of Open Access Journals (Sweden)

    Miguel Angel Niño Zambrano

    2015-09-01

    Full Text Available Online Social Networks (OSNs have been gaining great importance among Internet users in recent years.  These are sites where it is possible to meet people, publish, and share content in a way that is both easy and free of charge. As a result, the volume of information contained in these websites has grown exponentially, and web search has consequently become an important tool for users to easily find information relevant to their social networking objectives. Making use of ontologies and user profiles can make these searches more effective. This article presents a model for Information Retrieval in OSNs (MOBIRSE based on user profile and ontologies which aims to improve the relevance of retrieved information on these websites. The social network Facebook was chosen for a case study and as the instance for the proposed model. The model was validated using measures such as At-k Precision and Kappa statistics, to assess its efficiency.

  16. Data Discretization for Novel Relationship Discovery in Information Retrieval.

    Science.gov (United States)

    Benoit, G.

    2002-01-01

    Describes an information retrieval, visualization, and manipulation model which offers the user multiple ways to exploit the retrieval set, based on weighted query terms, via an interactive interface. Outlines the mathematical model and describes an information retrieval application built on the model to search structured and full-text files.…

  17. A robust pointer segmentation in biomedical images toward building a visual ontology for biomedical article retrieval

    Science.gov (United States)

    You, Daekeun; Simpson, Matthew; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.

    2013-01-01

    Pointers (arrows and symbols) are frequently used in biomedical images to highlight specific image regions of interest (ROIs) that are mentioned in figure captions and/or text discussion. Detection of pointers is the first step toward extracting relevant visual features from ROIs and combining them with textual descriptions for a multimodal (text and image) biomedical article retrieval system. Recently we developed a pointer recognition algorithm based on an edge-based pointer segmentation method, and subsequently reported improvements made on our initial approach involving the use of Active Shape Models (ASM) for pointer recognition and region growing-based method for pointer segmentation. These methods contributed to improving the recall of pointer recognition but not much to the precision. The method discussed in this article is our recent effort to improve the precision rate. Evaluation performed on two datasets and compared with other pointer segmentation methods show significantly improved precision and the highest F1 score.

  18. Probabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval

    Science.gov (United States)

    Karisani, Payam; Qin, Zhaohui S; Agichtein, Eugene

    2018-01-01

    Abstract The bioCADDIE dataset retrieval challenge brought together different approaches to retrieval of biomedical datasets relevant to a user’s query, expressed as a text description of a needed dataset. We describe experiments in applying a data-driven, machine learning-based approach to biomedical dataset retrieval as part of this challenge. We report on a series of experiments carried out to evaluate the performance of both probabilistic and machine learning-driven techniques from information retrieval, as applied to this challenge. Our experiments with probabilistic information retrieval methods, such as query term weight optimization, automatic query expansion and simulated user relevance feedback, demonstrate that automatically boosting the weights of important keywords in a verbose query is more effective than other methods. We also show that although there is a rich space of potential representations and features available in this domain, machine learning-based re-ranking models are not able to improve on probabilistic information retrieval techniques with the currently available training data. The models and algorithms presented in this paper can serve as a viable implementation of a search engine to provide access to biomedical datasets. The retrieval performance is expected to be further improved by using additional training data that is created by expert annotation, or gathered through usage logs, clicks and other processes during natural operation of the system. Database URL: https://github.com/emory-irlab/biocaddie

  19. Refined repetitive sequence searches utilizing a fast hash function and cross species information retrievals

    Directory of Open Access Journals (Sweden)

    Reneker Jeff

    2005-05-01

    Full Text Available Abstract Background Searching for small tandem/disperse repetitive DNA sequences streamlines many biomedical research processes. For instance, whole genomic array analysis in yeast has revealed 22 PHO-regulated genes. The promoter regions of all but one of them contain at least one of the two core Pho4p binding sites, CACGTG and CACGTT. In humans, microsatellites play a role in a number of rare neurodegenerative diseases such as spinocerebellar ataxia type 1 (SCA1. SCA1 is a hereditary neurodegenerative disease caused by an expanded CAG repeat in the coding sequence of the gene. In bacterial pathogens, microsatellites are proposed to regulate expression of some virulence factors. For example, bacteria commonly generate intra-strain diversity through phase variation which is strongly associated with virulence determinants. A recent analysis of the complete sequences of the Helicobacter pylori strains 26695 and J99 has identified 46 putative phase-variable genes among the two genomes through their association with homopolymeric tracts and dinucleotide repeats. Life scientists are increasingly interested in studying the function of small sequences of DNA. However, current search algorithms often generate thousands of matches – most of which are irrelevant to the researcher. Results We present our hash function as well as our search algorithm to locate small sequences of DNA within multiple genomes. Our system applies information retrieval algorithms to discover knowledge of cross-species conservation of repeat sequences. We discuss our incorporation of the Gene Ontology (GO database into these algorithms. We conduct an exhaustive time analysis of our system for various repetitive sequence lengths. For instance, a search for eight bases of sequence within 3.224 GBases on 49 different chromosomes takes 1.147 seconds on average. To illustrate the relevance of the search results, we conduct a search with and without added annotation terms for the

  20. Information Retrieval Models

    NARCIS (Netherlands)

    Hiemstra, Djoerd; Göker, Ayse; Davies, John

    2009-01-01

    Many applications that handle information on the internet would be completely inadequate without the support of information retrieval technology. How would we find information on the world wide web if there were no web search engines? How would we manage our email without spam filtering? Much of the

  1. Introduction to information retrieval

    CERN Document Server

    Manning, Christopher D; Schütze, Hinrich

    2008-01-01

    Class-tested and coherent, this textbook teaches classical and web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. It gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Slides and additional exercises (with solutions for lecturers) are also available through the book's supporting website to help course instructors prepare their lectures.

  2. An advanced search engine for patent analytics in medicinal chemistry.

    Science.gov (United States)

    Pasche, Emilie; Gobeill, Julien; Teodoro, Douglas; Gaudinat, Arnaud; Vishnykova, Dina; Lovis, Christian; Ruch, Patrick

    2012-01-01

    Patent collections contain an important amount of medical-related knowledge, but existing tools were reported to lack of useful functionalities. We present here the development of TWINC, an advanced search engine dedicated to patent retrieval in the domain of health and life sciences. Our tool embeds two search modes: an ad hoc search to retrieve relevant patents given a short query and a related patent search to retrieve similar patents given a patent. Both search modes rely on tuning experiments performed during several patent retrieval competitions. Moreover, TWINC is enhanced with interactive modules, such as chemical query expansion, which is of prior importance to cope with various ways of naming biomedical entities. While the related patent search showed promising performances, the ad-hoc search resulted in fairly contrasted results. Nonetheless, TWINC performed well during the Chemathlon task of the PatOlympics competition and experts appreciated its usability.

  3. Mobile object retrieval in server-based image databases

    Science.gov (United States)

    Manger, D.; Pagel, F.; Widak, H.

    2013-05-01

    The increasing number of mobile phones equipped with powerful cameras leads to huge collections of user-generated images. To utilize the information of the images on site, image retrieval systems are becoming more and more popular to search for similar objects in an own image database. As the computational performance and the memory capacity of mobile devices are constantly increasing, this search can often be performed on the device itself. This is feasible, for example, if the images are represented with global image features or if the search is done using EXIF or textual metadata. However, for larger image databases, if multiple users are meant to contribute to a growing image database or if powerful content-based image retrieval methods with local features are required, a server-based image retrieval backend is needed. In this work, we present a content-based image retrieval system with a client server architecture working with local features. On the server side, the scalability to large image databases is addressed with the popular bag-of-word model with state-of-the-art extensions. The client end of the system focuses on a lightweight user interface presenting the most similar images of the database highlighting the visual information which is common with the query image. Additionally, new images can be added to the database making it a powerful and interactive tool for mobile contentbased image retrieval.

  4. Where to search top-K biomedical ontologies?

    Science.gov (United States)

    Oliveira, Daniela; Butt, Anila Sahar; Haller, Armin; Rebholz-Schuhmann, Dietrich; Sahay, Ratnesh

    2018-03-20

    Searching for precise terms and terminological definitions in the biomedical data space is problematic, as researchers find overlapping, closely related and even equivalent concepts in a single or multiple ontologies. Search engines that retrieve ontological resources often suggest an extensive list of search results for a given input term, which leads to the tedious task of selecting the best-fit ontological resource (class or property) for the input term and reduces user confidence in the retrieval engines. A systematic evaluation of these search engines is necessary to understand their strengths and weaknesses in different search requirements. We have implemented seven comparable Information Retrieval ranking algorithms to search through ontologies and compared them against four search engines for ontologies. Free-text queries have been performed, the outcomes have been judged by experts and the ranking algorithms and search engines have been evaluated against the expert-based ground truth (GT). In addition, we propose a probabilistic GT that is developed automatically to provide deeper insights and confidence to the expert-based GT as well as evaluating a broader range of search queries. The main outcome of this work is the identification of key search factors for biomedical ontologies together with search requirements and a set of recommendations that will help biomedical experts and ontology engineers to select the best-suited retrieval mechanism in their search scenarios. We expect that this evaluation will allow researchers and practitioners to apply the current search techniques more reliably and that it will help them to select the right solution for their daily work. The source code (of seven ranking algorithms), ground truths and experimental results are available at https://github.com/danielapoliveira/bioont-search-benchmark.

  5. Exploiting citation contexts for physics retrieval

    DEFF Research Database (Denmark)

    Dabrowska, Anna; Larsen, Birger

    2015-01-01

    The text surrounding citations within scientific papers may contain terms that usefully describe cited documents and can benefit retrieval. We present a preliminary study that investigates appending ci- tation contexts from citing documents to cited documents in the iSearch test collection. We ex...... in a large collection of physics papers, paving the way for future research that exploits citation contexts for retrieval....

  6. Exploring personalized searches using tag-based user profiles and resource profiles in folksonomy.

    Science.gov (United States)

    Cai, Yi; Li, Qing; Xie, Haoran; Min, Huaqin

    2014-10-01

    With the increase in resource-sharing websites such as YouTube and Flickr, many shared resources have arisen on the Web. Personalized searches have become more important and challenging since users demand higher retrieval quality. To achieve this goal, personalized searches need to take users' personalized profiles and information needs into consideration. Collaborative tagging (also known as folksonomy) systems allow users to annotate resources with their own tags, which provides a simple but powerful way for organizing, retrieving and sharing different types of social resources. In this article, we examine the limitations of previous tag-based personalized searches. To handle these limitations, we propose a new method to model user profiles and resource profiles in collaborative tagging systems. We use a normalized term frequency to indicate the preference degree of a user on a tag. A novel search method using such profiles of users and resources is proposed to facilitate the desired personalization in resource searches. In our framework, instead of the keyword matching or similarity measurement used in previous works, the relevance measurement between a resource and a user query (termed the query relevance) is treated as a fuzzy satisfaction problem of a user's query requirements. We implement a prototype system called the Folksonomy-based Multimedia Retrieval System (FMRS). Experiments using the FMRS data set and the MovieLens data set show that our proposed method outperforms baseline methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Online Patent Searching: The Realities.

    Science.gov (United States)

    Kaback, Stuart M.

    1983-01-01

    Considers patent subject searching capabilities of major online databases, noting patent claims, "deep-indexed" files, test searches, retrieval of related references, multi-database searching, improvements needed in indexing of chemical structures, full text searching, improvements needed in handling numerical data, and augmenting a…

  8. The Evolution of Web Searching.

    Science.gov (United States)

    Green, David

    2000-01-01

    Explores the interrelation between Web publishing and information retrieval technologies and lists new approaches to Web indexing and searching. Highlights include Web directories; search engines; portalisation; Internet service providers; browser providers; meta search engines; popularity based analysis; natural language searching; links-based…

  9. Best, Useful and Objective Precisions for Information Retrieval of Three Search Methods in PubMed and iPubMed

    Directory of Open Access Journals (Sweden)

    Somayyeh Nadi Ravandi

    2016-10-01

    Full Text Available MEDLINE is one of the valuable sources of medical information on the Internet. Among the different open access sites of MEDLINE, PubMed is the best-known site. In 2010, iPubMed was established with an interaction-fuzzy search method for MEDLINE access. In the present work, we aimed to compare the precision of the retrieved sources (Best, Useful and Objective precision in the PubMed and iPubMed using two search methods (simple and MeSH search in PubMed and interaction-fuzzy method in iPubmed. During our semi-empirical study period, we held training workshops for 61 students of higher education to teach them Simple Search, MeSH Search, and Fuzzy-Interaction Search methods. Then, the precision of 305 searches for each method prepared by the students was calculated on the basis of Best precision, Useful precision, and Objective precision formulas. Analyses were done in SPSS version 11.5 using the Friedman and Wilcoxon Test, and three precisions obtained with the three precision formulas were studied for the three search methods. The mean precision of the interaction-fuzzy Search method was higher than that of the simple search and MeSH search for all three types of precision, i.e., Best precision, Useful precision, and Objective precision, and the Simple search method was in the next rank, and their mean precisions were significantly different (P < 0.001. The precision of the interaction-fuzzy search method in iPubmed was investigated for the first time. Also for the first time, three types of precision were evaluated in PubMed and iPubmed. The results showed that the Interaction-Fuzzy search method is more precise than using the natural language search (simple search and MeSH search, and users of this method found papers that were more related to their queries; even though search in Pubmed is useful, it is important that users apply new search methods to obtain the best results.

  10. Dialog-based Interactive Image Retrieval

    OpenAIRE

    Guo, Xiaoxiao; Wu, Hui; Cheng, Yu; Rennie, Steven; Feris, Rogerio Schmidt

    2018-01-01

    Existing methods for interactive image retrieval have demonstrated the merit of integrating user feedback, improving retrieval results. However, most current systems rely on restricted forms of user feedback, such as binary relevance responses, or feedback based on a fixed set of relative attributes, which limits their impact. In this paper, we introduce a new approach to interactive image search that enables users to provide feedback via natural language, allowing for more natural and effect...

  11. Federated Search in the Wild: the combined power of over a hundred search engines

    NARCIS (Netherlands)

    Nguyen, Dong-Phuong; Demeester, Thomas; Trieschnigg, Rudolf Berend; Hiemstra, Djoerd

    2012-01-01

    Federated search has the potential of improving web search: the user becomes less dependent on a single search provider and parts of the deep web become available through a unified interface, leading to a wider variety in the retrieved search results. However, a publicly available dataset for

  12. Tales from the Field: Search Strategies Applied in Web Searching

    Directory of Open Access Journals (Sweden)

    Soohyung Joo

    2010-08-01

    Full Text Available In their web search processes users apply multiple types of search strategies, which consist of different search tactics. This paper identifies eight types of information search strategies with associated cases based on sequences of search tactics during the information search process. Thirty-one participants representing the general public were recruited for this study. Search logs and verbal protocols offered rich data for the identification of different types of search strategies. Based on the findings, the authors further discuss how to enhance web-based information retrieval (IR systems to support each type of search strategy.

  13. Collaborative Video Search Combining Video Retrieval with Human-Based Visual Inspection

    NARCIS (Netherlands)

    Hudelist, M.A.; Cobârzan, C.; Beecks, C.; van de Werken, Rob; Kletz, S.; Hürst, W.O.; Schoeffmann, K.

    2016-01-01

    We propose a novel video browsing approach that aims at optimally integrating traditional, machine-based retrieval methods with an interface design optimized for human browsing performance. Advanced video retrieval and filtering (e.g., via color and motion signatures, and visual concepts) on a

  14. search.bioPreprint: a discovery tool for cutting edge, preprint biomedical research articles [version 2; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Carrie L. Iwema

    2016-07-01

    Full Text Available The time it takes for a completed manuscript to be published traditionally can be extremely lengthy. Article publication delay, which occurs in part due to constraints associated with peer review, can prevent the timely dissemination of critical and actionable data associated with new information on rare diseases or developing health concerns such as Zika virus. Preprint servers are open access online repositories housing preprint research articles that enable authors (1 to make their research immediately and freely available and (2 to receive commentary and peer review prior to journal submission. There is a growing movement of preprint advocates aiming to change the current journal publication and peer review system, proposing that preprints catalyze biomedical discovery, support career advancement, and improve scientific communication. While the number of articles submitted to and hosted by preprint servers are gradually increasing, there has been no simple way to identify biomedical research published in a preprint format, as they are not typically indexed and are only discoverable by directly searching the specific preprint server websites. To address this issue, we created a search engine that quickly compiles preprints from disparate host repositories and provides a one-stop search solution. Additionally, we developed a web application that bolsters the discovery of preprints by enabling each and every word or phrase appearing on any web site to be integrated with articles from preprint servers. This tool, search.bioPreprint, is publicly available at http://www.hsls.pitt.edu/resources/preprint.

  15. BIBLIO: A Computerized Retrieval System for Communication Education.

    Science.gov (United States)

    Williams, M. Lee; Edwards, Renee

    1983-01-01

    Describes BIBLIO, a computer program created for the storage and retrieval of articles in the 1970-80 issues of "Communication Education." Tells how articles were coded, method used to retrieve information, and advantages and uses of the system. (PD)

  16. Using the open Web as an information resource and scholarly Web search engines as retrieval tools for academic and research purposes

    Directory of Open Access Journals (Sweden)

    Filistea Naude

    2010-08-01

    Full Text Available This study provided insight into the significance of the open Web as an information resource and Web search engines as research tools amongst academics. The academic staff establishment of the University of South Africa (Unisa was invited to participate in a questionnaire survey and included 1188 staff members from five colleges. This study culminated in a PhD dissertation in 2008. One hundred and eighty seven respondents participated in the survey which gave a response rate of 15.7%. The results of this study show that academics have indeed accepted the open Web as a useful information resource and Web search engines as retrieval tools when seeking information for academic and research work. The majority of respondents used the open Web and Web search engines on a daily or weekly basis to source academic and research information. The main obstacles presented by using the open Web and Web search engines included lack of time to search and browse the Web, information overload, poor network speed and the slow downloading speed of webpages.

  17. Using the open Web as an information resource and scholarly Web search engines as retrieval tools for academic and research purposes

    Directory of Open Access Journals (Sweden)

    Filistea Naude

    2010-12-01

    Full Text Available This study provided insight into the significance of the open Web as an information resource and Web search engines as research tools amongst academics. The academic staff establishment of the University of South Africa (Unisa was invited to participate in a questionnaire survey and included 1188 staff members from five colleges. This study culminated in a PhD dissertation in 2008. One hundred and eighty seven respondents participated in the survey which gave a response rate of 15.7%. The results of this study show that academics have indeed accepted the open Web as a useful information resource and Web search engines as retrieval tools when seeking information for academic and research work. The majority of respondents used the open Web and Web search engines on a daily or weekly basis to source academic and research information. The main obstacles presented by using the open Web and Web search engines included lack of time to search and browse the Web, information overload, poor network speed and the slow downloading speed of webpages.

  18. Undergraduates Prefer Federated Searching to Searching Databases Individually. A Review of: Belliston, C. Jeffrey, Jared L. Howland, & Brian C. Roberts. “Undergraduate Use of Federated Searching: A Survey of Preferences and Perceptions of Value-Added Functionality.” College & Research Libraries 68.6 (Nov. 2007: 472-86.

    Directory of Open Access Journals (Sweden)

    Genevieve Gore

    2008-09-01

    Full Text Available Objective – To determine whether use offederated searching by undergraduates saves time, meets their information needs, is preferred over searching databases individually, and provides results of higher quality. Design – Crossover study.Setting – Three American universities, all members of the Consortium of Church Libraries & Archives (CCLA: BYU (Brigham Young University, a large research university; BYUH (Brigham Young University – Hawaii, a small baccalaureate college; and BYUI (Brigham Young University – Idaho, a large baccalaureate collegeSubjects – Ninety-five participants recruited via e-mail invitations sent to a random sample of currently enrolled undergraduates at BYU, BYUH, and BYUI.Methods – Participants were given written directions to complete a literature search for journal articles on two biology-related topics using two search methods: 1. federated searching with WebFeat® (implemented in the same way for this study at the three universities and 2. a hyperlinked list of databases to search individually. Both methods used the same set of seven databases. Each topic was assigned in random order to one of the two search methods, also assigned in random order, for a total of two searches per participant. The time to complete the searches was recorded. Students compiled their list of citations, which were later normalized and graded. To analyze the quality of the citations, one quantitative rubric was created by librarians and one qualitative rubric was approved by a faculty member at BYU. The librarian-created rubric included the journal impact factor (from ISI’s Journal Citation Reports®, the proportion of citations from peer-reviewed journals (determined from Ulrichsweb.com™ to total citations, and the timeliness of the articles. The faculty-approved rubric included three criteria: relevance to the topic, quality of the individual citations (good quality: primary research results, peer-reviewed sources, and

  19. High-speed data search

    Science.gov (United States)

    Driscoll, James N.

    1994-01-01

    The high-speed data search system developed for KSC incorporates existing and emerging information retrieval technology to help a user intelligently and rapidly locate information found in large textual databases. This technology includes: natural language input; statistical ranking of retrieved information; an artificial intelligence concept called semantics, where 'surface level' knowledge found in text is used to improve the ranking of retrieved information; and relevance feedback, where user judgements about viewed information are used to automatically modify the search for further information. Semantics and relevance feedback are features of the system which are not available commercially. The system further demonstrates focus on paragraphs of information to decide relevance; and it can be used (without modification) to intelligently search all kinds of document collections, such as collections of legal documents medical documents, news stories, patents, and so forth. The purpose of this paper is to demonstrate the usefulness of statistical ranking, our semantic improvement, and relevance feedback.

  20. BioTCM-SE: a semantic search engine for the information retrieval of modern biology and traditional Chinese medicine.

    Science.gov (United States)

    Chen, Xi; Chen, Huajun; Bi, Xuan; Gu, Peiqin; Chen, Jiaoyan; Wu, Zhaohui

    2014-01-01

    Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM), essentially different from Western Medicine (WM), is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM.

  1. BioTCM-SE: A Semantic Search Engine for the Information Retrieval of Modern Biology and Traditional Chinese Medicine

    Directory of Open Access Journals (Sweden)

    Xi Chen

    2014-01-01

    Full Text Available Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM, essentially different from Western Medicine (WM, is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM.

  2. Robust retrieval of fine art paintings

    Science.gov (United States)

    Smolka, Bogdan; Lukac, Rastislav; Plataniotis, Konstantinos N.; Venetsanopoulos, Anastasios N.

    2003-10-01

    The rapid growth of image archives increases the need for efficient and fast tools that can retrieve and search through large amount of visual data. In this paper we propose an efficient method of extracting the image color content, which serves as an image digital signature, allowing to efficiently index and retrieve the content of large, heterogeneous multimedia databases. We apply the proposed method for the retrieval of images from the WEBMUSEUM Internet database, containing the collection of fine art images and show that the new method of image color representation is robust to image distorsions caused by resizing and compression and can be incorporated into existing retrieval systems which exploit the information on color content in digital images.

  3. On the Performance of Medical Information Retrieval using MeSH Terms – A Survey

    Directory of Open Access Journals (Sweden)

    Swetha S

    2014-09-01

    Full Text Available Internet users have increased everywhere. Searching and retrieving documents is a common thing nowadays. Retrieving related documents from the search engines are difficult task. To retrieve correct documents, knowledge about the search topic is essential. Even though separate search engines are there to retrieve medical documents the users are not familiar with MeSH terms (Medical Subject Heading. So, both the search browser and the MeSH terms have to be integrated to make the search effective and efficient. To implement this integration, SimpleMed and MeSHMed were introduced. The MeSH terms have to be ranked to know how frequently it has been used and to know the importance of the MeSH terms. To rank it a semi – automated tool called MeSHy was developed. The terms were extracted, filtered, ranked and displayed to the user. Classifiers have to be constructed to label the documents as health and non – health. Three strategies were used to classify them. The errors that are commonly done by the users have to be found out. It was calculated based on the queries presented by the user to the search browser.

  4. Unusual clustering of coefficients of variation in published articles from a medical biochemistry department in India.

    Science.gov (United States)

    Hudes, Mark L; McCann, Joyce C; Ames, Bruce N

    2009-03-01

    A simple statistical method is described to test whether data are consistent with minimum statistical variability expected in a biological experiment. The method is applied to data presented in data tables in a subset of 84 articles among more than 200 published by 3 investigators in a small medical biochemistry department at a major university in India and to 29 "control" articles selected by key word PubMed searches. Major conclusions include: 1) unusual clustering of coefficients of variation (CVs) was observed for data from the majority of articles analyzed that were published by the 3 investigators from 2000-2007; unusual clustering was not observed for data from any of their articles examined that were published between 1992 and 1999; and 2) among a group of 29 control articles retrieved by PubMed key word, title, or title/abstract searches, unusually clustered CVs were observed in 3 articles. Two of these articles were coauthored by 1 of the 3 investigators, and 1 was from the same university but a different department. We are unable to offer a statistical or biological explanation for the unusual clustering observed.

  5. Medical Content Searching, Retrieving, and Sharing Over the Internet: Lessons Learned From the mEducator Through a Scenario-Based Evaluation

    Science.gov (United States)

    Spachos, Dimitris; Mylläri, Jarkko; Giordano, Daniela; Dafli, Eleni; Mitsopoulou, Evangelia; Schizas, Christos N; Pattichis, Constantinos; Nikolaidou, Maria

    2015-01-01

    Background The mEducator Best Practice Network (BPN) implemented and extended standards and reference models in e-learning to develop innovative frameworks as well as solutions that enable specialized state-of-the-art medical educational content to be discovered, retrieved, shared, and re-purposed across European Institutions, targeting medical students, doctors, educators and health care professionals. Scenario-based evaluation for usability testing, complemented with data from online questionnaires and field notes of users’ performance, was designed and utilized for the evaluation of these solutions. Objective The objective of this work is twofold: (1) to describe one instantiation of the mEducator BPN solutions (mEducator3.0 - “MEdical Education LINnked Arena” MELINA+) with a focus on the metadata schema used, as well as on other aspects of the system that pertain to usability and acceptance, and (2) to present evaluation results on the suitability of the proposed metadata schema for searching, retrieving, and sharing of medical content and with respect to the overall usability and acceptance of the system from the target users. Methods A comprehensive evaluation methodology framework was developed and applied to four case studies, which were conducted in four different countries (ie, Greece, Cyprus, Bulgaria and Romania), with a total of 126 participants. In these case studies, scenarios referring to creating, sharing, and retrieving medical educational content using mEducator3.0 were used. The data were collected through two online questionnaires, consisting of 36 closed-ended questions and two open-ended questions that referred to mEducator 3.0 and through the use of field notes during scenario-based evaluations. Results The main findings of the study showed that even though the informational needs of the mEducator target groups were addressed to a satisfactory extent and the metadata schema supported content creation, sharing, and retrieval from an end

  6. Semantic memory retrieval circuit: role of pre-SMA, caudate, and thalamus.

    Science.gov (United States)

    Hart, John; Maguire, Mandy J; Motes, Michael; Mudar, Raksha Anand; Chiang, Hsueh-Sheng; Womack, Kyle B; Kraut, Michael A

    2013-07-01

    We propose that pre-supplementary motor area (pre-SMA)-thalamic interactions govern processes fundamental to semantic retrieval of an integrated object memory. At the onset of semantic retrieval, pre-SMA initiates electrical interactions between multiple cortical regions associated with semantic memory subsystems encodings as indexed by an increase in theta-band EEG power. This starts between 100-150 ms after stimulus presentation and is sustained throughout the task. We posit that this activity represents initiation of the object memory search, which continues in searching for an object memory. When the correct memory is retrieved, there is a high beta-band EEG power increase, which reflects communication between pre-SMA and thalamus, designates the end of the search process and resultant in object retrieval from multiple semantic memory subsystems. This high beta signal is also detected in cortical regions. This circuit is modulated by the caudate nuclei to facilitate correct and suppress incorrect target memories. Copyright © 2012 Elsevier Inc. All rights reserved.

  7. Concept similarity and related categories in information retrieval using formal concept analysis

    Science.gov (United States)

    Eklund, P.; Ducrou, J.; Dau, F.

    2012-11-01

    The application of formal concept analysis to the problem of information retrieval has been shown useful but has lacked any real analysis of the idea of relevance ranking of search results. SearchSleuth is a program developed to experiment with the automated local analysis of Web search using formal concept analysis. SearchSleuth extends a standard search interface to include a conceptual neighbourhood centred on a formal concept derived from the initial query. This neighbourhood of the concept derived from the search terms is decorated with its upper and lower neighbours representing more general and special concepts, respectively. SearchSleuth is in many ways an archetype of search engines based on formal concept analysis with some novel features. In SearchSleuth, the notion of related categories - which are themselves formal concepts - is also introduced. This allows the retrieval focus to shift to a new formal concept called a sibling. This movement across the concept lattice needs to relate one formal concept to another in a principled way. This paper presents the issues concerning exploring, searching, and ordering the space of related categories. The focus is on understanding the use and meaning of proximity and semantic distance in the context of information retrieval using formal concept analysis.

  8. Better Search Through Query Expansion Using Controlled Vocabularies and Apache Solr

    Directory of Open Access Journals (Sweden)

    Scott Williams

    2013-04-01

    Full Text Available This article describes how the University of Pennsylvania Museum of Archaeology and Anthropology (Penn Museum modified its Solr-based discovery interface to improve recall and enable end users to benefit from the power of their in-house controlled vocabularies. These modifications automatically expand the query generated by any search term that matches their controlled vocabulary to include all related alternate and narrower terms. For example, if a user enters Ohio, that search will retrieve the record for an arrowhead found in Cincinnati (a narrower term of Ohio even if that record does not include the term Ohio.

  9. FRBRization of a Library Catalog: Better Collocation of Records, Leading to Enhanced Search, Retrieval, and Display

    Directory of Open Access Journals (Sweden)

    Timothy J. Dickey

    2008-03-01

    Full Text Available The Functional Requirements for Bibliographic Records (FRBR’s hierarchical system defines families of bibliographic relationship between records and collocates them better than most extant bibliographic systems. Certain library materials (especially audio-visual formats pose notable challenges to search and retrieval; the first benefits of a FRBRized system would be felt in music libraries, but research already has proven its advantages for fine arts, theology, and literature—the bulk of the non-science, technology, and mathematics collections. This report will summarize the benefits of FRBR to nextgeneration library catalogs and OPACs, and will review the handful of ILS and catalog systems currently operating with its theoretical structure.

  10. Human memory search

    NARCIS (Netherlands)

    Davelaar, E.J.; Raaijmakers, J.G.W.; Hills, T.T.; Robbins, T.W.; Todd, P.M.

    2012-01-01

    The importance of understanding human memory search is hard to exaggerate: we build and live our lives based on what whe remember. This chapter explores the characteristics of memory search, with special emphasis on the use of retrieval cues. We introduce the dependent measures that are obtained

  11. In Search of Signature Pedagogy for PDS Teacher Education: A Review of Articles Published in "School-University Partnerships"

    Science.gov (United States)

    Yendol-Hoppey, Diane; Franco, Yvonne

    2014-01-01

    ''In Search of Signature Pedagogy for PDS Teacher Education'' is a review of articles published in "School-University Partnerships" which emerged in response to Shulman's critique that we do not possess powerful, consistent models of practice that we can define and have deeply studied. To these ends, we searched for Signature Pedagogy…

  12. A Secured Cognitive Agent based Multi-strategic Intelligent Search System

    Directory of Open Access Journals (Sweden)

    Neha Gulati

    2018-04-01

    Full Text Available Search Engine (SE is the most preferred information retrieval tool ubiquitously used. In spite of vast scale involvement of users in SE’s, their limited capabilities to understand the user/searcher context and emotions places high cognitive, perceptual and learning load on the user to maintain the search momentum. In this regard, the present work discusses a Cognitive Agent (CA based approach to support the user in Web-based search process. The work suggests a framework called Secured Cognitive Agent based Multi-strategic Intelligent Search System (CAbMsISS to assist the user in search process. It helps to reduce the contextual and emotional mismatch between the SE’s and user. After implementation of the proposed framework, performance analysis shows that CAbMsISS framework improves Query Retrieval Time (QRT and effectiveness for retrieving relevant results as compared to Present Search Engine (PSE. Supplementary to this, it also provides search suggestions when user accesses a resource previously tagged with negative emotions. Overall, the goal of the system is to enhance the search experience for keeping the user motivated. The framework provides suggestions through the search log that tracks the queries searched, resources accessed and emotions experienced during the search. The implemented framework also considers user security. Keywords: BDI model, Cognitive Agent, Emotion, Information retrieval, Intelligent search, Search Engine

  13. Can electronic search engines optimize screening of search results in systematic reviews: an empirical study.

    Science.gov (United States)

    Sampson, Margaret; Barrowman, Nicholas J; Moher, David; Clifford, Tammy J; Platt, Robert W; Morrison, Andra; Klassen, Terry P; Zhang, Li

    2006-02-24

    Most electronic search efforts directed at identifying primary studies for inclusion in systematic reviews rely on the optimal Boolean search features of search interfaces such as DIALOG and Ovid. Our objective is to test the ability of an Ultraseek search engine to rank MEDLINE records of the included studies of Cochrane reviews within the top half of all the records retrieved by the Boolean MEDLINE search used by the reviewers. Collections were created using the MEDLINE bibliographic records of included and excluded studies listed in the review and all records retrieved by the MEDLINE search. Records were converted to individual HTML files. Collections of records were indexed and searched through a statistical search engine, Ultraseek, using review-specific search terms. Our data sources, systematic reviews published in the Cochrane library, were included if they reported using at least one phase of the Cochrane Highly Sensitive Search Strategy (HSSS), provided citations for both included and excluded studies and conducted a meta-analysis using a binary outcome measure. Reviews were selected if they yielded between 1000-6000 records when the MEDLINE search strategy was replicated. Nine Cochrane reviews were included. Included studies within the Cochrane reviews were found within the first 500 retrieved studies more often than would be expected by chance. Across all reviews, recall of included studies into the top 500 was 0.70. There was no statistically significant difference in ranking when comparing included studies with just the subset of excluded studies listed as excluded in the published review. The relevance ranking provided by the search engine was better than expected by chance and shows promise for the preliminary evaluation of large results from Boolean searches. A statistical search engine does not appear to be able to make fine discriminations concerning the relevance of bibliographic records that have been pre-screened by systematic reviewers.

  14. The internet and intelligent machines: search engines, agents and robots

    International Nuclear Information System (INIS)

    Achenbach, S.; Alfke, H.

    2000-01-01

    The internet plays an important role in a growing number of medical applications. Finding relevant information is not always easy as the amount of available information on the Web is rising quickly. Even the best Search Engines can only collect links to a fraction of all existing Web pages. In addition, many of these indexed documents have been changed or deleted. The vast majority of information on the Web is not searchable with conventional methods. New search strategies, technologies and standards are combined in Intelligent Search Agents (ISA) an Robots, which can retrieve desired information in a specific approach. Conclusion: The article describes differences between ISAs and conventional Search Engines and how communication between Agents improves their ability to find information. Examples of existing ISAs are given and the possible influences on the current and future work in radiology is discussed. (orig.) [de

  15. Multimedia Information Retrieval

    CERN Document Server

    Rueger, Stefan

    2009-01-01

    At its very core multimedia information retrieval means the process of searching for and finding multimedia documents; the corresponding research field is concerned with building the best possible multimedia search engines. The intriguing bit here is that the query itself can be a multimedia excerpt: For example, when you walk around in an unknown place and stumble across an interesting landmark, would it not be great if you could just take a picture with your mobile phone and send it to a service that finds a similar picture in a database and tells you more about the building -- and about its

  16. Top-k Keyword Search Over Graphs Based On Backward Search

    Directory of Open Access Journals (Sweden)

    Zeng Jia-Hui

    2017-01-01

    Full Text Available Keyword search is one of the most friendly and intuitive information retrieval methods. Using the keyword search to get the connected subgraph has a lot of application in the graph-based cognitive computation, and it is a basic technology. This paper focuses on the top-k keyword searching over graphs. We implemented a keyword search algorithm which applies the backward search idea. The algorithm locates the keyword vertices firstly, and then applies backward search to find rooted trees that contain query keywords. The experiment shows that query time is affected by the iteration number of the algorithm.

  17. Information retrieval for the Cochrane systematic reviews: the case of breast cancer surgery

    Directory of Open Access Journals (Sweden)

    Gaetana Cognetti

    2015-03-01

    Full Text Available Introduction. Systematic reviews are fundamental sources of knowledge on the state-of-the-art interventions for various clinical problems. One of the essential components in carrying out a systematic review is that of developing a comprehensive literature search. Materials and methods. Three Cochrane systematic reviews published in 2012 were retrieved using the MeSH descriptor breast neoplasms/surgery, and analyzed with respect to the information sources used and the search strategies adopted. In March 2014, an update of one of the reviews retrieved was also considered in the study. Results. The number of databases queried for each review ranged between three and seven. All the reviews reported the search strategies adopted, however some only partially. All the reviews explicitly claimed that the searches applied no language restriction although sources such as the free database Lilacs (in Spanish and Portuguese was not consulted. Conclusion. To improve the quality it is necessary to apply standards in carrying out systematic reviews (as laid down in the MECIR project. To meet these standards concerning literature searching, professional information retrieval specialist staff should be involved. The peer review committee in charge of evaluating the publication of a systematic review should also include specialists in information retrieval for assessing the quality of the literature search.

  18. Information retrieval for the Cochrane systematic reviews: the case of breast cancer surgery.

    Science.gov (United States)

    Cognetti, Gaetana; Grossi, Laura; Lucon, Antonio; Solimini, Renata

    2015-01-01

    Systematic reviews are fundamental sources of knowledge on the state-of-the-art interventions for various clinical problems. One of the essential components in carrying out a systematic review is that of developing a comprehensive literature search. Three Cochrane systematic reviews published in 2012 were retrieved using the MeSH descriptor breast neoplasms/surgery, and analyzed with respect to the information sources used and the search strategies adopted. In March 2014, an update of one of the reviews retrieved was also considered in the study. The number of databases queried for each review ranged between three and seven. All the reviews reported the search strategies adopted, however some only partially. All the reviews explicitly claimed that the searches applied no language restriction although sources such as the free database Lilacs (in Spanish and Portuguese) was not consulted. To improve the quality it is necessary to apply standards in carrying out systematic reviews (as laid down in the MECIR project). To meet these standards concerning literature searching, professional information retrieval specialist staff should be involved. The peer review committee in charge of evaluating the publication of a systematic review should also include specialists in information retrieval for assessing the quality of the literature search.

  19. Survey the role of emotions in information retrieval

    Directory of Open Access Journals (Sweden)

    Hassan Behzadi

    2016-03-01

    Full Text Available The present study was conducted to identify the users' emotion in various stages of information retrieval based on the information retrieval model in web.From the methodological perspective, the present study is experimental, and the type of study is practical. The society comprised all MA students majoring in different humanistic science branches and studying at Imam Reza international university. The sample society of this research consisted of 30 participants. The sample size was determined through stratified random sampling via G*power software. Data collection was carried out by using: demographic and prior experience of using internet questionnaire, post search questionnaire and recorded videos of users' faces. The findings of the study demonstrated that: 1 during the initial stages of searching, the frequency of emotion of apprehension, and in general during the link tracking stage, the negative emotions with the overall 49/3 percent are more frequent than the other emotions in browsing and differentiation stages, the emotion of happy was more frequent than the other emotions. 2 These variances resulted in significant relations among different emotions of the users throughout the four stages of information retrieval. 3 In simple search, the respondents displayed the emotion of happy most frequently and the emotion of aversion least frequently. On the other hand, in complicated search, apprehension and aversion were the most and the least frequently-cited emotions, respectively. Overall, the negative emotions were reported more frequently in complicated search in comparison with the simple search. This demonstrated that any change in the difficulty level of search undertaking would cause users to exhibit different types of emotions.

  20. 'Meatball searching' - The adversarial approach to online information retrieval

    Science.gov (United States)

    Jack, R. F.

    1985-01-01

    It is proposed that the different styles of online searching can be described as either formal (highly precise) or informal with the needs of the client dictating which is most applicable at a particular moment. The background and personality of the searcher also come into play. Particular attention is focused on meatball searching which is a form of online searching characterized by deliberate vagueness. It requires generally comprehensive searches, often on unusual topics and with tight deadlines. It is most likely to occur in search centers serving many different disciplines and levels of client information sophistication. Various information needs are outlined as well as the laws of meatball searching and the adversarial approach. Traits and characteristics important to sucessful searching include: (1) concept analysis, (2) flexibility of thinking, (3) ability to think in synonyms and (4) anticipation of variant word forms and spellings.

  1. Information retrieval in digital environments

    CERN Document Server

    Dinet, Jérôme

    2014-01-01

    Information retrieval is a central and essential activity. It is indeed difficult to find a human activity that does not need to retrieve information in an environment which is often increasingly digital: moving and navigating, learning, having fun, communicating, informing, making a decision, etc. Most human activities are intimately linked to our ability to search quickly and effectively for relevant information, the stakes are sometimes extremely important: passing an exam, voting, finding a job, remaining autonomous, being socially connected, developing a critical spirit, or simply surviv

  2. Searching in the Context of a Task: A Review of Methods and Tools

    Directory of Open Access Journals (Sweden)

    Ana Maguitman

    2018-04-01

    Full Text Available Contextual information extracted from the user task can help to better target retrieval to task-relevant content. In particular, topical context can be exploited to identify the subject of the information needs, contributing to reduce the information overload problem. A great number of methods exist to extract raw context data and contextual interaction patterns from the user task and to model this information using higher-level representations. Context can then be used as a source for automatic query generation, or as a means to refine or disambiguate user-generated queries. It can also be used to filter and rank results as well as to select domain-specific search engines with better capabilities to satisfy specific information requests. This article reviews methods that have been applied to deal with the problem of reflecting the current and long-term interests of a user in the search process. It discusses major difficulties encountered in the research area of context-based information retrieval and presents an overview of tools proposed since the mid-nineties to deal with the problem of context-based search.

  3. Nuclear expert web search and crawler algorithm

    International Nuclear Information System (INIS)

    Reis, Thiago; Barroso, Antonio C.O.; Baptista, Benedito Filho D.

    2013-01-01

    In this paper we present preliminary research on web search and crawling algorithm applied specifically to nuclear-related web information. We designed a web-based nuclear-oriented expert system guided by a web crawler algorithm and a neural network able to search and retrieve nuclear-related hyper textual web information in autonomous and massive fashion. Preliminary experimental results shows a retrieval precision of 80% for web pages related to any nuclear theme and a retrieval precision of 72% for web pages related only to nuclear power theme. (author)

  4. Nuclear expert web search and crawler algorithm

    Energy Technology Data Exchange (ETDEWEB)

    Reis, Thiago; Barroso, Antonio C.O.; Baptista, Benedito Filho D., E-mail: thiagoreis@usp.br, E-mail: barroso@ipen.br, E-mail: bdbfilho@ipen.br [Instituto de Pesquisas Energeticas e Nucleares (IPEN/CNEN-SP), Sao Paulo, SP (Brazil)

    2013-07-01

    In this paper we present preliminary research on web search and crawling algorithm applied specifically to nuclear-related web information. We designed a web-based nuclear-oriented expert system guided by a web crawler algorithm and a neural network able to search and retrieve nuclear-related hyper textual web information in autonomous and massive fashion. Preliminary experimental results shows a retrieval precision of 80% for web pages related to any nuclear theme and a retrieval precision of 72% for web pages related only to nuclear power theme. (author)

  5. Content-based video retrieval by example video clip

    Science.gov (United States)

    Dimitrova, Nevenka; Abdel-Mottaleb, Mohamed

    1997-01-01

    This paper presents a novel approach for video retrieval from a large archive of MPEG or Motion JPEG compressed video clips. We introduce a retrieval algorithm that takes a video clip as a query and searches the database for clips with similar contents. Video clips are characterized by a sequence of representative frame signatures, which are constructed from DC coefficients and motion information (`DC+M' signatures). The similarity between two video clips is determined by using their respective signatures. This method facilitates retrieval of clips for the purpose of video editing, broadcast news retrieval, or copyright violation detection.

  6. PubData: search engine for bioinformatics databases worldwide

    OpenAIRE

    Vand, Kasra; Wahlestedt, Thor; Khomtchouk, Kelly; Sayed, Mohammed; Wahlestedt, Claes; Khomtchouk, Bohdan

    2016-01-01

    We propose a search engine and file retrieval system for all bioinformatics databases worldwide. PubData searches biomedical data in a user-friendly fashion similar to how PubMed searches biomedical literature. PubData is built on novel network programming, natural language processing, and artificial intelligence algorithms that can patch into the file transfer protocol servers of any user-specified bioinformatics database, query its contents, retrieve files for download, and adapt to the use...

  7. A Typed Text Retrieval Query Language for XML Documents.

    Science.gov (United States)

    Colazzo, Dario; Sartiani, Carlo; Albano, Antonio; Manghi, Paolo; Ghelli, Giorgio; Lini, Luca; Paoli, Michele

    2002-01-01

    Discussion of XML focuses on a description of Tequyla-TX, a typed text retrieval query language for XML documents that can search on both content and structures. Highlights include motivations; numerous examples; word-based and char-based searches; tag-dependent full-text searches; text normalization; query algebra; data models and term language;…

  8. Flexible patient information search and retrieval framework: pilot implementation

    Science.gov (United States)

    Erdal, Selnur; Catalyurek, Umit V.; Saltz, Joel; Kamal, Jyoti; Gurcan, Metin N.

    2007-03-01

    Medical centers collect and store significant amount of valuable data pertaining to patients' visit in the form of medical free-text. In addition, standardized diagnosis codes (International Classification of Diseases, Ninth Revision, Clinical Modification: ICD9-CM) related to those dictated reports are usually available. In this work, we have created a framework where image searches could be initiated through a combination of free-text reports as well as ICD9 codes. This framework enables more comprehensive search on existing large sets of patient data in a systematic way. The free text search is enriched by computer-aided inclusion of additional search terms enhanced by a thesaurus. This combination of enriched search allows users to access to a larger set of relevant results from a patient-centric PACS in a simpler way. Therefore, such framework is of particular use in tasks such as gathering images for desired patient populations, building disease models, and so on. As the motivating application of our framework, we implemented a search engine. This search engine processed two years of patient data from the OSU Medical Center's Information Warehouse and identified lung nodule location information using a combination of UMLS Meta-Thesaurus enhanced text report searches along with ICD9 code searches on patients that have been discharged. Five different queries with various ICD9 codes involving lung cancer were carried out on 172552 cases. Each search was completed under a minute on average per ICD9 code and the inclusion of UMLS thesaurus increased the number of relevant cases by 45% on average.

  9. Document image retrieval through word shape coding.

    Science.gov (United States)

    Lu, Shijian; Li, Linlin; Tan, Chew Lim

    2008-11-01

    This paper presents a document retrieval technique that is capable of searching document images without OCR (optical character recognition). The proposed technique retrieves document images by a new word shape coding scheme, which captures the document content through annotating each word image by a word shape code. In particular, we annotate word images by using a set of topological shape features including character ascenders/descenders, character holes, and character water reservoirs. With the annotated word shape codes, document images can be retrieved by either query keywords or a query document image. Experimental results show that the proposed document image retrieval technique is fast, efficient, and tolerant to various types of document degradation.

  10. Features of the search profiles in the INIS-RS service

    International Nuclear Information System (INIS)

    Komatsubara, Yasutoshi

    1982-01-01

    Report is presented on the INIS-RS service being performed for nuclear people in Japan from 1979. Brief information on the INIS database and the retrieval system is stated in the first place. Analyses are made on the 322 items to reveal the composition and characteristics of the search profiles processed at the JAERI. Results are shown on the executing ratios of preliminaly search and of ranking retrieval with weighted descriptors. Each number of search terms and of logical operators used in each query is described with correlation of the number between terms and AND(*) operators. Descriptions are also given on the relevance ratio of the retrieval and number of the documents retrieved. (author)

  11. Myanmar Language Search Engine

    OpenAIRE

    Pann Yu Mon; Yoshiki Mikami

    2011-01-01

    With the enormous growth of the World Wide Web, search engines play a critical role in retrieving information from the borderless Web. Although many search engines are available for the major languages, but they are not much proficient for the less computerized languages including Myanmar. The main reason is that those search engines are not considering the specific features of those languages. A search engine which capable of searching the Web documents written in those languages is highly n...

  12. Retrieval practice enhances the ability to evaluate complex physiology information.

    Science.gov (United States)

    Dobson, John; Linderholm, Tracy; Perez, Jose

    2018-05-01

    Many investigations have shown that retrieval practice enhances the recall of different types of information, including both medical and physiological, but the effects of the strategy on higher-order thinking, such as evaluation, are less clear. The primary aim of this study was to compare how effectively retrieval practice and repeated studying (i.e. reading) strategies facilitated the evaluation of two research articles that advocated dissimilar conclusions. A secondary aim was to determine if that comparison was affected by using those same strategies to first learn important contextual information about the articles. Participants were randomly assigned to learn three texts that provided background information about the research articles either by studying them four consecutive times (Text-S) or by studying and then retrieving them two consecutive times (Text-R). Half of both the Text-S and Text-R groups were then randomly assigned to learn two physiology research articles by studying them four consecutive times (Article-S) and the other half learned them by studying and then retrieving them two consecutive times (Article-R). Participants then completed two assessments: the first tested their ability to critique the research articles and the second tested their recall of the background texts. On the article critique assessment, the Article-R groups' mean scores of 33.7 ± 4.7% and 35.4 ± 4.5% (Text-R then Article-R group and Text-S then Article-R group, respectively) were both significantly (p Retrieval practice promoted superior critical evaluation of the research articles, and the results also indicated the strategy enhanced the recall of background information. © 2018 John Wiley & Sons Ltd and The Association for the Study of Medical Education.

  13. How Are Researching and Reading Interwieved during Retrieval from Hierarchically Structured Documents?

    DEFF Research Database (Denmark)

    Hertzum, Morten; Lalmas, M.; Frøkjær, Erik

    2001-01-01

    Effective use of information retrieval systems requires that users know when to – temporarily – cease searching to do some reading and where to start reading. In hierarchically structured documents, users can to some extent interchange searching and reading by entering the text at different levels...... information retrieval systems could exploit document structure to return the best points to support reading, rather than merely hits...

  14. Computer-Assisted Search Of Large Textual Data Bases

    Science.gov (United States)

    Driscoll, James R.

    1995-01-01

    "QA" denotes high-speed computer system for searching diverse collections of documents including (but not limited to) technical reference manuals, legal documents, medical documents, news releases, and patents. Incorporates previously available and emerging information-retrieval technology to help user intelligently and rapidly locate information found in large textual data bases. Technology includes provision for inquiries in natural language; statistical ranking of retrieved information; artificial-intelligence implementation of semantics, in which "surface level" knowledge found in text used to improve ranking of retrieved information; and relevance feedback, in which user's judgements of relevance of some retrieved documents used automatically to modify search for further information.

  15. Search of associative memory.

    NARCIS (Netherlands)

    Raaijmakers, J.G.W.; Shiffrin, R.M.

    1981-01-01

    Describes search of associative memory (SAM), a general theory of retrieval from long-term memory that combines features of associative network models and random search models. It posits cue-dependent probabilistic sampling and recovery from an associative network, but the network is specified as a

  16. Compact binary hashing for music retrieval

    Science.gov (United States)

    Seo, Jin S.

    2014-03-01

    With the huge volume of music clips available for protection, browsing, and indexing, there is an increased attention to retrieve the information contents of the music archives. Music-similarity computation is an essential building block for browsing, retrieval, and indexing of digital music archives. In practice, as the number of songs available for searching and indexing is increased, so the storage cost in retrieval systems is becoming a serious problem. This paper deals with the storage problem by extending the supervector concept with the binary hashing. We utilize the similarity-preserving binary embedding in generating a hash code from the supervector of each music clip. Especially we compare the performance of the various binary hashing methods for music retrieval tasks on the widely-used genre dataset and the in-house singer dataset. Through the evaluation, we find an effective way of generating hash codes for music similarity estimation which improves the retrieval performance.

  17. The role of central attention in retrieval from visual short-term memory.

    Science.gov (United States)

    Magen, Hagit

    2017-04-01

    The role of central attention in visual short-term memory (VSTM) encoding and maintenance is well established, yet its role in retrieval has been largely unexplored. This study examined the involvement of central attention in retrieval from VSTM using a dual-task paradigm. Participants performed a color change-detection task. Set size varied between 1 and 3 items, and the memory sample was maintained for either a short or a long delay period. A secondary tone discrimination task was introduced at the end of the delay period, shortly before the appearance of a central probe, and occupied central attention while participants were searching within VSTM representations. Similarly to numerous previous studies, reaction time increased as a function of set size reflecting the occurrence of a capacity-limited memory search. When the color targets were maintained over a short delay, memory was searched for the most part without the involvement of central attention. However, with a longer delay period, the search relied entirely on the operation of central attention. Taken together, this study demonstrates that central attention is involved in retrieval from VSTM, but the extent of its involvement depends on the duration of the delay period. Future studies will determine whether the type of memory search (parallel or serial) carried out during retrieval depends on the nature of the attentional mechanism involved the task.

  18. Generating Concise Rules for Human Motion Retrieval

    Science.gov (United States)

    Mukai, Tomohiko; Wakisaka, Ken-Ichi; Kuriyama, Shigeru

    This paper proposes a method for retrieving human motion data with concise retrieval rules based on the spatio-temporal features of motion appearance. Our method first converts motion clip into a form of clausal language that represents geometrical relations between body parts and their temporal relationship. A retrieval rule is then learned from the set of manually classified examples using inductive logic programming (ILP). ILP automatically discovers the essential rule in the same clausal form with a user-defined hypothesis-testing procedure. All motions are indexed using this clausal language, and the desired clips are retrieved by subsequence matching using the rule. Such rule-based retrieval offers reasonable performance and the rule can be intuitively edited in the same language form. Consequently, our method enables efficient and flexible search from a large dataset with simple query language.

  19. The effectiveness of search dogs compared with humans in searching difficult terrain at turbine sites for bat fatalities

    Energy Technology Data Exchange (ETDEWEB)

    Mathews, Fiona

    2011-07-01

    Full text: Many wind farms in the UK and elsewhere in northern Europe are situated in habitat with dense tall vegetation such as arable fields and upland heaths. This makes surveying for bat fatalities extremely difficult. To facilitate a multi-centre study of the effects of wind turbines on British bats, we have therefore conducted controlled trials of the relative success of trained search dogs and ecologists in retrieving bat carcasses. Although dogs have been used previously in ecological surveys for bats, this is the first time they have been specifically trained for use in 'difficult to survey' habitats. Two ecologists and two Labrador dogs with handlers were each given the opportunity to retrieve up to 45 bat carcasses in a range of habitat types. Their efficiency in terms of overall search time, costs, and retrieval abilities were evaluated. Our results indicate that high rates of retrieval can be achieved by dogs, even in dense vegetation up to 75cm high. Further, a typical 100m2 search area can be surveyed in less than half the time taken by humans. The limitations of using search dogs, and their ability to detect the presence of bats that have been scavenged are also presented (presentation supported with video footage). (Author)

  20. WAIS Searching of the Current Contents Database

    Science.gov (United States)

    Banholzer, P.; Grabenstein, M. E.

    The Homer E. Newell Memorial Library of NASA's Goddard Space Flight Center is developing capabilities to permit Goddard personnel to access electronic resources of the Library via the Internet. The Library's support services contractor, Maxima Corporation, and their subcontractor, SANAD Support Technologies have recently developed a World Wide Web Home Page (http://www-library.gsfc.nasa.gov) to provide the primary means of access. The first searchable database to be made available through the HomePage to Goddard employees is Current Contents, from the Institute for Scientific Information (ISI). The initial implementation includes coverage of articles from the last few months of 1992 to present. These records are augmented with abstracts and references, and often are more robust than equivalent records in bibliographic databases that currently serve the astronomical community. Maxima/SANAD selected Wais Incorporated's WAIS product with which to build the interface to Current Contents. This system allows access from Macintosh, IBM PC, and Unix hosts, which is an important feature for Goddard's multiplatform environment. The forms interface is structured to allow both fielded (author, article title, journal name, id number, keyword, subject term, and citation) and unfielded WAIS searches. The system allows a user to: Retrieve individual journal article records. Retrieve Table of Contents of specific issues of journals. Connect to articles with similar subject terms or keywords. Connect to other issues of the same journal in the same year. Browse journal issues from an alphabetical list of indexed journal names.

  1. Supplementary searches of PubMed to improve currency of MEDLINE and MEDLINE In-Process searches via Ovid.

    Science.gov (United States)

    Duffy, Steven; de Kock, Shelley; Misso, Kate; Noake, Caro; Ross, Janine; Stirk, Lisa

    2016-10-01

    The research investigated whether conducting a supplementary search of PubMed in addition to the main MEDLINE (Ovid) search for a systematic review is worthwhile and to ascertain whether this PubMed search can be conducted quickly and if it retrieves unique, recently published, and ahead-of-print studies that are subsequently considered for inclusion in the final systematic review. Searches of PubMed were conducted after MEDLINE (Ovid) and MEDLINE In-Process (Ovid) searches had been completed for seven recent reviews. The searches were limited to records not in MEDLINE or MEDLINE In-Process (Ovid). Additional unique records were identified for all of the investigated reviews. Search strategies were adapted quickly to run in PubMed, and reviewer screening of the results was not time consuming. For each of the investigated reviews, studies were ordered for full screening; in six cases, studies retrieved from the supplementary PubMed searches were included in the final systematic review. Supplementary searching of PubMed for studies unavailable elsewhere is worthwhile and improves the currency of the systematic reviews.

  2. IMPROVING PERSONALIZED WEB SEARCH USING BOOKSHELF DATA STRUCTURE

    Directory of Open Access Journals (Sweden)

    S.K. Jayanthi

    2012-10-01

    Full Text Available Search engines are playing a vital role in retrieving relevant information for the web user. In this research work a user profile based web search is proposed. So the web user from different domain may receive different set of results. The main challenging work is to provide relevant results at the right level of reading difficulty. Estimating user expertise and re-ranking the results are the main aspects of this paper. The retrieved results are arranged in Bookshelf Data Structure for easy access. Better presentation of search results hence increases the usability of web search engines significantly in visual mode.

  3. Routes to the past: neural substrates of direct and generative autobiographical memory retrieval.

    Science.gov (United States)

    Addis, Donna Rose; Knapp, Katie; Roberts, Reece P; Schacter, Daniel L

    2012-02-01

    Models of autobiographical memory propose two routes to retrieval depending on cue specificity. When available cues are specific and personally-relevant, a memory can be directly accessed. However, when available cues are generic, one must engage a generative retrieval process to produce more specific cues to successfully access a relevant memory. The current study sought to characterize the neural bases of these retrieval processes. During functional magnetic resonance imaging (fMRI), participants were shown personally-relevant cues to elicit direct retrieval, or generic cues (nouns) to elicit generative retrieval. We used spatiotemporal partial least squares to characterize the spatial and temporal characteristics of the networks associated with direct and generative retrieval. Both retrieval tasks engaged regions comprising the autobiographical retrieval network, including hippocampus, and medial prefrontal and parietal cortices. However, some key neural differences emerged. Generative retrieval differentially recruited lateral prefrontal and temporal regions early on during the retrieval process, likely supporting the strategic search operations and initial recovery of generic autobiographical information. However, many regions were activated more strongly during direct versus generative retrieval, even when we time-locked the analysis to the successful recovery of events in both conditions. This result suggests that there may be fundamental differences between memories that are accessed directly and those that are recovered via the iterative search and retrieval process that characterizes generative retrieval. Copyright © 2011 Elsevier Inc. All rights reserved.

  4. Biomedical information retrieval across languages.

    Science.gov (United States)

    Daumke, Philipp; Markü, Kornél; Poprat, Michael; Schulz, Stefan; Klar, Rüdiger

    2007-06-01

    This work presents a new dictionary-based approach to biomedical cross-language information retrieval (CLIR) that addresses many of the general and domain-specific challenges in current CLIR research. Our method is based on a multilingual lexicon that was generated partly manually and partly automatically, and currently covers six European languages. It contains morphologically meaningful word fragments, termed subwords. Using subwords instead of entire words significantly reduces the number of lexical entries necessary to sufficiently cover a specific language and domain. Mediation between queries and documents is based on these subwords as well as on lists of word-n-grams that are generated from large monolingual corpora and constitute possible translation units. The translations are then sent to a standard Internet search engine. This process makes our approach an effective tool for searching the biomedical content of the World Wide Web in different languages. We evaluate this approach using the OHSUMED corpus, a large medical document collection, within a cross-language retrieval setting.

  5. An ontology-based search engine for protein-protein interactions.

    Science.gov (United States)

    Park, Byungkyu; Han, Kyungsook

    2010-01-18

    Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology.

  6. A Survey: Framework of an Information Retrieval for Malay Translated Hadith Document

    Directory of Open Access Journals (Sweden)

    Zulkefli Nurul Syeilla Syazhween

    2017-01-01

    Full Text Available This paper reviews and analyses the limitation of the existing method used in the IR process in retrieving Malay Translated Hadith documents related to the search request. Traditional Malay Translated Hadith retrieval system has not focused on semantic extraction from text. The bag-of-words representation ignores the conceptual similarity of information in the query text and documents, which produce unsatisfactory retrieval results. Therefore, a more efficient IR framework is needed. This paper claims that the significant information extraction and subject-related information are actually important because the clues from this information can be used to search and find the relevance document to a query. Also, unimportant information can be discarded to represent the document content. So, semantic understanding of query and document is necessary to improve the effectiveness and accuracy of retrieval results for this domain study. Therefore, advance research is needed and it will be experimented in the future work. It is hoped that it will help users to search and find information regarding to the Malay Translated Hadith document.

  7. Using Google Search Appliance (GSA) to search digital library collections: A Case Study of the INIS Collection Search

    International Nuclear Information System (INIS)

    Savic, Dobrica

    2014-01-01

    Google Search has established a new standard for information retrieval which did not exist with previous generations of library search facilities. The INIS hosts one of the world’s largest collections of published information on the peaceful uses of nuclear science and technology. It offers on-line access to a unique collection of 3.6 million bibliographic records and 483,000 full texts of non-conventional (grey) literature. This large digital library collection suffered from most of the well-known shortcomings of the classic library catalogue. Searching was complex and complicated, it required training in Boolean logic, full-text searching was not an option, and response time was slow. An opportune moment to improve the system came with the retirement of the previous catalogue software and the adoption of GSA as an organization-wide search engine standard. INIS was quick to realize the potential of using such a well-known application to replace its on-line catalogue. This paper presents the advantages and disadvantages encountered during three years of GSA use. Based on specific INIS-based practice and experience, this paper also offers some guidelines on ways to improve classic collections of millions of bibliographic and full-text documents, while reaping multiple benefits, such as increased use, accessibility, usability, expandability and improving user search and retrieval experiences. (author)

  8. Searching for Suicide Methods: Accessibility of Information About Helium as a Method of Suicide on the Internet.

    Science.gov (United States)

    Gunnell, David; Derges, Jane; Chang, Shu-Sen; Biddle, Lucy

    2015-01-01

    Helium gas suicides have increased in England and Wales; easy-to-access descriptions of this method on the Internet may have contributed to this rise. To investigate the availability of information on using helium as a method of suicide and trends in searching about this method on the Internet. We analyzed trends in (a) Google searching (2004-2014) and (b) hits on a Wikipedia article describing helium as a method of suicide (2013-2014). We also investigated the extent to which helium was described as a method of suicide on web pages and discussion forums identified via Google. We found no evidence of rises in Internet searching about suicide using helium. News stories about helium suicides were associated with increased search activity. The Wikipedia article may have been temporarily altered to increase awareness of suicide using helium around the time of a celebrity suicide. Approximately one third of the links retrieved using Google searches for suicide methods mentioned helium. Information about helium as a suicide method is readily available on the Internet; the Wikipedia article describing its use was highly accessed following celebrity suicides. Availability of online information about this method may contribute to rises in helium suicides.

  9. Information retrieval for systematic reviews in food and feed topics: A narrative review.

    Science.gov (United States)

    Wood, Hannah; O'Connor, Annette; Sargeant, Jan; Glanville, Julie

    2018-01-09

    Systematic review methods are now being used for reviews of food production, food safety and security, plant health, and animal health and welfare. Information retrieval methods in this context have been informed by human health-care approaches and ideally should be based on relevant research and experience. This narrative review seeks to identify and summarize current research-based evidence and experience on information retrieval for systematic reviews in food and feed topics. MEDLINE (Ovid), Science Citation Index (Web of Science), and ScienceDirect (http://www.sciencedirect.com/) were searched in 2012 and 2016. We also contacted topic experts and undertook citation searches. We selected and summarized studies reporting research on information retrieval, as well as published guidance and experience. There is little published evidence on the most efficient way to conduct searches for food and feed topics. There are few available study design search filters, and their use may be problematic given poor or inconsistent reporting of study methods. Food and feed research makes use of a wide range of study designs so it might be best to focus strategy development on capturing study populations, although this also has challenges. There is limited guidance on which resources should be searched and whether publication bias in disciplines relevant to food and feed necessitates extensive searching of the gray literature. There is some limited evidence on information retrieval approaches, but more research is required to inform effective and efficient approaches to searching to populate food and feed reviews. Copyright © 2018 John Wiley & Sons, Ltd.

  10. Search and Retrieval of Foreign Objects for the Steam Generator of Wolsung NPP Unit 1

    Energy Technology Data Exchange (ETDEWEB)

    Jeong, Woo-Tae; Lee, Kyung-Ho [KHNP CRI, Daejeon (Korea, Republic of)

    2016-10-15

    We developed a foreign object search and retrieval (FOSAR) system for Wolsung NPP unit 1 steam generators. The steam generators of Wolsung NPP unit 1 have one 2.5 inch hand hole and two 4 inch hand holes. The FOSAR system was designed to be installed through 4 inch hand holes. Using permanent magnet, the FOSAR system was firmly attached to the vertical annulus wall of the steam generator. We successfully developed the FOSAR system for Wolsung NPP unit 1. Using the developed FOSAR system, technicians successfully found and removed various foreign objects. Most of the foreign objects, we found, were made of carbon steel sheet, therefore magnet tool was the most useful to remove it. Alligator tool was sometimes used. Based on the experience during the FOSAR activities, we are developing a lancing system for Wolsung NPP unit 1. It will be designed and manufactured until November 2016.

  11. Search and Retrieval of Foreign Objects for the Steam Generator of Wolsung NPP Unit 1

    International Nuclear Information System (INIS)

    Jeong, Woo-Tae; Lee, Kyung-Ho

    2016-01-01

    We developed a foreign object search and retrieval (FOSAR) system for Wolsung NPP unit 1 steam generators. The steam generators of Wolsung NPP unit 1 have one 2.5 inch hand hole and two 4 inch hand holes. The FOSAR system was designed to be installed through 4 inch hand holes. Using permanent magnet, the FOSAR system was firmly attached to the vertical annulus wall of the steam generator. We successfully developed the FOSAR system for Wolsung NPP unit 1. Using the developed FOSAR system, technicians successfully found and removed various foreign objects. Most of the foreign objects, we found, were made of carbon steel sheet, therefore magnet tool was the most useful to remove it. Alligator tool was sometimes used. Based on the experience during the FOSAR activities, we are developing a lancing system for Wolsung NPP unit 1. It will be designed and manufactured until November 2016

  12. Usability Testing of a Large, Multidisciplinary Library Database: Basic Search and Visual Search

    Directory of Open Access Journals (Sweden)

    Jody Condit Fagan

    2006-09-01

    Full Text Available Visual search interfaces have been shown by researchers to assist users with information search and retrieval. Recently, several major library vendors have added visual search interfaces or functions to their products. For public service librarians, perhaps the most critical area of interest is the extent to which visual search interfaces and text-based search interfaces support research. This study presents the results of eight full-scale usability tests of both the EBSCOhost Basic Search and Visual Search in the context of a large liberal arts university.

  13. Experiences with automated categorization in e-government information retrieval

    DEFF Research Database (Denmark)

    Jonasen, Tanja Svarre; Lykke, Marianne

    2014-01-01

    High-precision search results are essential for supporting e-government employees’ information tasks. Prior studies have shown that existing features of e-government retrieval systems need improvement in terms of search facilities (e.g., Goh et al. 2008), navigation (e.g., de Jong and Lentz 2006)...

  14. Usability evaluation of an experimental text summarization system and three search engines: implications for the reengineering of health care interfaces.

    Science.gov (United States)

    Kushniruk, Andre W; Kan, Min-Yem; McKeown, Kathleen; Klavans, Judith; Jordan, Desmond; LaFlamme, Mark; Patel, Vimia L

    2002-01-01

    This paper describes the comparative evaluation of an experimental automated text summarization system, Centrifuser and three conventional search engines - Google, Yahoo and About.com. Centrifuser provides information to patients and families relevant to their questions about specific health conditions. It then produces a multidocument summary of articles retrieved by a standard search engine, tailored to the user's question. Subjects, consisting of friends or family of hospitalized patients, were asked to "think aloud" as they interacted with the four systems. The evaluation involved audio- and video recording of subject interactions with the interfaces in situ at a hospital. Results of the evaluation show that subjects found Centrifuser's summarization capability useful and easy to understand. In comparing Centrifuser to the three search engines, subjects' ratings varied; however, specific interface features were deemed useful across interfaces. We conclude with a discussion of the implications for engineering Web-based retrieval systems.

  15. Evolution of a search: the use of dynamic twitter searches during superstorm sandy.

    Science.gov (United States)

    Harris Smith, Sara; Bennett, Kelly J; Livinski, Alicia A

    2014-09-26

    Twitter has emerged as a critical source of free and openly available information during emergency response operations, providing an unmatched level of on-the-ground situational awareness in real-time. Responders and survivors turn to Twitter to share information and resources within communities, conduct rumor control, and provide a "boots on the ground" understanding of the disaster. However, the ability to tune out background "noise" is essential to effectively utilizing Twitter to identify important and useful information during an emergency response. This article highlights a two-prong strategy in which the use of a Twitter list paired with subject specific Boolean searches provided increased situational awareness and early event detection during the United States Department of Health and Human Services (HHS), Office of the Assistant Secretary for Preparedness and Response (ASPR) response to Superstorm Sandy in 2012. To maximize the amount of relevant information that was retrieved, the Twitter list and Boolean searches were dynamic and responsive to real-time developments, evolving health threats, and the informational needs of decision-makers. The use of a Twitter list combined with Boolean searches led to enhanced situational awareness throughout the HHS response. The incorporation of a dynamic search strategy over the course of the HHS Sandy response, allowed for the ability to account for over-tweeted information, changes in event related conversation, and decreases in the return of relevant information.

  16. Hospital nurses' information retrieval behaviours in relation to evidence based nursing: a literature review.

    Science.gov (United States)

    Alving, Berit Elisabeth; Christensen, Janne Buck; Thrysøe, Lars

    2018-03-01

    The purpose of this literature review is to provide an overview of the information retrieval behaviour of clinical nurses, in terms of the use of databases and other information resources and their frequency of use. Systematic searches carried out in five databases and handsearching were used to identify the studies from 2010 to 2016, with a populations, exposures and outcomes (PEO) search strategy, focusing on the question: In which databases or other information resources do hospital nurses search for evidence based information, and how often? Of 5272 titles retrieved based on the search strategy, only nine studies fulfilled the criteria for inclusion. The studies are from the United States, Canada, Taiwan and Nigeria. The results show that hospital nurses' primary choice of source for evidence based information is Google and peers, while bibliographic databases such as PubMed are secondary choices. Data on frequency are only included in four of the studies, and data are heterogenous. The reasons for choosing Google and peers are primarily lack of time; lack of information; lack of retrieval skills; or lack of training in database searching. Only a few studies are published on clinical nurses' retrieval behaviours, and more studies are needed from Europe and Australia. © 2018 Health Libraries Group.

  17. DisArticle: a web server for SVM-based discrimination of articles on traditional medicine.

    Science.gov (United States)

    Kim, Sang-Kyun; Nam, SeJin; Kim, SangHyun

    2017-01-28

    Much research has been done in Northeast Asia to show the efficacy of traditional medicine. While MEDLINE contains many biomedical articles including those on traditional medicine, it does not categorize those articles by specific research area. The aim of this study was to provide a method that searches for articles only on traditional medicine in Northeast Asia, including traditional Chinese medicine, from among the articles in MEDLINE. This research established an SVM-based classifier model to identify articles on traditional medicine. The TAK + HM classifier, trained with the features of title, abstract, keywords, herbal data, and MeSH, has a precision of 0.954 and a recall of 0.902. In particular, the feature of herbal data significantly increased the performance of the classifier. By using the TAK + HM classifier, a total of about 108,000 articles were discriminated as articles on traditional medicine from among all articles in MEDLINE. We also built a web server called DisArticle ( http://informatics.kiom.re.kr/disarticle ), in which users can search for the articles and obtain statistical data. Because much evidence-based research on traditional medicine has been published in recent years, it has become necessary to search for articles on traditional medicine exclusively in literature databases. DisArticle can help users to search for and analyze the research trends in traditional medicine.

  18. Subject Retrieval from Full-Text Databases in the Humanities

    Science.gov (United States)

    East, John W.

    2007-01-01

    This paper examines the problems involved in subject retrieval from full-text databases of secondary materials in the humanities. Ten such databases were studied and their search functionality evaluated, focusing on factors such as Boolean operators, document surrogates, limiting by subject area, proximity operators, phrase searching, wildcards,…

  19. Support Vector Machines: Relevance Feedback and Information Retrieval.

    Science.gov (United States)

    Drucker, Harris; Shahrary, Behzad; Gibbon, David C.

    2002-01-01

    Compares support vector machines (SVMs) to Rocchio, Ide regular and Ide dec-hi algorithms in information retrieval (IR) of text documents using relevancy feedback. If the preliminary search is so poor that one has to search through many documents to find at least one relevant document, then SVM is preferred. Includes nine tables. (Contains 24…

  20. Relationship between Usefulness Assessments and Perceptions of Work Task Complexity and Search Topic Specificity: An Exploratory Study

    DEFF Research Database (Denmark)

    Ingwersen, Peter; Wang, Peiling

    2012-01-01

    This research investigates the relations between the usefulness assessments of retrieved documents and the perceptions of task complexity and search topic specificity. Twenty-three academic researchers submitted 65 real task-based information search topics. These task topics were searched...... in an integrated document collection consisting of full text research articles in PDFs, abstracts, and bibliographic records (the iSearch Test Collection in Physics). The search results were provided to the researchers who, as task performers, made assessments of usefulness using a four-point sale (highly, fairly......, marginally, or not useful). In addition, they also assessed the perceived task complexity (highly, fairly, and routine/low) and the perceived specificity of the search topic (highly, fairly, and generic/low). It is found that highly specific topics associate with all degrees of task complexity, whereas...

  1. Image retrieval

    DEFF Research Database (Denmark)

    Ørnager, Susanne

    1997-01-01

    The paper touches upon indexing and retrieval for effective searches of digitized images. Different conceptions of what subject indexing means are described as a basis for defining an operational subject indexing strategy for images. The methodology is based on the art historian Erwin Panofsky......), special knowledge about image codes, and special knowledge about history of ideas. The semiologist Roland Barthes has established a semiology for pictorial expressions based on advertising photos. Barthes uses the concepts denotation/connotation where denotations can be explained as the sober expression...

  2. Efficient Similarity Retrieval in Music Databases

    DEFF Research Database (Denmark)

    Ruxanda, Maria Magdalena; Jensen, Christian Søndergaard

    2006-01-01

    Audio music is increasingly becoming available in digital form, and the digital music collections of individuals continue to grow. Addressing the need for effective means of retrieving music from such collections, this paper proposes new techniques for content-based similarity search. Each music...

  3. A literature search tool for intelligent extraction of disease-associated genes.

    Science.gov (United States)

    Jung, Jae-Yoon; DeLuca, Todd F; Nelson, Tristan H; Wall, Dennis P

    2014-01-01

    To extract disorder-associated genes from the scientific literature in PubMed with greater sensitivity for literature-based support than existing methods. We developed a PubMed query to retrieve disorder-related, original research articles. Then we applied a rule-based text-mining algorithm with keyword matching to extract target disorders, genes with significant results, and the type of study described by the article. We compared our resulting candidate disorder genes and supporting references with existing databases. We demonstrated that our candidate gene set covers nearly all genes in manually curated databases, and that the references supporting the disorder-gene link are more extensive and accurate than other general purpose gene-to-disorder association databases. We implemented a novel publication search tool to find target articles, specifically focused on links between disorders and genotypes. Through comparison against gold-standard manually updated gene-disorder databases and comparison with automated databases of similar functionality we show that our tool can search through the entirety of PubMed to extract the main gene findings for human diseases rapidly and accurately.

  4. [Application of spaced retrieval training on patients with dementia].

    Science.gov (United States)

    Wu, Hua-Shan; Lin, Li-Chan

    2012-10-01

    Dementia causes semantic and episodic memory impairments that limit patients' activities of daily living (ADL) and increase caregiver burden. Spaced retrieval training uses repetitive retrieval to strengthen cognitive and motor skills intuitively in mild / moderate dementia patients who retain preserved implicit / non-declarative memory. This article describes and discusses the operative mechanism, influencing variables, and practical applications of spaced retrieval training. We hope this article increases professional understanding and application of this training approach to improve dementia patient ADL and improve quality of life for both caregivers and patients.

  5. CDAPubMed: a browser extension to retrieve EHR-based biomedical literature

    Directory of Open Access Journals (Sweden)

    Perez-Rey David

    2012-04-01

    Full Text Available Abstract Background Over the last few decades, the ever-increasing output of scientific publications has led to new challenges to keep up to date with the literature. In the biomedical area, this growth has introduced new requirements for professionals, e.g., physicians, who have to locate the exact papers that they need for their clinical and research work amongst a huge number of publications. Against this backdrop, novel information retrieval methods are even more necessary. While web search engines are widespread in many areas, facilitating access to all kinds of information, additional tools are required to automatically link information retrieved from these engines to specific biomedical applications. In the case of clinical environments, this also means considering aspects such as patient data security and confidentiality or structured contents, e.g., electronic health records (EHRs. In this scenario, we have developed a new tool to facilitate query building to retrieve scientific literature related to EHRs. Results We have developed CDAPubMed, an open-source web browser extension to integrate EHR features in biomedical literature retrieval approaches. Clinical users can use CDAPubMed to: (i load patient clinical documents, i.e., EHRs based on the Health Level 7-Clinical Document Architecture Standard (HL7-CDA, (ii identify relevant terms for scientific literature search in these documents, i.e., Medical Subject Headings (MeSH, automatically driven by the CDAPubMed configuration, which advanced users can optimize to adapt to each specific situation, and (iii generate and launch literature search queries to a major search engine, i.e., PubMed, to retrieve citations related to the EHR under examination. Conclusions CDAPubMed is a platform-independent tool designed to facilitate literature searching using keywords contained in specific EHRs. CDAPubMed is visually integrated, as an extension of a widespread web browser, within the standard

  6. A Survey in Indexing and Searching XML Documents.

    Science.gov (United States)

    Luk, Robert W. P.; Leong, H. V.; Dillon, Tharam S.; Chan, Alvin T. S.; Croft, W. Bruce; Allan, James

    2002-01-01

    Discussion of XML focuses on indexing techniques for XML documents, grouping them into flat-file, semistructured, and structured indexing paradigms. Highlights include searching techniques, including full text search and multistage search; search result presentations; database and information retrieval system integration; XML query languages; and…

  7. Care episode retrieval: distributional semantic models for information retrieval in the clinical domain.

    Science.gov (United States)

    Moen, Hans; Ginter, Filip; Marsi, Erwin; Peltonen, Laura-Maria; Salakoski, Tapio; Salanterä, Sanna

    2015-01-01

    Patients' health related information is stored in electronic health records (EHRs) by health service providers. These records include sequential documentation of care episodes in the form of clinical notes. EHRs are used throughout the health care sector by professionals, administrators and patients, primarily for clinical purposes, but also for secondary purposes such as decision support and research. The vast amounts of information in EHR systems complicate information management and increase the risk of information overload. Therefore, clinicians and researchers need new tools to manage the information stored in the EHRs. A common use case is, given a--possibly unfinished--care episode, to retrieve the most similar care episodes among the records. This paper presents several methods for information retrieval, focusing on care episode retrieval, based on textual similarity, where similarity is measured through domain-specific modelling of the distributional semantics of words. Models include variants of random indexing and the semantic neural network model word2vec. Two novel methods are introduced that utilize the ICD-10 codes attached to care episodes to better induce domain-specificity in the semantic model. We report on experimental evaluation of care episode retrieval that circumvents the lack of human judgements regarding episode relevance. Results suggest that several of the methods proposed outperform a state-of-the art search engine (Lucene) on the retrieval task.

  8. Development and evaluation of a biomedical search engine using a predicate-based vector space model.

    Science.gov (United States)

    Kwak, Myungjae; Leroy, Gondy; Martinez, Jesse D; Harwell, Jeffrey

    2013-10-01

    Although biomedical information available in articles and patents is increasing exponentially, we continue to rely on the same information retrieval methods and use very few keywords to search millions of documents. We are developing a fundamentally different approach for finding much more precise and complete information with a single query using predicates instead of keywords for both query and document representation. Predicates are triples that are more complex datastructures than keywords and contain more structured information. To make optimal use of them, we developed a new predicate-based vector space model and query-document similarity function with adjusted tf-idf and boost function. Using a test bed of 107,367 PubMed abstracts, we evaluated the first essential function: retrieving information. Cancer researchers provided 20 realistic queries, for which the top 15 abstracts were retrieved using a predicate-based (new) and keyword-based (baseline) approach. Each abstract was evaluated, double-blind, by cancer researchers on a 0-5 point scale to calculate precision (0 versus higher) and relevance (0-5 score). Precision was significantly higher (psearching than keywords, laying the foundation for rich and sophisticated information search. Copyright © 2013 Elsevier Inc. All rights reserved.

  9. Using the fuzzy modeling for the retrieval algorithms

    International Nuclear Information System (INIS)

    Mohamed, A.H

    2010-01-01

    A rapid growth in number and size of images in databases and world wide web (www) has created a strong need for more efficient search and retrieval systems to exploit the benefits of this large amount of information. However, the collection of this information is now based on the image technology. One of the limitations of the current image analysis techniques necessitates that most image retrieval systems use some form of text description provided by the users as the basis to index and retrieve images. To overcome this problem, the proposed system introduces the using of fuzzy modeling to describe the image by using the linguistic ambiguities. Also, the proposed system can include vague or fuzzy terms in modeling the queries to match the image descriptions in the retrieval process. This can facilitate the indexing and retrieving process, increase their performance and decrease its computational time . Therefore, the proposed system can improve the performance of the traditional image retrieval algorithms.

  10. Human memory retrieval as Lévy foraging

    Science.gov (United States)

    Rhodes, Theo; Turvey, Michael T.

    2007-11-01

    When people attempt to recall as many words as possible from a specific category (e.g., animal names) their retrievals occur sporadically over an extended temporal period. Retrievals decline as recall progresses, but short retrieval bursts can occur even after tens of minutes of performing the task. To date, efforts to gain insight into the nature of retrieval from this fundamental phenomenon of semantic memory have focused primarily upon the exponential growth rate of cumulative recall. Here we focus upon the time intervals between retrievals. We expected and found that, for each participant in our experiment, these intervals conformed to a Lévy distribution suggesting that the Lévy flight dynamics that characterize foraging behavior may also characterize retrieval from semantic memory. The closer the exponent on the inverse square power-law distribution of retrieval intervals approximated the optimal foraging value of 2, the more efficient was the retrieval. At an abstract dynamical level, foraging for particular foods in one's niche and searching for particular words in one's memory must be similar processes if particular foods and particular words are randomly and sparsely located in their respective spaces at sites that are not known a priori. We discuss whether Lévy dynamics imply that memory processes, like foraging, are optimized in an ecological way.

  11. Semi-automating the manual literature search for systematic reviews increases efficiency.

    Science.gov (United States)

    Chapman, Andrea L; Morgan, Laura C; Gartlehner, Gerald

    2010-03-01

    To minimise retrieval bias, manual literature searches are a key part of the search process of any systematic review. Considering the need to have accurate information, valid results of the manual literature search are essential to ensure scientific standards; likewise efficient approaches that minimise the amount of personnel time required to conduct a manual literature search are of great interest. The objective of this project was to determine the validity and efficiency of a new manual search method that utilises the scopus database. We used the traditional manual search approach as the gold standard to determine the validity and efficiency of the proposed scopus method. Outcome measures included completeness of article detection and personnel time involved. Using both methods independently, we compared the results based on accuracy of the results, validity and time spent conducting the search, efficiency. Regarding accuracy, the scopus method identified the same studies as the traditional approach indicating its validity. In terms of efficiency, using scopus led to a time saving of 62.5% compared with the traditional approach (3 h versus 8 h). The scopus method can significantly improve the efficiency of manual searches and thus of systematic reviews.

  12. A semantic medical multimedia retrieval approach using ontology information hiding.

    Science.gov (United States)

    Guo, Kehua; Zhang, Shigeng

    2013-01-01

    Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches.

  13. Video Retrieval Berdasarkan Teks dan Gambar

    Directory of Open Access Journals (Sweden)

    Rahmi Hidayati

    2013-01-01

    Abstract Retrieval video has been used to search a video based on the query entered by user which were text and image. This system could increase the searching ability on video browsing and expected to reduce the video’s retrieval time. The research purposes were designing and creating a software application of retrieval video based on the text and image on the video. The index process for the text is tokenizing, filtering (stopword, stemming. The results of stemming to saved in the text index table. Index process for the image is to create an image color histogram and compute the mean and standard deviation at each primary color red, green and blue (RGB of each image. The results of feature extraction is stored in the image table The process of video retrieval using the query text, images or both. To text query system to process the text query by looking at the text index tables. If there is a text query on the index table system will display information of the video according to the text query. To image query system to process the image query by finding the value of the feature extraction means red, green means, means blue, red standard deviation, standard deviation and standard deviation of blue green. If the value of the six features extracted query image on the index table image will display the video information system according to the query image. To query text and query images, the system will display the video information if the query text and query images have a relationship that is query text and query image has the same film title.   Keywords—  video, index, retrieval, text, image

  14. The Weaknesses of Full-Text Searching

    Science.gov (United States)

    Beall, Jeffrey

    2008-01-01

    This paper provides a theoretical critique of the deficiencies of full-text searching in academic library databases. Because full-text searching relies on matching words in a search query with words in online resources, it is an inefficient method of finding information in a database. This matching fails to retrieve synonyms, and it also retrieves…

  15. Externalities and article citations: experience of a national public health journal (Gaceta Sanitaria).

    Science.gov (United States)

    Ruano-Ravina, Alberto; Álvarez-Dardet, Carlos; Domínguez-Berjón, M Felicitas; Fernández, Esteve; García, Ana M; Borrell, Carme

    2016-01-01

    The purpose of the study was to analyze the determinants of citations such as publication year, article type, article topic, article selected for a press release, number of articles previously published by the corresponding author, and publication language in a Spanish journal of public health. Observational study including all articles published in Gaceta Sanitaria during 2007-2011. We retrieved the number of citations from the ISI Web of Knowledge database in June 2013 and also information on other variables such as number of articles published by the corresponding author in the previous 5 years (searched through PubMed), selection for a press release, publication language, article type and topic, and others. We included 542 articles. Of these, 62.5% were cited in the period considered. We observed an increased odds ratio of citations for articles selected for a press release and also with the number of articles published previously by the corresponding author. Articles published in English do not seem to increase their citations. Certain externalities such as number of articles published by the corresponding author and being selected for a press release seem to influence the number of citations in national journals. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Search Engine For Ebook Portal

    Directory of Open Access Journals (Sweden)

    Prashant Kanade

    2017-05-01

    Full Text Available The purpose of this paper is to establish the textual analytics involved in developing a search engine for an ebook portal. We have extracted our dataset from Project Gutenberg using a robot harvester. Textual Analytics is used for efficient search retrieval. The entire dataset is represented using Vector Space Model where each document is a vector in the vector space. Further for computational purposes we represent our dataset in the form of a Term Frequency- Inverse Document Frequency tf-idf matrix. The first step involves obtaining the most coherent sequence of words of the search query entered. The entered query is processed using Front End algorithms this includes-Spell Checker Text Segmentation and Language Modeling. Back End processing includes Similarity Modeling Clustering Indexing and Retrieval. The relationship between documents and words is established using cosine similarity measured between the documents and words in Vector Space. Clustering performed is used to suggest books that are similar to the search query entered by the user. Lastly the Lucene Based Elasticsearch engine is used for indexing on the documents. This allows faster retrieval of data. Elasticsearch returns a dictionary and creates a tf-idf matrix. The processed query is compared with the dictionary obtained and tf-idf matrix is used to calculate the score for each match to give most relevant result.

  17. State-of-the-Art Review on Relevance of Genetic Algorithm to Internet Web Search

    Directory of Open Access Journals (Sweden)

    Kehinde Agbele

    2012-01-01

    Full Text Available People use search engines to find information they desire with the aim that their information needs will be met. Information retrieval (IR is a field that is concerned primarily with the searching and retrieving of information in the documents and also searching the search engine, online databases, and Internet. Genetic algorithms (GAs are robust, efficient, and optimizated methods in a wide area of search problems motivated by Darwin’s principles of natural selection and survival of the fittest. This paper describes information retrieval systems (IRS components. This paper looks at how GAs can be applied in the field of IR and specifically the relevance of genetic algorithms to internet web search. Finally, from the proposals surveyed it turns out that GA is applied to diverse problem fields of internet web search.

  18. Mammogram retrieval through machine learning within BI-RADS standards.

    Science.gov (United States)

    Wei, Chia-Hung; Li, Yue; Huang, Pai Jung

    2011-08-01

    A content-based mammogram retrieval system can support usual comparisons made on images by physicians, answering similarity queries over images stored in the database. The importance of searching for similar mammograms lies in the fact that physicians usually try to recall similar cases by seeking images that are pathologically similar to a given image. This paper presents a content-based mammogram retrieval system, which employs a query example to search for similar mammograms in the database. In this system the mammographic lesions are interpreted based on their medical characteristics specified in the Breast Imaging Reporting and Data System (BI-RADS) standards. A hierarchical similarity measurement scheme based on a distance weighting function is proposed to model user's perception and maximizes the effectiveness of each feature in a mammographic descriptor. A machine learning approach based on support vector machines and user's relevance feedback is also proposed to analyze the user's information need in order to retrieve target images more accurately. Experimental results demonstrate that the proposed machine learning approach with Radial Basis Function (RBF) kernel function achieves the best performance among all tested ones. Furthermore, the results also show that the proposed learning approach can improve retrieval performance when applied to retrieve mammograms with similar mass and calcification lesions, respectively. Copyright © 2011 Elsevier Inc. All rights reserved.

  19. Protein structural similarity search by Ramachandran codes

    Directory of Open Access Journals (Sweden)

    Chang Chih-Hung

    2007-08-01

    Full Text Available Abstract Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation. SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era.

  20. [Advanced online search techniques and dedicated search engines for physicians].

    Science.gov (United States)

    Nahum, Yoav

    2008-02-01

    In recent years search engines have become an essential tool in the work of physicians. This article will review advanced search techniques from the world of information specialists, as well as some advanced search engine operators that may help physicians improve their online search capabilities, and maximize the yield of their searches. This article also reviews popular dedicated scientific and biomedical literature search engines.

  1. Evaluating search effectiveness of some selected search engines ...

    African Journals Online (AJOL)

    With advancement in technology, many individuals are getting familiar with the internet a lot of users seek for information on the World Wide Web (WWW) using variety of search engines. This research work evaluates the retrieval effectiveness of Google, Yahoo, Bing, AOL and Baidu. Precision, relative recall and response ...

  2. Determining the relative importance of figures in journal articles to find representative images

    Science.gov (United States)

    Müller, Henning; Foncubierta-Rodríguez, Antonio; Lin, Chang; Eggel, Ivan

    2013-03-01

    When physicians are searching for articles in the medical literature, images of the articles can help determining relevance of the article content for a specific information need. The visual image representation can be an advantage in effectiveness (quality of found articles) and also in efficiency (speed of determining relevance or irrelevance) as many articles can likely be excluded much quicker by looking at a few representative images. In domains such as medical information retrieval, allowing to determine relevance quickly and accurately is an important criterion. This becomes even more important when small interfaces are used as it is frequently the case on mobile phones and tablets to access scientific data whenever information needs arise. In scientific articles many figures are used and particularly in the biomedical literature only a subset may be relevant for determining the relevance of a specific article to an information need. In many cases clinical images can be seen as more important for visual appearance than graphs or histograms that require looking at the context for interpretation. To get a clearer idea of image relevance in articles, a user test with a physician was performed who classified images of biomedical research articles into categories of importance that can subsequently be used to evaluate algorithms that automatically select images as representative examples. The manual sorting of images of 50 journal articles of BioMedCentral with each containing more than 8 figures by importance also allows to derive several rules that determine how to choose images and how to develop algorithms for choosing the most representative images of specific texts. This article describes the user tests and can be a first important step to evaluate automatic tools to select representative images for representing articles and potentially also images in other contexts, for example when representing patient records or other medical concepts when selecting

  3. The Use of QBIC Content-Based Image Retrieval System

    Directory of Open Access Journals (Sweden)

    Ching-Yi Wu

    2004-03-01

    Full Text Available The fast increase in digital images has caught increasing attention on the development of image retrieval technologies. Content-based image retrieval (CBIR has become an important approach in retrieving image data from a large collection. This article reports our results on the use and users study of a CBIR system. Thirty-eight students majored in art and design were invited to use the IBM’s OBIC (Query by Image Content system through the Internet. Data from their information needs, behaviors, and retrieval strategies were collected through an in-depth interview, observation, and self-described think-aloud process. Important conclusions are:(1)There are four types of information needs for image data: implicit, inspirational, ever-changing, and purposive. The types of needs may change during the retrieval process. (2)CBIR is suitable for the example-type query, text retrieval is suitable for the scenario-type query, and image browsing is suitable for the symbolic query. (3)Different from text retrieval, detailed description of the query condition may lead to retrieval failure more easily. (4)CBIR is suitable for the domain-specific image collection, not for the images on the Word-Wide Web.[Article content in Chinese

  4. A passage retrieval method based on probabilistic information retrieval model and UMLS concepts in biomedical question answering.

    Science.gov (United States)

    Sarrouti, Mourad; Ouatik El Alaoui, Said

    2017-04-01

    Passage retrieval, the identification of top-ranked passages that may contain the answer for a given biomedical question, is a crucial component for any biomedical question answering (QA) system. Passage retrieval in open-domain QA is a longstanding challenge widely studied over the last decades. However, it still requires further efforts in biomedical QA. In this paper, we present a new biomedical passage retrieval method based on Stanford CoreNLP sentence/passage length, probabilistic information retrieval (IR) model and UMLS concepts. In the proposed method, we first use our document retrieval system based on PubMed search engine and UMLS similarity to retrieve relevant documents to a given biomedical question. We then take the abstracts from the retrieved documents and use Stanford CoreNLP for sentence splitter to make a set of sentences, i.e., candidate passages. Using stemmed words and UMLS concepts as features for the BM25 model, we finally compute the similarity scores between the biomedical question and each of the candidate passages and keep the N top-ranked ones. Experimental evaluations performed on large standard datasets, provided by the BioASQ challenge, show that the proposed method achieves good performances compared with the current state-of-the-art methods. The proposed method significantly outperforms the current state-of-the-art methods by an average of 6.84% in terms of mean average precision (MAP). We have proposed an efficient passage retrieval method which can be used to retrieve relevant passages in biomedical QA systems with high mean average precision. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. The JPL Library information retrieval system

    Science.gov (United States)

    Walsh, J.

    1975-01-01

    The development, capabilities, and products of the computer-based retrieval system of the Jet Propulsion Laboratory Library are described. The system handles books and documents, produces a book catalog, and provides a machine search capability. Programs and documentation are available to the public through NASA's computer software dissemination program.

  6. A Bayesian Approach to Interactive Retrieval

    Science.gov (United States)

    Tague, Jean M.

    1973-01-01

    A probabilistic model for interactive retrieval is presented. Bayesian statistical decision theory principles are applied: use of prior and sample information about the relationship of document descriptions to query relevance; maximization of expected value of a utility function, to the problem of optimally restructuring search strategies in an…

  7. Selective Document Retrieval from Encrypted Database

    NARCIS (Netherlands)

    Bösch, C.T.; Tang, Qiang; Hartel, Pieter H.; Jonker, Willem

    We propose the concept of selective document retrieval (SDR) from an encrypted database which allows a client to store encrypted data on a third-party server and perform efficient search remotely. We propose a new SDR scheme based on the recent advances in fully homomorphic encryption schemes. The

  8. IntegromeDB: an integrated system and biological search engine.

    Science.gov (United States)

    Baitaluk, Michael; Kozhenkov, Sergey; Dubinina, Yulia; Ponomarenko, Julia

    2012-01-19

    With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback.

  9. Interactive Information Retrieval:

    DEFF Research Database (Denmark)

    Borlund, Pia

    IIR from the perspective of search dedication and task load in order to also include everyday life information seeking? With this presentation, the IIR community is invited to an exchange of ideas and is encouraged to engage in collaborations with the solving of these (and other) issues to our joint......This presentation addresses methodological issues of interactive information retrieval (IIR) evaluation in terms of what it entails to study users' use and interaction with IR systems, as well as their satisfaction with retrieved information. In particular, the presentation focuses on test design...... of the users to ensure a complete and realistic picture to enhance our understanding of IIR. The presentation also reflects on whether a re-thinking of the concept on an information need is necessary. One may ask whether it still makes sense to talk about types of information needs. Or should we rather study...

  10. Search Analytics for Your Site

    CERN Document Server

    Rosenfeld, Louis

    2011-01-01

    Any organization that has a searchable web site or intranet is sitting on top of hugely valuable and usually under-exploited data: logs that capture what users are searching for, how often each query was searched, and how many results each query retrieved. Search queries are gold: they are real data that show us exactly what users are searching for in their own words. This book shows you how to use search analytics to carry on a conversation with your customers: listen to and understand their needs, and improve your content, navigation and search performance to meet those needs.

  11. Review article: herbal and dietary supplement hepatotoxicity.

    Science.gov (United States)

    Bunchorntavakul, C; Reddy, K R

    2013-01-01

    Herbal and dietary supplements are commonly used throughout the World. There is a tendency for underreporting their ingestion by patients and the magnitude of their use is underrecognised by Physicians. Herbal hepatotoxicity is not uncommonly encountered, but the precise incidence and manifestations have not been well characterised. To review the epidemiology, presentation and diagnosis of herbal hepatotoxicity. This review will mainly discuss single ingredients and complex mixtures of herbs marketed under a single label. A Medline search was undertaken to identify relevant literature using search terms including 'herbal', 'herbs', 'dietary supplement', 'liver injury', 'hepatitis' and 'hepatotoxicity'. Furthermore, we scanned the reference lists of the primary and review articles to identify publications not retrieved by electronic searches. The incidence rates of herbal hepatotoxicity are largely unknown. The clinical presentation and severity can be highly variable, ranging from mild hepatitis to acute hepatic failure requiring transplantation. Scoring systems for the causality assessment of drug-induced liver injury may be helpful, but have not been validated for herbal hepatotoxicity. Hepatotoxicity features of commonly used herbal products, such as Ayurvedic and Chinese herbs, black cohosh, chaparral, germander, greater celandine, green tea, Herbalife, Hydroxycut, kava, pennyroyal, pyrrolizidine alkaloids, skullcap, and usnic acid, have been individually reviewed. Furthermore, clinically significant herb-drug interactions are also discussed. A number of herbal medicinal products are associated with a spectrum of hepatotoxicity events. Advances in the understanding of the pathogenesis and the risks involved are needed to improve herbal medicine safety. © 2012 Blackwell Publishing Ltd.

  12. Research Article Special Issue

    African Journals Online (AJOL)

    pc

    2018-03-07

    Mar 7, 2018 ... ethical issues behind the retrieval of organs (commitment to ... The article reviews the practices of European countries and attitude of the EU citizens to the .... knowledge that the purchase and sale of donor organs is prohibited ...

  13. Semantic association ranking schemes for information retrieval ...

    Indian Academy of Sciences (India)

    retrieval applications using term association graph representation ... Department of Computer Science and Engineering, Government College of ... Introduction ... leads to poor precision, e.g., model, python, and chip. ...... The approaches proposed in this paper focuses on the query-centric re-ranking of search results.

  14. Citation searching: a systematic review case study of multiple risk behaviour interventions.

    Science.gov (United States)

    Wright, Kath; Golder, Su; Rodriguez-Lopez, Rocio

    2014-06-03

    The value of citation searches as part of the systematic review process is currently unknown. While the major guides to conducting systematic reviews state that citation searching should be carried out in addition to searching bibliographic databases there are still few studies in the literature that support this view. Rather than using a predefined search strategy to retrieve studies, citation searching uses known relevant papers to identify further papers. We describe a case study about the effectiveness of using the citation sources Google Scholar, Scopus, Web of Science and OVIDSP MEDLINE to identify records for inclusion in a systematic review.We used the 40 included studies identified by traditional database searches from one systematic review of interventions for multiple risk behaviours. We searched for each of the included studies in the four citation sources to retrieve the details of all papers that have cited these studies.We carried out two analyses; the first was to examine the overlap between the four citation sources to identify which citation tool was the most useful; the second was to investigate whether the citation searches identified any relevant records in addition to those retrieved by the original database searches. The highest number of citations was retrieved from Google Scholar (1680), followed by Scopus (1173), then Web of Science (1095) and lastly OVIDSP (213). To retrieve all the records identified by the citation tracking searching all four resources was required. Google Scholar identified the highest number of unique citations.The citation tracking identified 9 studies that met the review's inclusion criteria. Eight of these had already been identified by the traditional databases searches and identified in the screening process while the ninth was not available in any of the databases when the original searches were carried out. It would, however, have been identified by two of the database search strategies if searches had been

  15. Google Scholar as replacement for systematic literature searches: good relative recall and precision are not enough.

    Science.gov (United States)

    Boeker, Martin; Vach, Werner; Motschall, Edith

    2013-10-26

    Recent research indicates a high recall in Google Scholar searches for systematic reviews. These reports raised high expectations of Google Scholar as a unified and easy to use search interface. However, studies on the coverage of Google Scholar rarely used the search interface in a realistic approach but instead merely checked for the existence of gold standard references. In addition, the severe limitations of the Google Search interface must be taken into consideration when comparing with professional literature retrieval tools.The objectives of this work are to measure the relative recall and precision of searches with Google Scholar under conditions which are derived from structured search procedures conventional in scientific literature retrieval; and to provide an overview of current advantages and disadvantages of the Google Scholar search interface in scientific literature retrieval. General and MEDLINE-specific search strategies were retrieved from 14 Cochrane systematic reviews. Cochrane systematic review search strategies were translated to Google Scholar search expression as good as possible under consideration of the original search semantics. The references of the included studies from the Cochrane reviews were checked for their inclusion in the result sets of the Google Scholar searches. Relative recall and precision were calculated. We investigated Cochrane reviews with a number of included references between 11 and 70 with a total of 396 references. The Google Scholar searches resulted in sets between 4,320 and 67,800 and a total of 291,190 hits. The relative recall of the Google Scholar searches had a minimum of 76.2% and a maximum of 100% (7 searches). The precision of the Google Scholar searches had a minimum of 0.05% and a maximum of 0.92%. The overall relative recall for all searches was 92.9%, the overall precision was 0.13%. The reported relative recall must be interpreted with care. It is a quality indicator of Google Scholar confined to

  16. Leveraging search and content exploration by exploiting context in folksonomy systems

    Science.gov (United States)

    Abel, Fabian; Baldoni, Matteo; Baroglio, Cristina; Henze, Nicola; Kawase, Ricardo; Krause, Daniel; Patti, Viviana

    2010-04-01

    With the advent of Web 2.0 tagging became a popular feature in social media systems. People tag diverse kinds of content, e.g. products at Amazon, music at Last.fm, images at Flickr, etc. In the last years several researchers analyzed the impact of tags on information retrieval. Most works focused on tags only and ignored context information. In this article we present context-aware approaches for learning semantics and improve personalized information retrieval in tagging systems. We investigate how explorative search, initialized by clicking on tags, can be enhanced with automatically produced context information so that search results better fit to the actual information needs of the users. We introduce the SocialHITS algorithm and present an experiment where we compare different algorithms for ranking users, tags, and resources in a contextualized way. We showcase our approaches in the domain of images and present the TagMe! system that enables users to explore and tag Flickr pictures. In TagMe! we further demonstrate how advanced context information can easily be generated: TagMe! allows users to attach tag assignments to a specific area within an image and to categorize tag assignments. In our corresponding evaluation we show that those additional facets of tag assignments gain valuable semantics, which can be applied to improve existing search and ranking algorithms significantly.

  17. Comparing the Precision of Information Retrieval of MeSH-Controlled Vocabulary Search Method and a Visual Method in the Medline Medical Database.

    Science.gov (United States)

    Hariri, Nadjla; Ravandi, Somayyeh Nadi

    2014-01-01

    Medline is one of the most important databases in the biomedical field. One of the most important hosts for Medline is Elton B. Stephens CO. (EBSCO), which has presented different search methods that can be used based on the needs of the users. Visual search and MeSH-controlled search methods are among the most common methods. The goal of this research was to compare the precision of the retrieved sources in the EBSCO Medline base using MeSH-controlled and visual search methods. This research was a semi-empirical study. By holding training workshops, 70 students of higher education in different educational departments of Kashan University of Medical Sciences were taught MeSH-Controlled and visual search methods in 2012. Then, the precision of 300 searches made by these students was calculated based on Best Precision, Useful Precision, and Objective Precision formulas and analyzed in SPSS software using the independent sample T Test, and three precisions obtained with the three precision formulas were studied for the two search methods. The mean precision of the visual method was greater than that of the MeSH-Controlled search for all three types of precision, i.e. Best Precision, Useful Precision, and Objective Precision, and their mean precisions were significantly different (P searches. Fifty-three percent of the participants in the research also mentioned that the use of the combination of the two methods produced better results. For users, it is more appropriate to use a natural, language-based method, such as the visual method, in the EBSCO Medline host than to use the controlled method, which requires users to use special keywords. The potential reason for their preference was that the visual method allowed them more freedom of action.

  18. PMD2HD--a web tool aligning a PubMed search results page with the local German Cancer Research Centre library collection.

    Science.gov (United States)

    Bohne-Lang, Andreas; Lang, Elke; Taube, Anke

    2005-06-27

    Web-based searching is the accepted contemporary mode of retrieving relevant literature, and retrieving as many full text articles as possible is a typical prerequisite for research success. In most cases only a proportion of references will be directly accessible as digital reprints through displayed links. A large number of references, however, have to be verified in library catalogues and, depending on their availability, are accessible as print holdings or by interlibrary loan request. The problem of verifying local print holdings from an initial retrieval set of citations can be solved using Z39.50, an ANSI protocol for interactively querying library information systems. Numerous systems include Z39.50 interfaces and therefore can process Z39.50 interactive requests. However, the programmed query interaction command structure is non-intuitive and inaccessible to the average biomedical researcher. For the typical user, it is necessary to implement the protocol within a tool that hides and handles Z39.50 syntax, presenting a comfortable user interface. PMD2HD is a web tool implementing Z39.50 to provide an appropriately functional and usable interface to integrate into the typical workflow that follows an initial PubMed literature search, providing users with an immediate asset to assist in the most tedious step in literature retrieval, checking for subscription holdings against a local online catalogue. PMD2HD can facilitate literature access considerably with respect to the time and cost of manual comparisons of search results with local catalogue holdings. The example presented in this article is related to the library system and collections of the German Cancer Research Centre. However, the PMD2HD software architecture and use of common Z39.50 protocol commands allow for transfer to a broad range of scientific libraries using Z39.50-compatible library information systems.

  19. An Approach to Retrieval of OCR Degraded Text

    Directory of Open Access Journals (Sweden)

    Yuen-Hsien Tseng

    1998-12-01

    Full Text Available The major problem with retrieval of OCR text is the unpredictable distortion of characters due to recognition errors. Because users have no ideas of such distortion, the terms they query can hardly match the terms stored in the OCR text exactly. Thus retrieval effectiveness is significantly reduced , especially for low-quality input. To reduce the losses from retrieving such noisy OCR text, a fault-tolerant retrieval strategy based on automatic keyword extraction and fuzzy matching is proposed. In this strategy, terms, correct or not, and their term frequencies are extracted from the noisy text and presented for browsing and selection in response to users' initial queries , With theunderstanding of the real terms stored in the noisy text and of their estimated frequency distributions, users may then choose appropriate terms for a more effective searching, A text retrieval system based on this strategy has been built. Examples to show the effectiveness are demonstrated. Finally, some OCR issues for further enhancingretrieval effectiveness are discussed.

  20. Content Based Searching for INIS

    International Nuclear Information System (INIS)

    Jain, V.; Jain, R.K.

    2016-01-01

    Full text: Whatever a user wants is available on the internet, but to retrieve the information efficiently, a multilingual and most-relevant document search engine is a must. Most current search engines are word based or pattern based. They do not consider the meaning of the query posed to them; purely based on the keywords of the query; no support of multilingual query and and dismissal of nonrelevant results. Current information-retrieval techniques either rely on an encoding process, using a certain perspective or classification scheme, to describe a given item, or perform a full-text analysis, searching for user-specified words. Neither case guarantees content matching because an encoded description might reflect only part of the content and the mere occurrence of a word does not necessarily reflect the document’s content. For general documents, there doesn’t yet seem to be a much better option than lazy full-text analysis, by manually going through those endless results pages. In contrast to this, new search engine should extract the meaning of the query and then perform the search based on this extracted meaning. New search engine should also employ Interlingua based machine translation technology to present information in the language of choice of the user. (author

  1. PolySearch2: a significantly improved text-mining system for discovering associations between human diseases, genes, drugs, metabolites, toxins and more.

    Science.gov (United States)

    Liu, Yifeng; Liang, Yongjie; Wishart, David

    2015-07-01

    PolySearch2 (http://polysearch.ca) is an online text-mining system for identifying relationships between biomedical entities such as human diseases, genes, SNPs, proteins, drugs, metabolites, toxins, metabolic pathways, organs, tissues, subcellular organelles, positive health effects, negative health effects, drug actions, Gene Ontology terms, MeSH terms, ICD-10 medical codes, biological taxonomies and chemical taxonomies. PolySearch2 supports a generalized 'Given X, find all associated Ys' query, where X and Y can be selected from the aforementioned biomedical entities. An example query might be: 'Find all diseases associated with Bisphenol A'. To find its answers, PolySearch2 searches for associations against comprehensive collections of free-text collections, including local versions of MEDLINE abstracts, PubMed Central full-text articles, Wikipedia full-text articles and US Patent application abstracts. PolySearch2 also searches 14 widely used, text-rich biological databases such as UniProt, DrugBank and Human Metabolome Database to improve its accuracy and coverage. PolySearch2 maintains an extensive thesaurus of biological terms and exploits the latest search engine technology to rapidly retrieve relevant articles and databases records. PolySearch2 also generates, ranks and annotates associative candidates and present results with relevancy statistics and highlighted key sentences to facilitate user interpretation. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Information Retrieval and Graph Analysis Approaches for Book Recommendation

    Directory of Open Access Journals (Sweden)

    Chahinez Benkoussas

    2015-01-01

    Full Text Available A combination of multiple information retrieval approaches is proposed for the purpose of book recommendation. In this paper, book recommendation is based on complex user's query. We used different theoretical retrieval models: probabilistic as InL2 (Divergence from Randomness model and language model and tested their interpolated combination. Graph analysis algorithms such as PageRank have been successful in Web environments. We consider the application of this algorithm in a new retrieval approach to related document network comprised of social links. We called Directed Graph of Documents (DGD a network constructed with documents and social information provided from each one of them. Specifically, this work tackles the problem of book recommendation in the context of INEX (Initiative for the Evaluation of XML retrieval Social Book Search track. A series of reranking experiments demonstrate that combining retrieval models yields significant improvements in terms of standard ranked retrieval metrics. These results extend the applicability of link analysis algorithms to different environments.

  3. Information Retrieval and Graph Analysis Approaches for Book Recommendation.

    Science.gov (United States)

    Benkoussas, Chahinez; Bellot, Patrice

    2015-01-01

    A combination of multiple information retrieval approaches is proposed for the purpose of book recommendation. In this paper, book recommendation is based on complex user's query. We used different theoretical retrieval models: probabilistic as InL2 (Divergence from Randomness model) and language model and tested their interpolated combination. Graph analysis algorithms such as PageRank have been successful in Web environments. We consider the application of this algorithm in a new retrieval approach to related document network comprised of social links. We called Directed Graph of Documents (DGD) a network constructed with documents and social information provided from each one of them. Specifically, this work tackles the problem of book recommendation in the context of INEX (Initiative for the Evaluation of XML retrieval) Social Book Search track. A series of reranking experiments demonstrate that combining retrieval models yields significant improvements in terms of standard ranked retrieval metrics. These results extend the applicability of link analysis algorithms to different environments.

  4. Quantifying retrieval bias in Web archive search

    NARCIS (Netherlands)

    Samar, Thaer; Traub, Myriam C.; van Ossenbruggen, Jacco; Hardman, Lynda; de Vries, Arjen P.

    2018-01-01

    A Web archive usually contains multiple versions of documents crawled from the Web at different points in time. One possible way for users to access a Web archive is through full-text search systems. However, previous studies have shown that these systems can induce a bias, known as the

  5. Biomedical image representation approach using visualness and spatial information in a concept feature space for interactive region-of-interest-based retrieval.

    Science.gov (United States)

    Rahman, Md Mahmudur; Antani, Sameer K; Demner-Fushman, Dina; Thoma, George R

    2015-10-01

    This article presents an approach to biomedical image retrieval by mapping image regions to local concepts where images are represented in a weighted entropy-based concept feature space. The term "concept" refers to perceptually distinguishable visual patches that are identified locally in image regions and can be mapped to a glossary of imaging terms. Further, the visual significance (e.g., visualness) of concepts is measured as the Shannon entropy of pixel values in image patches and is used to refine the feature vector. Moreover, the system can assist the user in interactively selecting a region-of-interest (ROI) and searching for similar image ROIs. Further, a spatial verification step is used as a postprocessing step to improve retrieval results based on location information. The hypothesis that such approaches would improve biomedical image retrieval is validated through experiments on two different data sets, which are collected from open access biomedical literature.

  6. Survey of keyword adjustment of published articles medical subject headings in journal of mazandaran university of medical sciences (2009-2010).

    Science.gov (United States)

    Kabirzadeh, Azar; Siamian, Hasan; Abadi, Ebrahim Bagherian Farah; Saravi, Benyamin Mohseni

    2013-01-01

    17 articles could be retrieved if the search words are selected from the MeSH. In this case the expected 100% of published articles titles at this university the validity of exchange of research projects which is something noteworthy. The lack of correlation between number of authors and matching of Keywords with MeSH, may mean all of the papers' authors did not take part in writing and it is understanding that only one author wrote the paper.

  7. Using a Google Search Appliance (GSA) to search digital library collections: a case study of the INIS Collection Search

    OpenAIRE

    Savic, Dobrica

    2014-01-01

    Libraries are facing many challenges today. In addition to diminishing funding and increased user expectations, the use of classic library catalogues is becoming an additional challenge. Library users require fast and easy access to information resources regardless whether the format used is paper or electronic. Google search, with its speed and simplicity, set up a new standard for information retrieval which is hard to achieve with the previous generation of library search facilities. Put i...

  8. Personalizing Web Search based on User Profile

    OpenAIRE

    Utage, Sharyu; Ahire, Vijaya

    2016-01-01

    Web Search engine is most widely used for information retrieval from World Wide Web. These Web Search engines help user to find most useful information. When different users Searches for same information, search engine provide same result without understanding who is submitted that query. Personalized web search it is search technique for proving useful result. This paper models preference of users as hierarchical user profiles. a framework is proposed called UPS. It generalizes profile and m...

  9. Exploring Database Improvements for GPM Constellation Precipitation Retrievals

    Science.gov (United States)

    Ringerud, S.; Kidd, C.; Skofronick Jackson, G.

    2017-12-01

    The Global Precipitation Measurement Mission (GPM) offers an unprecedented opportunity for understanding and mapping of liquid and frozen precipitation on a global scale. GPM mission development of physically based retrieval algorithms, for application consistently across the constellation radiometers, relies on combined active-passive retrievals from the GPM core satellite as a transfer standard. Radiative transfer modeling is then utilized to compute a priori databases at the frequency and footprint geometry of each individual radiometer. The Goddard Profiling Algorithm (GPROF) performs constellation retrievals across the GPM databases in a Bayesian framework, constraining searches using model data on a pixel-by-pixel basis. This work explores how the retrieval might be enhanced with additional information available within the brightness temperature observations themselves. In order to better exploit available information content, model water vapor is replaced with retrieved water vapor. Rather than treating each footprint as a 1D profile alone in space, information regarding Tb variability in the horizontal is added as well as variability in the time dimension. This additional information is tested and evaluated for retrieval improvement in the context of the Bayesian retrieval scheme. Retrieval differences are presented as a function of precipitation and surface type for evaluation of where the added information proves most effective.

  10. Bibliographic Information Retrieval Systems: Increasing Cognitive Compatibility.

    Science.gov (United States)

    Smith, Philip J.; And Others

    1987-01-01

    Discusses the impact of research in artificial intelligence and human computer interaction on the design of bibliographic information retrieval systems, and presents design principles of a prototype system that uses semantically based searches and a knowledge base consisting of conceptual frames. (10 references) (CLB)

  11. PcapDB: Search Optimized Packet Capture, Version 0.1.0.0

    Energy Technology Data Exchange (ETDEWEB)

    2016-11-04

    PcapDB is a packet capture system designed to optimize the captured data for fast search in the typical (network incident response) use case. The technology involved in this software has been submitted via the IDEAS system and has been filed as a provisional patent. It includes the following primary components: capture: The capture component utilizes existing capture libraries to retrieve packets from network interfaces. Once retrieved the packets are passed to additional threads for sorting into flows and indexing. The sorted flows and indexes are passed to other threads so that they can be written to disk. These components are written in the C programming language. search: The search components provide a means to find relevant flows and the associated packets. A search query is parsed and represented as a search tree. Various search commands, written in C, are then used resolve this tree into a set of search results. The tree generation and search execution management components are written in python. interface: The PcapDB web interface is written in Python on the Django framework. It provides a series of pages, API's, and asynchronous tasks that allow the user to manage the capture system, perform searches, and retrieve results. Web page components are written in HTML,CSS and Javascript.

  12. Intelligent image retrieval based on radiology reports

    Energy Technology Data Exchange (ETDEWEB)

    Gerstmair, Axel; Langer, Mathias; Kotter, Elmar [University Medical Center Freiburg, Department of Diagnostic Radiology, Freiburg (Germany); Daumke, Philipp; Simon, Kai [Averbis GmbH, Freiburg (Germany)

    2012-12-15

    To create an advanced image retrieval and data-mining system based on in-house radiology reports. Radiology reports are semantically analysed using natural language processing (NLP) techniques and stored in a state-of-the-art search engine. Images referenced by sequence and image number in the reports are retrieved from the picture archiving and communication system (PACS) and stored for later viewing. A web-based front end is used as an interface to query for images and show the results with the retrieved images and report text. Using a comprehensive radiological lexicon for the underlying terminology, the search algorithm also finds results for synonyms, abbreviations and related topics. The test set was 108 manually annotated reports analysed by different system configurations. Best results were achieved using full syntactic and semantic analysis with a precision of 0.929 and recall of 0.952. Operating successfully since October 2010, 258,824 reports have been indexed and a total of 405,146 preview images are stored in the database. Data-mining and NLP techniques provide quick access to a vast repository of images and radiology reports with both high precision and recall values. Consequently, the system has become a valuable tool in daily clinical routine, education and research. (orig.)

  13. Intelligent image retrieval based on radiology reports

    International Nuclear Information System (INIS)

    Gerstmair, Axel; Langer, Mathias; Kotter, Elmar; Daumke, Philipp; Simon, Kai

    2012-01-01

    To create an advanced image retrieval and data-mining system based on in-house radiology reports. Radiology reports are semantically analysed using natural language processing (NLP) techniques and stored in a state-of-the-art search engine. Images referenced by sequence and image number in the reports are retrieved from the picture archiving and communication system (PACS) and stored for later viewing. A web-based front end is used as an interface to query for images and show the results with the retrieved images and report text. Using a comprehensive radiological lexicon for the underlying terminology, the search algorithm also finds results for synonyms, abbreviations and related topics. The test set was 108 manually annotated reports analysed by different system configurations. Best results were achieved using full syntactic and semantic analysis with a precision of 0.929 and recall of 0.952. Operating successfully since October 2010, 258,824 reports have been indexed and a total of 405,146 preview images are stored in the database. Data-mining and NLP techniques provide quick access to a vast repository of images and radiology reports with both high precision and recall values. Consequently, the system has become a valuable tool in daily clinical routine, education and research. (orig.)

  14. Comparing the Scale of Web Subject Directories Precision in Technical-Engineering Information Retrieval

    Directory of Open Access Journals (Sweden)

    Mehrdokht Wazirpour Keshmiri

    2012-07-01

    Full Text Available The main purpose of this research was to compare the scale of web subject directories precision in information retrieval of technical-engineering science. Information gathering was documentary and webometric. Keywords of technical-engineering science were chosen at twenty different subjects from IEEE (Institute of Electrical and Electronics Engineers and engineering magazines that situated in sciencedirect site. These keywords are used at five subject directories Yahoo, Google, Infomine, Intute, Dmoz, that were web directories high-utilization. Usually first results in searching tools are connected to searching keywords. Because, first ten results was evaluated in every search. These assessments to consist of scale of precision, scale of error, scale retrieval items in technical-engineering categories to retrieval items entirely. The used criteria for determining the scale of precision that was according to high-utilization standards in different documents, to consist of presence of the keywords in title, appearance of keywords at the part of web retrieved pages, keywords adjacency, URL of page, page description and subject categories. Information analysis was according to Kruskal-Wallis Test and L.S.D fisher. Results revealed that there was meaningful difference about precision of web subject directories in information retrieval of technical-engineering science, Therefore this theory was confirmed.web subject directories ranked from point of precision as follows. Google, Yahoo, Intute, Dmoz, and Infomine. The scale of observed error at the first results was another criterion that was used for comparing web subject directories. In this research, Yahoo had minimum scale of error and Infomine had most of error. This research also compared the scale of retrieval items in all of categories web subject directories entirely to retrieval items in technical-engineering categories, results revealed that there was meaningful difference between them. And

  15. Millennial Students' Mental Models of Information Retrieval

    Science.gov (United States)

    Holman, Lucy

    2009-01-01

    This qualitative study examines first-year college students' online search habits in order to identify patterns in millennials' mental models of information retrieval. The study employed a combination of modified contextual inquiry and concept mapping methodologies to elicit students' mental models. The researcher confirmed previously observed…

  16. The Development of Relevance in Information Retrieval

    Directory of Open Access Journals (Sweden)

    Mu-hsuan Huang

    1997-12-01

    Full Text Available This article attempts to investigate the notion of relevance in information retrieval. It discusses various definitions for relevance from historical viewpoints and the characteristics of relevance judgments. Also, it introduces empirical results of important related researches.[Article content in Chinese

  17. Pathfinder: multiresolution region-based searching of pathology images using IRM.

    OpenAIRE

    Wang, J. Z.

    2000-01-01

    The fast growth of digitized pathology slides has created great challenges in research on image database retrieval. The prevalent retrieval technique involves human-supplied text annotations to describe slide contents. These pathology images typically have very high resolution, making it difficult to search based on image content. In this paper, we present Pathfinder, an efficient multiresolution region-based searching system for high-resolution pathology image libraries. The system uses wave...

  18. Indexing Bibliographic Database Content Using MariaDB and Sphinx Search Server

    Directory of Open Access Journals (Sweden)

    Arie Nugraha

    2014-07-01

    Full Text Available Fast retrieval of digital content has become mandatory for library and archive information systems. Many software applications have emerged to handle the indexing of digital content, from low-level ones such Apache Lucene, to more RESTful and web-services-ready ones such Apache Solr and ElasticSearch. Solr’s popularity among library software developers makes it the “de-facto” standard software for indexing digital content. For content (full-text content or bibliographic description already stored inside a relational DBMS such as MariaDB (a fork of MySQL or PostgreSQL, Sphinx Search Server (Sphinx is a suitable alternative. This article will cover an introduction on how to use Sphinx with MariaDB databases to index database content as well as some examples of Sphinx API usage.

  19. Research of image retrieval technology based on color feature

    Science.gov (United States)

    Fu, Yanjun; Jiang, Guangyu; Chen, Fengying

    2009-10-01

    Recently, with the development of the communication and the computer technology and the improvement of the storage technology and the capability of the digital image equipment, more and more image resources are given to us than ever. And thus the solution of how to locate the proper image quickly and accurately is wanted.The early method is to set up a key word for searching in the database, but now the method has become very difficult when we search much more picture that we need. In order to overcome the limitation of the traditional searching method, content based image retrieval technology was aroused. Now, it is a hot research subject.Color image retrieval is the important part of it. Color is the most important feature for color image retrieval. Three key questions on how to make use of the color characteristic are discussed in the paper: the expression of color, the abstraction of color characteristic and the measurement of likeness based on color. On the basis, the extraction technology of the color histogram characteristic is especially discussed. Considering the advantages and disadvantages of the overall histogram and the partition histogram, a new method based the partition-overall histogram is proposed. The basic thought of it is to divide the image space according to a certain strategy, and then calculate color histogram of each block as the color feature of this block. Users choose the blocks that contain important space information, confirming the right value. The system calculates the distance between the corresponding blocks that users choosed. Other blocks merge into part overall histograms again, and the distance should be calculated. Then accumulate all the distance as the real distance between two pictures. The partition-overall histogram comprehensive utilizes advantages of two methods above, by choosing blocks makes the feature contain more spatial information which can improve performance; the distances between partition-overall histogram

  20. A cloud-based framework for large-scale traditional Chinese medical record retrieval.

    Science.gov (United States)

    Liu, Lijun; Liu, Li; Fu, Xiaodong; Huang, Qingsong; Zhang, Xianwen; Zhang, Yin

    2018-01-01

    Electronic medical records are increasingly common in medical practice. The secondary use of medical records has become increasingly important. It relies on the ability to retrieve the complete information about desired patient populations. How to effectively and accurately retrieve relevant medical records from large- scale medical big data is becoming a big challenge. Therefore, we propose an efficient and robust framework based on cloud for large-scale Traditional Chinese Medical Records (TCMRs) retrieval. We propose a parallel index building method and build a distributed search cluster, the former is used to improve the performance of index building, and the latter is used to provide high concurrent online TCMRs retrieval. Then, a real-time multi-indexing model is proposed to ensure the latest relevant TCMRs are indexed and retrieved in real-time, and a semantics-based query expansion method and a multi- factor ranking model are proposed to improve retrieval quality. Third, we implement a template-based visualization method for displaying medical reports. The proposed parallel indexing method and distributed search cluster can improve the performance of index building and provide high concurrent online TCMRs retrieval. The multi-indexing model can ensure the latest relevant TCMRs are indexed and retrieved in real-time. The semantics expansion method and the multi-factor ranking model can enhance retrieval quality. The template-based visualization method can enhance the availability and universality, where the medical reports are displayed via friendly web interface. In conclusion, compared with the current medical record retrieval systems, our system provides some advantages that are useful in improving the secondary use of large-scale traditional Chinese medical records in cloud environment. The proposed system is more easily integrated with existing clinical systems and be used in various scenarios. Copyright © 2017. Published by Elsevier Inc.

  1. Personalizing Information Retrieval Using Interaction Behaviors in Search Sessions in Different Types of Tasks

    Science.gov (United States)

    Liu, Chang

    2012-01-01

    When using information retrieval (IR) systems, users often pose short and ambiguous query terms. It is critical for IR systems to obtain more accurate representation of users' information need, their document preferences, and the context they are working in, and then incorporate them into the design of the systems to tailor retrieval to…

  2. Three-Dimensional Model Retrieval Using Dynamic Multi-Descriptor Fusion

    Institute of Scientific and Technical Information of China (English)

    Jau-Ling Shi; Chang-Hsing Lee; Yao-Wen Hou; Po-Ting Yeh

    2017-01-01

    In this paper, we propose a dynamic multi-descriptor fusion (DMDF) approach to improving the retrieval accuracy of 3-dimensional (3D) model retrieval systems. First, an independent retrieval list is generated by using each individual descriptor. Second, we propose an automatic relevant/irrelevant models selection (ARMS) approach to selecting the relevant and irrelevant 3D models automatically without any user interaction. A weighted distance, in which the weight associated with each individual descriptor is learnt by using the selected relevant and irrelevant models, is used to measure the similarity between two 3D models. Furthermore, a descriptor-dependent adaptive query point movement (AQPM) approach is employed to update every feature vector. This set of new feature vectors is used to index 3D models in the next search process. Four 3D model databases are used to compare the retrieval accuracy of our proposed DMDF approach with several descriptors as well as some well-known information fusion methods. Experimental results have shown that our proposed DMDF approach provides a promising retrieval result and always yields the best retrieval accuracy.

  3. Finding electronic information for health policy advocacy: a guide to improving search results.

    Science.gov (United States)

    Olsan, Tobie H; Bianchi, Carolanne; White, Pamela; Glessner, Theresa; Mapstone, Pamela L

    2011-12-01

    The success of advanced practice registered nurses' (APRNs') health policy advocacy depends on staying well informed about key issues. Searching for high-quality health policy information, however, can be frustrating and time consuming. Busy clinicians need strategies and tips to reduce information overload and to access synthesized research for evidence-based health policy. This article therefore offers APRNs practical guidelines and resources for searching electronic health policy information. Scholarly databases and Internet sites. Electronic health policy information is generated by a wide variety of public and private organizations and disseminated in hundreds of journals and Web pages. Specialty search tools are needed to retrieve the unindexed gray literature, which includes government documents, agency reports, fact sheets, standards, and statistics not produced by commercial publishers. Further, Internet users need to examine search results with a critical eye for information quality. Expertise in searching electronic health policy information is a prerequisite for developing APRNs' leadership in political arenas to influence health policy and the delivery of healthcare services. ©2011 The Author(s) Journal compilation ©2011 American Academy of Nurse Practitioners.

  4. PENGEMBANGAN SISTEM TEMU KEMBALI INFORMASI DIGITAL FULLTEXT ARTIKEL JURNAL DI PDII – LIPI

    Directory of Open Access Journals (Sweden)

    Sjaeful Afandi

    2016-03-01

    Full Text Available One of the tasks in Center for Scientific Documentation and Information – Indonesian Institutes of Sciences (PDII - LIPI is to disseminate the results of existing research in Indonesia. The research result can be either books or journal articles. Currently, activity of retrieval system on a journal article have still using a traditional retrieve systems regardless of the relevance of data search results. The order of search results based only on the order of data entry that need to be developed information retrieval of digital full text articles as data retrieval alternative. Development of information retrieval system that uses Sphinx Search software. The data used are the result of the conversion from Portable Digital Format (PDF into XML as much as 1000 file. Data of conversion result, then are processed through “tokenisasi” and indexing techniques using Sphinx Search software. Retrieval system tested with a query that has been determined. Retrieval results calculated using standard recalls eleven that in mind the relevance and accuracy. Data retrieval system produces search results that are relevant and accurate with average presicion (AVP value is of 79%.

  5. An information retrieval system using weighted descriptors generated by automatic frequency counting

    International Nuclear Information System (INIS)

    Komatsubara, Yasutoshi

    1979-01-01

    An information retrieval system with improved relevance is described, in which a weighted descriptor file, generated by feedback of requester's relevance judgement on pretest results, is used. This method does not need modification of search formulas, and works better by only setting weight thresholds, and can alleviate searcher duties, as examples show. Index word weighting and retrieval word weighting are compared and some problems to be encountered when retrieval word weighting is combined to operational systems are pointed out. (author)

  6. High-performance information search filters for CKD content in PubMed, Ovid MEDLINE, and EMBASE.

    Science.gov (United States)

    Iansavichus, Arthur V; Hildebrand, Ainslie M; Haynes, R Brian; Wilczynski, Nancy L; Levin, Adeera; Hemmelgarn, Brenda R; Tu, Karen; Nesrallah, Gihad E; Nash, Danielle M; Garg, Amit X

    2015-01-01

    Finding relevant articles in large bibliographic databases such as PubMed, Ovid MEDLINE, and EMBASE to inform care and future research is challenging. Articles relevant to chronic kidney disease (CKD) are particularly difficult to find because they are often published under different terminology and are found across a wide range of journal types. We used computer automation within a diagnostic test assessment framework to develop and validate information search filters to identify CKD articles in large bibliographic databases. 22,992 full-text articles in PubMed, Ovid MEDLINE, or EMBASE. 1,374,148 unique search filters. We established the reference standard of article relevance to CKD by manual review of all full-text articles using prespecified criteria to determine whether each article contained CKD content or not. We then assessed filter performance by calculating sensitivity, specificity, and positive predictive value for the retrieval of CKD articles. Filters with high sensitivity and specificity for the identification of CKD articles in the development phase (two-thirds of the sample) were then retested in the validation phase (remaining one-third of the sample). We developed and validated high-performance CKD search filters for each bibliographic database. Filters optimized for sensitivity reached at least 99% sensitivity, and filters optimized for specificity reached at least 97% specificity. The filters were complex; for example, one PubMed filter included more than 89 terms used in combination, including "chronic kidney disease," "renal insufficiency," and "renal fibrosis." In proof-of-concept searches, physicians found more articles relevant to the topic of CKD with the use of these filters. As knowledge of the pathogenesis of CKD grows and definitions change, these filters will need to be updated to incorporate new terminology used to index relevant articles. PubMed, Ovid MEDLINE, and EMBASE can be filtered reliably for articles relevant to CKD. These

  7. Search and Recommendation

    DEFF Research Database (Denmark)

    Bogers, Toine

    2014-01-01

    In just a little over half a century, the field of information retrieval has experienced spectacular growth and success, with IR applications such as search engines becoming a billion-dollar industry in the past decades. Recommender systems have seen an even more meteoric rise to success with wide...

  8. Feature hashing for fast image retrieval

    Science.gov (United States)

    Yan, Lingyu; Fu, Jiarun; Zhang, Hongxin; Yuan, Lu; Xu, Hui

    2018-03-01

    Currently, researches on content based image retrieval mainly focus on robust feature extraction. However, due to the exponential growth of online images, it is necessary to consider searching among large scale images, which is very timeconsuming and unscalable. Hence, we need to pay much attention to the efficiency of image retrieval. In this paper, we propose a feature hashing method for image retrieval which not only generates compact fingerprint for image representation, but also prevents huge semantic loss during the process of hashing. To generate the fingerprint, an objective function of semantic loss is constructed and minimized, which combine the influence of both the neighborhood structure of feature data and mapping error. Since the machine learning based hashing effectively preserves neighborhood structure of data, it yields visual words with strong discriminability. Furthermore, the generated binary codes leads image representation building to be of low-complexity, making it efficient and scalable to large scale databases. Experimental results show good performance of our approach.

  9. Overview of the INEX 2014 Social Book Search Track

    DEFF Research Database (Denmark)

    Koolen, Marijn; Bogers, Toine; Kazai, Gabriella

    2014-01-01

    The goal of the INEX 2014 Social Book Search Track is to evaluate approaches for supporting users in searching collections of books based on book metadata and associated user-generated content. The track investigates the complex nature of relevance in book search and the role of traditional...... and user-generated book metadata in retrieval. We extended last year’s investigation into the nature of book suggestions from the LibraryThing forums and how they compare to book relevance judgements. Participants were encouraged to incorporate rich user profiles of both topic creators and other Library......Thing users to explore the relative value of recommendation and retrieval paradigms for book search. We found further support that such suggestions are a valuable alternative to traditional test collections that are based on top-k pooling and editorial relevance judgements....

  10. Diversification of visual media retrieval results using saliency detection

    Science.gov (United States)

    Muratov, Oleg; Boato, Giulia; De Natale, Franesco G. B.

    2013-03-01

    Diversification of retrieval results allows for better and faster search. Recently there has been proposed different methods for diversification of image retrieval results mainly utilizing text information and techniques imported from natural language processing domain. However, images contain visual information that is impossible to describe in text and the use of visual features is inevitable. Visual saliency is information about the main object of an image implicitly included by humans while creating visual content. For this reason it is naturally to exploit this information for the task of diversification of the content. In this work we study whether visual saliency can be used for the task of diversification and propose a method for re-ranking image retrieval results using saliency. The evaluation has shown that the use of saliency information results in higher diversity of retrieval results.

  11. Order effect in interactive information retrieval evaluation

    DEFF Research Database (Denmark)

    Clemmensen, Melanie Landvad; Borlund, Pia

    2016-01-01

    , and the good-subject effect shed light on how and why order effect may affect test participants’ IR system interaction and search behaviour. Research limitations/implications – Insight about order effect has implications for test design of IIR studies and hence the knowledge base generated on the basis...... of such studies. Due to the limited sample of 20 test participants (Library and Information Science (LIS) students) inference statistics is not applicable; hence conclusions can be drawn from this sample of test participants only. Originality/value – Only few studies in LIS focus on order effect and none from...... the perspective of IIR. Keywords Evaluation, Research methods, Information retrieval, User studies, Searching, Information searches...

  12. Identifying quality improvement intervention publications - A comparison of electronic search strategies

    Directory of Open Access Journals (Sweden)

    Rubenstein Lisa V

    2011-08-01

    Full Text Available Abstract Background The evidence base for quality improvement (QI interventions is expanding rapidly. The diversity of the initiatives and the inconsistency in labeling these as QI interventions makes it challenging for researchers, policymakers, and QI practitioners to access the literature systematically and to identify relevant publications. Methods We evaluated search strategies developed for MEDLINE (Ovid and PubMed based on free text words, Medical subject headings (MeSH, QI intervention components, continuous quality improvement (CQI methods, and combinations of the strategies. Three sets of pertinent QI intervention publications were used for validation. Two independent expert reviewers screened publications for relevance. We compared the yield, recall rate, and precision of the search strategies for the identification of QI publications and for a subset of empirical studies on effects of QI interventions. Results The search yields ranged from 2,221 to 216,167 publications. Mean recall rates for reference publications ranged from 5% to 53% for strategies with yields of 50,000 publications or fewer. The 'best case' strategy, a simple text word search with high face validity ('quality' AND 'improv*' AND 'intervention*' identified 44%, 24%, and 62% of influential intervention articles selected by Agency for Healthcare Research and Quality (AHRQ experts, a set of exemplar articles provided by members of the Standards for Quality Improvement Reporting Excellence (SQUIRE group, and a sample from the Cochrane Effective Practice and Organization of Care Group (EPOC register of studies, respectively. We applied the search strategy to a PubMed search for articles published in 10 pertinent journals in a three-year period which retrieved 183 publications. Among these, 67% were deemed relevant to QI by at least one of two independent raters. Forty percent were classified as empirical studies reporting on a QI intervention. Conclusions The presented

  13. Integrated optimization of location assignment and sequencing in multi-shuttle automated storage and retrieval systems under modified 2n-command cycle pattern

    Science.gov (United States)

    Yang, Peng; Peng, Yongfei; Ye, Bin; Miao, Lixin

    2017-09-01

    This article explores the integrated optimization problem of location assignment and sequencing in multi-shuttle automated storage/retrieval systems under the modified 2n-command cycle pattern. The decision of storage and retrieval (S/R) location assignment and S/R request sequencing are jointly considered. An integer quadratic programming model is formulated to describe this integrated optimization problem. The optimal travel cycles for multi-shuttle S/R machines can be obtained to process S/R requests in the storage and retrieval request order lists by solving the model. The small-sized instances are optimally solved using CPLEX. For large-sized problems, two tabu search algorithms are proposed, in which the first come, first served and nearest neighbour are used to generate initial solutions. Various numerical experiments are conducted to examine the heuristics' performance and the sensitivity of algorithm parameters. Furthermore, the experimental results are analysed from the viewpoint of practical application, and a parameter list for applying the proposed heuristics is recommended under different real-life scenarios.

  14. Sampling criteria in multicollection searching.

    Science.gov (United States)

    Gilio, A.; Scozzafava, R.; Marchetti, P. G.

    In the first stage of the document retrieval process, no information concerning relevance of a particular document is available. On the other hand, computer implementation requires that the analysis be made only for a sample of retrieved documents. This paper addresses the significance and suitability of two different sampling criteria for a multicollection online search facility. The inevitability of resorting to a logarithmic criterion in order to achieve a "spread of representativeness" from the multicollection is demonstrated.

  15. Intelligent Search Optimization using Artificial Fuzzy Logics

    OpenAIRE

    Manral, Jai

    2015-01-01

    Information on the web is prodigious; searching relevant information is difficult making web users to rely on search engines for finding relevant information on the web. Search engines index and categorize web pages according to their contents using crawlers and rank them accordingly. For given user query they retrieve millions of webpages and display them to users according to web-page rank. Every search engine has their own algorithms based on certain parameters for ranking web-pages. Searc...

  16. The 25 most cited articles in arthroscopic orthopaedic surgery.

    Science.gov (United States)

    Cassar Gheiti, Adrian J; Downey, Richard E; Byrne, Damien P; Molony, Diarmuid C; Mulhall, Kevin J

    2012-04-01

    The purpose of this study was to use Web of Knowledge to determine which published arthroscopic surgery-related articles have been cited most frequently by other authors by ranking the 25 most cited articles. We furthermore wished to determine whether there is any difference between a categorical "journal-by-journal" analysis and an "all-database" analysis in arthroscopic surgery and whether such a search methodology would alter the results of previously published lists of "citation classics" in the field. We analyzed the characteristics of these articles to determine what qualities make an article important to this subspecialty of orthopaedic surgery. Web of Knowledge was searched on March 7, 2011, using the term "arthroscopy" for citations to articles related to arthroscopy in 61 orthopaedic journals and using the all-database function. Each of the 61 orthopaedic journals was searched separately for arthroscopy-related articles to determine the 25 most cited articles. An all-database search for arthroscopy-related articles was carried out and compared with a journal-by-journal search. Each article was reviewed for basic information including the type of article, authorship, institution, country, publishing journal, and year published. The number of citations ranged from 189 to 567 in a journal-by-journal search and from 214 to 1,869 in an all-database search. The 25 most cited articles on arthroscopic surgery were published in 11 journals: 8 orthopaedic journals and 3 journals from other specialties. The most cited article in arthroscopic orthopaedic surgery was published in The New England Journal of Medicine, which was not previously identified by a journal-by-journal search. An all-database search in Web of Knowledge gives a more in-depth methodology of determining the true citation ranking of articles. Among the top 25 most cited articles, autologous chondrocyte implantation/transplantation is currently the most cited and most popular topic in arthroscopic

  17. Monitoring User-System Performance in Interactive Retrieval Tasks

    NARCIS (Netherlands)

    Boldareva, L.; de Vries, A.P.; Hiemstra, Djoerd

    Monitoring user-system performance in interactive search is a challenging task. Traditional measures of retrieval evaluation, based on recall and precision, are not of any use in real time, for they require a priori knowledge of relevant documents. This paper shows how a Shannon entropy-based

  18. TX-Kw: An Effective Temporal XML Keyword Search

    OpenAIRE

    Rasha Bin-Thalab; Neamat El-Tazi; Mohamed E.El-Sharkawi

    2013-01-01

    Inspired by the great success of information retrieval (IR) style keyword search on the web, keyword search on XML has emerged recently. Existing methods cannot resolve challenges addressed by using keyword search in Temporal XML documents. We propose a way to evaluate temporal keyword search queries over Temporal XML documents. Moreover, we propose a new ranking method based on the time-aware IR ranking methods to rank temporal keyword search queries results. Extensive experiments have been ...

  19. The Application of Similar Image Retrieval in Electronic Commerce

    Science.gov (United States)

    Hu, YuPing; Yin, Hua; Han, Dezhi; Yu, Fei

    2014-01-01

    Traditional online shopping platform (OSP), which searches product information by keywords, faces three problems: indirect search mode, large search space, and inaccuracy in search results. For solving these problems, we discuss and research the application of similar image retrieval in electronic commerce. Aiming at improving the network customers' experience and providing merchants with the accuracy of advertising, we design a reasonable and extensive electronic commerce application system, which includes three subsystems: image search display subsystem, image search subsystem, and product information collecting subsystem. This system can provide seamless connection between information platform and OSP, on which consumers can automatically and directly search similar images according to the pictures from information platform. At the same time, it can be used to provide accuracy of internet marketing for enterprises. The experiment shows the efficiency of constructing the system. PMID:24883411

  20. The Application of Similar Image Retrieval in Electronic Commerce

    Directory of Open Access Journals (Sweden)

    YuPing Hu

    2014-01-01

    Full Text Available Traditional online shopping platform (OSP, which searches product information by keywords, faces three problems: indirect search mode, large search space, and inaccuracy in search results. For solving these problems, we discuss and research the application of similar image retrieval in electronic commerce. Aiming at improving the network customers’ experience and providing merchants with the accuracy of advertising, we design a reasonable and extensive electronic commerce application system, which includes three subsystems: image search display subsystem, image search subsystem, and product information collecting subsystem. This system can provide seamless connection between information platform and OSP, on which consumers can automatically and directly search similar images according to the pictures from information platform. At the same time, it can be used to provide accuracy of internet marketing for enterprises. The experiment shows the efficiency of constructing the system.

  1. The application of similar image retrieval in electronic commerce.

    Science.gov (United States)

    Hu, YuPing; Yin, Hua; Han, Dezhi; Yu, Fei

    2014-01-01

    Traditional online shopping platform (OSP), which searches product information by keywords, faces three problems: indirect search mode, large search space, and inaccuracy in search results. For solving these problems, we discuss and research the application of similar image retrieval in electronic commerce. Aiming at improving the network customers' experience and providing merchants with the accuracy of advertising, we design a reasonable and extensive electronic commerce application system, which includes three subsystems: image search display subsystem, image search subsystem, and product information collecting subsystem. This system can provide seamless connection between information platform and OSP, on which consumers can automatically and directly search similar images according to the pictures from information platform. At the same time, it can be used to provide accuracy of internet marketing for enterprises. The experiment shows the efficiency of constructing the system.

  2. Improving performance of content-based image retrieval schemes in searching for similar breast mass regions: an assessment

    International Nuclear Information System (INIS)

    Wang Xiaohui; Park, Sang Cheol; Zheng Bin

    2009-01-01

    This study aims to assess three methods commonly used in content-based image retrieval (CBIR) schemes and investigate the approaches to improve scheme performance. A reference database involving 3000 regions of interest (ROIs) was established. Among them, 400 ROIs were randomly selected to form a testing dataset. Three methods, namely mutual information, Pearson's correlation and a multi-feature-based k-nearest neighbor (KNN) algorithm, were applied to search for the 15 'the most similar' reference ROIs to each testing ROI. The clinical relevance and visual similarity of searching results were evaluated using the areas under receiver operating characteristic (ROC) curves (A Z ) and average mean square difference (MSD) of the mass boundary spiculation level ratings between testing and selected ROIs, respectively. The results showed that the A Z values were 0.893 ± 0.009, 0.606 ± 0.021 and 0.699 ± 0.026 for the use of KNN, mutual information and Pearson's correlation, respectively. The A Z values increased to 0.724 ± 0.017 and 0.787 ± 0.016 for mutual information and Pearson's correlation when using ROIs with the size adaptively adjusted based on actual mass size. The corresponding MSD values were 2.107 ± 0.718, 2.301 ± 0.733 and 2.298 ± 0.743. The study demonstrates that due to the diversity of medical images, CBIR schemes using multiple image features and mass size-based ROIs can achieve significantly improved performance.

  3. A comparative study of two neural networks for document retrieval

    International Nuclear Information System (INIS)

    Hui, S.C.; Goh, A.

    1997-01-01

    In recent years there has been specific interest in adopting advanced computer techniques in the field of document retrieval. This interest is generated by the fact that classical methods such as the Boolean search, the vector space model or even probabilistic retrieval cannot handle the increasing demands of end-users in satisfying their needs. The most recent attempt is the application of the neural network paradigm as a means of providing end-users with a more powerful retrieval mechanism. Neural networks are not only good pattern matchers but also highly versatile and adaptable. In this paper, we demonstrate how to apply two neural networks, namely Adaptive Resonance Theory and Fuzzy Kohonen Neural Network, for document retrieval. In addition, a comparison of these two neural networks based on performance is also given

  4. 06491 Summary -- Digital Historical Corpora- Architecture, Annotation, and Retrieval

    OpenAIRE

    Burnard, Lou; Dobreva, Milena; Fuhr, Norbert; Lüdeling, Anke

    2007-01-01

    The seminar "Digital Historical Corpora" brought together scholars from (historical) linguistics, (historical) philology, computational linguistics and computer science who work with collections of historical texts. The issues that were discussed include digitization, corpus design, corpus architecture, annotation, search, and retrieval.

  5. A Novel Technique for Shape Feature Extraction Using Content Based Image Retrieval

    Directory of Open Access Journals (Sweden)

    Dhanoa Jaspreet Singh

    2016-01-01

    Full Text Available With the advent of technology and multimedia information, digital images are increasing very quickly. Various techniques are being developed to retrieve/search digital information or data contained in the image. Traditional Text Based Image Retrieval System is not plentiful. Since it is time consuming as it require manual image annotation. Also, the image annotation differs with different peoples. An alternate to this is Content Based Image Retrieval (CBIR system. It retrieves/search for image using its contents rather the text, keywords etc. A lot of exploration has been compassed in the range of Content Based Image Retrieval (CBIR with various feature extraction techniques. Shape is a significant image feature as it reflects the human perception. Moreover, Shape is quite simple to use by the user to define object in an image as compared to other features such as Color, texture etc. Over and above, if applied alone, no descriptor will give fruitful results. Further, by combining it with an improved classifier, one can use the positive features of both the descriptor and classifier. So, a tryout will be made to establish an algorithm for accurate feature (Shape extraction in Content Based Image Retrieval (CBIR. The main objectives of this project are: (a To propose an algorithm for shape feature extraction using CBIR, (b To evaluate the performance of proposed algorithm and (c To compare the proposed algorithm with state of art techniques.

  6. Strategies to optimize MEDLINE and EMBASE search strategies for anesthesiology systematic reviews. An experimental study.

    Science.gov (United States)

    Volpato, Enilze de Souza Nogueira; Betini, Marluci; Puga, Maria Eduarda; Agarwal, Arnav; Cataneo, Antônio José Maria; Oliveira, Luciane Dias de; Bazan, Rodrigo; Braz, Leandro Gobbo; Pereira, José Eduardo Guimarães; El Dib, Regina

    2018-01-15

    A high-quality electronic search is essential for ensuring accuracy and comprehensiveness among the records retrieved when conducting systematic reviews. Therefore, we aimed to identify the most efficient method for searching in both MEDLINE (through PubMed) and EMBASE, covering search terms with variant spellings, direct and indirect orders, and associations with MeSH and EMTREE terms (or lack thereof). Experimental study. UNESP, Brazil. We selected and analyzed 37 search strategies that had specifically been developed for the field of anesthesiology. These search strategies were adapted in order to cover all potentially relevant search terms, with regard to variant spellings and direct and indirect orders, in the most efficient manner. When the strategies included variant spellings and direct and indirect orders, these adapted versions of the search strategies selected retrieved the same number of search results in MEDLINE (mean of 61.3%) and a higher number in EMBASE (mean of 63.9%) in the sample analyzed. The numbers of results retrieved through the searches analyzed here were not identical with and without associated use of MeSH and EMTREE terms. However, association of these terms from both controlled vocabularies retrieved a larger number of records than did the use of either one of them. In view of these results, we recommend that the search terms used should include both preferred and non-preferred terms (i.e. variant spellings and direct/indirect order of the same term) and associated MeSH and EMTREE terms, in order to develop highly-sensitive search strategies for systematic reviews.

  7. Mobile Visual Search Based on Histogram Matching and Zone Weight Learning

    Science.gov (United States)

    Zhu, Chuang; Tao, Li; Yang, Fan; Lu, Tao; Jia, Huizhu; Xie, Xiaodong

    2018-01-01

    In this paper, we propose a novel image retrieval algorithm for mobile visual search. At first, a short visual codebook is generated based on the descriptor database to represent the statistical information of the dataset. Then, an accurate local descriptor similarity score is computed by merging the tf-idf weighted histogram matching and the weighting strategy in compact descriptors for visual search (CDVS). At last, both the global descriptor matching score and the local descriptor similarity score are summed up to rerank the retrieval results according to the learned zone weights. The results show that the proposed approach outperforms the state-of-the-art image retrieval method in CDVS.

  8. What do reviewers look for in an original research article?

    Science.gov (United States)

    Shankar, P R

    2012-01-01

    In this article common errors committed by authors especially those, whose first language is not English, while writing an original research articleis described. Avoiding common errors and improving chances of publication has also been covered. This article may resemble instruction to the author. However, tips from reviewer's eyes has been given. The abstract is the section of the paper most commonly read and care should be taken while writing this section. Keywordsare usedto retrieve articles following searches and use of words from the MeSH database is recommended.The introduction describes work already conducted in the particular area and briefly mentions how the manuscript will add to the existing knowledge.The methods section describes how the study was conducted, is written in the past tense and is often the first part of the paper to be written. The results describe what was found in the study and is usually written after the methods section.The discussion compares the study with the literature and helps to put the study findings in context. The conclusions should be based on the results of the study. The references should be written strictly according to the journal format. Language should be simple, active voice should be used and jargon avoided. Avoid directly quoting from reference articles and paraphrase these in your own words to avoid plagiarism.

  9. Accelerating Information Retrieval from Profile Hidden Markov Model Databases.

    Science.gov (United States)

    Tamimi, Ahmad; Ashhab, Yaqoub; Tamimi, Hashem

    2016-01-01

    Profile Hidden Markov Model (Profile-HMM) is an efficient statistical approach to represent protein families. Currently, several databases maintain valuable protein sequence information as profile-HMMs. There is an increasing interest to improve the efficiency of searching Profile-HMM databases to detect sequence-profile or profile-profile homology. However, most efforts to enhance searching efficiency have been focusing on improving the alignment algorithms. Although the performance of these algorithms is fairly acceptable, the growing size of these databases, as well as the increasing demand for using batch query searching approach, are strong motivations that call for further enhancement of information retrieval from profile-HMM databases. This work presents a heuristic method to accelerate the current profile-HMM homology searching approaches. The method works by cluster-based remodeling of the database to reduce the search space, rather than focusing on the alignment algorithms. Using different clustering techniques, 4284 TIGRFAMs profiles were clustered based on their similarities. A representative for each cluster was assigned. To enhance sensitivity, we proposed an extended step that allows overlapping among clusters. A validation benchmark of 6000 randomly selected protein sequences was used to query the clustered profiles. To evaluate the efficiency of our approach, speed and recall values were measured and compared with the sequential search approach. Using hierarchical, k-means, and connected component clustering techniques followed by the extended overlapping step, we obtained an average reduction in time of 41%, and an average recall of 96%. Our results demonstrate that representation of profile-HMMs using a clustering-based approach can significantly accelerate data retrieval from profile-HMM databases.

  10. Automated Text Markup for Information Retrieval from an Electronic Textbook of Infectious Disease

    Science.gov (United States)

    Berrios, Daniel C.; Kehler, Andrew; Kim, David K.; Yu, Victor L.; Fagan, Lawrence M.

    1998-01-01

    The information needs of practicing clinicians frequently require textbook or journal searches. Making these sources available in electronic form improves the speed of these searches, but precision (i.e., the fraction of relevant to total documents retrieved) remains low. Improving the traditional keyword search by transforming search terms into canonical concepts does not improve search precision greatly. Kim et al. have designed and built a prototype system (MYCIN II) for computer-based information retrieval from a forthcoming electronic textbook of infectious disease. The system requires manual indexing by experts in the form of complex text markup. However, this mark-up process is time consuming (about 3 person-hours to generate, review, and transcribe the index for each of 218 chapters). We have designed and implemented a system to semiautomate the markup process. The system, information extraction for semiautomated indexing of documents (ISAID), uses query models and existing information-extraction tools to provide support for any user, including the author of the source material, to mark up tertiary information sources quickly and accurately.

  11. [A retrieval method of drug molecules based on graph collapsing].

    Science.gov (United States)

    Qu, J W; Lv, X Q; Liu, Z M; Liao, Y; Sun, P H; Wang, B; Tang, Z

    2018-04-18

    To establish a compact and efficient hypergraph representation and a graph-similarity-based retrieval method of molecules to achieve effective and efficient medicine information retrieval. Chemical structural formula (CSF) was a primary search target as a unique and precise identifier for each compound at the molecular level in the research field of medicine information retrieval. To retrieve medicine information effectively and efficiently, a complete workflow of the graph-based CSF retrieval system was introduced. This system accepted the photos taken from smartphones and the sketches drawn on tablet personal computers as CSF inputs, and formalized the CSFs with the corresponding graphs. Then this paper proposed a compact and efficient hypergraph representation for molecules on the basis of analyzing factors that directly affected the efficiency of graph matching. According to the characteristics of CSFs, a hierarchical collapsing method combining graph isomorphism and frequent subgraph mining was adopted. There was yet a fundamental challenge, subgraph overlapping during the collapsing procedure, which hindered the method from establishing the correct compact hypergraph of an original CSF graph. Therefore, a graph-isomorphism-based algorithm was proposed to select dominant acyclic subgraphs on the basis of overlapping analysis. Finally, the spatial similarity among graphical CSFs was evaluated by multi-dimensional measures of similarity. To evaluate the performance of the proposed method, the proposed system was firstly compared with Wikipedia Chemical Structure Explorer (WCSE), the state-of-the-art system that allowed CSF similarity searching within Wikipedia molecules dataset, on retrieval accuracy. The system achieved higher values on mean average precision, discounted cumulative gain, rank-biased precision, and expected reciprocal rank than WCSE from the top-2 to the top-10 retrieved results. Specifically, the system achieved 10%, 1.41, 6.42%, and 1

  12. Scalability of Findability: Decentralized Search and Retrieval in Large Information Networks

    Science.gov (United States)

    Ke, Weimao

    2010-01-01

    Amid the rapid growth of information today is the increasing challenge for people to survive and navigate its magnitude. Dynamics and heterogeneity of large information spaces such as the Web challenge information retrieval in these environments. Collection of information in advance and centralization of IR operations are hardly possible because…

  13. Comparing the Influence of Title and URL in Information Retrieval Relevance in Search Engines Results between Human Science and Agriculture Science

    Directory of Open Access Journals (Sweden)

    Parisa Allami

    2012-12-01

    Full Text Available When the World Wide Web provides suitable methods for producing and publishing information to scientists, the Web has become a mediator to publishing information. This environment has been formed billions of web pages that each of them has a special title, special content, special address and special purpose. Search engines provide a variety of facilities limit search results to raise the possibility of relevance in the retrieval results. One of these facilities is the limitation of the keywords and search terms to the title or URL. It can increase the possibility of results relevance significantly. Search engines claim what are limited to title and URL is most relevant. This research tried to compare the results relevant between results limited in title and URL in agricultural and Humanities areas from their users sights also it notice to Comparison of the presence of keywords in the title and URL between two areas and the relationship between search query numbers and matching keywords in title and their URLs. For this purpose, the number of 30 students in each area whom were in MA process and in doing their thesis was chosen. There was a significant relevant of the results that they limited their information needs to title and URL. There was significantly relevance in URL results in agricultural area, but there was not any significant difference between title and URL results in the humanities. For comparing the number of keywords in title and URL in two areas, 30 keywords in each area were chosen. There was not any significantly difference between the number of keywords in the title and URL of websites in two areas. To show relationship between number of search keyword and the matching of title and URL 45 keywords in each area were chosen. They were divided to three parts (one keyword, two keywords and three keywords. It was determined that if search keyword was less, the amount of matching between title and URL was more and if the matching

  14. Do family physicians retrieve synopses of clinical research previously read as email alerts?

    Science.gov (United States)

    Grad, Roland; Pluye, Pierre; Johnson-Lafleur, Janique; Granikov, Vera; Shulha, Michael; Bartlett, Gillian; Marlow, Bernard

    2011-11-30

    A synopsis of new clinical research highlights important aspects of one study in a brief structured format. When delivered as email alerts, synopses enable clinicians to become aware of new developments relevant for practice. Once read, a synopsis can become a known item of clinical information. In time-pressured situations, remembering a known item may facilitate information retrieval by the clinician. However, exactly how synopses first delivered as email alerts influence retrieval at some later time is not known. We examined searches for clinical information in which a synopsis previously read as an email alert was retrieved (defined as a dyad). Our study objectives were to (1) examine whether family physicians retrieved synopses they previously read as email alerts and then to (2) explore whether family physicians purposefully retrieved these synopses. We conducted a mixed-methods study in which a qualitative multiple case study explored the retrieval of email alerts within a prospective longitudinal cohort of practicing family physicians. Reading of research-based synopses was tracked in two contexts: (1) push, meaning to read on email and (2) pull, meaning to read after retrieval from one electronic knowledge resource. Dyads, defined as synopses first read as email alerts and subsequently retrieved in a search of a knowledge resource, were prospectively identified. Participants were interviewed about all of their dyads. Outcomes were the total number of dyads and their type. Over a period of 341 days, 194 unique synopses delivered to 41 participants resulted in 4937 synopsis readings. In all, 1205 synopses were retrieved over an average of 320 days. Of the 1205 retrieved synopses, 21 (1.7%) were dyads made by 17 family physicians. Of the 1205 retrieved synopses, 6 (0.5%) were known item type dyads. However, dyads also occurred serendipitously. In the single knowledge resource we studied, email alerts containing research-based synopses were rarely retrieved

  15. Simplified automatic on-line document searching

    International Nuclear Information System (INIS)

    Ebinuma, Yukio

    1983-01-01

    The author proposed searching method for users who need not-comprehensive retrieval. That is to provide flexible number of related documents for the users automatically. A group of technical terms are used as search terms to express an inquiry. Logical sums of the terms in the ascending order of frequency of the usage are prepared sequentially and automatically, and then the search formulas, qsub(m) and qsub(m-1) which meet certain threshold values are selected automatically also. Users justify precision of the search output up to 20 items retrieved by the formula qsub(m). If a user wishes more than 30% of recall ratio, the serach result should be output by qsub(m), and if he wishes less than 30% of it, it should be output by qsub(m-1). The search by this method using one year volume of INIS Database (76,600 items) and five inquiries resulted in 32% of recall ratio and 36% of precision ratio on the average in the case of qsub(m). The connecting time of a terminal was within 15 minutes per an inquiry. It showed more efficiency than that of an inexperienced searcher. The method can be applied to on-line searching system for database in which natural language only or natural language and controlled vocabulary are used. (author)

  16. Two Search Techniques within a Human Pedigree Database

    OpenAIRE

    Gersting, J. M.; Conneally, P. M.; Rogers, K.

    1982-01-01

    This paper presents the basic features of two search techniques from MEGADATS-2 (MEdical Genetics Acquisition and DAta Transfer System), a system for collecting, storing, retrieving and plotting human family pedigrees. The individual search provides a quick method for locating an individual in the pedigree database. This search uses a modified soundex coding and an inverted file structure based on a composite key. The navigational search uses a set of pedigree traversal operations (individual...

  17. Video Stream Retrieval of Unseen Queries using Semantic Memory

    NARCIS (Netherlands)

    Cappallo, S.; Mensink, T.; Snoek, C.G.M.; Wilson, R.C.; Hancock, E.R.; Smith, W.A.P.

    2016-01-01

    Retrieval of live, user-broadcast video streams is an under-addressed and increasingly relevant challenge. The on-line nature of the problem requires temporal evaluation and the unforeseeable scope of potential queries motivates an approach which can accommodate arbitrary search queries. To account

  18. Predicting the performance of fingerprint similarity searching.

    Science.gov (United States)

    Vogt, Martin; Bajorath, Jürgen

    2011-01-01

    Fingerprints are bit string representations of molecular structure that typically encode structural fragments, topological features, or pharmacophore patterns. Various fingerprint designs are utilized in virtual screening and their search performance essentially depends on three parameters: the nature of the fingerprint, the active compounds serving as reference molecules, and the composition of the screening database. It is of considerable interest and practical relevance to predict the performance of fingerprint similarity searching. A quantitative assessment of the potential that a fingerprint search might successfully retrieve active compounds, if available in the screening database, would substantially help to select the type of fingerprint most suitable for a given search problem. The method presented herein utilizes concepts from information theory to relate the fingerprint feature distributions of reference compounds to screening libraries. If these feature distributions do not sufficiently differ, active database compounds that are similar to reference molecules cannot be retrieved because they disappear in the "background." By quantifying the difference in feature distribution using the Kullback-Leibler divergence and relating the divergence to compound recovery rates obtained for different benchmark classes, fingerprint search performance can be quantitatively predicted.

  19. Application of discriminative models for interactive query refinement in video retrieval

    Science.gov (United States)

    Srivastava, Amit; Khanwalkar, Saurabh; Kumar, Anoop

    2013-12-01

    The ability to quickly search for large volumes of videos for specific actions or events can provide a dramatic new capability to intelligence agencies. Example-based queries from video are a form of content-based information retrieval (CBIR) where the objective is to retrieve clips from a video corpus, or stream, using a representative query sample to find more like this. Often, the accuracy of video retrieval is largely limited by the gap between the available video descriptors and the underlying query concept, and such exemplar queries return many irrelevant results with relevant ones. In this paper, we present an Interactive Query Refinement (IQR) system which acts as a powerful tool to leverage human feedback and allow intelligence analyst to iteratively refine search queries for improved precision in the retrieved results. In our approach to IQR, we leverage discriminative models that operate on high dimensional features derived from low-level video descriptors in an iterative framework. Our IQR model solicits relevance feedback on examples selected from the region of uncertainty and updates the discriminating boundary to produce a relevance ranked results list. We achieved 358% relative improvement in Mean Average Precision (MAP) over initial retrieval list at a rank cutoff of 100 over 4 iterations. We compare our discriminative IQR model approach to a naïve IQR and show our model-based approach yields 49% relative improvement over the no model naïve system.

  20. A Context Maintenance and Retrieval Model of Organizational Processes in Free Recall

    Science.gov (United States)

    Polyn, Sean M.; Norman, Kenneth A.; Kahana, Michael J.

    2009-01-01

    The authors present the context maintenance and retrieval (CMR) model of memory search, a generalized version of the temporal context model of M. W. Howard and M. J. Kahana (2002a), which proposes that memory search is driven by an internally maintained context representation composed of stimulus-related and source-related features. In the CMR…

  1. Development and tuning of an original search engine for patent libraries in medicinal chemistry.

    Science.gov (United States)

    Pasche, Emilie; Gobeill, Julien; Kreim, Olivier; Oezdemir-Zaech, Fatma; Vachon, Therese; Lovis, Christian; Ruch, Patrick

    2014-01-01

    The large increase in the size of patent collections has led to the need of efficient search strategies. But the development of advanced text-mining applications dedicated to patents of the biomedical field remains rare, in particular to address the needs of the pharmaceutical & biotech industry, which intensively uses patent libraries for competitive intelligence and drug development. We describe here the development of an advanced retrieval engine to search information in patent collections in the field of medicinal chemistry. We investigate and combine different strategies and evaluate their respective impact on the performance of the search engine applied to various search tasks, which covers the putatively most frequent search behaviours of intellectual property officers in medical chemistry: 1) a prior art search task; 2) a technical survey task; and 3) a variant of the technical survey task, sometimes called known-item search task, where a single patent is targeted. The optimal tuning of our engine resulted in a top-precision of 6.76% for the prior art search task, 23.28% for the technical survey task and 46.02% for the variant of the technical survey task. We observed that co-citation boosting was an appropriate strategy to improve prior art search tasks, while IPC classification of queries was improving retrieval effectiveness for technical survey tasks. Surprisingly, the use of the full body of the patent was always detrimental for search effectiveness. It was also observed that normalizing biomedical entities using curated dictionaries had simply no impact on the search tasks we evaluate. The search engine was finally implemented as a web-application within Novartis Pharma. The application is briefly described in the report. We have presented the development of a search engine dedicated to patent search, based on state of the art methods applied to patent corpora. We have shown that a proper tuning of the system to adapt to the various search tasks

  2. Expert Search Strategies: The Information Retrieval Practices of Healthcare Information Professionals

    OpenAIRE

    Russell-Rose, Tony; Chamberlain, Jon

    2017-01-01

    Background Healthcare information professionals play a key role in closing the knowledge gap between medical research and clinical practice. Their work involves meticulous searching of literature databases using complex search strategies that can consist of hundreds of keywords, operators, and ontology terms. This process is prone to error and can lead to inefficiency and bias if performed incorrectly. Objective The aim of this study was to investigate the search behavior of healthcare inform...

  3. Automated Literature Searches for Longitudinal Tracking of Cancer Research Training Program Graduates.

    Science.gov (United States)

    Padilla, Luz A; Desmond, Renee A; Brooks, C Michael; Waterbor, John W

    2018-06-01

    A key outcome measure of cancer research training programs is the number of cancer-related peer-reviewed publications after training. Because program graduates do not routinely report their publications, staff must periodically conduct electronic literature searches on each graduate. The purpose of this study is to compare findings of an innovative computer-based automated search program versus repeated manual literature searches to identify post-training peer-reviewed publications. In late 2014, manual searches for publications by former R25 students identified 232 cancer-related articles published by 112 of 543 program graduates. In 2016, a research assistant was instructed in performing Scopus literature searches for comparison with individual PubMed searches on our 543 program graduates. Through 2014, Scopus found 304 cancer publications, 220 of that had been retrieved manually plus an additional 84 papers. However, Scopus missed 12 publications found manually. Together, both methods found 316 publications. The automated method found 96.2 % of the 316 publications while individual searches found only 73.4 %. An automated search method such as using the Scopus database is a key tool for conducting comprehensive literature searches, but it must be supplemented with periodic manual searches to find the initial publications of program graduates. A time-saving feature of Scopus is the periodic automatic alerts of new publications. Although a training period is needed and initial costs can be high, an automated search method is worthwhile due to its high sensitivity and efficiency in the long term.

  4. Semantic reasoning in zero example video event retrieval

    NARCIS (Netherlands)

    Boer, M.H.T. de; Lu, Y.J.; Zhang, H.; Schutte, K.; Ngo, C.W.; Kraaij, W.

    2017-01-01

    Searching in digital video data for high-level events, such as a parade or a car accident, is challenging when the query is textual and lacks visual example images or videos. Current research in deep neural networks is highly beneficial for the retrieval of high-level events using visual examples,

  5. Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  6. Literature searches on Ayurveda: An update.

    Science.gov (United States)

    Aggithaya, Madhur G; Narahari, Saravu R

    2015-01-01

    The journals that publish on Ayurveda are increasingly indexed by popular medical databases in recent years. However, many Eastern journals are not indexed biomedical journal databases such as PubMed. Literature searches for Ayurveda continue to be challenging due to the nonavailability of active, unbiased dedicated databases for Ayurvedic literature. In 2010, authors identified 46 databases that can be used for systematic search of Ayurvedic papers and theses. This update reviewed our previous recommendation and identified current and relevant databases. To update on Ayurveda literature search and strategy to retrieve maximum publications. Author used psoriasis as an example to search previously listed databases and identify new. The population, intervention, control, and outcome table included keywords related to psoriasis and Ayurvedic terminologies for skin diseases. Current citation update status, search results, and search options of previous databases were assessed. Eight search strategies were developed. Hundred and five journals, both biomedical and Ayurveda, which publish on Ayurveda, were identified. Variability in databases was explored to identify bias in journal citation. Five among 46 databases are now relevant - AYUSH research portal, Annotated Bibliography of Indian Medicine, Digital Helpline for Ayurveda Research Articles (DHARA), PubMed, and Directory of Open Access Journals. Search options in these databases are not uniform, and only PubMed allows complex search strategy. "The Researches in Ayurveda" and "Ayurvedic Research Database" (ARD) are important grey resources for hand searching. About 44/105 (41.5%) journals publishing Ayurvedic studies are not indexed in any database. Only 11/105 (10.4%) exclusive Ayurveda journals are indexed in PubMed. AYUSH research portal and DHARA are two major portals after 2010. It is mandatory to search PubMed and four other databases because all five carry citations from different groups of journals. The hand

  7. Vague element selection and query rewriting for XML retrieval

    NARCIS (Netherlands)

    Mihajlovic, V.; Hiemstra, Djoerd; Blok, H.E.; de Jong, Franciska M.G.; Kraaij, W.

    In this paper we present the extension of our prototype three-level database system (TIJAH) developed for structured information retrieval. The extension is aimed at modeling vague search on XML elements. All three levels (conceptual, logical, and physical) of the TIJAH system are enhanced to

  8. Orwell's 1984: Natural Language Searching and the Contemporary Metaphor.

    Science.gov (United States)

    Dadlez, Eva M.

    1984-01-01

    Describes a natural language searching strategy for retrieving current material which has bearing on George Orwell's "1984," and identifies four main themes (technology, authoritarianism, press and psychological/linguistic implications of surveillance, political oppression) which have emerged from cross-database searches of the "Big…

  9. Accelerating Information Retrieval from Profile Hidden Markov Model Databases.

    Directory of Open Access Journals (Sweden)

    Ahmad Tamimi

    Full Text Available Profile Hidden Markov Model (Profile-HMM is an efficient statistical approach to represent protein families. Currently, several databases maintain valuable protein sequence information as profile-HMMs. There is an increasing interest to improve the efficiency of searching Profile-HMM databases to detect sequence-profile or profile-profile homology. However, most efforts to enhance searching efficiency have been focusing on improving the alignment algorithms. Although the performance of these algorithms is fairly acceptable, the growing size of these databases, as well as the increasing demand for using batch query searching approach, are strong motivations that call for further enhancement of information retrieval from profile-HMM databases. This work presents a heuristic method to accelerate the current profile-HMM homology searching approaches. The method works by cluster-based remodeling of the database to reduce the search space, rather than focusing on the alignment algorithms. Using different clustering techniques, 4284 TIGRFAMs profiles were clustered based on their similarities. A representative for each cluster was assigned. To enhance sensitivity, we proposed an extended step that allows overlapping among clusters. A validation benchmark of 6000 randomly selected protein sequences was used to query the clustered profiles. To evaluate the efficiency of our approach, speed and recall values were measured and compared with the sequential search approach. Using hierarchical, k-means, and connected component clustering techniques followed by the extended overlapping step, we obtained an average reduction in time of 41%, and an average recall of 96%. Our results demonstrate that representation of profile-HMMs using a clustering-based approach can significantly accelerate data retrieval from profile-HMM databases.

  10. Searching to Translate and Translating to Search: When Information Retrieval Meets Machine Translation

    Science.gov (United States)

    Ture, Ferhan

    2013-01-01

    With the adoption of web services in daily life, people have access to tremendous amounts of information, beyond any human's reading and comprehension capabilities. As a result, search technologies have become a fundamental tool for accessing information. Furthermore, the web contains information in multiple languages, introducing another barrier…

  11. Computer Use of a Medical Dictionary to Select Search Words.

    Science.gov (United States)

    O'Connor, John

    1986-01-01

    Explains an experiment in text-searching retrieval for cancer questions which developed and used computer procedures (via human simulation) to select search words from medical dictionaries. This study is based on an earlier one in which search words were humanly selected, and the recall results of the two studies are compared. (Author/LRW)

  12. Validation of search filters for identifying pediatric studies in PubMed

    NARCIS (Netherlands)

    Leclercq, Edith; Leeflang, Mariska M. G.; van Dalen, Elvira C.; Kremer, Leontien C. M.

    2013-01-01

    To identify and validate PubMed search filters for retrieving studies including children and to develop a new pediatric search filter for PubMed. We developed 2 different datasets of studies to evaluate the performance of the identified pediatric search filters, expressed in terms of sensitivity,

  13. Supervised learning of tools for content-based search of image databases

    Science.gov (United States)

    Delanoy, Richard L.

    1996-03-01

    A computer environment, called the Toolkit for Image Mining (TIM), is being developed with the goal of enabling users with diverse interests and varied computer skills to create search tools for content-based image retrieval and other pattern matching tasks. Search tools are generated using a simple paradigm of supervised learning that is based on the user pointing at mistakes of classification made by the current search tool. As mistakes are identified, a learning algorithm uses the identified mistakes to build up a model of the user's intentions, construct a new search tool, apply the search tool to a test image, display the match results as feedback to the user, and accept new inputs from the user. Search tools are constructed in the form of functional templates, which are generalized matched filters capable of knowledge- based image processing. The ability of this system to learn the user's intentions from experience contrasts with other existing approaches to content-based image retrieval that base searches on the characteristics of a single input example or on a predefined and semantically- constrained textual query. Currently, TIM is capable of learning spectral and textural patterns, but should be adaptable to the learning of shapes, as well. Possible applications of TIM include not only content-based image retrieval, but also quantitative image analysis, the generation of metadata for annotating images, data prioritization or data reduction in bandwidth-limited situations, and the construction of components for larger, more complex computer vision algorithms.

  14. LIVIVO - the Vertical Search Engine for Life Sciences.

    Science.gov (United States)

    Müller, Bernd; Poley, Christoph; Pössel, Jana; Hagelstein, Alexandra; Gübitz, Thomas

    2017-01-01

    The explosive growth of literature and data in the life sciences challenges researchers to keep track of current advancements in their disciplines. Novel approaches in the life science like the One Health paradigm require integrated methodologies in order to link and connect heterogeneous information from databases and literature resources. Current publications in the life sciences are increasingly characterized by the employment of trans-disciplinary methodologies comprising molecular and cell biology, genetics, genomic, epigenomic, transcriptional and proteomic high throughput technologies with data from humans, plants, and animals. The literature search engine LIVIVO empowers retrieval functionality by incorporating various literature resources from medicine, health, environment, agriculture and nutrition. LIVIVO is developed in-house by ZB MED - Information Centre for Life Sciences. It provides a user-friendly and usability-tested search interface with a corpus of 55 Million citations derived from 50 databases. Standardized application programming interfaces are available for data export and high throughput retrieval. The search functions allow for semantic retrieval with filtering options based on life science entities. The service oriented architecture of LIVIVO uses four different implementation layers to deliver search services. A Knowledge Environment is developed by ZB MED to deal with the heterogeneity of data as an integrative approach to model, store, and link semantic concepts within literature resources and databases. Future work will focus on the exploitation of life science ontologies and on the employment of NLP technologies in order to improve query expansion, filters in faceted search, and concept based relevancy rankings in LIVIVO.

  15. The top 50 cited articles on chordomas.

    Science.gov (United States)

    Ikpeze, Tochukwu; Mesfin, Addisu

    2018-03-01

    Chordomas are rare malignant primary tumors of the spine. In the mobile spine and sacrum an en-bloc resection is associated with decreased rates of recurrence. Our objective was to identify the top cited articles in chordoma research and to further analyze characteristics of these articles. In March 2017, we used ISI Web of Science (v5.11, Thomas Reuter, Philadelphia, Pennsylvania, USA) to search for the following key word: "chordoma". Articles were searched from 1900 to 2017. Articles were ranked based on number of citations. The results were evaluated to determine articles most clinically relevant to the management of chordomas. The top 50 articles that met the search criteria were further characterized on the basis of: title, author, citation density, journal of publication, year (and decade) of publication, institution and country of origin and paper topic. A total of 1,043 articles matched the search criteria. The most influential 50 articles were cited 65 to 290 times. The articles were published between 1926 and 2012, and all articles were published in English. Thirty-three publications (66%) originated from the United States and seven (14%) from Italy. Cancer accounted for the most frequent (n=9) destination journal followed by Journal of Bone and Joint Surgery (n=4). A total of 41 institutions contributed to the top 50 articles. The most common article types were: clinical 44% (n=22), papers that combined clinical and pathology findings 18% (n=9) and basic science research 14% (n=7). The top 50 cited articles on chordomas are predominantly clinical papers, arising from the United States and most frequently published in Cancer and Journal of Bone and Joint Surgery .

  16. Learning Object Retrieval and Aggregation Based on Learning Styles

    Science.gov (United States)

    Ramirez-Arellano, Aldo; Bory-Reyes, Juan; Hernández-Simón, Luis Manuel

    2017-01-01

    The main goal of this article is to develop a Management System for Merging Learning Objects (msMLO), which offers an approach that retrieves learning objects (LOs) based on students' learning styles and term-based queries, which produces a new outcome with a better score. The msMLO faces the task of retrieving LOs via two steps: The first step…

  17. An architecture for diversity-aware search for medical web content.

    Science.gov (United States)

    Denecke, K

    2012-01-01

    The Web provides a huge source of information, also on medical and health-related issues. In particular the content of medical social media data can be diverse due to the background of an author, the source or the topic. Diversity in this context means that a document covers different aspects of a topic or a topic is described in different ways. In this paper, we introduce an approach that allows to consider the diverse aspects of a search query when providing retrieval results to a user. We introduce a system architecture for a diversity-aware search engine that allows retrieving medical information from the web. The diversity of retrieval results is assessed by calculating diversity measures that rely upon semantic information derived from a mapping to concepts of a medical terminology. Considering these measures, the result set is diversified by ranking more diverse texts higher. The methods and system architecture are implemented in a retrieval engine for medical web content. The diversity measures reflect the diversity of aspects considered in a text and its type of information content. They are used for result presentation, filtering and ranking. In a user evaluation we assess the user satisfaction with an ordering of retrieval results that considers the diversity measures. It is shown through the evaluation that diversity-aware retrieval considering diversity measures in ranking could increase the user satisfaction with retrieval results.

  18. Simultenious binary hash and features learning for image retrieval

    Science.gov (United States)

    Frantc, V. A.; Makov, S. V.; Voronin, V. V.; Marchuk, V. I.; Semenishchev, E. A.; Egiazarian, K. O.; Agaian, S.

    2016-05-01

    Content-based image retrieval systems have plenty of applications in modern world. The most important one is the image search by query image or by semantic description. Approaches to this problem are employed in personal photo-collection management systems, web-scale image search engines, medical systems, etc. Automatic analysis of large unlabeled image datasets is virtually impossible without satisfactory image-retrieval technique. It's the main reason why this kind of automatic image processing has attracted so much attention during recent years. Despite rather huge progress in the field, semantically meaningful image retrieval still remains a challenging task. The main issue here is the demand to provide reliable results in short amount of time. This paper addresses the problem by novel technique for simultaneous learning of global image features and binary hash codes. Our approach provide mapping of pixel-based image representation to hash-value space simultaneously trying to save as much of semantic image content as possible. We use deep learning methodology to generate image description with properties of similarity preservation and statistical independence. The main advantage of our approach in contrast to existing is ability to fine-tune retrieval procedure for very specific application which allow us to provide better results in comparison to general techniques. Presented in the paper framework for data- dependent image hashing is based on use two different kinds of neural networks: convolutional neural networks for image description and autoencoder for feature to hash space mapping. Experimental results confirmed that our approach has shown promising results in compare to other state-of-the-art methods.

  19. Information Retrieval in Virtual Universities

    Science.gov (United States)

    Puustjärvi, Juha; Pöyry, Päivi

    2006-01-01

    Information retrieval in the context of virtual universities deals with the representation, organization, and access to learning objects. The representation and organization of learning objects should provide the learner with an easy access to the learning objects. In this article, we give an overview of the ONES system, and analyze the relevance…

  20. Search and Hyperlinking Task at MediaEval 2012

    NARCIS (Netherlands)

    Eskevich, Maria; Jones, Gareth J.F.; Chen, Shu; Aly, Robin; Ordelman, Roeland J.F.; Larson, Martha

    2012-01-01

    The Search and Hyperlinking Task was one of the Brave New Tasks at MediaEval 2012. The Task consisted of two sub- tasks which focused on search and linking in retrieval from a collection of semi-professional video content. These tasks followed up on research carried out within the MediaEval 2011

  1. A Statistical Ontology-Based Approach to Ranking for Multiword Search

    Science.gov (United States)

    Kim, Jinwoo

    2013-01-01

    Keyword search is a prominent data retrieval method for the Web, largely because the simple and efficient nature of keyword processing allows a large amount of information to be searched with fast response. However, keyword search approaches do not formally capture the clear meaning of a keyword query and fail to address the semantic relationships…

  2. How Adolescents Search for and Appraise Online Health Information: A Systematic Review.

    Science.gov (United States)

    Freeman, Jaimie L; Caldwell, Patrina H Y; Bennett, Patricia A; Scott, Karen M

    2018-04-01

    To conduct a systematic review of the evidence concerning whether and how adolescents search for online health information and the extent to which they appraise the credibility of information they retrieve. A systematic search of online databases (MEDLINE, EMBASE, PsycINFO, ERIC) was performed. Reference lists of included papers were searched manually for additional articles. Included were studies on whether and how adolescents searched for and appraised online health information, where adolescent participants were aged 13-18 years. Thematic analysis was used to synthesize the findings. Thirty-four studies met the inclusion criteria. In line with the research questions, 2 key concepts were identified within the papers: whether and how adolescents search for online health information, and the extent to which adolescents appraise online health information. Four themes were identified regarding whether and how adolescents search for online health information: use of search engines, difficulties in selecting appropriate search strings, barriers to searching, and absence of searching. Four themes emerged concerning the extent to which adolescents appraise the credibility of online health information: evaluation based on Web site name and reputation, evaluation based on first impression of Web site, evaluation of Web site content, and absence of a sophisticated appraisal strategy. Adolescents are aware of the varying quality of online health information. Strategies used by individuals for searching and appraising online health information differ in their sophistication. It is important to develop resources to enhance search and appraisal skills and to collaborate with adolescents to ensure that such resources are appropriate for them. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. The MediaMill TRECVID 2010 semantic video search engine

    NARCIS (Netherlands)

    Snoek, C.G.M.; van de Sande, K.E.A.; de Rooij, O.; Huurnink, B.; Gavves, E.; Odijk, D.; de Rijke, M.; Gevers, T.; Worring, M.; Koelma, D.C.; Smeulders, A.W.M.

    2010-01-01

    In this paper we describe our TRECVID 2010 video retrieval experiments. The MediaMill team participated in three tasks: semantic indexing, known-item search, and instance search. The starting point for the MediaMill concept detection approach is our top-performing bag-of-words system of TRECVID

  4. Distributed Web-Scale Infrastructure For Crawling, Indexing And Search With Semantic Support

    Directory of Open Access Journals (Sweden)

    Stefan Dlugolinsky

    2012-01-01

    Full Text Available In this paper, we describe our work in progress in the scope of web-scale informationextraction and information retrieval utilizing distributed computing. Wepresent a distributed architecture built on top of the MapReduce paradigm forinformation retrieval, information processing and intelligent search supportedby spatial capabilities. Proposed architecture is focused on crawling documentsin several different formats, information extraction, lightweight semantic annotationof the extracted information, indexing of extracted information andfinally on indexing of documents based on the geo-spatial information foundin a document. We demonstrate the architecture on two use cases, where thefirst is search in job offers retrieved from the LinkedIn portal and the second issearch in BBC news feeds and discuss several problems we had to face duringthe implementation. We also discuss spatial search applications for both casesbecause both LinkedIn job offer pages and BBC news feeds contain a lot of spatialinformation to extract and process.

  5. List search hardware for interpretive software

    CERN Document Server

    Altaber, Jacques; Mears, B; Rausch, R

    1979-01-01

    Interpreted languages, e.g. BASIC, are simple to learn, easy to use, quick to modify and in general 'user-friendly'. However, a critically time consuming process during interpretation is that of list searching. A special microprogrammed device for fast list searching has therefore been developed at the SPS Division of CERN. It uses bit- sliced hardware. Fast algorithms perform search, insert and delete of a six-character name and its value in a list of up to 1000 pairs. The prototype shows retrieval times of the order of 10-30 microseconds. (11 refs).

  6. Searching for qualitative research for inclusion in systematic reviews: a structured methodological review.

    Science.gov (United States)

    Booth, Andrew

    2016-05-04

    Qualitative systematic reviews or qualitative evidence syntheses (QES) are increasingly recognised as a way to enhance the value of systematic reviews (SRs) of clinical trials. They can explain the mechanisms by which interventions, evaluated within trials, might achieve their effect. They can investigate differences in effects between different population groups. They can identify which outcomes are most important to patients, carers, health professionals and other stakeholders. QES can explore the impact of acceptance, feasibility, meaningfulness and implementation-related factors within a real world setting and thus contribute to the design and further refinement of future interventions. To produce valid, reliable and meaningful QES requires systematic identification of relevant qualitative evidence. Although the methodologies of QES, including methods for information retrieval, are well-documented, little empirical evidence exists to inform their conduct and reporting. This structured methodological overview examines papers on searching for qualitative research identified from the Cochrane Qualitative and Implementation Methods Group Methodology Register and from citation searches of 15 key papers. A single reviewer reviewed 1299 references. Papers reporting methodological guidance, use of innovative methodologies or empirical studies of retrieval methods were categorised under eight topical headings: overviews and methodological guidance, sampling, sources, structured questions, search procedures, search strategies and filters, supplementary strategies and standards. This structured overview presents a contemporaneous view of information retrieval for qualitative research and identifies a future research agenda. This review concludes that poor empirical evidence underpins current information practice in information retrieval of qualitative research. A trend towards improved transparency of search methods and further evaluation of key search procedures offers

  7. Retrieval monitoring is influenced by information value: the interplay between importance and confidence on false memory.

    Science.gov (United States)

    McDonough, Ian M; Bui, Dung C; Friedman, Michael C; Castel, Alan D

    2015-10-01

    The perceived value of information can influence one's motivation to successfully remember that information. This study investigated how information value can affect memory search and evaluation processes (i.e., retrieval monitoring). In Experiment 1, participants studied unrelated words associated with low, medium, or high values. Subsequent memory tests required participants to selectively monitor retrieval for different values. False memory effects were smaller when searching memory for high-value than low-value words, suggesting that people more effectively monitored more important information. In Experiment 2, participants studied semantically-related words, and the need for retrieval monitoring was reduced at test by using inclusion instructions (i.e., endorsement of any word related to the studied words) compared with standard instructions. Inclusion instructions led to increases in false recognition for low-value, but not for high-value words, suggesting that under standard-instruction conditions retrieval monitoring was less likely to occur for important information. Experiment 3 showed that words retrieved with lower confidence were associated with more effective retrieval monitoring, suggesting that the quality of the retrieved memory influenced the degree and effectiveness of monitoring processes. Ironically, unless encouraged to do so, people were less likely to carefully monitor important information, even though people want to remember important memories most accurately. Copyright © 2015 Elsevier B.V. All rights reserved.

  8. CBP Customs Rulings Online Search System (CROSS)

    Data.gov (United States)

    Department of Homeland Security — CROSS is a searchable database of CBP rulings that can be retrieved based on simple or complex search characteristics using keywords and Boolean operators. CROSS has...

  9. Asteroid retrieval missions enabled by invariant manifold dynamics

    Science.gov (United States)

    Sánchez, Joan Pau; García Yárnoz, Daniel

    2016-10-01

    Near Earth Asteroids are attractive targets for new space missions; firstly, because of their scientific importance, but also because of their impact threat and prospective resources. The asteroid retrieval mission concept has thus arisen as a synergistic approach to tackle these three facets of interest in one single mission. This paper reviews the methodology used by the authors (2013) in a previous search for objects that could be transported from accessible heliocentric orbits into the Earth's neighbourhood at affordable costs (or Easily Retrievable Objects, a.k.a. EROs). This methodology consisted of a heuristic pruning and an impulsive manoeuvre trajectory optimisation. Low thrust propulsion on the other hand clearly enables the transportation of much larger objects due to its higher specific impulse. Hence, in this paper, low thrust retrieval transfers are sought using impulsive trajectories as first guesses to solve the optimal control problem. GPOPS-II is used to transcribe the continuous-time optimal control problem to a nonlinear programming problem (NLP). The latter is solved by IPOPT, an open source software package for large-scale NLPs. Finally, a natural continuation procedure that increases the asteroid mass allows to find out the largest objects that could be retrieved from a given asteroid orbit. If this retrievable mass is larger than the actual mass of the asteroid, the asteroid retrieval mission for this particular object is said to be feasible. The paper concludes with an updated list of 17 EROs, as of April 2016, with their maximum retrievable masses by means of low thrust propulsion. This ranges from 2000 tons for the easiest object to be retrieved to 300 tons for the least accessible of them.

  10. Search and retrieval of medical images for improved diagnosis of neurodegenerative diseases

    Science.gov (United States)

    Ekin, Ahmet; Jasinschi, Radu; Turan, Erman; Engbers, Rene; van der Grond, Jeroen; van Buchem, Mark A.

    2007-01-01

    In the medical world, the accuracy of diagnosis is mainly affected by either lack of sufficient understanding of some diseases or the inter-, and/or intra-observer variability of the diagnoses. The former requires understanding the progress of diseases at much earlier stages, extraction of important information from ever growing amounts of data, and finally finding correlations with certain features and complications that will illuminate the disease progression. The latter (inter-, and intra- observer variability) is caused by the differences in the experience levels of different medical experts (inter-observer variability) or by mental and physical tiredness of one expert (intra-observer variability). We believe that the use of large databases can help improve the current status of disease understanding and decision making. By comparing large number of patients, some of the otherwise hidden relations can be revealed that results in better understanding, patients with similar complications can be found, the diagnosis and treatment can be compared so that the medical expert can make a better diagnosis. To this effect, this paper introduces a search and retrieval system for brain MR databases and shows that brain iron accumulation shape provides additional information to the shape-insensitive features, such as the total brain iron load, that are commonly used in the clinics. We propose to use Kendall's correlation value to automatically compare various returns to a query. We also describe a fully automated and fast brain MR image analysis system to detect degenerative iron accumulation in brain, as it is the case in Alzheimer's and Parkinson's. The system is composed of several novel image processing algorithms and has been extensively tested in Leiden University Medical Center over so far more than 600 patients.

  11. The Potential of User Feedback Through the Iterative Refining of Queries in an Image Retrieval System

    NARCIS (Netherlands)

    Ben Moussa, Maher; Pasch, Marco; Hiemstra, Djoerd; van der Vet, P.E.; Huibers, Theo W.C.; Marchand-Maillet, Stephane; Bruno, Eric; Nürnberger, Andreas; Detyniecki, Marcin

    2007-01-01

    Inaccurate or ambiguous expressions in queries lead to poor results in information retrieval. We assume that iterative user feedback can improve the quality of queries. To this end we developed a system for image retrieval that utilizes user feedback to refine the user’s search query. This is done

  12. Intelligent methods for data retrieval in fusion databases

    International Nuclear Information System (INIS)

    Vega, J.

    2008-01-01

    The plasma behaviour is identified through the recognition of patterns inside signals. The search for patterns is usually a manual and tedious procedure in which signals need to be examined individually. A breakthrough in data retrieval for fusion databases is the development of intelligent methods to search for patterns. A pattern (in the broadest sense) could be a single segment of a waveform, a set of pixels within an image or even a heterogeneous set of features made up of waveforms, images and any kind of experimental data. Intelligent methods will allow searching for data according to technical, scientific and structural criteria instead of an identifiable time interval or pulse number. Such search algorithms should be intelligent enough to avoid passing over the entire database. Benefits of such access methods are discussed and several available techniques are reviewed. In addition, the applicability of the methods from general purpose searching systems to ad hoc developments is covered

  13. Development of dog-like retrieving capability in a ground robot

    Science.gov (United States)

    MacKenzie, Douglas C.; Ashok, Rahul; Rehg, James M.; Witus, Gary

    2013-01-01

    This paper presents the Mobile Intelligence Team's approach to addressing the CANINE outdoor ground robot competition. The competition required developing a robot that provided retrieving capabilities similar to a dog, while operating fully autonomously in unstructured environments. The vision team consisted of Mobile Intelligence, the Georgia Institute of Technology, and Wayne State University. Important computer vision aspects of the project were the ability to quickly learn the distinguishing characteristics of novel objects, searching images for the object as the robot drove a search pattern, identifying people near the robot for safe operations, correctly identify the object among distractors, and localizing the object for retrieval. The classifier used to identify the objects will be discussed, including an analysis of its performance, and an overview of the entire system architecture presented. A discussion of the robot's performance in the competition will demonstrate the system's successes in real-world testing.

  14. Joint Textual And Visual Cues For Retrieving Images Using Latent Semantic Indexing

    OpenAIRE

    Pecenovic, Zoran; Ayer, Serge; Vetterli, Martin

    2001-01-01

    In this article we present a novel approach of integrating textual and visual descriptors of images in a unified retrieval structure. The methodology, inspired from text retrieval and information filtering is based on Latent Semantic Indexing (LS1).

  15. LitVar: a semantic search engine for linking genomic variant data in PubMed and PMC.

    Science.gov (United States)

    Allot, Alexis; Peng, Yifan; Wei, Chih-Hsuan; Lee, Kyubum; Phan, Lon; Lu, Zhiyong

    2018-05-14

    The identification and interpretation of genomic variants play a key role in the diagnosis of genetic diseases and related research. These tasks increasingly rely on accessing relevant manually curated information from domain databases (e.g. SwissProt or ClinVar). However, due to the sheer volume of medical literature and high cost of expert curation, curated variant information in existing databases are often incomplete and out-of-date. In addition, the same genetic variant can be mentioned in publications with various names (e.g. 'A146T' versus 'c.436G>A' versus 'rs121913527'). A search in PubMed using only one name usually cannot retrieve all relevant articles for the variant of interest. Hence, to help scientists, healthcare professionals, and database curators find the most up-to-date published variant research, we have developed LitVar for the search and retrieval of standardized variant information. In addition, LitVar uses advanced text mining techniques to compute and extract relationships between variants and other associated entities such as diseases and chemicals/drugs. LitVar is publicly available at https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/LitVar.

  16. Interactive Information Retrieval: An Introduction

    Directory of Open Access Journals (Sweden)

    Borlund, Pia

    2013-09-01

    Full Text Available The paper introduces the research area of interactive information retrieval (IIR from a historical point of view. Further, the focus here is on evaluation, because much research in IR deals with IR evaluation methodology due to the core research interest in IR performance, system interaction and satisfaction with retrieved information. In order to position IIR evaluation, the Cranfield model and the series of tests that led to the Cranfield model are outlined. Three iconic user-oriented studies and projects that all have contributed to how IIR is perceived and understood today are presented: The MEDLARS test, the Book House fiction retrieval system, and the OKAPI project. On this basis the call for alternative IIR evaluation approaches motivated by the three revolutions (the cognitive, the relevance, and the interactive revolutions put forward by Robertson & Hancock-Beaulieu (1992 is presented. As a response to this call the 'IIR evaluation model' by Borlund (e.g., 2003a is introduced. The objective of the IIR evaluation model is to facilitate IIR evaluation as close as possible to actual information searching and IR processes, though still in a relatively controlled evaluation environment, in which the test instrument of a simulated work task situation plays a central part.

  17. A study of Consistency in the Selection of Search Terms and Search Concepts: A Case Study in National Taiwan University

    Directory of Open Access Journals (Sweden)

    Mu-hsuan Huang

    2001-12-01

    Full Text Available This article analyzes the consistency in the selection of search terms and search contents of college and graduate students in National Taiwan University when they are using PsycLIT CD-ROM database. 31 students conducted pre-assigned searches, doing 59 searches generating 609 search terms. The study finds the consistency in selection of search terms of first level is 22.14% and second level is 35%. These results are similar with others’ researches. About the consistency in search concepts, no matter the overlaps of searched articles or judge relevant articles are lower than other researches. [Article content in Chinese

  18. SEOM's Sentinel-3/OLCI' project CAWA: advanced GRASP aerosol retrieval

    Science.gov (United States)

    Dubovik, Oleg; litvinov, Pavel; Huang, Xin; Aspetsberger, Michael; Fuertes, David; Brockmann, Carsten; Fischer, Jürgen; Bojkov, Bojan

    2016-04-01

    The CAWA "Advanced Clouds, Aerosols and WAter vapour products for Sentinel-3/OLCI" ESA-SEOM project aims on the development of advanced atmospheric retrieval algorithms for the Sentinel-3/OLCI mission, and is prepared using Envisat/MERIS and Aqua/MODIS datasets. This presentation discusses mainly CAWA aerosol product developments and results. CAWA aerosol retrieval uses recently developed GRASP algorithm (Generalized Retrieval of Aerosol and Surface Properties) algorithm described by Dubovik et al. (2014). GRASP derives extended set of atmospheric parameters using multi-pixel concept - a simultaneous fitting of a large group of pixels under additional a priori constraints limiting the time variability of surface properties and spatial variability of aerosol properties. Over land GRASP simultaneously retrieves properties of both aerosol and underlying surface even over bright surfaces. GRAPS doesn't use traditional look-up-tables and performs retrieval as search in continuous space of solution. All radiative transfer calculations are performed as part of the retrieval. The results of comprehensive sensitivity tests, as well as results obtained from real Envisat/MERIS data will be presented. The tests analyze various aspects of aerosol and surface reflectance retrieval accuracy. In addition, the possibilities of retrieval improvement by means of implementing synergetic inversion of a combination of OLCI data with observations by SLSTR are explored. Both the results of numerical tests, as well as the results of processing several years of Envisat/MERIS data illustrate demonstrate reliable retrieval of AOD (Aerosol Optical Depth) and surface BRDF. Observed retrieval issues and advancements will be discussed. For example, for some situations we illustrate possibilities of retrieving aerosol absorption - property that hardly accessible from satellite observations with no multi-angular and polarimetric capabilities.

  19. Personal information search on mobile devices

    OpenAIRE

    Akbas, Mehmet.

    2007-01-01

    Today's mobile devices, especially mobile phones, are comparable in computing capability and storage to the desktop computers of a few years ago. The volume and diversity of the information kept on mobile devices has continually increased and users have taken advantage of this. Since information is being stored on multiple devices, searching for and retrieving the desired information has become an important function. This thesis focuses on search with regard to Personal Information Manag...

  20. Web information retrieval for health professionals.

    Science.gov (United States)

    Ting, S L; See-To, Eric W K; Tse, Y K

    2013-06-01

    This paper presents a Web Information Retrieval System (WebIRS), which is designed to assist the healthcare professionals to obtain up-to-date medical knowledge and information via the World Wide Web (WWW). The system leverages the document classification and text summarization techniques to deliver the highly correlated medical information to the physicians. The system architecture of the proposed WebIRS is first discussed, and then a case study on an application of the proposed system in a Hong Kong medical organization is presented to illustrate the adoption process and a questionnaire is administrated to collect feedback on the operation and performance of WebIRS in comparison with conventional information retrieval in the WWW. A prototype system has been constructed and implemented on a trial basis in a medical organization. It has proven to be of benefit to healthcare professionals through its automatic functions in classification and summarizing the medical information that the physicians needed and interested. The results of the case study show that with the use of the proposed WebIRS, significant reduction of searching time and effort, with retrieval of highly relevant materials can be attained.

  1. Development of a search system of NRDF on WWW

    International Nuclear Information System (INIS)

    Masui, Hiroshi; Ohbayashi, Yoshihide; Aoyama, Shigeyoshi; Ohnishi, Akira; Kato, Kiyoshi; Chiba, Masaki

    2000-01-01

    We develop a data search system and a data entry system for the Nuclear Reaction Data File (NRDF), which is one of the charged-particle reaction database compiled by Japan Charged Particle Reaction Group (JCPRG). Using a WWW browser, we can easily search, retrieve and utilize the data of NRDF. (author)

  2. MinHash-Based Fuzzy Keyword Search of Encrypted Data across Multiple Cloud Servers

    Directory of Open Access Journals (Sweden)

    Jingsha He

    2018-05-01

    Full Text Available To enhance the efficiency of data searching, most data owners store their data files in different cloud servers in the form of cipher-text. Thus, efficient search using fuzzy keywords becomes a critical issue in such a cloud computing environment. This paper proposes a method that aims at improving the efficiency of cipher-text retrieval and lowering storage overhead for fuzzy keyword search. In contrast to traditional approaches, the proposed method can reduce the complexity of Min-Hash-based fuzzy keyword search by using Min-Hash fingerprints to avoid the need to construct the fuzzy keyword set. The method will utilize Jaccard similarity to rank the results of retrieval, thus reducing the amount of calculation for similarity and saving a lot of time and space overhead. The method will also take consideration of multiple user queries through re-encryption technology and update user permissions dynamically. Security analysis demonstrates that the method can provide better privacy preservation and experimental results show that efficiency of cipher-text using the proposed method can improve the retrieval time and lower storage overhead as well.

  3. Large Scale Hierarchical K-Means Based Image Retrieval With MapReduce

    Science.gov (United States)

    2014-03-27

    flat vocabulary on MapReduce. In 2013, Moise and Shestakov [32, 40], have been researching large scale indexing and search with MapReduce. They...time will be greatly reduced, however image retrieval performance will almost certainly suffer. Moise and Shestakov ran tests with 100M images on 108...43–72, 2005. [32] Diana Moise , Denis Shestakov, Gylfi Gudmundsson, and Laurent Amsaleg. Indexing and searching 100m images with map-reduce. In

  4. Retrieval Interference in Syntactic Processing: The Case of Reflexive Binding in English.

    Science.gov (United States)

    Patil, Umesh; Vasishth, Shravan; Lewis, Richard L

    2016-01-01

    It has been proposed that in online sentence comprehension the dependency between a reflexive pronoun such as himself/herself and its antecedent is resolved using exclusively syntactic constraints. Under this strictly syntactic search account, Principle A of the binding theory-which requires that the antecedent c-command the reflexive within the same clause that the reflexive occurs in-constrains the parser's search for an antecedent. The parser thus ignores candidate antecedents that might match agreement features of the reflexive (e.g., gender) but are ineligible as potential antecedents because they are in structurally illicit positions. An alternative possibility accords no special status to structural constraints: in addition to using Principle A, the parser also uses non-structural cues such as gender to access the antecedent. According to cue-based retrieval theories of memory (e.g., Lewis and Vasishth, 2005), the use of non-structural cues should result in increased retrieval times and occasional errors when candidates partially match the cues, even if the candidates are in structurally illicit positions. In this paper, we first show how the retrieval processes that underlie the reflexive binding are naturally realized in the Lewis and Vasishth (2005) model. We present the predictions of the model under the assumption that both structural and non-structural cues are used during retrieval, and provide a critical analysis of previous empirical studies that failed to find evidence for the use of non-structural cues, suggesting that these failures may be Type II errors. We use this analysis and the results of further modeling to motivate a new empirical design that we use in an eye tracking study. The results of this study confirm the key predictions of the model concerning the use of non-structural cues, and are inconsistent with the strictly syntactic search account. These results present a challenge for theories advocating the infallibility of the human

  5. Working with Data: Discovering Knowledge through Mining and Analysis; Systematic Knowledge Management and Knowledge Discovery; Text Mining; Methodological Approach in Discovering User Search Patterns through Web Log Analysis; Knowledge Discovery in Databases Using Formal Concept Analysis; Knowledge Discovery with a Little Perspective.

    Science.gov (United States)

    Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.

    2000-01-01

    These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)

  6. Innovation: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  7. Kiswahili: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  8. The effect of cue content on retrieval from autobiographical memory.

    Science.gov (United States)

    Uzer, Tugba; Brown, Norman R

    2017-01-01

    It has long been argued that personal memories are usually generated in an effortful search process in word-cueing studies. However, recent research (Uzer, Lee, & Brown, 2012) shows that direct retrieval of autobiographical memories, in response to word cues, is common. This invites the question of whether direct retrieval phenomenon is generalizable beyond the standard laboratory paradigm. Here we investigated prevalence of direct retrieval of autobiographical memories cued by specific and individuated cues versus generic cues. In Experiment 1, participants retrieved memories in response to cues from their own life (e.g., the names of friends) and generic words (e.g., chair). In Experiment 2, participants provided their personal cues two or three months prior to coming to the lab (min: 75days; max: 100days). In each experiment, RT was measured and participants reported whether memories were directly retrieved or generated on each trial. Results showed that personal cues elicited a high rate of direct retrieval. Personal cues were more likely to elicit direct retrieval than generic word cues, and as a consequence, participants responded faster, on average, to the former than to the latter. These results challenge the constructive view of autobiographical memory and suggest that autobiographical memories consist of pre-stored event representations, which are largely governed by associative mechanisms. These demonstrations offer theoretically interesting questions such as why are we not overwhelmed with directly retrieved memories cued by everyday familiar surroundings? Copyright © 2016 Elsevier B.V. All rights reserved.

  9. Pengembangan Algoritma Fast Inversion dalam Membentuk Inverted File untuk Text Retrieval dengan Data Skala Besar

    Directory of Open Access Journals (Sweden)

    Derwin Suhartono

    2012-06-01

    Full Text Available The rapid development of information systems generates new needs for indexing and retrieval of various kinds of media. The need for documents in the form of multimedia is increasing currently. Thus, the need to store or retrieve now becomes a primary problem. The multimedia type commonly used is text types, as widely seen as the main option in the search engines like Yahoo, Google or others. Essentially, search does not just want to get results, but also a more efficient process. For the purposes of indexing and retrieval, inverted file is used to provide faster results. However, there will be a problem if the making of an inverted file is related to a large amount of data. This study describes an algorithm called Fast Inversion as the development of base inverted file making method to address the needs related to the amount of data.

  10. Architecture for knowledge-based and federated search of online clinical evidence.

    Science.gov (United States)

    Coiera, Enrico; Walther, Martin; Nguyen, Ken; Lovell, Nigel H

    2005-10-24

    It is increasingly difficult for clinicians to keep up-to-date with the rapidly growing biomedical literature. Online evidence retrieval methods are now seen as a core tool to support evidence-based health practice. However, standard search engine technology is not designed to manage the many different types of evidence sources that are available or to handle the very different information needs of various clinical groups, who often work in widely different settings. The objectives of this paper are (1) to describe the design considerations and system architecture of a wrapper-mediator approach to federate search system design, including the use of knowledge-based, meta-search filters, and (2) to analyze the implications of system design choices on performance measurements. A trial was performed to evaluate the technical performance of a federated evidence retrieval system, which provided access to eight distinct online resources, including e-journals, PubMed, and electronic guidelines. The Quick Clinical system architecture utilized a universal query language to reformulate queries internally and utilized meta-search filters to optimize search strategies across resources. We recruited 227 family physicians from across Australia who used the system to retrieve evidence in a routine clinical setting over a 4-week period. The total search time for a query was recorded, along with the duration of individual queries sent to different online resources. Clinicians performed 1662 searches over the trial. The average search duration was 4.9 +/- 3.2 s (N = 1662 searches). Mean search duration to the individual sources was between 0.05 s and 4.55 s. Average system time (ie, system overhead) was 0.12 s. The relatively small system overhead compared to the average time it takes to perform a search for an individual source shows that the system achieves a good trade-off between performance and reliability. Furthermore, despite the additional effort required to incorporate the

  11. [Method of traditional Chinese medicine formula design based on 3D-database pharmacophore search and patent retrieval].

    Science.gov (United States)

    He, Yu-su; Sun, Zhi-yi; Zhang, Yan-ling

    2014-11-01

    By using the pharmacophore model of mineralocorticoid receptor antagonists as a starting point, the experiment stud- ies the method of traditional Chinese medicine formula design for anti-hypertensive. Pharmacophore models were generated by 3D-QSAR pharmacophore (Hypogen) program of the DS3.5, based on the training set composed of 33 mineralocorticoid receptor antagonists. The best pharmacophore model consisted of two Hydrogen-bond acceptors, three Hydrophobic and four excluded volumes. Its correlation coefficient of training set and test set, N, and CAI value were 0.9534, 0.6748, 2.878, and 1.119. According to the database screening, 1700 active compounds from 86 source plant were obtained. Because of lacking of available anti-hypertensive medi cation strategy in traditional theory, this article takes advantage of patent retrieval in world traditional medicine patent database, in order to design drug formula. Finally, two formulae was obtained for antihypertensive.

  12. Improving information retrieval with multiple health terminologies in a quality-controlled gateway.

    Science.gov (United States)

    Soualmia, Lina F; Sakji, Saoussen; Letord, Catherine; Rollin, Laetitia; Massari, Philippe; Darmoni, Stéfan J

    2013-01-01

    The Catalog and Index of French-language Health Internet resources (CISMeF) is a quality-controlled health gateway, primarily for Web resources in French (n=89,751). Recently, we achieved a major improvement in the structure of the catalogue by setting-up multiple terminologies, based on twelve health terminologies available in French, to overcome the potential weakness of the MeSH thesaurus, which is the main and pivotal terminology we use for indexing and retrieval since 1995. The main aim of this study was to estimate the added-value of exploiting several terminologies and their semantic relationships to improve Web resource indexing and retrieval in CISMeF, in order to provide additional health resources which meet the users' expectations. Twelve terminologies were integrated into the CISMeF information system to set up multiple-terminologies indexing and retrieval. The same sets of thirty queries were run: (i) by exploiting the hierarchical structure of the MeSH, and (ii) by exploiting the additional twelve terminologies and their semantic links. The two search modes were evaluated and compared. The overall coverage of the multiple-terminologies search mode was improved by comparison to the coverage of using the MeSH (16,283 vs. 14,159) (+15%). These additional findings were estimated at 56.6% relevant results, 24.7% intermediate results and 18.7% irrelevant. The multiple-terminologies approach improved information retrieval. These results suggest that integrating additional health terminologies was able to improve recall. Since performing the study, 21 other terminologies have been added which should enable us to make broader studies in multiple-terminologies information retrieval.

  13. HTTP-based Search and Ordering Using ECHO's REST-based and OpenSearch APIs

    Science.gov (United States)

    Baynes, K.; Newman, D. J.; Pilone, D.

    2012-12-01

    Metadata is an important entity in the process of cataloging, discovering, and describing Earth science data. NASA's Earth Observing System (EOS) ClearingHOuse (ECHO) acts as the core metadata repository for EOSDIS data centers, providing a centralized mechanism for metadata and data discovery and retrieval. By supporting both the ESIP's Federated Search API and its own search and ordering interfaces, ECHO provides multiple capabilities that facilitate ease of discovery and access to its ever-increasing holdings. Users are able to search and export metadata in a variety of formats including ISO 19115, json, and ECHO10. This presentation aims to inform technically savvy clients interested in automating search and ordering of ECHO's metadata catalog. The audience will be introduced to practical and applicable examples of end-to-end workflows that demonstrate finding, sub-setting and ordering data that is bound by keyword, temporal and spatial constraints. Interaction with the ESIP OpenSearch Interface will be highlighted, as will ECHO's own REST-based API.

  14. A Hybrid Model Ranking Search Result for Research Paper Searching on Social Bookmarking

    Directory of Open Access Journals (Sweden)

    pijitra jomsri

    2015-11-01

    Full Text Available Social bookmarking and publication sharing systems are essential tools for web resource discovery. The performance and capabilities of search results from research paper bookmarking system are vital. Many researchers use social bookmarking for searching papers related to their topics of interest. This paper proposes a combination of similarity based indexing “tag title and abstract” and static ranking to improve search results. In this particular study, the year of the published paper and type of research paper publication are combined with similarity ranking called (HybridRank. Different weighting scores are employed. The retrieval performance of these weighted combination rankings are evaluated using mean values of NDCG. The results suggest that HybridRank and similarity rank with weight 75:25 has the highest NDCG scores. From the preliminary result of experiment, the combination ranking technique provide more relevant research paper search results. Furthermore the chosen heuristic ranking can improve the efficiency of research paper searching on social bookmarking websites.

  15. Image Retrieval based on Integration between Color and Geometric Moment Features

    International Nuclear Information System (INIS)

    Saad, M.H.; Saleh, H.I.; Konbor, H.; Ashour, M.

    2012-01-01

    Content based image retrieval is the retrieval of images based on visual features such as colour, texture and shape. .the Current approaches to CBIR differ in terms of which image features are extracted; recent work deals with combination of distances or scores from different and usually independent representations in an attempt to induce high level semantics from the low level descriptors of the images. content-based image retrieval has many application areas such as, education, commerce, military, searching, commerce, and biomedicine and Web image classification. This paper proposes a new image retrieval system, which uses color and geometric moment feature to form the feature vectors. Bhattacharyya distance and histogram intersection are used to perform feature matching. This framework integrates the color histogram which represents the global feature and geometric moment as local descriptor to enhance the retrieval results. The proposed technique is proper for precisely retrieving images even in deformation cases such as geometric deformations and noise. It is tested on a standard the results shows that a combination of our approach as a local image descriptor with other global descriptors outperforms other approaches.

  16. A unified architecture for biomedical search engines based on semantic web technologies.

    Science.gov (United States)

    Jalali, Vahid; Matash Borujerdi, Mohammad Reza

    2011-04-01

    There is a huge growth in the volume of published biomedical research in recent years. Many medical search engines are designed and developed to address the over growing information needs of biomedical experts and curators. Significant progress has been made in utilizing the knowledge embedded in medical ontologies and controlled vocabularies to assist these engines. However, the lack of common architecture for utilized ontologies and overall retrieval process, hampers evaluating different search engines and interoperability between them under unified conditions. In this paper, a unified architecture for medical search engines is introduced. Proposed model contains standard schemas declared in semantic web languages for ontologies and documents used by search engines. Unified models for annotation and retrieval processes are other parts of introduced architecture. A sample search engine is also designed and implemented based on the proposed architecture in this paper. The search engine is evaluated using two test collections and results are reported in terms of precision vs. recall and mean average precision for different approaches used by this search engine.

  17. Web-page Prediction for Domain Specific Web-search using Boolean Bit Mask

    OpenAIRE

    Sinha, Sukanta; Duttagupta, Rana; Mukhopadhyay, Debajyoti

    2012-01-01

    Search Engine is a Web-page retrieval tool. Nowadays Web searchers utilize their time using an efficient search engine. To improve the performance of the search engine, we are introducing a unique mechanism which will give Web searchers more prominent search results. In this paper, we are going to discuss a domain specific Web search prototype which will generate the predicted Web-page list for user given search string using Boolean bit mask.

  18. Safety and safeguards aspects on retrievability: A German study

    International Nuclear Information System (INIS)

    Biurrun, E.; Engelmann, H.-J.; Brennecke, P.; Kranz, H.

    2000-01-01

    The article refers shortly to the definition of the term 'retrievability' and shows two different possibilities of retrieval scenarios, their advantages and detriments. The second part lists the Safeguards aspects of retrievability, gives a short outlook on the present German Safeguards Reference Concept in the post-closure phase of a repository in a salt dome and about the results of German studies concerning some proposed Safeguards methods. Furthermore, Planned investigations on Safeguards in the post-closure phase of a repository are mentioned. The third and main part finally describes the results of the German Retrievability Study, which was elaborated in the middle of the nineties by DBE on behalf of the German Federal Ministry of Education, Science, Research and Technology, BMBF, under an R and D contract. (author)

  19. Semantics-Based Intelligent Indexing and Retrieval of Digital Images - A Case Study

    Science.gov (United States)

    Osman, Taha; Thakker, Dhavalkumar; Schaefer, Gerald

    The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they typically rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this chapter we present a semantically enabled image annotation and retrieval engine that is designed to satisfy the requirements of commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as presenting our initial thoughts on exploiting lexical databases for explicit semantic-based query expansion.

  20. Indexing, learning and content-based retrieval for special purpose image databases

    NARCIS (Netherlands)

    M.J. Huiskes (Mark); E.J. Pauwels (Eric)

    2005-01-01

    textabstractThis chapter deals with content-based image retrieval in special purpose image databases. As image data is amassed ever more effortlessly, building efficient systems for searching and browsing of image databases becomes increasingly urgent. We provide an overview of the current

  1. Combining textual and visual information for image retrieval in the medical domain.

    Science.gov (United States)

    Gkoufas, Yiannis; Morou, Anna; Kalamboukis, Theodore

    2011-01-01

    In this article we have assembled the experience obtained from our participation in the imageCLEF evaluation task over the past two years. Exploitation on the use of linear combinations for image retrieval has been attempted by combining visual and textual sources of images. From our experiments we conclude that a mixed retrieval technique that applies both textual and visual retrieval in an interchangeably repeated manner improves the performance while overcoming the scalability limitations of visual retrieval. In particular, the mean average precision (MAP) has increased from 0.01 to 0.15 and 0.087 for 2009 and 2010 data, respectively, when content-based image retrieval (CBIR) is performed on the top 1000 results from textual retrieval based on natural language processing (NLP).

  2. Smart Images Search based on Visual Features Fusion

    International Nuclear Information System (INIS)

    Saad, M.H.

    2013-01-01

    Image search engines attempt to give fast and accurate access to the wide range of the huge amount images available on the Internet. There have been a number of efforts to build search engines based on the image content to enhance search results. Content-Based Image Retrieval (CBIR) systems have achieved a great interest since multimedia files, such as images and videos, have dramatically entered our lives throughout the last decade. CBIR allows automatically extracting target images according to objective visual contents of the image itself, for example its shapes, colors and textures to provide more accurate ranking of the results. The recent approaches of CBIR differ in terms of which image features are extracted to be used as image descriptors for matching process. This thesis proposes improvements of the efficiency and accuracy of CBIR systems by integrating different types of image features. This framework addresses efficient retrieval of images in large image collections. A comparative study between recent CBIR techniques is provided. According to this study; image features need to be integrated to provide more accurate description of image content and better image retrieval accuracy. In this context, this thesis presents new image retrieval approaches that provide more accurate retrieval accuracy than previous approaches. The first proposed image retrieval system uses color, texture and shape descriptors to form the global features vector. This approach integrates the yc b c r color histogram as a color descriptor, the modified Fourier descriptor as a shape descriptor and modified Edge Histogram as a texture descriptor in order to enhance the retrieval results. The second proposed approach integrates the global features vector, which is used in the first approach, with the SURF salient point technique as local feature. The nearest neighbor matching algorithm with a proposed similarity measure is applied to determine the final image rank. The second approach

  3. A Novel Integrated Algorithm for Wind Vector Retrieval from Conically Scanning Scatterometers

    Directory of Open Access Journals (Sweden)

    Xuetong Xie

    2013-11-01

    Full Text Available Due to the lower efficiency and the larger wind direction error of traditional algorithms, a novel integrated wind retrieval algorithm is proposed for conically scanning scatterometers. The proposed algorithm has the dual advantages of less computational cost and higher wind direction retrieval accuracy by integrating the wind speed standard deviation (WSSD algorithm and the wind direction interval retrieval (DIR algorithm. It adopts wind speed standard deviation as a criterion for searching possible wind vector solutions and retrieving a potential wind direction interval based on the change rate of the wind speed standard deviation. Moreover, a modified three-step ambiguity removal method is designed to let more wind directions be selected in the process of nudging and filtering. The performance of the new algorithm is illustrated by retrieval experiments using 300 orbits of SeaWinds/QuikSCAT L2A data (backscatter coefficients at 25 km resolution and co-located buoy data. Experimental results indicate that the new algorithm can evidently enhance the wind direction retrieval accuracy, especially in the nadir region. In comparison with the SeaWinds L2B Version 2 25 km selected wind product (retrieved wind fields, an improvement of 5.1° in wind direction retrieval can be made by the new algorithm for that region.

  4. Elsevier’s approach to the bioCADDIE 2016 Dataset Retrieval Challenge

    Science.gov (United States)

    Scerri, Antony; Kuriakose, John; Deshmane, Amit Ajit; Stanger, Mark; Moore, Rebekah; Naik, Raj; de Waard, Anita

    2017-01-01

    Abstract We developed a two-stream, Apache Solr-based information retrieval system in response to the bioCADDIE 2016 Dataset Retrieval Challenge. One stream was based on the principle of word embeddings, the other was rooted in ontology based indexing. Despite encountering several issues in the data, the evaluation procedure and the technologies used, the system performed quite well. We provide some pointers towards future work: in particular, we suggest that more work in query expansion could benefit future biomedical search engines. Database URL: https://data.mendeley.com/datasets/zd9dxpyybg/1 PMID:29220454

  5. Exploring Contextual Models in Chemical Patent Search

    Science.gov (United States)

    Urbain, Jay; Frieder, Ophir

    We explore the development of probabilistic retrieval models for integrating term statistics with entity search using multiple levels of document context to improve the performance of chemical patent search. A distributed indexing model was developed to enable efficient named entity search and aggregation of term statistics at multiple levels of patent structure including individual words, sentences, claims, descriptions, abstracts, and titles. The system can be scaled to an arbitrary number of compute instances in a cloud computing environment to support concurrent indexing and query processing operations on large patent collections.

  6. Assessors' Search Result Satisfaction Associated with Relevance in a Scientific Domain

    DEFF Research Database (Denmark)

    Ingwersen, Peter; Lykke, Marianne; Bogers, Toine

    2010-01-01

    genuine information tasks. Ease of assessment and search satisfaction are cross tabulated with retrieval performance measured by Normalized Discounted Cumulated Gain. Results show that when assessors find small numbers of relevant documents they tend to regard the search results with dissatisfaction and...

  7. PIE the search: searching PubMed literature for protein interaction information.

    Science.gov (United States)

    Kim, Sun; Kwon, Dongseop; Shin, Soo-Yong; Wilbur, W John

    2012-02-15

    Finding protein-protein interaction (PPI) information from literature is challenging but an important issue. However, keyword search in PubMed(®) is often time consuming because it requires a series of actions that refine keywords and browse search results until it reaches a goal. Due to the rapid growth of biomedical literature, it has become more difficult for biologists and curators to locate PPI information quickly. Therefore, a tool for prioritizing PPI informative articles can be a useful assistant for finding this PPI-relevant information. PIE (Protein Interaction information Extraction) the search is a web service implementing a competition-winning approach utilizing word and syntactic analyses by machine learning techniques. For easy user access, PIE the search provides a PubMed-like search environment, but the output is the list of articles prioritized by PPI confidence scores. By obtaining PPI-related articles at high rank, researchers can more easily find the up-to-date PPI information, which cannot be found in manually curated PPI databases. http://www.ncbi.nlm.nih.gov/IRET/PIE/.

  8. Logistics and safety of extracorporeal membrane oxygenation in medical retrieval.

    Science.gov (United States)

    Burns, Brian J; Habig, Karel; Reid, Cliff; Kernick, Paul; Wilkinson, Chris; Tall, Gary; Coombes, Sarah; Manning, Ron

    2011-01-01

    This article reviews the logistics and safety of extracorporeal membrane oxygenation (ECMO) medical retrieval in New South Wales, Australia. We describe the logistics involved in ECMO road and rotary-wing retrieval by a multidisciplinary team during the H1N1 influenza epidemic in winter 2009 (i.e., June 1 to August 31, 2009). Basic patient demographics and key retrieval time lines were analyzed. There were 17 patients retrieved on ECMO, with their ages ranging from 22 to 55 years. The median weight was 110 kg. Four critical events were recorded during retrieval, with no adverse outcomes. The retrieval distance varied from 20.8 to 430 km. There were delays in times from retrieval booking to both retrieval tasking and retrieval team departure in 88% of retrievals. The most common reasons cited were "patient not ready" 23.5% (4/17); "vehicle not available," 23.5% (4/17); and "complex retrieval," 41.2% (7/17). The median time (hours:minutes) from booking with the medical retrieval unit (MRU) to tasking was 4:35 (interquartile range [IQR] 3:27-6:15). The median time lag from tasking to departure was 1:00 (IQR 00:10-2:20). The median stabilization time was 1:30 (IQR 1:20-1:55). The median retrieval duration was 7:35 (IQR 5:50-10:15). The process of development of ECMO retrieval was enabled by the preexistence of a high-volume experienced medical retrieval service. Although ECMO retrieval is not a new concept, we describe an entire process for ECMO retrieval that we believe will benefit other retrieval service providers. The increased workload of ECMO retrieval during the swine flu pandemic has led to refinement in the system and process for the future.

  9. Needle Custom Search: Recall-oriented search on the Web using semantic annotations

    NARCIS (Netherlands)

    Kaptein, Rianne; Koot, Gijs; Huis in 't Veld, Mirjam A.A.; van den Broek, Egon; de Rijke, Maarten; Kenter, Tom; de Vries, A.P.; Zhai, Chen Xiang; de Jong, Franciska M.G.; Radinsky, Kira; Hofmann, Katja

    Web search engines are optimized for early precision, which makes it difficult to perform recall-oriented tasks using these search engines. In this article, we present our tool Needle Custom Search. This tool exploits semantic annotations of Web search results and, thereby, increase the efficiency

  10. Needle Custom Search : Recall-oriented search on the web using semantic annotations

    NARCIS (Netherlands)

    Kaptein, Rianne; Koot, Gijs; Huis in 't Veld, Mirjam A.A.; van den Broek, Egon L.

    2014-01-01

    Web search engines are optimized for early precision, which makes it difficult to perform recall-oriented tasks using these search engines. In this article, we present our tool Needle Custom Search. This tool exploits semantic annotations of Web search results and, thereby, increase the efficiency

  11. Assessing the performance of methodological search filters to improve the efficiency of evidence information retrieval: five literature reviews and a qualitative study.

    Science.gov (United States)

    Lefebvre, Carol; Glanville, Julie; Beale, Sophie; Boachie, Charles; Duffy, Steven; Fraser, Cynthia; Harbour, Jenny; McCool, Rachael; Smith, Lynne

    2017-11-01

    predominantly used to reduce large numbers of retrieved records and to introduce focus. The Inter Technology Appraisal Support Collaboration (InterTASC) Information Specialists' Sub-Group (ISSG) Search Filters Resource was most frequently mentioned by both groups as the resource consulted to select a filter. Randomised controlled trial (RCT) and systematic review filters, in particular the Cochrane RCT and the McMaster Hedges filters, were most frequently mentioned. The majority indicated that they used different filters depending on the requirement for sensitivity or precision. Over half of the respondents used the filters available in databases. Interviewees used various approaches when using and adapting search filters. Respondents suggested that the main factors that would make choosing a filter easier were the availability of critical appraisals and more detailed performance information. Provenance and having the filter available in a central storage location were also important. The questionnaire could have been shorter and could have included more multiple choice questions, and the reviews of filter performance focused on only four study designs. Search filter studies should use a representative reference standard and explicitly report methods and results. Performance measures should be presented systematically and clearly. Searchers find filters useful in certain circumstances but expressed a need for more user-friendly performance information to aid filter choice. We suggest approaches to use, adapt and report search filter performance. Future work could include research around search filters and performance measures for study designs not addressed here, exploration of alternative methods of displaying performance results and numerical synthesis of performance comparison results. The National Institute for Health Research (NIHR) Health Technology Assessment programme and Medical Research Council-NIHR Methodology Research Programme (grant number G0901496).

  12. Sex effects on spatial learning but not on spatial memory retrieval in healthy young adults.

    Science.gov (United States)

    Piber, Dominique; Nowacki, Jan; Mueller, Sven C; Wingenfeld, Katja; Otte, Christian

    2018-01-15

    Sex differences have been found in spatial learning and spatial memory, with several studies indicating that males outperform females. We tested in the virtual Morris Water Maze (vMWM) task, whether sex differences in spatial cognitive processes are attributable to differences in spatial learning or spatial memory retrieval in a large student sample. We tested 90 healthy students (45 women and 45 men) with a mean age of 23.5 years (SD=3.5). Spatial learning and spatial memory retrieval were measured by using the vMWM task, during which participants had to search a virtual pool for a hidden platform, facilitated by visual cues surrounding the pool. Several learning trials assessed spatial learning, while a separate probe trial assessed spatial memory retrieval. We found a significant sex effect during spatial learning, with males showing shorter latency and shorter path length, as compared to females (all pretrieval (p=0.615). Furthermore, post-hoc analyses revealed significant sex differences in spatial search strategies (pretrieval. Our study raises the question, whether men and women use different learning strategies, which nevertheless result in equal performances of spatial memory retrieval. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Evolutionary Computing Methods for Spectral Retrieval

    Science.gov (United States)

    Terrile, Richard; Fink, Wolfgang; Huntsberger, Terrance; Lee, Seugwon; Tisdale, Edwin; VonAllmen, Paul; Tinetti, Geivanna

    2009-01-01

    A methodology for processing spectral images to retrieve information on underlying physical, chemical, and/or biological phenomena is based on evolutionary and related computational methods implemented in software. In a typical case, the solution (the information that one seeks to retrieve) consists of parameters of a mathematical model that represents one or more of the phenomena of interest. The methodology was developed for the initial purpose of retrieving the desired information from spectral image data acquired by remote-sensing instruments aimed at planets (including the Earth). Examples of information desired in such applications include trace gas concentrations, temperature profiles, surface types, day/night fractions, cloud/aerosol fractions, seasons, and viewing angles. The methodology is also potentially useful for retrieving information on chemical and/or biological hazards in terrestrial settings. In this methodology, one utilizes an iterative process that minimizes a fitness function indicative of the degree of dissimilarity between observed and synthetic spectral and angular data. The evolutionary computing methods that lie at the heart of this process yield a population of solutions (sets of the desired parameters) within an accuracy represented by a fitness-function value specified by the user. The evolutionary computing methods (ECM) used in this methodology are Genetic Algorithms and Simulated Annealing, both of which are well-established optimization techniques and have also been described in previous NASA Tech Briefs articles. These are embedded in a conceptual framework, represented in the architecture of the implementing software, that enables automatic retrieval of spectral and angular data and analysis of the retrieved solutions for uniqueness.

  14. Associative conceptual space-based information retrieval systems

    NARCIS (Netherlands)

    M.J. Schuemie (Martijn); J.H. van den Berg (Jan)

    1998-01-01

    textabstractIn this `Information Era' with the availability of large collections of books, articles, journals, CD-ROMs, video films and so on, there exists an increasing need for intelligent information retrieval systems that enable users to find the information desired easily. Many attempts have

  15. Bat-Inspired Algorithm Based Query Expansion for Medical Web Information Retrieval.

    Science.gov (United States)

    Khennak, Ilyes; Drias, Habiba

    2017-02-01

    With the increasing amount of medical data available on the Web, looking for health information has become one of the most widely searched topics on the Internet. Patients and people of several backgrounds are now using Web search engines to acquire medical information, including information about a specific disease, medical treatment or professional advice. Nonetheless, due to a lack of medical knowledge, many laypeople have difficulties in forming appropriate queries to articulate their inquiries, which deem their search queries to be imprecise due the use of unclear keywords. The use of these ambiguous and vague queries to describe the patients' needs has resulted in a failure of Web search engines to retrieve accurate and relevant information. One of the most natural and promising method to overcome this drawback is Query Expansion. In this paper, an original approach based on Bat Algorithm is proposed to improve the retrieval effectiveness of query expansion in medical field. In contrast to the existing literature, the proposed approach uses Bat Algorithm to find the best expanded query among a set of expanded query candidates, while maintaining low computational complexity. Moreover, this new approach allows the determination of the length of the expanded query empirically. Numerical results on MEDLINE, the on-line medical information database, show that the proposed approach is more effective and efficient compared to the baseline.

  16. Comparison of the efficacy of three PubMed search filters in finding randomized controlled trials to answer clinical questions.

    Science.gov (United States)

    Yousefi-Nooraie, Reza; Irani, Shirin; Mortaz-Hedjri, Soroush; Shakiba, Behnam

    2013-10-01

    The aim of this study was to compare the performance of three search methods in the retrieval of relevant clinical trials from PubMed to answer specific clinical questions. Included studies of a sample of 100 Cochrane reviews which recorded in PubMed were considered as the reference standard. The search queries were formulated based on the systematic review titles. Precision, recall and number of retrieved records for limiting the results to clinical trial publication type, and using sensitive and specific clinical queries filters were compared. The number of keywords, presence of specific names of intervention and syndrome in the search keywords were used in a model to predict the recalls and precisions. The Clinical queries-sensitive search strategy retrieved the largest number of records (33) and had the highest recall (41.6%) and lowest precision (4.8%). The presence of specific intervention name was the only significant predictor of all recalls and precisions (P = 0.016). The recall and precision of combination of simple clinical search queries and methodological search filters to find clinical trials in various subjects were considerably low. The limit field strategy yielded in higher precision and fewer retrieved records and approximately similar recall, compared with the clinical queries-sensitive strategy. Presence of specific intervention name in the search keywords increased both recall and precision. © 2010 John Wiley & Sons Ltd.

  17. Evaluation of an automatic weighting in a Boolean retrieval system with thesaurus, 2

    International Nuclear Information System (INIS)

    Ebinuma, Yukio

    1980-01-01

    Ranking performances created by the weighting of search terms are manifested with the same search topics and data files as in the test for the weighting of index terms. Weighted looser queries evince favourable retrieval performances on an average, though these are slightly inferior to conventional man prepared medium and tighter queries. The weight generation sample taken from a data file for over two months can be applied roughly to weighting in another data file. Degrees of improvement in precision and loss in recall at each ranked score position in other weight applied outputs are estimated from rank-precision curves for some weight generation samples. This relevance ranking scheme results in a considerable benefit in most of retrievals but is inferior to the weighting of index terms in view of applicability and ranking performances. (author)

  18. How doctors apply semantic components to specify search in work-related information retrieval

    DEFF Research Database (Denmark)

    Lykke, Marianne; Price, Susan L.; Delcambre, Lois L. M.

    2012-01-01

    Workplace searching is often context-specific and targets a “right answer” within some domain-specific aspect of the search topic. We have developed the semantic component (SC) model that allows searchers to specify a search within context-specific aspects of the main topic of documents. The goal...

  19. Intelligence as the efficiency of cue-driven retrieval from secondary memory.

    Science.gov (United States)

    Liesefeld, Heinrich René; Hoffmann, Eugenia; Wentura, Dirk

    2016-01-01

    Complex-span (working-memory-capacity) tasks are among the most successful predictors of intelligence. One important contributor to this relationship is the ability to efficiently employ cues for the retrieval from secondary memory. Presumably, intelligent individuals can considerably restrict their memory search sets by using such cues and can thereby improve recall performance. We here test this assumption by experimentally manipulating the validity of retrieval cues. When memoranda are drawn from the same semantic category on two successive trials of a verbal complex-span task, the category is a very strong retrieval cue on its first occurrence (strong-cue trial) but loses some of its validity on its second occurrence (weak-cue trial). If intelligent individuals make better use of semantic categories as retrieval cues, their recall accuracy suffers more from this loss of cue validity. Accordingly, our results show that less variance in intelligence is explained by recall accuracy on weak-cue compared with strong-cue trials.

  20. Evaluation of gastroenterology and hepatology articles on Wikipedia: are they suitable as learning resources for medical students?

    Science.gov (United States)

    Azer, Samy A

    2014-02-01

    With the changes introduced to medical curricula, medical students use learning resources on the Internet such as Wikipedia. However, the credibility of the medical content of Wikipedia has been questioned and there is no evidence to respond to these concerns. The aim of this paper was to critically evaluate the accuracy and reliability of the gastroenterology and hepatology information that medical students retrieve from Wikipedia. The Wikipedia website was searched for articles on gastroenterology and hepatology on 28 May 2013. Copies of these articles were evaluated by three assessors independently using an appraisal form modified from the DISCERN instrument. The articles were scored for accuracy of content, readability, frequency of updating, and quality of references. A total of 39 articles were evaluated. Although the articles appeared to be well cited and reviewed regularly, several problems were identified with regard to depth of discussion of mechanisms and pathogenesis of diseases, as well as poor elaboration on different investigations. Analysis of the content showed a score ranging from 15.6±0.6 to 43.6±3.2 (mean±SD). The total number of references in all articles was 1233, and the number of references varied from 4 to 144 (mean±SD, 31.6±27.3). The number of citations from peer-reviewed journals published in the last 5 years was 242 (28%); however, several problems were identified in the list of references and citations made. The readability of articles was in the range of -8.0±55.7 to 44.4±1.4; for all articles the readability was 26±9.0 (mean±SD). The concordance between the assessors on applying the criteria had mean κ scores in the range of 0.61 to 0.79. Wikipedia is not a reliable source of information for medical students searching for gastroenterology and hepatology articles. Several limitations, deficiencies, and scientific errors have been identified in the articles examined.

  1. Composition-based statistics and translated nucleotide searches: Improving the TBLASTN module of BLAST

    Directory of Open Access Journals (Sweden)

    Schäffer Alejandro A

    2006-12-01

    Full Text Available Abstract Background TBLASTN is a mode of operation for BLAST that aligns protein sequences to a nucleotide database translated in all six frames. We present the first description of the modern implementation of TBLASTN, focusing on new techniques that were used to implement composition-based statistics for translated nucleotide searches. Composition-based statistics use the composition of the sequences being aligned to generate more accurate E-values, which allows for a more accurate distinction between true and false matches. Until recently, composition-based statistics were available only for protein-protein searches. They are now available as a command line option for recent versions of TBLASTN and as an option for TBLASTN on the NCBI BLAST web server. Results We evaluate the statistical and retrieval accuracy of the E-values reported by a baseline version of TBLASTN and by two variants that use different types of composition-based statistics. To test the statistical accuracy of TBLASTN, we ran 1000 searches using scrambled proteins from the mouse genome and a database of human chromosomes. To test retrieval accuracy, we modernize and adapt to translated searches a test set previously used to evaluate the retrieval accuracy of protein-protein searches. We show that composition-based statistics greatly improve the statistical accuracy of TBLASTN, at a small cost to the retrieval accuracy. Conclusion TBLASTN is widely used, as it is common to wish to compare proteins to chromosomes or to libraries of mRNAs. Composition-based statistics improve the statistical accuracy, and therefore the reliability, of TBLASTN results. The algorithms used by TBLASTN are not widely known, and some of the most important are reported here. The data used to test TBLASTN are available for download and may be useful in other studies of translated search algorithms.

  2. A survey on visual information search behavior and requirements of radiologists.

    Science.gov (United States)

    Markonis, D; Holzer, M; Dungs, S; Vargas, A; Langs, G; Kriewel, S; Müller, H

    2012-01-01

    The main objective of this study is to learn more on the image use and search requirements of radiologists. These requirements will then be taken into account to develop a new search system for images and associated meta data search in the Khresmoi project. Observations of the radiology workflow, case discussions and a literature review were performed to construct a survey form that was given online and in paper form to radiologists. Eye tracking was performed on a radiology viewing station to analyze typical tasks and to complement the survey. In total 34 radiologists answered the survey online or on paper. Image search was mentioned as a frequent and common task, particularly for finding cases of interest for differential diagnosis. Sources of information besides the Internet are books and discussions with colleagues. Search for images is unsuccessful in around 25% of the cases, stopping the search after around 10 minutes. The most common reason for failure is that target images are considered rare. Important additions for search requested in the survey are filtering by pathology and modality, as well as search for visually similar images and cases. Few radiologists are familiar with visual retrieval but they desire the option to upload images for searching similar ones. Image search is common in radiology but few radiologists are fully aware of visual information retrieval. Taking into account the many unsuccessful searches and time spent for this, a good image search could improve the situation and help in clinical practice.

  3. Automated information retrieval using CLIPS

    Science.gov (United States)

    Raines, Rodney Doyle, III; Beug, James Lewis

    1991-01-01

    Expert systems have considerable potential to assist computer users in managing the large volume of information available to them. One possible use of an expert system is to model the information retrieval interests of a human user and then make recommendations to the user as to articles of interest. At Cal Poly, a prototype expert system written in the C Language Integrated Production System (CLIPS) serves as an Automated Information Retrieval System (AIRS). AIRS monitors a user's reading preferences, develops a profile of the user, and then evaluates items returned from the information base. When prompted by the user, AIRS returns a list of items of interest to the user. In order to minimize the impact on system resources, AIRS is designed to run in the background during periods of light system use.

  4. Affective Music Information Retrieval

    OpenAIRE

    Wang, Ju-Chiang; Yang, Yi-Hsuan; Wang, Hsin-Min

    2015-01-01

    Much of the appeal of music lies in its power to convey emotions/moods and to evoke them in listeners. In consequence, the past decade witnessed a growing interest in modeling emotions from musical signals in the music information retrieval (MIR) community. In this article, we present a novel generative approach to music emotion modeling, with a specific focus on the valence-arousal (VA) dimension model of emotion. The presented generative model, called \\emph{acoustic emotion Gaussians} (AEG)...

  5. Interactions among emotional attention, encoding, and retrieval of ambiguous information: An eye-tracking study.

    Science.gov (United States)

    Everaert, Jonas; Koster, Ernst H W

    2015-10-01

    Emotional biases in attention modulate encoding of emotional material into long-term memory, but little is known about the role of such attentional biases during emotional memory retrieval. The present study investigated how emotional biases in memory are related to attentional allocation during retrieval. Forty-nine individuals encoded emotionally positive and negative meanings derived from ambiguous information and then searched their memory for encoded meanings in response to a set of retrieval cues. The remember/know/new procedure was used to classify memories as recollection-based or familiarity-based, and gaze behavior was monitored throughout the task to measure attentional allocation. We found that a bias in sustained attention during recollection-based, but not familiarity-based, retrieval predicted subsequent memory bias toward positive versus negative material following encoding. Thus, during emotional memory retrieval, attention affects controlled forms of retrieval (i.e., recollection) but does not modulate relatively automatic, familiarity-based retrieval. These findings enhance understanding of how distinct components of attention regulate the emotional content of memories. Implications for theoretical models and emotion regulation are discussed. (c) 2015 APA, all rights reserved).

  6. Formal Concept Analysis for Information Retrieval

    OpenAIRE

    Qadi, Abderrahim El; Aboutajedine, Driss; Ennouary, Yassine

    2010-01-01

    In this paper we describe a mechanism to improve Information Retrieval (IR) on the web. The method is based on Formal Concepts Analysis (FCA) that it is makes semantical relations during the queries, and allows a reorganizing, in the shape of a lattice of concepts, the answers provided by a search engine. We proposed for the IR an incremental algorithm based on Galois lattice. This algorithm allows a formal clustering of the data sources, and the results which it turns over are classified by ...

  7. Guenter Tulip Filter Retrieval Experience: Predictors of Successful Retrieval

    International Nuclear Information System (INIS)

    Turba, Ulku Cenk; Arslan, Bulent; Meuse, Michael; Sabri, Saher; Macik, Barbara Gail; Hagspiel, Klaus D.; Matsumoto, Alan H.; Angle, John F.

    2010-01-01

    We report our experience with Guenter Tulip filter placement indications, retrievals, and procedural problems, with emphasis on alternative retrieval techniques. We have identified 92 consecutive patients in whom a Guenter Tulip filter was placed and filter removal attempted. We recorded patient demographic information, filter placement and retrieval indications, procedures, standard and nonstandard filter retrieval techniques, complications, and clinical outcomes. The mean time to retrieval for those who experienced filter strut penetration was statistically significant [F(1,90) = 8.55, p = 0.004]. Filter strut(s) IVC penetration and successful retrieval were found to be statistically significant (p = 0.043). The filter hook-IVC relationship correlated with successful retrieval. A modified guidewire loop technique was applied in 8 of 10 cases where the hook appeared to penetrate the IVC wall and could not be engaged with a loop snare catheter, providing additional technical success in 6 of 8 (75%). Therefore, the total filter retrieval success increased from 88 to 95%. In conclusion, the Guenter Tulip filter has high successful retrieval rates with low rates of complication. Additional maneuvers such as a guidewire loop method can be used to improve retrieval success rates when the filter hook is endothelialized.

  8. An Improved Forensic Science Information Search.

    Science.gov (United States)

    Teitelbaum, J

    2015-01-01

    Although thousands of search engines and databases are available online, finding answers to specific forensic science questions can be a challenge even to experienced Internet users. Because there is no central repository for forensic science information, and because of the sheer number of disciplines under the forensic science umbrella, forensic scientists are often unable to locate material that is relevant to their needs. The author contends that using six publicly accessible search engines and databases can produce high-quality search results. The six resources are Google, PubMed, Google Scholar, Google Books, WorldCat, and the National Criminal Justice Reference Service. Carefully selected keywords and keyword combinations, designating a keyword phrase so that the search engine will search on the phrase and not individual keywords, and prompting search engines to retrieve PDF files are among the techniques discussed. Copyright © 2015 Central Police University.

  9. Interactions among emotional attention, encoding, and retrieval of ambiguous information: an eye-tracking study

    OpenAIRE

    Everaert, Jonas; Koster, Ernst

    2015-01-01

    Emotional biases in attention modulate encoding of emotional material into long-term memory, but little is known about the role of such attentional biases during emotional memory retrieval. The present study investigated how emotional biases in memory are related to attentional allocation during retrieval. Forty-nine individuals encoded emotionally positive and negative meanings derived from ambiguous information and then searched their memory for encoded meanings in response to a set of retr...

  10. Ontology lexicalization: Relationship between content and meaning in the context of Information Retrieval

    Directory of Open Access Journals (Sweden)

    Marcelo SCHIESSL

    Full Text Available Abstract The proposal presented in this study seeks to properly represent natural language to ontologies and vice-versa. Therefore, the semi-automatic creation of a lexical database in Brazilian Portuguese containing morphological, syntactic, and semantic information that can be read by machines was proposed, allowing the link between structured and unstructured data and its integration into an information retrieval model to improve precision. The results obtained demonstrated that the methodology can be used in the risco financeiro (financial risk domain in Portuguese for the construction of an ontology and the lexical-semantic database and the proposal of a semantic information retrieval model. In order to evaluate the performance of the proposed model, documents containing the main definitions of the financial risk domain were selected and indexed with and without semantic annotation. To enable the comparison between the approaches, two databases were created based on the texts with the semantic annotations to represent the semantic search. The first one represents the traditional search and the second contained the index built based on the texts with the semantic annotations to represent the semantic search. The evaluation of the proposal was based on recall and precision. The queries submitted to the model showed that the semantic search outperforms the traditional search and validates the methodology used. Although more complex, the procedure proposed can be used in all kinds of domains.

  11. EM-21 Retrieval Knowledge Center: Waste Retrieval Challenges

    Energy Technology Data Exchange (ETDEWEB)

    Fellinger, Andrew P.; Rinker, Michael W.; Berglin, Eric J.; Minichan, Richard L.; Poirier, Micheal R.; Gauglitz, Phillip A.; Martin, Bruce A.; Hatchell, Brian K.; Saldivar, Eloy; Mullen, O Dennis; Chapman, Noel F.; Wells, Beric E.; Gibbons, Peter W.

    2009-04-10

    EM-21 is the Waste Processing Division of the Office of Engineering and Technology, within the U.S. Department of Energy’s (DOE) Office of Environmental Management (EM). In August of 2008, EM-21 began an initiative to develop a Retrieval Knowledge Center (RKC) to provide the DOE, high level waste retrieval operators, and technology developers with centralized and focused location to share knowledge and expertise that will be used to address retrieval challenges across the DOE complex. The RKC is also designed to facilitate information sharing across the DOE Waste Site Complex through workshops, and a searchable database of waste retrieval technology information. The database may be used to research effective technology approaches for specific retrieval tasks and to take advantage of the lessons learned from previous operations. It is also expected to be effective for remaining current with state-of-the-art of retrieval technologies and ongoing development within the DOE Complex. To encourage collaboration of DOE sites with waste retrieval issues, the RKC team is co-led by the Savannah River National Laboratory (SRNL) and the Pacific Northwest National Laboratory (PNNL). Two RKC workshops were held in the Fall of 2008. The purpose of these workshops was to define top level waste retrieval functional areas, exchange lessons learned, and develop a path forward to support a strategic business plan focused on technology needs for retrieval. The primary participants involved in these workshops included retrieval personnel and laboratory staff that are associated with Hanford and Savannah River Sites since the majority of remaining DOE waste tanks are located at these sites. This report summarizes and documents the results of the initial RKC workshops. Technology challenges identified from these workshops and presented here are expected to be a key component to defining future RKC-directed tasks designed to facilitate tank waste retrieval solutions.

  12. EM-21 Retrieval Knowledge Center: Waste Retrieval Challenges

    International Nuclear Information System (INIS)

    Fellinger, Andrew P.; Rinker, Michael W.; Berglin, Eric J.; Minichan, Richard L.; Poirier, Micheal R.; Gauglitz, Phillip A.; Martin, Bruce A.; Hatchell, Brian K.; Saldivar, Eloy; Mullen, O Dennis; Chapman, Noel F.; Wells, Beric E.; Gibbons, Peter W.

    2009-01-01

    EM-21 is the Waste Processing Division of the Office of Engineering and Technology, within the U.S. Department of Energy's (DOE) Office of Environmental Management (EM). In August of 2008, EM-21 began an initiative to develop a Retrieval Knowledge Center (RKC) to provide the DOE, high level waste retrieval operators, and technology developers with centralized and focused location to share knowledge and expertise that will be used to address retrieval challenges across the DOE complex. The RKC is also designed to facilitate information sharing across the DOE Waste Site Complex through workshops, and a searchable database of waste retrieval technology information. The database may be used to research effective technology approaches for specific retrieval tasks and to take advantage of the lessons learned from previous operations. It is also expected to be effective for remaining current with state-of-the-art of retrieval technologies and ongoing development within the DOE Complex. To encourage collaboration of DOE sites with waste retrieval issues, the RKC team is co-led by the Savannah River National Laboratory (SRNL) and the Pacific Northwest National Laboratory (PNNL). Two RKC workshops were held in the Fall of 2008. The purpose of these workshops was to define top level waste retrieval functional areas, exchange lessons learned, and develop a path forward to support a strategic business plan focused on technology needs for retrieval. The primary participants involved in these workshops included retrieval personnel and laboratory staff that are associated with Hanford and Savannah River Sites since the majority of remaining DOE waste tanks are located at these sites. This report summarizes and documents the results of the initial RKC workshops. Technology challenges identified from these workshops and presented here are expected to be a key component to defining future RKC-directed tasks designed to facilitate tank waste retrieval solutions

  13. Development and empirical user-centered evaluation of semantically-based query recommendation for an electronic health record search engine.

    Science.gov (United States)

    Hanauer, David A; Wu, Danny T Y; Yang, Lei; Mei, Qiaozhu; Murkowski-Steffy, Katherine B; Vydiswaran, V G Vinod; Zheng, Kai

    2017-03-01

    The utility of biomedical information retrieval environments can be severely limited when users lack expertise in constructing effective search queries. To address this issue, we developed a computer-based query recommendation algorithm that suggests semantically interchangeable terms based on an initial user-entered query. In this study, we assessed the value of this approach, which has broad applicability in biomedical information retrieval, by demonstrating its application as part of a search engine that facilitates retrieval of information from electronic health records (EHRs). The query recommendation algorithm utilizes MetaMap to identify medical concepts from search queries and indexed EHR documents. Synonym variants from UMLS are used to expand the concepts along with a synonym set curated from historical EHR search logs. The empirical study involved 33 clinicians and staff who evaluated the system through a set of simulated EHR search tasks. User acceptance was assessed using the widely used technology acceptance model. The search engine's performance was rated consistently higher with the query recommendation feature turned on vs. off. The relevance of computer-recommended search terms was also rated high, and in most cases the participants had not thought of these terms on their own. The questions on perceived usefulness and perceived ease of use received overwhelmingly positive responses. A vast majority of the participants wanted the query recommendation feature to be available to assist in their day-to-day EHR search tasks. Challenges persist for users to construct effective search queries when retrieving information from biomedical documents including those from EHRs. This study demonstrates that semantically-based query recommendation is a viable solution to addressing this challenge. Published by Elsevier Inc.

  14. Field performance of the waste retrieval end effectors in the Oak Ridge gunite tanks

    International Nuclear Information System (INIS)

    Mullen, O.D.

    1997-09-01

    Waterjet-based tank waste retrieval end effectors have been developed by Retrieval Process Development and Enhancements through several generations of test articles targeted at deployment in Hanford underground storage tanks with a large robotic arm. The basic technology has demonstrated effectiveness for retrieval of simulants bounding a wide range of waste properties and compatibility with foreseen deployment systems. The Oak Ridge National Laboratory (ORNL) selected the waterjet scarifying end effector, the jet pump conveyance system, and the Modified Light Duty Utility Arm and Houdini Remotely Operated Vehicle deployment and manipulator systems for evaluation in the Gunite and Associated Tanks Treatability Study (GAAT-TS). The Retrieval Process Development and Enhancements (RPD ampersand E) team was tasked with developing a version of the retrieval end effector tailored to the Oak Ridge tanks, waste, and deployment platforms. The conceptual design was done by the University of Missouri-Rolla in FY 1995-96. The university researchers conducted separate effects tests of the component concepts, scaled the basic design features, and constructed a full-scale test article incorporating their findings in early FY 1996. The test article was extensively evaluated in the Hanford Hydraulic Testbed and the design features were further refined. Detail design of the prototype item was started at Waterjet Technology, Inc. before the development testing was finished, and two of the three main subassemblies were substantially complete before final design of the waterjet manifold was determined from the Hanford hydraulic testbed (HTB) testing. The manifold on the first prototype was optimized for sludge retrieval; assembled with that manifold, the end effector is termed the Sludge Retrieval End Effector (SREE)

  15. Corporate Author Entry Records Retrieved by Use of Derived Truncated Search Keys

    Directory of Open Access Journals (Sweden)

    Alan L. Landgraf

    1973-09-01

    Full Text Available An experiment was conducted to design a corporate author index to a large bibliographic file. The nature of corporate entries necessitates a different search key construction from that of personal names or titles. Derivation of a search key to select distinct corporate entry records is discussed.

  16. International patent applications for non-injectable naloxone for opioid overdose reversal: Exploratory search and retrieve analysis of the PatentScope database.

    Science.gov (United States)

    McDonald, Rebecca; Danielsson Glende, Øyvind; Dale, Ola; Strang, John

    2018-02-01

    : Exploratory search and retrieve analysis of the PatentScope database. Drug Alcohol Rev 2017;00:000-000]. © 2017 Australasian Professional Society on Alcohol and other Drugs.

  17. Representation and alignment of sung queries for music information retrieval

    Science.gov (United States)

    Adams, Norman H.; Wakefield, Gregory H.

    2005-09-01

    The pursuit of robust and rapid query-by-humming systems, which search melodic databases using sung queries, is a common theme in music information retrieval. The retrieval aspect of this database problem has received considerable attention, whereas the front-end processing of sung queries and the data structure to represent melodies has been based on musical intuition and historical momentum. The present work explores three time series representations for sung queries: a sequence of notes, a ``smooth'' pitch contour, and a sequence of pitch histograms. The performance of the three representations is compared using a collection of naturally sung queries. It is found that the most robust performance is achieved by the representation with highest dimension, the smooth pitch contour, but that this representation presents a formidable computational burden. For all three representations, it is necessary to align the query and target in order to achieve robust performance. The computational cost of the alignment is quadratic, hence it is necessary to keep the dimension small for rapid retrieval. Accordingly, iterative deepening is employed to achieve both robust performance and rapid retrieval. Finally, the conventional iterative framework is expanded to adapt the alignment constraints based on previous iterations, further expediting retrieval without degrading performance.

  18. Solving Large Clustering Problems with Meta-Heuristic Search

    DEFF Research Database (Denmark)

    Turkensteen, Marcel; Andersen, Kim Allan; Bang-Jensen, Jørgen

    In Clustering Problems, groups of similar subjects are to be retrieved from data sets. In this paper, Clustering Problems with the frequently used Minimum Sum-of-Squares Criterion are solved using meta-heuristic search. Tabu search has proved to be a successful methodology for solving optimization...... problems, but applications to large clustering problems are rare. The simulated annealing heuristic has mainly been applied to relatively small instances. In this paper, we implement tabu search and simulated annealing approaches and compare them to the commonly used k-means approach. We find that the meta-heuristic...

  19. Zede Journal: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  20. Philosophical Papers: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  1. Agro-Science: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  2. Sciences & Nature: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  3. Vulture News: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  4. Agronomie Africaine: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  5. African Environment: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  6. Mathematics Connection: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  7. Kenya Veterinarian: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  8. Ergonomics SA: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  9. Critical Arts: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  10. Africa Insight: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  11. Counsellor (The): Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  12. Nigerian Libraries: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  13. African Zoology: Advanced Search

    African Journals Online (AJOL)

    Search tips: Search terms are case-insensitive; Common words are ignored; By default only articles containing all terms in the query are returned (i.e., AND is implied); Combine multiple words with OR to find articles containing either term; e.g., education OR research; Use parentheses to create more complex queries; e.g., ...

  14. NASA Indexing Benchmarks: Evaluating Text Search Engines

    Science.gov (United States)

    Esler, Sandra L.; Nelson, Michael L.

    1997-01-01

    The current proliferation of on-line information resources underscores the requirement for the ability to index collections of information and search and retrieve them in a convenient manner. This study develops criteria for analytically comparing the index and search engines and presents results for a number of freely available search engines. A product of this research is a toolkit capable of automatically indexing, searching, and extracting performance statistics from each of the focused search engines. This toolkit is highly configurable and has the ability to run these benchmark tests against other engines as well. Results demonstrate that the tested search engines can be grouped into two levels. Level one engines are efficient on small to medium sized data collections, but show weaknesses when used for collections 100MB or larger. Level two search engines are recommended for data collections up to and beyond 100MB.

  15. T253. THE CORRELATION ANALYSIS BETWEEN RENAMING SCHIZOPHRENIA AND VISITING FREQUENCY OF MENTAL HEALTH SERVICES BY BIG DATA ANALYSIS (INTERNET SEARCHES AND NEWSPAPER ARTICLES) IN SOUTH KOREA

    Science.gov (United States)

    Lee, Sang Yup; Hong, Kyung Sue; Joo, Yeon Ho; Koike, Shinsuke; Lee, Yu Sang; Kwon, Jun Soo

    2018-01-01

    Abstract Background Korean Neuropsychiatric Association changed the Korean term for schizophrenia from ‘split-mind disorder’ to ‘attunement disorder’ in 2012, to dispel the stigma associated with name, and to promote early detection and treatment. Information on the internet affects the public awareness and attitude toward schizophrenia. The main purpose of this study was to investigate the correlation between renaming schizophrenia and the pattern of mental health services utilization by big data analysis of internet (newspaper articles and internet searches) in Korea. Methods From January 2016 to September 2017, newspaper articles on “attunement disorder” and “split-mind disorder” available on the internet were classified as related with negative images like crime and helpful or positive in dispelling the stigma. The relationship between the number of anti-stigma newspaper articles and newspaper articles of schizophrenia containing both positive and negative images was examined. In addition, using Naver, a major internet search engine in Korea, we investigated the total number of internet searches of both old and new name of schizophrenia by gender differences. Finally, the frequency of the visits of mental health services of patients with schizophrenia was measured using the Korean Healthcare Bigdata Hub (http://opendata.hira.or.kr/home.do#none) for 14 months and the correlation between the frequency of the visits and the above big data was examined. The data were analyzed using the SPSS/WIN 24.0. Pearson correlation coefficients were used to analyze correlations. Results The amounts of newspaper articles containing anti-stigma of schizophrenia were correlated with the amounts of newspaper articles containing negative images like crime of the new name (attunement disorder) of schizophrenia (r=0.528, p0.05) in next month was larger than the correlation of “split-mind disorder” searches with mental health services utilization (r = 0.082, p>0

  16. The 100 top-cited articles in orthodontics from 1975 to 2011.

    Science.gov (United States)

    Hui, Jifang; Han, Zongkai; Geng, Guannan; Yan, Weijun; Shao, Ping

    2013-05-01

    To identify the 100 top-cited articles published in orthodontics journals and to analyze their characteristics to investigate the achievement and development of orthodontics research in past decades. The Institute for Scientific Information Web of Knowledge Database and the 2011 Journal Citation Report Science Editions were used to retrieve the 100 top-cited articles published in orthodontics journals since 1975. Some basic information was collected by the Analyze Tool on the Web of Science, including citation time, publication title, journal name, publication year, and country and institution of origin. A further study was then performed to determine authorship, article type, field of study, study design, and level of evidence. The 100 target articles were retrieved from three journals: American Journal of Orthodontics and Dentofacial Orthopedics (n  =  74), The Angle Orthodontist (n = 15), and European Journal of Orthodontics (n  =  11). Since 1975, the articles cited 89 to 545 times mainly originated from the United States, and the overwhelming majority of articles were clinical. The most common study design was case series; 40 articles were classified as level IV and 12 as level V evidence. The 100 top-cited articles in orthodontics are generally old articles, rarely possessing high-level evidence.

  17. Similarity search of business process models

    NARCIS (Netherlands)

    Dumas, M.; García-Bañuelos, L.; Dijkman, R.M.

    2009-01-01

    Similarity search is a general class of problems in which a given object, called a query object, is compared against a collection of objects in order to retrieve those that most closely resemble the query object. This paper reviews recent work on an instance of this class of problems, where the

  18. Searching for Significance in Unstructured Data: Text Mining with Leximancer

    Science.gov (United States)

    Thomas, David A.

    2014-01-01

    Scholars in many knowledge domains rely on sophisticated information technologies to search for and retrieve records and publications pertinent to their research interests. But what is a scholar to do when a search identifies hundreds of documents, any of which might be vital or irrelevant to his or her work? The problem is further complicated by…

  19. BEST: Next-Generation Biomedical Entity Search Tool for Knowledge Discovery from Biomedical Literature.

    Directory of Open Access Journals (Sweden)

    Sunwon Lee

    Full Text Available As the volume of publications rapidly increases, searching for relevant information from the literature becomes more challenging. To complement standard search engines such as PubMed, it is desirable to have an advanced search tool that directly returns relevant biomedical entities such as targets, drugs, and mutations rather than a long list of articles. Some existing tools submit a query to PubMed and process retrieved abstracts to extract information at query time, resulting in a slow response time and limited coverage of only a fraction of the PubMed corpus. Other tools preprocess the PubMed corpus to speed up the response time; however, they are not constantly updated, and thus produce outdated results. Further, most existing tools cannot process sophisticated queries such as searches for mutations that co-occur with query terms in the literature. To address these problems, we introduce BEST, a biomedical entity search tool. BEST returns, as a result, a list of 10 different types of biomedical entities including genes, diseases, drugs, targets, transcription factors, miRNAs, and mutations that are relevant to a user's query. To the best of our knowledge, BEST is the only system that processes free text queries and returns up-to-date results in real time including mutation information in the results. BEST is freely accessible at http://best.korea.ac.kr.

  20. W-transform method for feature-oriented multiresolution image retrieval

    Energy Technology Data Exchange (ETDEWEB)

    Kwong, M.K.; Lin, B. [Argonne National Lab., IL (United States). Mathematics and Computer Science Div.

    1995-07-01

    Image database management is important in the development of multimedia technology. Since an enormous amount of digital images is likely to be generated within the next few decades in order to integrate computers, television, VCR, cables, telephone and various imaging devices. Effective image indexing and retrieval systems are urgently needed so that images can be easily organized, searched, transmitted, and presented. Here, the authors present a local-feature-oriented image indexing and retrieval method based on Kwong, and Tang`s W-transform. Multiresolution histogram comparison is an effective method for content-based image indexing and retrieval. However, most recent approaches perform multiresolution analysis for whole images but do not exploit the local features present in the images. Since W-transform is featured by its ability to handle images of arbitrary size, with no periodicity assumptions, it provides a natural tool for analyzing local image features and building indexing systems based on such features. In this approach, the histograms of the local features of images are used in the indexing, system. The system not only can retrieve images that are similar or identical to the query images but also can retrieve images that contain features specified in the query images, even if the retrieved images as a whole might be very different from the query images. The local-feature-oriented method also provides a speed advantage over the global multiresolution histogram comparison method. The feature-oriented approach is expected to be applicable in managing large-scale image systems such as video databases and medical image databases.

  1. Guenther tulip retrievable filter: why, when and how?

    International Nuclear Information System (INIS)

    Millward, S.F.

    2001-01-01

    Nonpermanent inferior vena cava (IVC) filters can be subdivided into temporary and retrievable filters. Temporary filters are attached to a catheter or guide wire. They have been extensively used in Europe, mainly for the prevention of pulmonary embolism (PE) during thrombolytic treatment for lower extremity deep vein thrombosis (DVT). However, the reported rate of recurrent PE in patients protected with a temporary filter appears to be similar, or even higher, than that seen in North American studies of thrombolytic treatment of DVT when no filter was used. Results obtained with temporary filters in other clinical situations have also been somewhat discouraging. Consequently, research efforts in North America with nonpermanent filters appear to have shifted predominantly to the study of retrievable filters. Retrievable filters, such as the Guenther Tulip filter (William Cook Europe, Bjaeverskov, Denmark) (Fig. 1), are permanent filters with a design feature (usually a hook that can be snared) to permit retrieval. They have an advantage over temporary filters in that they can be either left in place permanently or retrieved, whichever is most appropriate for a given patient. The Guenther Tulip filter is currently the only approved device from this versatile new class of retrievable filters. ln this article, I hope to offer some practical points on why, when and how the device should be used. (author)

  2. Concise quantum associative memories with nonlinear search algorithm

    International Nuclear Information System (INIS)

    Tchapet Njafa, J.P.; Nana Engo, S.G.

    2016-01-01

    The model of Quantum Associative Memories (QAM) we propose here consists in simplifying and generalizing that of Rigui Zhou et al. [1] which uses the quantum matrix with the binary decision diagram put forth by David Rosenbaum [2] and the Abrams and Lloyd's nonlinear search algorithm [3]. Our model gives the possibility to retrieve one of the sought states in multi-values retrieving scheme when a measurement is done on the first register in O(c-r) time complexity. It is better than Grover's algorithm and its modified form which need O(√((2 n )/(m))) steps when they are used as the retrieval algorithm. n is the number of qubits of the first register and m the number of x values for which f(x) = 1. As the nonlinearity makes the system highly susceptible to the noise, an analysis of the influence of the single qubit noise channels on the Nonlinear Search Algorithm of our model of QAM shows a fidelity of about 0.7 whatever the number of qubits existing in the first register, thus demonstrating the robustness of our model. (copyright 2016 WILEY-VCH Verlag GmbH and Co. KGaA, Weinheim)

  3. Retrievable Inferior Vena Cava Filters: Factors that Affect Retrieval Success

    Energy Technology Data Exchange (ETDEWEB)

    Geisbuesch, Philipp, E-mail: philippgeisbuesch@gmx.de; Benenati, James F.; Pena, Constantino S.; Couvillon, Joseph; Powell, Alex; Gandhi, Ripal; Samuels, Shaun; Uthoff, Heiko [Baptist Cardiac and Vascular Institute, Division of Vascular and Interventional Radiology (United States)

    2012-10-15

    Purpose: To report and analyze the indications, procedural success, and complications of retrievable inferior vena cava filters (rIVCF) placement and to identify parameters that influence retrieval attempt and failure. Methods: Between January 2005 and December 2010, a total of 200 patients (80 men, median age 67 years, range 11-95 years) received a rIVCF with the clinical possibility that it could be removed. All patients with rIVCF were prospectively entered into a database and followed until retrieval or a decision not to retrieve the filter was made. A retrospective analysis of this database was performed. Results: Sixty-one percent of patients had an accepted indication for filter placement; 39% of patients had a relative indication. There was a tendency toward a higher retrieval rate in patients with relative indications (40% vs. 55%, P = 0.076). Filter placement was technically successful in all patients, with no procedure-related mortality. The retrieval rate was 53%. Patient age of >80 years (odds ratio [OR] 0.056, P > 0.0001) and presence of malignancy (OR 0.303, P = 0.003) was associated with a significantly reduced probability for attempted retrieval. Retrieval failure occurred in 7% (6 of 91) of all retrieval attempts. A time interval of > 90 days between implantation and attempted retrieval was associated with retrieval failure (OR 19.8, P = 0.009). Conclusions: Patient age >80 years and a history of malignancy are predictors of a reduced probability for retrieval attempt. The rate of retrieval failure is low and seems to be associated with a time interval of >90 days between filter placement and retrieval.

  4. Retrievable Inferior Vena Cava Filters: Factors that Affect Retrieval Success

    International Nuclear Information System (INIS)

    Geisbüsch, Philipp; Benenati, James F.; Peña, Constantino S.; Couvillon, Joseph; Powell, Alex; Gandhi, Ripal; Samuels, Shaun; Uthoff, Heiko

    2012-01-01

    Purpose: To report and analyze the indications, procedural success, and complications of retrievable inferior vena cava filters (rIVCF) placement and to identify parameters that influence retrieval attempt and failure. Methods: Between January 2005 and December 2010, a total of 200 patients (80 men, median age 67 years, range 11–95 years) received a rIVCF with the clinical possibility that it could be removed. All patients with rIVCF were prospectively entered into a database and followed until retrieval or a decision not to retrieve the filter was made. A retrospective analysis of this database was performed. Results: Sixty-one percent of patients had an accepted indication for filter placement; 39% of patients had a relative indication. There was a tendency toward a higher retrieval rate in patients with relative indications (40% vs. 55%, P = 0.076). Filter placement was technically successful in all patients, with no procedure-related mortality. The retrieval rate was 53%. Patient age of >80 years (odds ratio [OR] 0.056, P > 0.0001) and presence of malignancy (OR 0.303, P = 0.003) was associated with a significantly reduced probability for attempted retrieval. Retrieval failure occurred in 7% (6 of 91) of all retrieval attempts. A time interval of > 90 days between implantation and attempted retrieval was associated with retrieval failure (OR 19.8, P = 0.009). Conclusions: Patient age >80 years and a history of malignancy are predictors of a reduced probability for retrieval attempt. The rate of retrieval failure is low and seems to be associated with a time interval of >90 days between filter placement and retrieval.

  5. Enriching PubMed Related Article Search with Sentence Level Co-citations

    Science.gov (United States)

    Tran, Nam; Alves, Pedro; Ma, Shuangge

    2009-01-01

    PubMed related article links identify closely related articles and enhance our ability to navigate the biomedical literature. They are derived by calculating the word similarity between two articles, relating articles with overlapping word content. In this paper, we propose to enrich PubMed with a new type of related article link based on citations within a single sentence (i.e. sentence level co-citations or SLCs). Using different similarity metrics, we demonstrated that articles linked by SLCs are highly related. We also showed that only half of SLCs are found among PubMed related article links. Additionally, we discuss how the citing sentence of an SLC explains the connection between two articles. PMID:20351935

  6. Hybrid ontology for semantic information retrieval model using keyword matching indexing system.

    Science.gov (United States)

    Uthayan, K R; Mala, G S Anandha

    2015-01-01

    Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology.

  7. The Electronic Data and Retrieval of the Secret History of the Mongols

    Directory of Open Access Journals (Sweden)

    Di Jiang

    2007-07-01

    Full Text Available This paper discusses the principle of electronic data and retrieval methods for the Secret History of the Mongols, which is a great classical historical work written in the 13th century with Chinese characters transliterated from Mongol. This handwritten work contains rather rich text information, which should be the contents of forming an electronic database. There are in the original book multi-types of information, including layouts, volumes, chapters, characters, interlinear translation, segments, and Chinese translation, each format of which has been approached in detail and divided separately with markers. On the basis of analysis, our project builds up a complete electronic retrieval system for this great book, which resolves the return to the original shape of the archaic handwriting form with three lines representing one content. The sorting methods of the system are also designed according to the original text formats, namely concordance technology, which can print out retrieved objects with their contexts, retrieve with statistical data, and freely browse search.

  8. Proceedings of the ECIR 2012 Workshop on Task-Based and Aggregated Search (TBAS2012)

    DEFF Research Database (Denmark)

    2012-01-01

    Task-based search aims to understand the user's current task and desired outcomes, and how this may provide useful context for the Information Retrieval (IR) process. An example of task-based search is situations where additional user information on e.g. the purpose of the search or what the user...

  9. Considerations for the development of task-based search engines

    DEFF Research Database (Denmark)

    Petcu, Paula; Dragusin, Radu

    2013-01-01

    Based on previous experience from working on a task-based search engine, we present a list of suggestions and ideas for an Information Retrieval (IR) framework that could inform the development of next generation professional search systems. The specific task that we start from is the clinicians......' information need in finding rare disease diagnostic hypotheses at the time and place where medical decisions are made. Our experience from the development of a search engine focused on supporting clinicians in completing this task has provided us valuable insights in what aspects should be considered...... by the developers of vertical search engines....

  10. "Free full text articles": where to search for them?

    Science.gov (United States)

    Singh, Ashish; Singh, Manish; Singh, Ajai Kumar; Singh, Deepti; Singh, Pratibha; Sharma, Abhishek

    2011-07-01

    References form the backbone of any medical literature. Presently, because of high inflation, it is very difficult for any library/organization/college to purchase all journals. The condition is even worse for an individual person, such as private practitioners. The solution lies in the free availability of full-text articles. Here, the authors share their experiences about the accessibility of free full-text articles.

  11. Search and the Aging Mind: The Promise and Limits of the Cognitive Control Hypothesis of Age Differences in Search.

    Science.gov (United States)

    Mata, Rui; von Helversen, Bettina

    2015-07-01

    Search is a prerequisite for successful performance in a broad range of tasks ranging from making decisions between consumer goods to memory retrieval. How does aging impact search processes in such disparate situations? Aging is associated with structural and neuromodulatory brain changes that underlie cognitive control processes, which in turn have been proposed as a domain-general mechanism controlling search in external environments as well as memory. We review the aging literature to evaluate the cognitive control hypothesis that suggests that age-related change in cognitive control underlies age differences in both external and internal search. We also consider the limits of the cognitive control hypothesis and propose additional mechanisms such as changes in strategy use and affect that may be necessary to understand how aging affects search. Copyright © 2015 Cognitive Science Society, Inc.

  12. Functional-anatomic study of episodic retrieval using fMRI. I. Retrieval effort versus retrieval success.

    Science.gov (United States)

    Buckner, R L; Koutstaal, W; Schacter, D L; Wagner, A D; Rosen, B R

    1998-04-01

    A number of recent functional imaging studies have identified brain areas activated during tasks involving episodic memory retrieval. The identification of such areas provides a foundation for targeted hypotheses regarding the more specific contributions that these areas make to episodic retrieval. As a beginning effort toward such an endeavor, whole-brain functional magnetic resonance imaging (fMRI) was used to examine 14 subjects during episodic word recognition in a block-designed fMRI experiment. Study conditions were manipulated by presenting either shallow or deep encoding tasks. This manipulation yielded two recognition conditions that differed with regard to retrieval effort and retrieval success: shallow encoding yielded low levels of recognition success with high levels of retrieval effort, and deep encoding yielded high levels of recognition success with low levels of effort. Many brain areas were activated in common by these two recognition conditions compared to a low-level fixation condition, including left and right prefrontal regions often detected during PET episodic retrieval paradigms (e.g., R. L. Buckner et al., 1996, J. Neurosci. 16, 6219-6235) thereby generalizing these findings to fMRI. Characterization of the activated regions in relation to the separate recognition conditions showed (1) bilateral anterior insular regions and a left dorsal prefrontal region were more active after shallow encoding, when retrieval demanded greatest effort, and (2) right anterior prefrontal cortex, which has been implicated in episodic retrieval, was most active during successful retrieval after deep encoding. We discuss these findings in relation to component processes involved in episodic retrieval and in the context of a companion study using event-related fMRI.

  13. Comparison of the effectiveness of alternative feature sets in shape retrieval of multicomponent images

    Science.gov (United States)

    Eakins, John P.; Edwards, Jonathan D.; Riley, K. Jonathan; Rosin, Paul L.

    2001-01-01

    Many different kinds of features have been used as the basis for shape retrieval from image databases. This paper investigates the relative effectiveness of several types of global shape feature, both singly and in combination. The features compared include well-established descriptors such as Fourier coefficients and moment invariants, as well as recently-proposed measures of triangularity and ellipticity. Experiments were conducted within the framework of the ARTISAN shape retrieval system, and retrieval effectiveness assessed on a database of over 10,000 images, using 24 queries and associated ground truth supplied by the UK Patent Office . Our experiments revealed only minor differences in retrieval effectiveness between different measures, suggesting that a wide variety of shape feature combinations can provide adequate discriminating power for effective shape retrieval in multi-component image collections such as trademark registries. Marked differences between measures were observed for some individual queries, suggesting that there could be considerable scope for improving retrieval effectiveness by providing users with an improved framework for searching multi-dimensional feature space.

  14. Optimization of search algorithms for a mass spectra library

    International Nuclear Information System (INIS)

    Domokos, L.; Henneberg, D.; Weimann, B.

    1983-01-01

    The SISCOM mass spectra library search is mainly an interpretative system producing a ''hit list'' of similar spectra based on six comparison factors. This paper deals with extension of the system; the aim is exact identification (retrieval) of those reference spectra in the SISCOM hit list that correspond to the unknown compounds or components of the mixture. Thus, instead of a similarity measure, a decision (retrieval) function is needed to establish the identity of reference and unknown compounds by comparison of their spectra. To facilitate estimation of the weightings of the different variables in the retrieval function, pattern recognition algorithms were applied. Numerous statistical evaluations of three different library collections were made to check the quality of data bases and to derive appropriate variables for the retrieval function. (Auth.)

  15. Understanding human quality judgment in assessing online forum contents for thread retrieval purpose

    Science.gov (United States)

    Ismail, Zuriati; Salim, Naomie; Huspi, Sharin Hazlin

    2017-10-01

    Compared to traditional materials or journals, user-generated contents are not peer-reviewed. Lack of quality control and the explosive growth of web contents make the task of finding quality information on the web especially critical. The existence of new facilities for producing web contents such as forum makes this issue more significant. This study focuses on online forums threads or discussion, where the forums contain valuable human-generated information in a form of discussions. Due to the unique structure of the online forum pages, special techniques are required to organize and search for information in these forums. Quality biased retrieval is a retrieval approach that search for relevant document and prioritized higher quality documents. Despite major concern of quality content and recent development of quality biased retrieval, there is an urgent need to understand how quality content is being judged, for retrieval and performance evaluation purposes. Furthermore, even though there are various studies on the quality of information, there is no standard framework that has been established. The primary aim of this paper is to contribute to the understanding of human quality judgment in assessing online forum contents. The foundation of this study is to compare and evaluate different frameworks (for quality biased retrieval and information quality). This led to the finding that many quality dimensions are redundant and some dimensions are understood differently between different studies. We conducted a survey on crowdsourcing community to measure the importance of each quality dimensions found in various frameworks. Accuracy and ease of understanding are among top important dimensions while threads popularity and contents manipulability are among least important dimensions. This finding is beneficial in evaluating contents of online forum.

  16. The use of categorization information in language models for question retrieval

    DEFF Research Database (Denmark)

    Cao, Xin; Cong, Gao; Cui, Bin

    2009-01-01

    Community Question Answering (CQA) has emerged as a popular type of service meeting a wide range of information needs. Such services enable users to ask and answer questions and to access existing question-answer pairs. CQA archives contain very large volumes of valuable user-generated content...... and have become important information resources on the Web. To make the body of knowledge accumulated in CQA archives accessible, effective and efficient question search is required. Question search in a CQA archive aims to retrieve historical questions that are relevant to new questions posed by users...

  17. Searching the PASCAL database - A user's perspective

    Science.gov (United States)

    Jack, Robert F.

    1989-01-01

    The operation of PASCAL, a bibliographic data base covering broad subject areas in science and technology, is discussed. The data base includes information from about 1973 to the present, including topics in engineering, chemistry, physics, earth science, environmental science, biology, psychology, and medicine. Data from 1986 to the present may be searched using DIALOG. The procedures and classification codes for searching PASCAL are presented. Examples of citations retrieved from the data base are given and suggestions are made concerning when to use PASCAL.

  18. Research Article Special Issue

    African Journals Online (AJOL)

    pc

    2018-03-07

    Mar 7, 2018 ... where the operations of all sorts of searches and sorting are the most complex ones. Existing ... The object of this article development and research is an associative ..... ARPN Journal of Engineering and Applied Sciences.

  19. Management protocols for status epilepticus in the pediatric emergency room: systematic review article.

    Science.gov (United States)

    Au, Cheuk C; Branco, Ricardo G; Tasker, Robert C

    This systematic review of national or regional guidelines published in English aimed to better understand variance in pre-hospital and emergency department treatment of status epilepticus. Systematic search of national or regional guidelines (January 2000 to February 2017) contained within PubMed and Google Scholar databases, and article reference lists. The search keywords were status epilepticus, prolonged seizure, treatment, and guideline. 356 articles were retrieved and 13 were selected according to the inclusion criteria. In all six pre-hospital guidelines, the preferred route of medication administration was to use alternatives to the intravenous route: all recommended buccal and intranasal midazolam; three also recommended intramuscular midazolam, and five recommended using rectal diazepam. All 11 emergency department guidelines described three phases in therapy. Intravenous medication, by phase, was indicated as such: initial phase - ten/11 guidelines recommended lorazepam, and eight/11 recommended diazepam; second phase - most (ten/11) guidelines recommended phenytoin, but other options were phenobarbital (nine/11), valproic acid (six/11), and either fosphenytoin or levetiracetam (each four/11); third phase - four/11 guidelines included the choice of repeating second phase therapy, whereas the other guidelines recommended using a variety of intravenous anesthetic agents (thiopental, midazolam, propofol, and pentobarbital). All of the guidelines share a similar framework for management of status epilepticus. The choice in route of administration and drug type varied across guidelines. Hence, the adoption of a particular guideline should take account of local practice options in health service delivery. Copyright © 2017 Sociedade Brasileira de Pediatria. Published by Elsevier Editora Ltda. All rights reserved.

  20. Management protocols for status epilepticus in the pediatric emergency room: systematic review article

    Directory of Open Access Journals (Sweden)

    Cheuk C. Au

    Full Text Available Abstract Objective: This systematic review of national or regional guidelines published in English aimed to better understand variance in pre-hospital and emergency department treatment of status epilepticus. Sources: Systematic search of national or regional guidelines (January 2000 to February 2017 contained within PubMed and Google Scholar databases, and article reference lists. The search keywords were status epilepticus, prolonged seizure, treatment, and guideline. Summary of findings: 356 articles were retrieved and 13 were selected according to the inclusion criteria. In all six pre-hospital guidelines, the preferred route of medication administration was to use alternatives to the intravenous route: all recommended buccal and intranasal midazolam; three also recommended intramuscular midazolam, and five recommended using rectal diazepam. All 11 emergency department guidelines described three phases in therapy. Intravenous medication, by phase, was indicated as such: initial phase - ten/11 guidelines recommended lorazepam, and eight/11 recommended diazepam; second phase - most (ten/11 guidelines recommended phenytoin, but other options were phenobarbital (nine/11, valproic acid (six/11, and either fosphenytoin or levetiracetam (each four/11; third phase - four/11 guidelines included the choice of repeating second phase therapy, whereas the other guidelines recommended using a variety of intravenous anesthetic agents (thiopental, midazolam, propofol, and pentobarbital. Conclusions: All of the guidelines share a similar framework for management of status epilepticus. The choice in route of administration and drug type varied across guidelines. Hence, the adoption of a particular guideline should take account of local practice options in health service delivery.

  1. Document retrieval on repetitive string collections.

    Science.gov (United States)

    Gagie, Travis; Hartikainen, Aleksi; Karhu, Kalle; Kärkkäinen, Juha; Navarro, Gonzalo; Puglisi, Simon J; Sirén, Jouni

    2017-01-01

    Most of the fastest-growing string collections today are repetitive, that is, most of the constituent documents are similar to many others. As these collections keep growing, a key approach to handling them is to exploit their repetitiveness, which can reduce their space usage by orders of magnitude. We study the problem of indexing repetitive string collections in order to perform efficient document retrieval operations on them. Document retrieval problems are routinely solved by search engines on large natural language collections, but the techniques are less developed on generic string collections. The case of repetitive string collections is even less understood, and there are very few existing solutions. We develop two novel ideas, interleaved LCPs and precomputed document lists , that yield highly compressed indexes solving the problem of document listing (find all the documents where a string appears), top- k document retrieval (find the k documents where a string appears most often), and document counting (count the number of documents where a string appears). We also show that a classical data structure supporting the latter query becomes highly compressible on repetitive data. Finally, we show how the tools we developed can be combined to solve ranked conjunctive and disjunctive multi-term queries under the simple [Formula: see text] model of relevance. We thoroughly evaluate the resulting techniques in various real-life repetitiveness scenarios, and recommend the best choices for each case.

  2. SIFT Meets CNN: A Decade Survey of Instance Retrieval.

    Science.gov (United States)

    Zheng, Liang; Yang, Yi; Tian, Qi

    2018-05-01

    In the early days, content-based image retrieval (CBIR) was studied with global features. Since 2003, image retrieval based on local descriptors (de facto SIFT) has been extensively studied for over a decade due to the advantage of SIFT in dealing with image transformations. Recently, image representations based on the convolutional neural network (CNN) have attracted increasing interest in the community and demonstrated impressive performance. Given this time of rapid evolution, this article provides a comprehensive survey of instance retrieval over the last decade. Two broad categories, SIFT-based and CNN-based methods, are presented. For the former, according to the codebook size, we organize the literature into using large/medium-sized/small codebooks. For the latter, we discuss three lines of methods, i.e., using pre-trained or fine-tuned CNN models, and hybrid methods. The first two perform a single-pass of an image to the network, while the last category employs a patch-based feature extraction scheme. This survey presents milestones in modern instance retrieval, reviews a broad selection of previous works in different categories, and provides insights on the connection between SIFT and CNN-based methods. After analyzing and comparing retrieval performance of different categories on several datasets, we discuss promising directions towards generic and specialized instance retrieval.

  3. Searching LOGIN, the Local Government Information Network.

    Science.gov (United States)

    Jack, Robert F.

    1984-01-01

    Describes a computer-based information retrieval and electronic messaging system produced by Control Data Corporation now being used by government agencies and other organizations. Background of Local Government Information Network (LOGIN), database structure, types of LOGIN units, searching LOGIN (intersect, display, and list commands), and how…

  4. Categorization and Searching of Color Images Using Mean Shift Algorithm

    Directory of Open Access Journals (Sweden)

    Prakash PANDEY

    2009-07-01

    Full Text Available Now a day’s Image Searching is still a challenging problem in content based image retrieval (CBIR system. Most CBIR system operates on all images without pre-sorting the images. The image search result contains many unrelated image. The aim of this research is to propose a new object based indexing system Based on extracting salient region representative from the image, categorizing the image into different types and search images that are similar to given query images.In our approach, the color features are extracted using the mean shift algorithm, a robust clustering technique, Dominant objects are obtained by performing region grouping of segmented thumbnails. The category for an image is generated automatically by analyzing the image for the presence of a dominant object. The images in the database are clustered based on region feature similarity using Euclidian distance. Placing an image into a category can help the user to navigate retrieval results more effectively. Extensive experimental results illustrate excellent performance.

  5. Audiovisual Narrative Creation and Creative Retrieval: How Searching for a Story Shapes the Story

    NARCIS (Netherlands)

    Sauer, Sabrina

    2017-01-01

    Media professionals – such as news editors, image researchers, and documentary filmmakers - increasingly rely on online access to digital content within audiovisual archives to create narratives. Retrieving audiovisual sources therefore requires an in-depth knowledge of how to find sources

  6. The Role of the Medical Students’ Emotional Mood in Information Retrieval from the Web

    Directory of Open Access Journals (Sweden)

    Marzieh Yari Zanganeh

    2018-04-01

    Full Text Available Background: Online information retrieval is a process the result of which is influenced by the changes in the emotional moods of the user. It seems reasonable to include emotional aspects in developing information retrieval systems in order to optimize the experience of the users. Therefore, this study aimed to identify the role of positive and negative affects in the information seeking process on the web among students of medical sciences. Methods: From the methodological perspective, the present study was an experimental and applied research. According to the nature of the experimental method, observation and questionnaire were used. The participants were the students of various fields of Medical Sciences. The research sample included 50 students of Shiraz University of Medical Sciences selected through purposeful sampling method; they regularly used World Wide Web and google engine for information retrieval in educational, Research, personal, or managerial activities. In order to collect the data, search tasks were characterized by the topic, sequence in a search process, difficulty level, and searcher’s interest (simple in a task. Face and content validity of the questionnaire were confirmed by the experts. Reliability of the questionnaire was tested by Alpha Cronbach. Cronbach’s alpha coefficient (PA=0.777, NA=0.754 showed a high rate of reliability in a PANAS questionnaire. The collected data were analyzed using SPSS, version 20.0; also, to test the research hypothesis, T-Test and pair Samples T-Test were used. The P0.05. Conclusion: Information retrieval systems in the Web should identify positive and negative affects in the information seeking process in a set of perceiving signs in human interaction with the computer. The automatic identification of the users’ affect opens new dimensions into users moderators and information retrieval systems for successful retrieval from the Web.

  7. Designing and Implementing a Cross-Language Information Retrieval System Using Linguistic Corpora

    Directory of Open Access Journals (Sweden)

    Amin Nezarat

    2012-03-01

    Full Text Available Information retrieval (IR is a crucial area of natural language processing (NLP and can be defined as finding documents whose content is relevant to the query need of a user. Cross-language information retrieval (CLIR refers to a kind of information retrieval in which the language of the query and that of searched document are different. In fact, it is a retrieval process where the user presents queries in one language to retrieve documents in another language. This paper tried to construct a bilingual lexicon of parallel chunks of English and Persian from two very large monolingual corpora an English-Persian parallel corpus which could be directly applied to cross-language information retrieval tasks. For this purpose, a statistical measure known as Association Score (AS was used to compute the association value between every two corresponding chunks in the corpus using a couple of complicated algorithms. Once the CLIR system was developed using this bilingual lexicon, an experiment was performed on a set of one hundred English and Persian phrases and collocations to see to what extend this system was effective in assisting the users find the most relevant and suitable equivalents of their queries in either language.

  8. JANE, A new information retrieval system for the Radiation Shielding Information Center

    International Nuclear Information System (INIS)

    Trubey, D.K.

    1991-05-01

    A new information storage and retrieval system has been developed for the Radiation Shielding Information Center (RSIC) at Oak Ridge National Laboratory to replace mainframe systems that have become obsolete. The database contains citations and abstracts of literature which were selected by RSIC analysts and indexed with terms from a controlled vocabulary. The database, begun in 1963, has been maintained continuously since that time. The new system, called JANE, incorporates automatic indexing techniques and on-line retrieval using the RSIC Data General Eclipse MV/4000 minicomputer, Automatic indexing and retrieval techniques based on fuzzy-set theory allow the presentation of results in order of Retrieval Status Value. The fuzzy-set membership function depends on term frequency in the titles and abstracts and on Term Discrimination Values which indicate the resolving power of the individual terms. These values are determined by the Cover Coefficient method. The use of a commercial database base to store and retrieve the indexing information permits rapid retrieval of the stored documents. Comparisons of the new and presently-used systems for actual searches of the literature indicate that it is practical to replace the mainframe systems with a minicomputer system similar to the present version of JANE. 18 refs., 10 figs

  9. Large-scale retrieval for medical image analytics: A comprehensive review.

    Science.gov (United States)

    Li, Zhongyu; Zhang, Xiaofan; Müller, Henning; Zhang, Shaoting

    2018-01-01

    Over the past decades, medical image analytics was greatly facilitated by the explosion of digital imaging techniques, where huge amounts of medical images were produced with ever-increasing quality and diversity. However, conventional methods for analyzing medical images have achieved limited success, as they are not capable to tackle the huge amount of image data. In this paper, we review state-of-the-art approaches for large-scale medical image analysis, which are mainly based on recent advances in computer vision, machine learning and information retrieval. Specifically, we first present the general pipeline of large-scale retrieval, summarize the challenges/opportunities of medical image analytics on a large-scale. Then, we provide a comprehensive review of algorithms and techniques relevant to major processes in the pipeline, including feature representation, feature indexing, searching, etc. On the basis of existing work, we introduce the evaluation protocols and multiple applications of large-scale medical image retrieval, with a variety of exploratory and diagnostic scenarios. Finally, we discuss future directions of large-scale retrieval, which can further improve the performance of medical image analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Encoding specificity manipulations do affect retrieval from memory.

    Science.gov (United States)

    Zeelenberg, René

    2005-05-01

    In a recent article, P.A. Higham (2002) [Strong cues are not necessarily weak: Thomson and Tulving (1970) and the encoding specificity principle revisited. Memory &Cognition, 30, 67-80] proposed a new way to analyze cued recall performance in terms of three separable aspects of memory (retrieval, monitoring, and report bias) by comparing performance under both free-report and forced-report instructions. He used this method to derive estimates of these aspects of memory in an encoding specificity experiment similar to that reported by D.M. Thomson and E. Tulving (1970) [Associative encoding and retrieval: weak and strong cues. Journal of Experimental Psychology, 86, 255-262]. Under forced-report instructions, the encoding specificity manipulation did not affect performance. Higham concluded that the manipulation affected monitoring and report bias, but not retrieval. I argue that this interpretation of the results is problematic because the Thomson and Tulving paradigm is confounded, and show in three experiments using a more appropriate design that encoding specificity manipulations do affect performance in forced-report cued recall. Because in Higham's framework forced-report performance provides a measure of retrieval that is uncontaminated by monitoring and report bias it is concluded that encoding specificity manipulations do affect retrieval from memory.

  11. Differential Neural Activity during Search of Specific and General Autobiographical Memories elicited by Musical Cues

    OpenAIRE

    Ford, Jaclyn Hennessey; Addis, Donna Rose; Giovanello, Kelly S.

    2011-01-01

    Previous neuroimaging studies that have examined autobiographical memory specificity have utilized retrieval cues associated with prior searches of the event, potentially changing the retrieval processes being investigated. In the current study, musical cues were used to naturally elicit memories from multiple levels of specificity (i.e., lifetime period, general event, and event-specific). Sixteen young adults participated in a neuroimaging study in which they retrieved autobiographical memo...

  12. Using a Google Search Appliance (GSA to search digital library collections: a case study of the INIS Collection Search

    Directory of Open Access Journals (Sweden)

    Dobrica Savic

    2014-05-01

    The International Nuclear Information System (INIS hosts one of the world’s largest collections of published information on the peaceful uses of nuclear science and technology. It offers online access to a unique collection of 3.6 million bibliographic records and 320,000 full-texts of non-conventional (grey literature. This large digital library collection suffered from most of the well-known shortcomings of the classic library catalogue. Searching was complex and complicated, required some training in using Boolean logic, full-text searching was not an option, and the response time was slow. An opportune moment came with the retirement of the previous catalogue software and with the adoption of Google Search Appliance (GSA as an organization-wide search engine standard. INIS was quick to realize a great potential in using such a well-known application as a replacement for its online catalogue and this paper presents the advantages and disadvantages encountered during three years of GSA use. Based on specific INIS-based practice and experience, this paper also offers some guidelines on ways to improve classic collections of millions of bibliographic and full-text documents, while achieving multiple benefits such as increased use, accessibility, usability, expandability and improving the user search and retrieval experience.

  13. An Improved Botanical Search Application for Middle-and High-School Students

    Science.gov (United States)

    Kajiyama, Tomoko

    2016-01-01

    A previously reported botanical data retrieval application has been improved to make it better suited for use in middle-and high-school science classes. This search interface is ring-structured and treats multi-faceted metadata intuitively, enabling students not only to search for plant names but also to learn about the morphological features and…

  14. Raising Reliability of Web Search Tool Research through Replication and Chaos Theory

    OpenAIRE

    Nicholson, Scott

    1999-01-01

    Because the World Wide Web is a dynamic collection of information, the Web search tools (or "search engines") that index the Web are dynamic. Traditional information retrieval evaluation techniques may not provide reliable results when applied to the Web search tools. This study is the result of ten replications of the classic 1996 Ding and Marchionini Web search tool research. It explores the effects that replication can have on transforming unreliable results from one iteration into replica...

  15. DRUMS: a human disease related unique gene mutation search engine.

    Science.gov (United States)

    Li, Zuofeng; Liu, Xingnan; Wen, Jingran; Xu, Ye; Zhao, Xin; Li, Xuan; Liu, Lei; Zhang, Xiaoyan

    2011-10-01

    With the completion of the human genome project and the development of new methods for gene variant detection, the integration of mutation data and its phenotypic consequences has become more important than ever. Among all available resources, locus-specific databases (LSDBs) curate one or more specific genes' mutation data along with high-quality phenotypes. Although some genotype-phenotype data from LSDB have been integrated into central databases little effort has been made to integrate all these data by a search engine approach. In this work, we have developed disease related unique gene mutation search engine (DRUMS), a search engine for human disease related unique gene mutation as a convenient tool for biologists or physicians to retrieve gene variant and related phenotype information. Gene variant and phenotype information were stored in a gene-centred relational database. Moreover, the relationships between mutations and diseases were indexed by the uniform resource identifier from LSDB, or another central database. By querying DRUMS, users can access the most popular mutation databases under one interface. DRUMS could be treated as a domain specific search engine. By using web crawling, indexing, and searching technologies, it provides a competitively efficient interface for searching and retrieving mutation data and their relationships to diseases. The present system is freely accessible at http://www.scbit.org/glif/new/drums/index.html. © 2011 Wiley-Liss, Inc.

  16. [Biomedical information on the internet using search engines. A one-year trial].

    Science.gov (United States)

    Corrao, Salvatore; Leone, Francesco; Arnone, Sabrina

    2004-01-01

    The internet is a communication medium and content distributor that provide information in the general sense but it could be of great utility regarding as the search and retrieval of biomedical information. Search engines represent a great deal to rapidly find information on the net. However, we do not know whether general search engines and meta-search ones are reliable in order to find useful and validated biomedical information. The aim of our study was to verify the reproducibility of a search by key-words (pediatric or evidence) using 9 international search engines and 1 meta-search engine at the baseline and after a one year period. We analysed the first 20 citations as output of each searching. We evaluated the formal quality of Web-sites and their domain extensions. Moreover, we compared the output of each search at the start of this study and after a one year period and we considered as a criterion of reliability the number of Web-sites cited again. We found some interesting results that are reported throughout the text. Our findings point out an extreme dynamicity of the information on the Web and, for this reason, we advice a great caution when someone want to use search and meta-search engines as a tool for searching and retrieve reliable biomedical information. On the other hand, some search and meta-search engines could be very useful as a first step searching for defining better a search and, moreover, for finding institutional Web-sites too. This paper allows to know a more conscious approach to the internet biomedical information universe.

  17. Metadata Creation, Management and Search System for your Scientific Data

    Science.gov (United States)

    Devarakonda, R.; Palanisamy, G.

    2012-12-01

    Mercury Search Systems is a set of tools for creating, searching, and retrieving of biogeochemical metadata. Mercury toolset provides orders of magnitude improvements in search speed, support for any metadata format, integration with Google Maps for spatial queries, multi-facetted type search, search suggestions, support for RSS (Really Simple Syndication) delivery of search results, and enhanced customization to meet the needs of the multiple projects that use Mercury. Mercury's metadata editor provides a easy way for creating metadata and Mercury's search interface provides a single portal to search for data and information contained in disparate data management systems, each of which may use any metadata format including FGDC, ISO-19115, Dublin-Core, Darwin-Core, DIF, ECHO, and EML. Mercury harvests metadata and key data from contributing project servers distributed around the world and builds a centralized index. The search interfaces then allow the users to perform a variety of fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury is being used more than 14 different projects across 4 federal agencies. It was originally developed for NASA, with continuing development funded by NASA, USGS, and DOE for a consortium of projects. Mercury search won the NASA's Earth Science Data Systems Software Reuse Award in 2008. References: R. Devarakonda, G. Palanisamy, B.E. Wilson, and J.M. Green, "Mercury: reusable metadata management data discovery and access system", Earth Science Informatics, vol. 3, no. 1, pp. 87-94, May 2010. R. Devarakonda, G. Palanisamy, J.M. Green, B.E. Wilson, "Data sharing and retrieval using OAI-PMH", Earth Science Informatics DOI: 10.1007/s12145-010-0073-0, (2010);

  18. Region-Based Color Image Indexing and Retrieval

    DEFF Research Database (Denmark)

    Kompatsiaris, Ioannis; Triantafyllou, Evangelia; Strintzis, Michael G.

    2001-01-01

    In this paper a region-based color image indexing and retrieval algorithm is presented. As a basis for the indexing, a novel K-Means segmentation algorithm is used, modified so as to take into account the coherence of the regions. A new color distance is also defined for this algorithm. Based on ....... Experimental results demonstrate the performance of the algorithm. The development of an intelligent image content-based search engine for the World Wide Web is also presented, as a direct application of the presented algorithm....

  19. Hooked on Music Information Retrieval

    Directory of Open Access Journals (Sweden)

    W. Bas de Haas

    2011-04-01

    Full Text Available This article provides a reply to 'Lure(d into listening: The potential of cognition-based music information retrieval,' in which Henkjan Honing discusses the potential impact of his proposed Listen, Lure & Locate project on Music Information Retrieval (MIR. Honing presents some critical remarks on data-oriented approaches in MIR, which we endorse. To place these remarks in context, we first give a brief overview of the state of the art of MIR research. Then we present a series of arguments that show why purely data-oriented approaches are unlikely to take MIR research and applications to a more advanced level. Next, we propose our view on MIR research, in which the modelling of musical knowledge has a central role. Finally, we elaborate on the ideas in Honing's paper from a MIR perspective in this paper and propose some additions to the Listen, Lure & Locate project.

  20. Usefulness of systematic review search strategies in finding child health systematic reviews in MEDLINE

    NARCIS (Netherlands)

    Boluyt, Nicole; Tjosvold, Lisa; Lefebvre, Carol; Klassen, Terry P.; Offringa, Martin

    2008-01-01

    OBJECTIVE: To determine the sensitivity and precision of existing search strategies for retrieving child health systematic reviews in MEDLINE using PubMed. DESIGN: Filter (diagnostic) accuracy study. We identified existing search strategies for systematic reviews, combined them with a filter that