web-based search engine: Topics by WorldWideScience.org

Sample records for web-based search engine

Web Search Engines

OpenAIRE

Rajashekar, TB

1998-01-01

The World Wide Web is emerging as an all-in-one information source. Tools for searching Web-based information include search engines, subject directories and meta search tools. We take a look at key features of these tools and suggest practical hints for effective Web searching.
A World Wide Web Region-Based Image Search Engine

DEFF Research Database (Denmark)

Kompatsiaris, Ioannis; Triantafyllou, Evangelia; Strintzis, Michael G.

2001-01-01

In this paper the development of an intelligent image content-based search engine for the World Wide Web is presented. This system will offer a new form of media representation and access of content available in WWW. Information Web Crawlers continuously traverse the Internet and collect images...
Knowledge-based personalized search engine for the Web-based Human Musculoskeletal System Resources (HMSR) in biomechanics.

Science.gov (United States)

Dao, Tien Tuan; Hoang, Tuan Nha; Ta, Xuan Hien; Tho, Marie Christine Ho Ba

2013-02-01

Human musculoskeletal system resources of the human body are valuable for the learning and medical purposes. Internet-based information from conventional search engines such as Google or Yahoo cannot response to the need of useful, accurate, reliable and good-quality human musculoskeletal resources related to medical processes, pathological knowledge and practical expertise. In this present work, an advanced knowledge-based personalized search engine was developed. Our search engine was based on a client-server multi-layer multi-agent architecture and the principle of semantic web services to acquire dynamically accurate and reliable HMSR information by a semantic processing and visualization approach. A security-enhanced mechanism was applied to protect the medical information. A multi-agent crawler was implemented to develop a content-based database of HMSR information. A new semantic-based PageRank score with related mathematical formulas were also defined and implemented. As the results, semantic web service descriptions were presented in OWL, WSDL and OWL-S formats. Operational scenarios with related web-based interfaces for personal computers and mobile devices were presented and analyzed. Functional comparison between our knowledge-based search engine, a conventional search engine and a semantic search engine showed the originality and the robustness of our knowledge-based personalized search engine. In fact, our knowledge-based personalized search engine allows different users such as orthopedic patient and experts or healthcare system managers or medical students to access remotely into useful, accurate, reliable and good-quality HMSR information for their learning and medical purposes. Copyright © 2012 Elsevier Inc. All rights reserved.
Assessment and Comparison of Search capabilities of Web-based Meta-Search Engines: A Checklist Approach

Directory of Open Access Journals (Sweden)

Alireza Isfandiyari Moghadam

2010-03-01

Full Text Available The present investigation concerns evaluation, comparison and analysis of search options existing within web-based meta-search engines. 64 meta-search engines were identified. 19 meta-search engines that were free, accessible and compatible with the objectives of the present study were selected. An author’s constructed check list was used for data collection. Findings indicated that all meta-search engines studied used the AND operator, phrase search, number of results displayed setting, previous search query storage and help tutorials. Nevertheless, none of them demonstrated any search options for hypertext searching and displaying the size of the pages searched. 94.7% support features such as truncation, keywords in title and URL search and text summary display. The checklist used in the study could serve as a model for investigating search options in search engines, digital libraries and other internet search tools.
Research on the optimization strategy of web search engine based on data mining

Science.gov (United States)

Chen, Ronghua

2018-04-01

With the wide application of search engines, web site information has become an important way for people to obtain information. People have found that they are growing in an increasingly explosive manner. Web site information is verydifficult to find the information they need, and now the search engine can not meet the need, so there is an urgent need for the network to provide website personalized information service, data mining technology for this new challenge is to find a breakthrough. In order to improve people's accuracy of finding information from websites, a website search engine optimization strategy based on data mining is proposed, and verified by website search engine optimization experiment. The results show that the proposed strategy improves the accuracy of the people to find information, and reduces the time for people to find information. It has an important practical value.
A unified architecture for biomedical search engines based on semantic web technologies.

Science.gov (United States)

Jalali, Vahid; Matash Borujerdi, Mohammad Reza

2011-04-01

There is a huge growth in the volume of published biomedical research in recent years. Many medical search engines are designed and developed to address the over growing information needs of biomedical experts and curators. Significant progress has been made in utilizing the knowledge embedded in medical ontologies and controlled vocabularies to assist these engines. However, the lack of common architecture for utilized ontologies and overall retrieval process, hampers evaluating different search engines and interoperability between them under unified conditions. In this paper, a unified architecture for medical search engines is introduced. Proposed model contains standard schemas declared in semantic web languages for ontologies and documents used by search engines. Unified models for annotation and retrieval processes are other parts of introduced architecture. A sample search engine is also designed and implemented based on the proposed architecture in this paper. The search engine is evaluated using two test collections and results are reported in terms of precision vs. recall and mean average precision for different approaches used by this search engine.
Adding a visualization feature to web search engines: it's time.

Science.gov (United States)

Wong, Pak Chung

2008-01-01

It's widely recognized that all Web search engines today are almost identical in presentation layout and behavior. In fact, the same presentation approach has been applied to depicting search engine results pages (SERPs) since the first Web search engine launched in 1993. In this Visualization Viewpoints article, I propose to add a visualization feature to Web search engines and suggest that the new addition can improve search engines' performance and capabilities, which in turn lead to better Web search technology.
The Use of Web Search Engines in Information Science Research.

Science.gov (United States)

Bar-Ilan, Judit

2004-01-01

Reviews the literature on the use of Web search engines in information science research, including: ways users interact with Web search engines; social aspects of searching; structure and dynamic nature of the Web; link analysis; other bibliometric applications; characterizing information on the Web; search engine evaluation and improvement; and…
The invisible Web uncovering information sources search engines can't see

CERN Document Server

Sherman, Chris

2001-01-01

Enormous expanses of the Internet are unreachable with standard web search engines. This book provides the key to finding these hidden resources by identifying how to uncover and use invisible web resources. Mapping the invisible Web, when and how to use it, assessing the validity of the information, and the future of Web searching are topics covered in detail. Only 16 percent of Net-based information can be located using a general search engine. The other 84 percent is what is referred to as the invisible Web-made up of information stored in databases. Unlike pages on the visible Web, informa
Personalizing Web Search based on User Profile

OpenAIRE

Utage, Sharyu; Ahire, Vijaya

2016-01-01

Web Search engine is most widely used for information retrieval from World Wide Web. These Web Search engines help user to find most useful information. When different users Searches for same information, search engine provide same result without understanding who is submitted that query. Personalized web search it is search technique for proving useful result. This paper models preference of users as hierarchical user profiles. a framework is proposed called UPS. It generalizes profile and m...
Sexual information seeking on web search engines.

Science.gov (United States)

Spink, Amanda; Koricich, Andrew; Jansen, B J; Cole, Charles

2004-02-01

Sexual information seeking is an important element within human information behavior. Seeking sexually related information on the Internet takes many forms and channels, including chat rooms discussions, accessing Websites or searching Web search engines for sexual materials. The study of sexual Web queries provides insight into sexually-related information-seeking behavior, of value to Web users and providers alike. We qualitatively analyzed queries from logs of 1,025,910 Alta Vista and AlltheWeb.com Web user queries from 2001. We compared the differences in sexually-related Web searching between Alta Vista and AlltheWeb.com users. Differences were found in session duration, query outcomes, and search term choices. Implications of the findings for sexual information seeking are discussed.
The Evolution of Web Searching.

Science.gov (United States)

Green, David

2000-01-01

Explores the interrelation between Web publishing and information retrieval technologies and lists new approaches to Web indexing and searching. Highlights include Web directories; search engines; portalisation; Internet service providers; browser providers; meta search engines; popularity based analysis; natural language searching; links-based…
GeNemo: a search engine for web-based functional genomic data.

Science.gov (United States)

Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng

2016-07-08

A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions. This distinguishes GeNemo from text or DNA sequence searches. The user can input any complete or partial functional genomic dataset, for example, a binding intensity file (bigWig) or a peak file. GeNemo reports any genomic regions, ranging from hundred bases to hundred thousand bases, from any of the online ENCODE datasets that share similar functional (binding, modification, accessibility) patterns. This is enabled by a Markov Chain Monte Carlo-based maximization process, executed on up to 24 parallel computing threads. By clicking on a search result, the user can visually compare her/his data with the found datasets and navigate the identified genomic regions. GeNemo is available at www.genemo.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Adding a Visualization Feature to Web Search Engines: It’s Time

Energy Technology Data Exchange (ETDEWEB)

Wong, Pak C.

2008-11-11

Since the first world wide web (WWW) search engine quietly entered our lives in 1994, the “information need” behind web searching has rapidly grown into a multi-billion dollar business that dominates the internet landscape, drives e-commerce traffic, propels global economy, and affects the lives of the whole human race. Today’s search engines are faster, smarter, and more powerful than those released just a few years ago. With the vast investment pouring into research and development by leading web technology providers and the intense emotion behind corporate slogans such as “win the web” or “take back the web,” I can’t help but ask why are we still using the very same “text-only” interface that was used 13 years ago to browse our search engine results pages (SERPs)? Why has the SERP interface technology lagged so far behind in the web evolution when the corresponding search technology has advanced so rapidly? In this article I explore some current SERP interface issues, suggest a simple but practical visual-based interface design approach, and argue why a visual approach can be a strong candidate for tomorrow’s SERP interface.
Web Feet Guide to Search Engines: Finding It on the Net.

Science.gov (United States)

Web Feet, 2001

2001-01-01

This guide to search engines for the World Wide Web discusses selecting the right search engine; interpreting search results; major search engines; online tutorials and guides; search engines for kids; specialized search tools for various subjects; and other specialized engines and gateways. (LRW)
The Little Engines That Could: Modeling the Performance of World Wide Web Search Engines

OpenAIRE

Eric T. Bradlow; David C. Schmittlein

2000-01-01

This research examines the ability of six popular Web search engines, individually and collectively, to locate Web pages containing common marketing/management phrases. We propose and validate a model for search engine performance that is able to represent key patterns of coverage and overlap among the engines. The model enables us to estimate the typical additional benefit of using multiple search engines, depending on the particular set of engines being considered. It also provides an estim...
AN OVERVIEW OF SEARCHING AND DISCOVERING WEB BASED INFORMATION RESOURCES

Directory of Open Access Journals (Sweden)

Cezar VASILESCU

2010-01-01

Full Text Available The Internet becomes for most of us a daily used instrument, for professional or personal reasons. We even do not remember the times when a computer and a broadband connection were luxury items. More and more people are relying on the complicated web network to find the needed information.This paper presents an overview of Internet search related issues, upon search engines and describes the parties and the basic mechanism that is embedded in a search for web based information resources. Also presents ways to increase the efficiency of web searches, through a better understanding of what search engines ignore at websites content.
A study of medical and health queries to web search engines.

Science.gov (United States)

Spink, Amanda; Yang, Yin; Jansen, Jim; Nykanen, Pirrko; Lorence, Daniel P; Ozmutlu, Seda; Ozmutlu, H Cenk

2004-03-01

This paper reports findings from an analysis of medical or health queries to different web search engines. We report results: (i). comparing samples of 10000 web queries taken randomly from 1.2 million query logs from the AlltheWeb.com and Excite.com commercial web search engines in 2001 for medical or health queries, (ii). comparing the 2001 findings from Excite and AlltheWeb.com users with results from a previous analysis of medical and health related queries from the Excite Web search engine for 1997 and 1999, and (iii). medical or health advice-seeking queries beginning with the word 'should'. Findings suggest: (i). a small percentage of web queries are medical or health related, (ii). the top five categories of medical or health queries were: general health, weight issues, reproductive health and puberty, pregnancy/obstetrics, and human relationships, and (iii). over time, the medical and health queries may have declined as a proportion of all web queries, as the use of specialized medical/health websites and e-commerce-related queries has increased. Findings provide insights into medical and health-related web querying and suggests some implications for the use of the general web search engines when seeking medical/health information.
Dynamics of a macroscopic model characterizing mutualism of search engines and web sites

Science.gov (United States)

Wang, Yuanshi; Wu, Hong

2006-05-01

We present a model to describe the mutualism relationship between search engines and web sites. In the model, search engines and web sites benefit from each other while the search engines are derived products of the web sites and cannot survive independently. Our goal is to show strategies for the search engines to survive in the internet market. From mathematical analysis of the model, we show that mutualism does not always result in survival. We show various conditions under which the search engines would tend to extinction, persist or grow explosively. Then by the conditions, we deduce a series of strategies for the search engines to survive in the internet market. We present conditions under which the initial number of consumers of the search engines has little contribution to their persistence, which is in agreement with the results in previous works. Furthermore, we show novel conditions under which the initial value plays an important role in the persistence of the search engines and deduce new strategies. We also give suggestions for the web sites to cooperate with the search engines in order to form a win-win situation.
A grammar checker based on web searching

Directory of Open Access Journals (Sweden)

Joaquim Moré

2006-05-01

Full Text Available This paper presents an English grammar and style checker for non-native English speakers. The main characteristic of this checker is the use of an Internet search engine. As the number of web pages written in English is immense, the system hypothesises that a piece of text not found on the Web is probably badly written. The system also hypothesises that the Web will provide examples of how the content of the text segment can be expressed in a grammatically correct and idiomatic way. Thus, when the checker warns the user about the odd nature of a text segment, the Internet engine searches for contexts that can help the user decide whether he/she should correct the segment or not. By means of a search engine, the checker also suggests use of other expressions that appear on the Web more often than the expression he/she actually wrote.

Virtual Reference Services through Web Search Engines: Study of Academic Libraries in Pakistan

Directory of Open Access Journals (Sweden)

Rubia Khan

2017-03-01

Full Text Available Web search engines (WSE are powerful and popular tools in the field of information service management. This study is an attempt to examine the impact and usefulness of web search engines in providing virtual reference services (VRS within academic libraries in Pakistan. The study also attempts to investigate the relevant expertise and skills of library professionals in providing digital reference services (DRS efficiently using web search engines. Methodology used in this study is quantitative in nature. The data was collected from fifty public and private sector universities in Pakistan using a structured questionnaire. Microsoft Excel and SPSS were used for data analysis. The study concludes that web search engines are commonly used by librarians to help users (especially research scholars by providing digital reference services. The study also finds a positive correlation between use of web search engines and quality of digital reference services provided to library users. It is concluded that although search engines have increased the expectations of users and are really big competitors to a library’s reference desk, they are however not an alternative to reference service. Findings reveal that search engines pose numerous challenges for librarians and the study also attempts to bring together possible remedial measures. This study is useful for library professionals to understand the importance of search engines in providing VRS. The study also provides an intellectual comparison among different search engines, their capabilities, limitations, challenges and opportunities to provide VRS effectively in libraries.
Web Spam, Social Propaganda and the Evolution of Search Engine Rankings

Science.gov (United States)

Metaxas, Panagiotis Takis

Search Engines have greatly influenced the way we experience the web. Since the early days of the web, users have been relying on them to get informed and make decisions. When the web was relatively small, web directories were built and maintained using human experts to screen and categorize pages according to their characteristics. By the mid 1990's, however, it was apparent that the human expert model of categorizing web pages does not scale. The first search engines appeared and they have been evolving ever since, taking over the role that web directories used to play.
GeoSearcher: Location-Based Ranking of Search Engine Results.

Science.gov (United States)

Watters, Carolyn; Amoudi, Ghada

2003-01-01

Discussion of Web queries with geospatial dimensions focuses on an algorithm that assigns location coordinates dynamically to Web sites based on the URL. Describes a prototype search system that uses the algorithm to re-rank search engine results for queries with a geospatial dimension, thus providing an alternative ranking order for search engine…
Web-based information search and retrieval: effects of strategy use and age on search success.

Science.gov (United States)

Stronge, Aideen J; Rogers, Wendy A; Fisk, Arthur D

2006-01-01

The purpose of this study was to investigate the relationship between strategy use and search success on the World Wide Web (i.e., the Web) for experienced Web users. An additional goal was to extend understanding of how the age of the searcher may influence strategy use. Current investigations of information search and retrieval on the Web have provided an incomplete picture of Web strategy use because participants have not been given the opportunity to demonstrate their knowledge of Web strategies while also searching for information on the Web. Using both behavioral and knowledge-engineering methods, we investigated searching behavior and system knowledge for 16 younger adults (M = 20.88 years of age) and 16 older adults (M = 67.88 years). Older adults were less successful than younger adults in finding correct answers to the search tasks. Knowledge engineering revealed that the age-related effect resulted from ineffective search strategies and amount of Web experience rather than age per se. Our analysis led to the development of a decision-action diagram representing search behavior for both age groups. Older adults had more difficulty than younger adults when searching for information on the Web. However, this difficulty was related to the selection of inefficient search strategies, which may have been attributable to a lack of knowledge about available Web search strategies. Actual or potential applications of this research include training Web users to search more effectively and suggestions to improve the design of search engines.
Categorization of web pages - Performance enhancement to search engine

Digital Repository Service at National Institute of Oceanography (India)

Lakshminarayana, S.

of Artificial Intelligence, Volume III. Los Altos, CA.: William Kaufmann. pp 1-74. 18. Brin, S. & Page, L. (1998). The anatomy of a large scale hyper-textual web search engine. In Proceedings of the seventh World Wide Web conference, Brisbane, Australia. 19...
Sagace: A web-based search engine for biomedical databases in Japan

Directory of Open Access Journals (Sweden)

Morita Mizuki

2012-10-01

Full Text Available Abstract Background In the big data era, biomedical research continues to generate a large amount of data, and the generated information is often stored in a database and made publicly available. Although combining data from multiple databases should accelerate further studies, the current number of life sciences databases is too large to grasp features and contents of each database. Findings We have developed Sagace, a web-based search engine that enables users to retrieve information from a range of biological databases (such as gene expression profiles and proteomics data and biological resource banks (such as mouse models of disease and cell lines. With Sagace, users can search more than 300 databases in Japan. Sagace offers features tailored to biomedical research, including manually tuned ranking, a faceted navigation to refine search results, and rich snippets constructed with retrieved metadata for each database entry. Conclusions Sagace will be valuable for experts who are involved in biomedical research and drug development in both academia and industry. Sagace is freely available at http://sagace.nibio.go.jp/en/.
REPTREE CLASSIFIER FOR IDENTIFYING LINK SPAM IN WEB SEARCH ENGINES

Directory of Open Access Journals (Sweden)

S.K. Jayanthi

2013-01-01

Full Text Available Search Engines are used for retrieving the information from the web. Most of the times, the importance is laid on top 10 results sometimes it may shrink as top 5, because of the time constraint and reliability on the search engines. Users believe that top 10 or 5 of total results are more relevant. Here comes the problem of spamdexing. It is a method to deceive the search result quality. Falsified metrics such as inserting enormous amount of keywords or links in website may take that website to the top 10 or 5 positions. This paper proposes a classifier based on the Reptree (Regression tree representative. As an initial step Link-based features such as neighbors, pagerank, truncated pagerank, trustrank and assortativity related attributes are inferred. Based on this features, tree is constructed. The tree uses the feature inference to differentiate spam sites from legitimate sites. WEBSPAM-UK-2007 dataset is taken as a base. It is preprocessed and converted into five datasets FEATA, FEATB, FEATC, FEATD and FEATE. Only link based features are taken for experiments. This paper focus on link spam alone. Finally a representative tree is created which will more precisely classify the web spam entries. Results are given. Regression tree classification seems to perform well as shown through experiments.
Study of Search Engine Transaction Logs Shows Little Change in How Users use Search Engines. A review of: Jansen, Bernard J., and Amanda Spink. “How Are We Searching the World Wide Web? A Comparison of Nine Search Engine Transaction Logs.” Information Processing & Management 42.1 (2006: 248‐263.

Directory of Open Access Journals (Sweden)

David Hook

2006-09-01

Full Text Available Objective – To examine the interactions between users and search engines, and how they have changed over time. Design – Comparative analysis of search engine transaction logs. Setting – Nine major analyses of search engine transaction logs. Subjects – Nine web search engine studies (4 European, 5 American over a seven‐year period, covering the search engines Excite, Fireball, AltaVista, BWIE and AllTheWeb. Methods – The results from individual studies are compared by year of study for percentages of single query sessions, one term queries, operator (and, or, not, etc. usage and single result page viewing. As well, the authors group the search queries into eleven different topical categories and compare how the breakdown has changed over time. Main Results – Based on the percentage of single query sessions, it does not appear that the complexity of interactions has changed significantly for either the U.S.‐based or the European‐based search engines. As well, there was little change observed in the percentage of one‐term queries over the years of study for either the U.S.‐based or the European‐based search engines. Few users (generally less than 20% use Boolean or other operators in their queries, and these percentages have remained relatively stable. One area of noticeable change is in the percentage of users viewing only one results page, which has increased over the years of study. Based on the studies of the U.S.‐based search engines, the topical categories of ‘People, Place or Things’ and ‘Commerce, Travel, Employment or Economy’ are becoming more popular, while the categories of ‘Sex and Pornography’ and ‘Entertainment or Recreation’ are declining. Conclusions – The percentage of users viewing only one results page increased during the years of the study, while the percentages of single query sessions, oneterm sessions and operator usage remained stable. The increase in single result page viewing
Using the open Web as an information resource and scholarly Web search engines as retrieval tools for academic and research purposes

Directory of Open Access Journals (Sweden)

Filistea Naude

2010-08-01

Full Text Available This study provided insight into the significance of the open Web as an information resource and Web search engines as research tools amongst academics. The academic staff establishment of the University of South Africa (Unisa was invited to participate in a questionnaire survey and included 1188 staff members from five colleges. This study culminated in a PhD dissertation in 2008. One hundred and eighty seven respondents participated in the survey which gave a response rate of 15.7%. The results of this study show that academics have indeed accepted the open Web as a useful information resource and Web search engines as retrieval tools when seeking information for academic and research work. The majority of respondents used the open Web and Web search engines on a daily or weekly basis to source academic and research information. The main obstacles presented by using the open Web and Web search engines included lack of time to search and browse the Web, information overload, poor network speed and the slow downloading speed of webpages.
Using the open Web as an information resource and scholarly Web search engines as retrieval tools for academic and research purposes

Directory of Open Access Journals (Sweden)

Filistea Naude

2010-12-01

Full Text Available This study provided insight into the significance of the open Web as an information resource and Web search engines as research tools amongst academics. The academic staff establishment of the University of South Africa (Unisa was invited to participate in a questionnaire survey and included 1188 staff members from five colleges. This study culminated in a PhD dissertation in 2008. One hundred and eighty seven respondents participated in the survey which gave a response rate of 15.7%. The results of this study show that academics have indeed accepted the open Web as a useful information resource and Web search engines as retrieval tools when seeking information for academic and research work. The majority of respondents used the open Web and Web search engines on a daily or weekly basis to source academic and research information. The main obstacles presented by using the open Web and Web search engines included lack of time to search and browse the Web, information overload, poor network speed and the slow downloading speed of webpages.
Experience of Developing a Meta-Semantic Search Engine

OpenAIRE

Mukhopadhyay, Debajyoti; Sharma, Manoj; Joshi, Gajanan; Pagare, Trupti; Palwe, Adarsha

2013-01-01

Thinking of todays web search scenario which is mainly keyword based, leads to the need of effective and meaningful search provided by Semantic Web. Existing search engines are vulnerable to provide relevant answers to users query due to their dependency on simple data available in web pages. On other hand, semantic search engines provide efficient and relevant results as the semantic web manages information with well defined meaning using ontology. A Meta-Search engine is a search tool that ...
Noesis: Ontology based Scoped Search Engine and Resource Aggregator for Atmospheric Science

Science.gov (United States)

Ramachandran, R.; Movva, S.; Li, X.; Cherukuri, P.; Graves, S.

2006-12-01

The goal for search engines is to return results that are both accurate and complete. The search engines should find only what you really want and find everything you really want. Search engines (even meta search engines) lack semantics. The basis for search is simply based on string matching between the user's query term and the resource database and the semantics associated with the search string is not captured. For example, if an atmospheric scientist is searching for "pressure" related web resources, most search engines return inaccurate results such as web resources related to blood pressure. In this presentation Noesis, which is a meta-search engine and a resource aggregator that uses domain ontologies to provide scoped search capabilities will be described. Noesis uses domain ontologies to help the user scope the search query to ensure that the search results are both accurate and complete. The domain ontologies guide the user to refine their search query and thereby reduce the user's burden of experimenting with different search strings. Semantics are captured by refining the query terms to cover synonyms, specializations, generalizations and related concepts. Noesis also serves as a resource aggregator. It categorizes the search results from different online resources such as education materials, publications, datasets, web search engines that might be of interest to the user.
Web-page Prediction for Domain Specific Web-search using Boolean Bit Mask

OpenAIRE

Sinha, Sukanta; Duttagupta, Rana; Mukhopadhyay, Debajyoti

2012-01-01

Search Engine is a Web-page retrieval tool. Nowadays Web searchers utilize their time using an efficient search engine. To improve the performance of the search engine, we are introducing a unique mechanism which will give Web searchers more prominent search results. In this paper, we are going to discuss a domain specific Web search prototype which will generate the predicted Web-page list for user given search string using Boolean bit mask.
Key word placing in Web page body text to increase visibility to search engines

Directory of Open Access Journals (Sweden)

W. T. Kritzinger

2007-11-01

Full Text Available The growth of the World Wide Web has spawned a wide variety of new information sources, which has also left users with the daunting task of determining which sources are valid. Many users rely on the Web as an information source because of the low cost of information retrieval. It is also claimed that the Web has evolved into a powerful business tool. Examples include highly popular business services such as Amazon.com and Kalahari.net. It is estimated that around 80% of users utilize search engines to locate information on the Internet. This, by implication, places emphasis on the underlying importance of Web pages being listed on search engines indices. Empirical evidence that the placement of key words in certain areas of the body text will have an influence on the Web sites' visibility to search engines could not be found in the literature. The result of two experiments indicated that key words should be concentrated towards the top, and diluted towards the bottom of a Web page to increase visibility. However, care should be taken in terms of key word density, to prevent search engine algorithms from raising the spam alarm.
The Effectiveness of Web Search Engines to Index New Sites from Different Countries

Science.gov (United States)

Pirkola, Ari

2009-01-01

Introduction: Investigates how effectively Web search engines index new sites from different countries. The primary interest is whether new sites are indexed equally or whether search engines are biased towards certain countries. If major search engines show biased coverage it can be considered a significant economic and political problem because…
Myanmar Language Search Engine

OpenAIRE

Pann Yu Mon; Yoshiki Mikami

2011-01-01

With the enormous growth of the World Wide Web, search engines play a critical role in retrieving information from the borderless Web. Although many search engines are available for the major languages, but they are not much proficient for the less computerized languages including Myanmar. The main reason is that those search engines are not considering the specific features of those languages. A search engine which capable of searching the Web documents written in those languages is highly n...
Curating the Web: Building a Google Custom Search Engine for the Arts

Science.gov (United States)

Hennesy, Cody; Bowman, John

2008-01-01

Google's first foray onto the web made search simple and results relevant. With its Co-op platform, Google has taken another step toward dramatically increasing the relevancy of search results, further adapting the World Wide Web to local needs. Google Custom Search Engine, a tool on the Co-op platform, puts one in control of his or her own search…
Global polar geospatial information service retrieval based on search engine and ontology reasoning

Science.gov (United States)

Chen, Nengcheng; E, Dongcheng; Di, Liping; Gong, Jianya; Chen, Zeqiang

2007-01-01

In order to improve the access precision of polar geospatial information service on web, a new methodology for retrieving global spatial information services based on geospatial service search and ontology reasoning is proposed, the geospatial service search is implemented to find the coarse service from web, the ontology reasoning is designed to find the refined service from the coarse service. The proposed framework includes standardized distributed geospatial web services, a geospatial service search engine, an extended UDDI registry, and a multi-protocol geospatial information service client. Some key technologies addressed include service discovery based on search engine and service ontology modeling and reasoning in the Antarctic geospatial context. Finally, an Antarctica multi protocol OWS portal prototype based on the proposed methodology is introduced.
IMPROVING PERSONALIZED WEB SEARCH USING BOOKSHELF DATA STRUCTURE

Directory of Open Access Journals (Sweden)

S.K. Jayanthi

2012-10-01

Full Text Available Search engines are playing a vital role in retrieving relevant information for the web user. In this research work a user profile based web search is proposed. So the web user from different domain may receive different set of results. The main challenging work is to provide relevant results at the right level of reading difficulty. Estimating user expertise and re-ranking the results are the main aspects of this paper. The retrieved results are arranged in Bookshelf Data Structure for easy access. Better presentation of search results hence increases the usability of web search engines significantly in visual mode.
Search Engine Optimization for Flash Best Practices for Using Flash on the Web

CERN Document Server

Perkins, Todd

2009-01-01

Search Engine Optimization for Flash dispels the myth that Flash-based websites won't show up in a web search by demonstrating exactly what you can do to make your site fully searchable -- no matter how much Flash it contains. You'll learn best practices for using HTML, CSS and JavaScript, as well as SWFObject, for building sites with Flash that will stand tall in search rankings.

Evidence-based Medicine Search: a customizable federated search engine.

Science.gov (United States)

Bracke, Paul J; Howse, David K; Keim, Samuel M

2008-04-01

This paper reports on the development of a tool by the Arizona Health Sciences Library (AHSL) for searching clinical evidence that can be customized for different user groups. The AHSL provides services to the University of Arizona's (UA's) health sciences programs and to the University Medical Center. Librarians at AHSL collaborated with UA College of Medicine faculty to create an innovative search engine, Evidence-based Medicine (EBM) Search, that provides users with a simple search interface to EBM resources and presents results organized according to an evidence pyramid. EBM Search was developed with a web-based configuration component that allows the tool to be customized for different specialties. Informal and anecdotal feedback from physicians indicates that EBM Search is a useful tool with potential in teaching evidence-based decision making. While formal evaluation is still being planned, a tool such as EBM Search, which can be configured for specific user populations, may help lower barriers to information resources in an academic health sciences center.
Index Compression and Efficient Query Processing in Large Web Search Engines

Science.gov (United States)

Ding, Shuai

2013-01-01

The inverted index is the main data structure used by all the major search engines. Search engines build an inverted index on their collection to speed up query processing. As the size of the web grows, the length of the inverted list structures, which can easily grow to hundreds of MBs or even GBs for common terms (roughly linear in the size of…
Developing as new search engine and browser for libraries to search and organize the World Wide Web library resources

OpenAIRE

Sreenivasulu, V.

2000-01-01

Internet Granthalaya urges world wide advocates and targets at the task of creating a new search engine and dedicated browseer. Internet Granthalaya may be the ultimate search engine exclusively dedicated for every library use to search and organize the world wide web libary resources
A Full-Text-Based Search Engine for Finding Highly Matched Documents Across Multiple Categories

Science.gov (United States)

Nguyen, Hung D.; Steele, Gynelle C.

2016-01-01

This report demonstrates the full-text-based search engine that works on any Web-based mobile application. The engine has the capability to search databases across multiple categories based on a user's queries and identify the most relevant or similar. The search results presented here were found using an Android (Google Co.) mobile device; however, it is also compatible with other mobile phones.
Omicseq: a web-based search engine for exploring omics datasets

Science.gov (United States)

Sun, Xiaobo; Pittard, William S.; Xu, Tianlei; Chen, Li; Zwick, Michael E.; Jiang, Xiaoqian; Wang, Fusheng

2017-01-01

Abstract The development and application of high-throughput genomics technologies has resulted in massive quantities of diverse omics data that continue to accumulate rapidly. These rich datasets offer unprecedented and exciting opportunities to address long standing questions in biomedical research. However, our ability to explore and query the content of diverse omics data is very limited. Existing dataset search tools rely almost exclusively on the metadata. A text-based query for gene name(s) does not work well on datasets wherein the vast majority of their content is numeric. To overcome this barrier, we have developed Omicseq, a novel web-based platform that facilitates the easy interrogation of omics datasets holistically to improve ‘findability’ of relevant data. The core component of Omicseq is trackRank, a novel algorithm for ranking omics datasets that fully uses the numerical content of the dataset to determine relevance to the query entity. The Omicseq system is supported by a scalable and elastic, NoSQL database that hosts a large collection of processed omics datasets. In the front end, a simple, web-based interface allows users to enter queries and instantly receive search results as a list of ranked datasets deemed to be the most relevant. Omicseq is freely available at http://www.omicseq.org. PMID:28402462
A Novel Personalized Web Search Model

Institute of Scientific and Technical Information of China (English)

ZHU Zhengyu; XU Jingqiu; TIAN Yunyan; REN Xiang

2007-01-01

A novel personalized Web search model is proposed.The new system, as a middleware between a user and a Web search engine, is set up on the client machine. It can learn a user's preference implicitly and then generate the user profile automatically. When the user inputs query keywords, the system can automatically generate a few personalized expansion words by computing the term-term associations according to the current user profile, and then these words together with the query keywords are submitted to a popular search engine such as Yahoo or Google.These expansion words help to express accurately the user's search intention. The new Web search model can make a common search engine personalized, that is, the search engine can return different search results to different users who input the same keywords. The experimental results show the feasibility and applicability of the presented work.
IntegromeDB: an integrated system and biological search engine.

Science.gov (United States)

Baitaluk, Michael; Kozhenkov, Sergey; Dubinina, Yulia; Ponomarenko, Julia

2012-01-19

With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback.
A Webometric Analysis of ISI Medical Journals Using Yahoo, AltaVista, and All the Web Search Engines

Directory of Open Access Journals (Sweden)

Zohreh Zahedi

2010-12-01

Full Text Available The World Wide Web is an important information source for scholarly communications. Examining the inlinks via webometrics studies has attracted particular interests among information researchers. In this study, the number of inlinks to 69 ISI medical journals retrieved by Yahoo, AltaVista, and All The web Search Engines were examined via a comparative and Webometrics study. For data analysis, SPSS software was employed. Findings revealed that British Medical Journal website attracted the most links of all in the three search engines. There is a significant correlation between the number of External links and the ISI impact factor. The most significant correlation in the three search engines exists between external links of Yahoo and AltaVista (100% and the least correlation is found between external links of All The web & the number of pages of AltaVista (0.51. There is no significant difference between the internal links & the number of pages found by the three search engines. But in case of impact factors, significant differences are found between these three search engines. So, the study shows that journals with higher impact factor attract more links to their websites. It also indicates that the three search engines are significantly different in terms of total links, outlinks and web impact factors
A semantics-based method for clustering of Chinese web search results

Science.gov (United States)

Zhang, Hui; Wang, Deqing; Wang, Li; Bi, Zhuming; Chen, Yong

2014-01-01

Information explosion is a critical challenge to the development of modern information systems. In particular, when the application of an information system is over the Internet, the amount of information over the web has been increasing exponentially and rapidly. Search engines, such as Google and Baidu, are essential tools for people to find the information from the Internet. Valuable information, however, is still likely submerged in the ocean of search results from those tools. By clustering the results into different groups based on subjects automatically, a search engine with the clustering feature allows users to select most relevant results quickly. In this paper, we propose an online semantics-based method to cluster Chinese web search results. First, we employ the generalised suffix tree to extract the longest common substrings (LCSs) from search snippets. Second, we use the HowNet to calculate the similarities of the words derived from the LCSs, and extract the most representative features by constructing the vocabulary chain. Third, we construct a vector of text features and calculate snippets' semantic similarities. Finally, we improve the Chameleon algorithm to cluster snippets. Extensive experimental results have shown that the proposed algorithm has outperformed over the suffix tree clustering method and other traditional clustering methods.
An assessment of the visibility of MeSH-indexed medical web catalogs through search engines.

Science.gov (United States)

Zweigenbaum, P; Darmoni, S J; Grabar, N; Douyère, M; Benichou, J

2002-01-01

Manually indexed Internet health catalogs such as CliniWeb or CISMeF provide resources for retrieving high-quality health information. Users of these quality-controlled subject gateways are most often referred to them by general search engines such as Google, AltaVista, etc. This raises several questions, among which the following: what is the relative visibility of medical Internet catalogs through search engines? This study addresses this issue by measuring and comparing the visibility of six major, MeSH-indexed health catalogs through four different search engines (AltaVista, Google, Lycos, Northern Light) in two languages (English and French). Over half a million queries were sent to the search engines; for most of these search engines, according to our measures at the time the queries were sent, the most visible catalog for English MeSH terms was CliniWeb and the most visible one for French MeSH terms was CISMeF.
Development of health information search engine based on metadata and ontology.

Science.gov (United States)

Song, Tae-Min; Park, Hyeoun-Ae; Jin, Dal-Lae

2014-04-01

The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers.
Needle Custom Search: Recall-oriented search on the Web using semantic annotations

NARCIS (Netherlands)

Kaptein, Rianne; Koot, Gijs; Huis in 't Veld, Mirjam A.A.; van den Broek, Egon; de Rijke, Maarten; Kenter, Tom; de Vries, A.P.; Zhai, Chen Xiang; de Jong, Franciska M.G.; Radinsky, Kira; Hofmann, Katja

Web search engines are optimized for early precision, which makes it difficult to perform recall-oriented tasks using these search engines. In this article, we present our tool Needle Custom Search. This tool exploits semantic annotations of Web search results and, thereby, increase the efficiency
Needle Custom Search : Recall-oriented search on the web using semantic annotations

NARCIS (Netherlands)

Kaptein, Rianne; Koot, Gijs; Huis in 't Veld, Mirjam A.A.; van den Broek, Egon L.

2014-01-01

Web search engines are optimized for early precision, which makes it difficult to perform recall-oriented tasks using these search engines. In this article, we present our tool Needle Custom Search. This tool exploits semantic annotations of Web search results and, thereby, increase the efficiency
Omicseq: a web-based search engine for exploring omics datasets.

Science.gov (United States)

Sun, Xiaobo; Pittard, William S; Xu, Tianlei; Chen, Li; Zwick, Michael E; Jiang, Xiaoqian; Wang, Fusheng; Qin, Zhaohui S

2017-07-03

The development and application of high-throughput genomics technologies has resulted in massive quantities of diverse omics data that continue to accumulate rapidly. These rich datasets offer unprecedented and exciting opportunities to address long standing questions in biomedical research. However, our ability to explore and query the content of diverse omics data is very limited. Existing dataset search tools rely almost exclusively on the metadata. A text-based query for gene name(s) does not work well on datasets wherein the vast majority of their content is numeric. To overcome this barrier, we have developed Omicseq, a novel web-based platform that facilitates the easy interrogation of omics datasets holistically to improve 'findability' of relevant data. The core component of Omicseq is trackRank, a novel algorithm for ranking omics datasets that fully uses the numerical content of the dataset to determine relevance to the query entity. The Omicseq system is supported by a scalable and elastic, NoSQL database that hosts a large collection of processed omics datasets. In the front end, a simple, web-based interface allows users to enter queries and instantly receive search results as a list of ranked datasets deemed to be the most relevant. Omicseq is freely available at http://www.omicseq.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
CYCLOSA: Decentralizing Private Web Search Through SGX-Based Browser Extensions

OpenAIRE

Pires, Rafael; Goltzsche, David; Mokhtar, Sonia Ben; Bouchenak, Sara; Boutet, Antoine; Felber, Pascal; Kapitza, Rüdiger; Pasin, Marcelo; Schiavoni, Valerio

2018-01-01

By regularly querying Web search engines, users (unconsciously) disclose large amounts of their personal data as part of their search queries, among which some might reveal sensitive information (e.g. health issues, sexual, political or religious preferences). Several solutions exist to allow users querying search engines while improving privacy protection. However, these solutions suffer from a number of limitations: some are subject to user re-identification attacks, while others lack scala...
Is Internet search better than structured instruction for web-based health education?

Science.gov (United States)

Finkelstein, Joseph; Bedra, McKenzie

2013-01-01

Internet provides access to vast amounts of comprehensive information regarding any health-related subject. Patients increasingly use this information for health education using a search engine to identify education materials. An alternative approach of health education via Internet is based on utilizing a verified web site which provides structured interactive education guided by adult learning theories. Comparison of these two approaches in older patients was not performed systematically. The aim of this study was to compare the efficacy of a web-based computer-assisted education (CO-ED) system versus searching the Internet for learning about hypertension. Sixty hypertensive older adults (age 45+) were randomized into control or intervention groups. The control patients spent 30 to 40 minutes searching the Internet using a search engine for information about hypertension. The intervention patients spent 30 to 40 minutes using the CO-ED system, which provided computer-assisted instruction about major hypertension topics. Analysis of pre- and post- knowledge scores indicated a significant improvement among CO-ED users (14.6%) as opposed to Internet users (2%). Additionally, patients using the CO-ED program rated their learning experience more positively than those using the Internet.
Changes in users' Web search performance after ten years ...

African Journals Online (AJOL)

The changes in users' Web search performance using search engines over ten years was investigated in this study. Matched data obtained from samples in 2000 and 2010 were used for the comparative analysis. The patterns of Web search engine use suggested a dominance in using a particular search engine. Statistical ...
Children's Search Engines from an Information Search Process Perspective.

Science.gov (United States)

Broch, Elana

2000-01-01

Describes cognitive and affective characteristics of children and teenagers that may affect their Web searching behavior. Reviews literature on children's searching in online public access catalogs (OPACs) and using digital libraries. Profiles two Web search engines. Discusses some of the difficulties children have searching the Web, in the…
New Architectures for Presenting Search Results Based on Web Search Engines Users Experience

Science.gov (United States)

Martinez, F. J.; Pastor, J. A.; Rodriguez, J. V.; Lopez, Rosana; Rodriguez, J. V., Jr.

2011-01-01

Introduction: The Internet is a dynamic environment which is continuously being updated. Search engines have been, currently are and in all probability will continue to be the most popular systems in this information cosmos. Method: In this work, special attention has been paid to the series of changes made to search engines up to this point,…
BIOMedical Search Engine Framework: Lightweight and customized implementation of domain-specific biomedical search engines.

Science.gov (United States)

Jácome, Alberto G; Fdez-Riverola, Florentino; Lourenço, Anália

2016-07-01

Text mining and semantic analysis approaches can be applied to the construction of biomedical domain-specific search engines and provide an attractive alternative to create personalized and enhanced search experiences. Therefore, this work introduces the new open-source BIOMedical Search Engine Framework for the fast and lightweight development of domain-specific search engines. The rationale behind this framework is to incorporate core features typically available in search engine frameworks with flexible and extensible technologies to retrieve biomedical documents, annotate meaningful domain concepts, and develop highly customized Web search interfaces. The BIOMedical Search Engine Framework integrates taggers for major biomedical concepts, such as diseases, drugs, genes, proteins, compounds and organisms, and enables the use of domain-specific controlled vocabulary. Technologies from the Typesafe Reactive Platform, the AngularJS JavaScript framework and the Bootstrap HTML/CSS framework support the customization of the domain-oriented search application. Moreover, the RESTful API of the BIOMedical Search Engine Framework allows the integration of the search engine into existing systems or a complete web interface personalization. The construction of the Smart Drug Search is described as proof-of-concept of the BIOMedical Search Engine Framework. This public search engine catalogs scientific literature about antimicrobial resistance, microbial virulence and topics alike. The keyword-based queries of the users are transformed into concepts and search results are presented and ranked accordingly. The semantic graph view portraits all the concepts found in the results, and the researcher may look into the relevance of different concepts, the strength of direct relations, and non-trivial, indirect relations. The number of occurrences of the concept shows its importance to the query, and the frequency of concept co-occurrence is indicative of biological relations

Search Engine Optimization

CERN Document Server

Davis, Harold

2006-01-01

SEO--short for Search Engine Optimization--is the art, craft, and science of driving web traffic to web sites. Web traffic is food, drink, and oxygen--in short, life itself--to any web-based business. Whether your web site depends on broad, general traffic, or high-quality, targeted traffic, this PDF has the tools and information you need to draw more traffic to your site. You'll learn how to effectively use PageRank (and Google itself); how to get listed, get links, and get syndicated; and much more. The field of SEO is expanding into all the possible ways of promoting web traffic. This
INTERFACING GOOGLE SEARCH ENGINE TO CAPTURE USER WEB SEARCH BEHAVIOR

OpenAIRE

Fadhilah Mat Yamin; T. Ramayah

2013-01-01

The behaviour of the searcher when using the search engine especially during the query formulation is crucial. Search engines capture users’ activities in the search log, which is stored at the search engine server. Due to the difficulty of obtaining this search log, this paper proposed and develops an interface framework to interface a Google search engine. This interface will capture users’ queries before redirect them to Google. The analysis of the search log will show that users are utili...
BioCarian: search engine for exploratory searches in heterogeneous biological databases.

Science.gov (United States)

Zaki, Nazar; Tennakoon, Chandana

2017-10-02

There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search
Teknik Perangkingan Meta-search Engine

OpenAIRE

Puspitaningrum, Diyah

2014-01-01

Meta-search engine mengorganisasikan penyatuan hasil dari berbagai search engine dengan tujuan untuk meningkatkan presisi hasil pencarian dokumen web. Pada survei teknik perangkingan meta-search engine ini akan didiskusikan isu-isu pra-pemrosesan, rangking, dan berbagai teknik penggabungan hasil pencarian dari search engine yang berbeda-beda (multi-kombinasi). Isu-isu implementasi penggabungan 2 search engine dan 3 search engine juga menjadi sorotan. Pada makalah ini juga dibahas arahan penel...
Measuring Personalization of Web Search

DEFF Research Database (Denmark)

Hannak, Aniko; Sapiezynski, Piotr; Kakhki, Arash Molavi

2013-01-01

are simply unable to access information that the search engines’ algorithm decidesis irrelevant. Despitetheseconcerns, there has been little quantification of the extent of personalization in Web search today, or the user attributes that cause it. In light of this situation, we make three contributions...... as a result of searching with a logged in account and the IP address of the searching user. Our results are a first step towards understanding the extent and effects of personalization on Web search engines today....
Search engines that learn from their users

NARCIS (Netherlands)

Schuth, A.G.

2016-01-01

More than half the world’s population uses web search engines, resulting in over half a billion search queries every single day. For many people web search engines are among the first resources they go to when a question arises. Moreover, search engines have for many become the most trusted route to
Changes in users' mental models of Web search engines after ten ...

African Journals Online (AJOL)

Ward's Cluster analyses including the Pseudo T² Statistical analyses were used to determine the mental model clusters for the seventeen salient design features of Web search engines at each time point. The cubic clustering criterion (CCC) and the dendogram were conducted for each sample to help determine the number ...
Multitasking Web Searching and Implications for Design.

Science.gov (United States)

Ozmutlu, Seda; Ozmutlu, H. C.; Spink, Amanda

2003-01-01

Findings from a study of users' multitasking searches on Web search engines include: multitasking searches are a noticeable user behavior; multitasking search sessions are longer than regular search sessions in terms of queries per session and duration; both Excite and AlltheWeb.com users search for about three topics per multitasking session and…
Search Engine Ranking, Quality, and Content of Web Pages That Are Critical Versus Noncritical of Human Papillomavirus Vaccine.

Science.gov (United States)

Fu, Linda Y; Zook, Kathleen; Spoehr-Labutta, Zachary; Hu, Pamela; Joseph, Jill G

2016-01-01

Online information can influence attitudes toward vaccination. The aim of the present study was to provide a systematic evaluation of the search engine ranking, quality, and content of Web pages that are critical versus noncritical of human papillomavirus (HPV) vaccination. We identified HPV vaccine-related Web pages with the Google search engine by entering 20 terms. We then assessed each Web page for critical versus noncritical bias and for the following quality indicators: authorship disclosure, source disclosure, attribution of at least one reference, currency, exclusion of testimonial accounts, and readability level less than ninth grade. We also determined Web page comprehensiveness in terms of mention of 14 HPV vaccine-relevant topics. Twenty searches yielded 116 unique Web pages. HPV vaccine-critical Web pages comprised roughly a third of the top, top 5- and top 10-ranking Web pages. The prevalence of HPV vaccine-critical Web pages was higher for queries that included term modifiers in addition to root terms. Compared with noncritical Web pages, Web pages critical of HPV vaccine overall had a lower quality score than those with a noncritical bias (p engine queries despite being of lower quality and less comprehensive than noncritical Web pages. Copyright © 2016 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
[Development of domain specific search engines].

Science.gov (United States)

Takai, T; Tokunaga, M; Maeda, K; Kaminuma, T

2000-01-01

As cyber space exploding in a pace that nobody has ever imagined, it becomes very important to search cyber space efficiently and effectively. One solution to this problem is search engines. Already a lot of commercial search engines have been put on the market. However these search engines respond with such cumbersome results that domain specific experts can not tolerate. Using a dedicate hardware and a commercial software called OpenText, we have tried to develop several domain specific search engines. These engines are for our institute's Web contents, drugs, chemical safety, endocrine disruptors, and emergent response for chemical hazard. These engines have been on our Web site for testing.
Start Your Engines: Surfing with Search Engines for Kids.

Science.gov (United States)

Byerly, Greg; Brodie, Carolyn S.

1999-01-01

Suggests that to be an effective educator and user of the Web it is essential to know the basics about search engines. Presents tips for using search engines. Describes several search engines for children and young adults, as well as some general filtered search engines for children. (AEF)
FindZebra: A search engine for rare diseases

DEFF Research Database (Denmark)

Dragusin, Radu; Petcu, Paula; Lioma, Christina Amalia

2013-01-01

Background: The web has become a primary information resource about illnesses and treatments for both medical and non-medical users. Standard web search is by far the most common interface for such information. It is therefore of interest to find out how well web search engines work for diagnostic...... approach for web search engines for rare disease diagnosis which includes 56 real life diagnostic cases, state-of-the-art evaluation measures, and curated information resources. In addition, we introduce FindZebra, a specialized (vertical) rare disease search engine. FindZebra is powered by open source...... medical concepts to demonstrate different ways of displaying the retrieved results to medical experts. Conclusions: Our results indicate that a specialized search engine can improve the diagnostic quality without compromising the ease of use of the currently widely popular web search engines. The proposed...
Health literacy and usability of clinical trial search engines.

Science.gov (United States)

Utami, Dina; Bickmore, Timothy W; Barry, Barbara; Paasche-Orlow, Michael K

2014-01-01

Several web-based search engines have been developed to assist individuals to find clinical trials for which they may be interested in volunteering. However, these search engines may be difficult for individuals with low health and computer literacy to navigate. The authors present findings from a usability evaluation of clinical trial search tools with 41 participants across the health and computer literacy spectrum. The study consisted of 3 parts: (a) a usability study of an existing web-based clinical trial search tool; (b) a usability study of a keyword-based clinical trial search tool; and (c) an exploratory study investigating users' information needs when deciding among 2 or more candidate clinical trials. From the first 2 studies, the authors found that users with low health literacy have difficulty forming queries using keywords and have significantly more difficulty using a standard web-based clinical trial search tool compared with users with adequate health literacy. From the third study, the authors identified the search factors most important to individuals searching for clinical trials and how these varied by health literacy level.
An open-source, mobile-friendly search engine for public medical knowledge.

Science.gov (United States)

Samwald, Matthias; Hanbury, Allan

2014-01-01

The World Wide Web has become an important source of information for medical practitioners. To complement the capabilities of currently available web search engines we developed FindMeEvidence, an open-source, mobile-friendly medical search engine. In a preliminary evaluation, the quality of results from FindMeEvidence proved to be competitive with those from TRIP Database, an established, closed-source search engine for evidence-based medicine.
MuZeeker - Adapting a music search engine for mobile phones

DEFF Research Database (Denmark)

Larsen, Jakob Eg; Halling, Søren Christian; Sigurdsson, Magnus Kristinn

2010-01-01

We describe MuZeeker, a search engine with domain knowledge based on Wikipedia. MuZeeker enables the user to refine a search in multiple steps by means of category selection. In the present version we focus on multimedia search related to music and we present two prototype search applications (web......-based and mobile) and discuss the issues involved in adapting the search engine for mobile phones. A category based filtering approach enables the user to refine a search through relevance feedback by category selection instead of typing additional text, which is hypothesized to be an advantage in the mobile Mu......Zeeker application. We report from two usability experiments using the think aloud protocol, in which N=20 participants performed tasks using MuZeeker and a customized Google search engine. In both experiments web-based and mobile user interfaces were used. The experiment shows that participants are capable...
Intelligent Agent Based Semantic Web in Cloud Computing Environment

OpenAIRE

Mukhopadhyay, Debajyoti; Sharma, Manoj; Joshi, Gajanan; Pagare, Trupti; Palwe, Adarsha

2013-01-01

Considering today's web scenario, there is a need of effective and meaningful search over the web which is provided by Semantic Web. Existing search engines are keyword based. They are vulnerable in answering intelligent queries from the user due to the dependence of their results on information available in web pages. While semantic search engines provides efficient and relevant results as the semantic web is an extension of the current web in which information is given well defined meaning....
An Innovative Approach for online Meta Search Engine Optimization

OpenAIRE

Manral, Jai; Hossain, Mohammed Alamgir

2015-01-01

This paper presents an approach to identify efficient techniques used in Web Search Engine Optimization (SEO). Understanding SEO factors which can influence page ranking in search engine is significant for webmasters who wish to attract large number of users to their website. Different from previous relevant research, in this study we developed an intelligent Meta search engine which aggregates results from various search engines and ranks them based on several important SEO parameters. The r...
Distributed Deep Web Search

NARCIS (Netherlands)

Tjin-Kam-Jet, Kien

2013-01-01

The World Wide Web contains billions of documents (and counting); hence, it is likely that some document will contain the answer or content you are searching for. While major search engines like Bing and Google often manage to return relevant results to your query, there are plenty of situations in
What Can Pictures Tell Us About Web Pages? Improving Document Search Using Images.

Science.gov (United States)

Rodriguez-Vaamonde, Sergio; Torresani, Lorenzo; Fitzgibbon, Andrew W

2015-06-01

Traditional Web search engines do not use the images in the HTML pages to find relevant documents for a given query. Instead, they typically operate by computing a measure of agreement between the keywords provided by the user and only the text portion of each page. In this paper we study whether the content of the pictures appearing in a Web page can be used to enrich the semantic description of an HTML document and consequently boost the performance of a keyword-based search engine. We present a Web-scalable system that exploits a pure text-based search engine to find an initial set of candidate documents for a given query. Then, the candidate set is reranked using visual information extracted from the images contained in the pages. The resulting system retains the computational efficiency of traditional text-based search engines with only a small additional storage cost needed to encode the visual information. We test our approach on one of the TREC Million Query Track benchmarks where we show that the exploitation of visual content yields improvement in accuracies for two distinct text-based search engines, including the system with the best reported performance on this benchmark. We further validate our approach by collecting document relevance judgements on our search results using Amazon Mechanical Turk. The results of this experiment confirm the improvement in accuracy produced by our image-based reranker over a pure text-based system.
Semantic similarity measure in biomedical domain leverage web search engine.

Science.gov (United States)

Chen, Chi-Huang; Hsieh, Sheau-Ling; Weng, Yung-Ching; Chang, Wen-Yung; Lai, Feipei

2010-01-01

Semantic similarity measure plays an essential role in Information Retrieval and Natural Language Processing. In this paper we propose a page-count-based semantic similarity measure and apply it in biomedical domains. Previous researches in semantic web related applications have deployed various semantic similarity measures. Despite the usefulness of the measurements in those applications, measuring semantic similarity between two terms remains a challenge task. The proposed method exploits page counts returned by the Web Search Engine. We define various similarity scores for two given terms P and Q, using the page counts for querying P, Q and P AND Q. Moreover, we propose a novel approach to compute semantic similarity using lexico-syntactic patterns with page counts. These different similarity scores are integrated adapting support vector machines, to leverage the robustness of semantic similarity measures. Experimental results on two datasets achieve correlation coefficients of 0.798 on the dataset provided by A. Hliaoutakis, 0.705 on the dataset provide by T. Pedersen with physician scores and 0.496 on the dataset provided by T. Pedersen et al. with expert scores.

Drexel at TREC 2014 Federated Web Search Track

Science.gov (United States)

2014-11-01

of its input RS results. 1. INTRODUCTION Federated Web Search is the task of searching multiple search engines simultaneously and combining their...or distributed properly[5]. The goal of RS is then, for a given query, to select only the most promising search engines from all those available. Most...result pages of 149 search engines . 4000 queries are used in building the sample set. As a part of the Vertical Selection task, search engines are
Next-Gen Search Engines

Science.gov (United States)

Gupta, Amardeep

2005-01-01

Current search engines--even the constantly surprising Google--seem unable to leap the next big barrier in search: the trillions of bytes of dynamically generated data created by individual web sites around the world, or what some researchers call the "deep web." The challenge now is not information overload, but information overlook.…
Overview of the TREC 2014 Federated Web Search Track

NARCIS (Netherlands)

Demeester, Thomas; Trieschnigg, Rudolf Berend; Nguyen, Dong-Phuong; Zhou, Ke; Hiemstra, Djoerd

2014-01-01

The TREC Federated Web Search track facilitates research in topics related to federated web search, by providing a large realistic data collection sampled from a multitude of online search engines. The FedWeb 2013 challenges of Resource Selection and Results Merging challenges are again included in
Reconsidering the Rhizome: A Textual Analysis of Web Search Engines as Gatekeepers of the Internet

Science.gov (United States)

Hess, A.

Critical theorists have often drawn from Deleuze and Guattari's notion of the rhizome when discussing the potential of the Internet. While the Internet may structurally appear as a rhizome, its day-to-day usage by millions via search engines precludes experiencing the random interconnectedness and potential democratizing function. Through a textual analysis of four search engines, I argue that Web searching has grown hierarchies, or "trees," that organize data in tracts of knowledge and place users in marketing niches rather than assist in the development of new knowledge.
Meta-Search Utilizing Evolitionary Recommendation: A Web Search Architecture Proposal

Czech Academy of Sciences Publication Activity Database

Húsek, Dušan; Keyhanipour, A.; Krömer, P.; Moshiri, B.; Owais, S.; Snášel, V.

2008-01-01

Roč. 33, - (2008), s. 189-200 ISSN 1870-4069 Institutional research plan: CEZ:AV0Z10300504 Keywords : web search * meta-search engine * intelligent re-ranking * ordered weighted averaging * Boolean search queries optimizing Subject RIV: IN - Informatics, Computer Science
Overview of the TREC 2013 Federated Web Search Track

NARCIS (Netherlands)

Demeester, Thomas; Trieschnigg, Rudolf Berend; Nguyen, Dong-Phuong; Hiemstra, Djoerd

2014-01-01

The TREC Federated Web Search track is intended to promote research related to federated search in a realistic web setting, and hereto provides a large data collection gathered from a series of online search engines. This overview paper discusses the results of the first edition of the track, FedWeb
Comparative analysis of some search engines

Directory of Open Access Journals (Sweden)

Taiwo O. Edosomwan

2010-10-01

Full Text Available We compared the information retrieval performances of some popular search engines (namely, Google, Yahoo, AlltheWeb, Gigablast, Zworks and AltaVista and Bing/MSN in response to a list of ten queries, varying in complexity. These queries were run on each search engine and the precision and response time of the retrieved results were recorded. The first ten documents on each retrieval output were evaluated as being ‘relevant’ or ‘non-relevant’ for evaluation of the search engine’s precision. To evaluate response time, normalised recall ratios were calculated at various cut-off points for each query and search engine. This study shows that Google appears to be the best search engine in terms of both average precision (70% and average response time (2 s. Gigablast and AlltheWeb performed the worst overall in this study.
Estimating Search Engine Index Size Variability

DEFF Research Database (Denmark)

Van den Bosch, Antal; Bogers, Toine; De Kunder, Maurice

2016-01-01

One of the determining factors of the quality of Web search engines is the size of their index. In addition to its influence on search result quality, the size of the indexed Web can also tell us something about which parts of the WWW are directly accessible to the everyday user. We propose a novel...... method of estimating the size of a Web search engine’s index by extrapolating from document frequencies of words observed in a large static corpus of Web pages. In addition, we provide a unique longitudinal perspective on the size of Google and Bing’s indices over a nine-year period, from March 2006...... until January 2015. We find that index size estimates of these two search engines tend to vary dramatically over time, with Google generally possessing a larger index than Bing. This result raises doubts about the reliability of previous one-off estimates of the size of the indexed Web. We find...
A Taxonomic Search Engine: federating taxonomic databases using web services.

Science.gov (United States)

Page, Roderic D M

2005-03-09

The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. The Taxonomic Search Engine (TSE) is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO) and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID) authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata) for each name. The Taxonomic Search Engine is available at http://darwin.zoology.gla.ac.uk/~rpage/portal/ and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names.
FindZebra: a search engine for rare diseases.

Science.gov (United States)

Dragusin, Radu; Petcu, Paula; Lioma, Christina; Larsen, Birger; Jørgensen, Henrik L; Cox, Ingemar J; Hansen, Lars Kai; Ingwersen, Peter; Winther, Ole

2013-06-01

The web has become a primary information resource about illnesses and treatments for both medical and non-medical users. Standard web search is by far the most common interface to this information. It is therefore of interest to find out how well web search engines work for diagnostic queries and what factors contribute to successes and failures. Among diseases, rare (or orphan) diseases represent an especially challenging and thus interesting class to diagnose as each is rare, diverse in symptoms and usually has scattered resources associated with it. We design an evaluation approach for web search engines for rare disease diagnosis which includes 56 real life diagnostic cases, performance measures, information resources and guidelines for customising Google Search to this task. In addition, we introduce FindZebra, a specialized (vertical) rare disease search engine. FindZebra is powered by open source search technology and uses curated freely available online medical information. FindZebra outperforms Google Search in both default set-up and customised to the resources used by FindZebra. We extend FindZebra with specialized functionalities exploiting medical ontological information and UMLS medical concepts to demonstrate different ways of displaying the retrieved results to medical experts. Our results indicate that a specialized search engine can improve the diagnostic quality without compromising the ease of use of the currently widely popular standard web search. The proposed evaluation approach can be valuable for future development and benchmarking. The FindZebra search engine is available at http://www.findzebra.com/. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
A development process meta-model for Web based expert systems: The Web engineering point of view

DEFF Research Database (Denmark)

Dokas, I.M.; Alapetite, Alexandre

2006-01-01

raised their complexity. Unfortunately, there is so far no clear answer to the question: How may the methods and experience of Web engineering and expert systems be combined and applied in order todevelop effective and successful Web based expert systems? In an attempt to answer this question...... on Web based expert systems – will be presented. The idea behind the presentation of theaccessibility evaluation and its conclusions is to show to Web based expert system developers, who typically have little Web engineering background, that Web engineering issues must be considered when developing Web......Similar to many legacy computer systems, expert systems can be accessed via the Web, forming a set of Web applications known as Web based expert systems. The tough Web competition, the way people and organizations rely on Web applications and theincreasing user requirements for better services have...
Use of Web Search Engines and Personalisation in Information Searching for Educational Purposes

Science.gov (United States)

Salehi, Sara; Du, Jia Tina; Ashman, Helen

2018-01-01

Introduction: Students increasingly depend on Web search for educational purposes. This causes concerns among education providers as some evidence indicates that in higher education, the disadvantages of Web search and personalised information are not justified by the benefits. Method: One hundred and twenty university students were surveyed about…
Uncovering Web search strategies in South African higher education

Directory of Open Access Journals (Sweden)

Surika Civilcharran

2016-11-01

Full Text Available Background: In spite of the enormous amount of information available on the Web and the fact that search engines are continuously evolving to enhance the search experience, students are nevertheless faced with the difficulty of effectively retrieving information. It is, therefore, imperative for the interaction between students and search tools to be understood and search strategies to be identified, in order to promote successful information retrieval. Objectives: This study identifies the Web search strategies used by postgraduate students and forms part of a wider study into information retrieval strategies used by postgraduate students at the University of KwaZulu-Natal (UKZN, Pietermaritzburg campus, South Africa. Method: Largely underpinned by Thatcher’s cognitive search strategies, the mixed-methods approach was utilised for this study, in which questionnaires were employed in Phase 1 and structured interviews in Phase 2. This article reports and reflects on the findings of Phase 2, which focus on identifying the Web search strategies employed by postgraduate students. The Phase 1 results were reported in Civilcharran, Hughes and Maharaj (2015. Results: Findings reveal the Web search strategies used for academic information retrieval. In spite of easy access to the invisible Web and the advent of meta-search engines, the use of Web search engines still remains the preferred search tool. The UKZN online library databases and especially the UKZN online library, Online Public Access Catalogue system, are being underutilised. Conclusion: Being ranked in the top three percent of the world’s universities, UKZN is investing in search tools that are not being used to their full potential. This evidence suggests an urgent need for students to be trained in Web searching and to have a greater exposure to a variety of search tools. This article is intended to further contribute to the design of undergraduate training programmes in order to deal
TECHNIQUES USED IN SEARCH ENGINE MARKETING

OpenAIRE

Assoc. Prof. Liviu Ion Ciora Ph. D; Lect. Ion Buligiu Ph. D

2010-01-01

Search engine marketing (SEM) is a generic term covering a variety of marketing techniques intended for attracting web traffic in search engines and directories. SEM is a popular tool since it has the potential of substantial gains with minimum investment. On the one side, most search engines and directories offer free or extremely cheap listing. On the other side, the traffic coming from search engines and directories tends to be motivated for acquisitions, making these visitors some of the ...
Dyniqx: a novel meta-search engine for metadata based cross search

OpenAIRE

Zhu, Jianhan; Song, Dawei; Eisenstadt, Marc; Barladeanu, Cristi; Rüger, Stefan

2008-01-01

The effect of metadata in collection fusion has not been sufficiently studied. In response to this, we present a novel meta-search engine called Dyniqx for metadata based cross search. Dyniqx exploits the availability of metadata in academic search services such as PubMed and Google Scholar etc for fusing search results from heterogeneous search engines. In addition, metadata from these search engines are used for generating dynamic query controls such as sliders and tick boxes etc which are ...
Overview of the TREC 2014 Federated Web Search Track

OpenAIRE

Demeester, Thomas; Trieschnigg, Rudolf Berend; Nguyen, Dong-Phuong; Zhou, Ke; Hiemstra, Djoerd

2014-01-01

The TREC Federated Web Search track facilitates research in topics related to federated web search, by providing a large realistic data collection sampled from a multitude of online search engines. The FedWeb 2013 challenges of Resource Selection and Results Merging challenges are again included in FedWeb 2014, and we additionally introduced the task of vertical selection. Other new aspects are the required link between the Resource Selection and Results Merging, and the importance of diversi...
Clinician search behaviors may be influenced by search engine design.

Science.gov (United States)

Lau, Annie Y S; Coiera, Enrico; Zrimec, Tatjana; Compton, Paul

2010-06-30

Searching the Web for documents using information retrieval systems plays an important part in clinicians' practice of evidence-based medicine. While much research focuses on the design of methods to retrieve documents, there has been little examination of the way different search engine capabilities influence clinician search behaviors. Previous studies have shown that use of task-based search engines allows for faster searches with no loss of decision accuracy compared with resource-based engines. We hypothesized that changes in search behaviors may explain these differences. In all, 75 clinicians (44 doctors and 31 clinical nurse consultants) were randomized to use either a resource-based or a task-based version of a clinical information retrieval system to answer questions about 8 clinical scenarios in a controlled setting in a university computer laboratory. Clinicians using the resource-based system could select 1 of 6 resources, such as PubMed; clinicians using the task-based system could select 1 of 6 clinical tasks, such as diagnosis. Clinicians in both systems could reformulate search queries. System logs unobtrusively capturing clinicians' interactions with the systems were coded and analyzed for clinicians' search actions and query reformulation strategies. The most frequent search action of clinicians using the resource-based system was to explore a new resource with the same query, that is, these clinicians exhibited a "breadth-first" search behaviour. Of 1398 search actions, clinicians using the resource-based system conducted 401 (28.7%, 95% confidence interval [CI] 26.37-31.11) in this way. In contrast, the majority of clinicians using the task-based system exhibited a "depth-first" search behavior in which they reformulated query keywords while keeping to the same task profiles. Of 585 search actions conducted by clinicians using the task-based system, 379 (64.8%, 95% CI 60.83-68.55) were conducted in this way. This study provides evidence that
Semantic similarity measures in the biomedical domain by leveraging a web search engine.

Science.gov (United States)

Hsieh, Sheau-Ling; Chang, Wen-Yung; Chen, Chi-Huang; Weng, Yung-Ching

2013-07-01

Various researches in web related semantic similarity measures have been deployed. However, measuring semantic similarity between two terms remains a challenging task. The traditional ontology-based methodologies have a limitation that both concepts must be resided in the same ontology tree(s). Unfortunately, in practice, the assumption is not always applicable. On the other hand, if the corpus is sufficiently adequate, the corpus-based methodologies can overcome the limitation. Now, the web is a continuous and enormous growth corpus. Therefore, a method of estimating semantic similarity is proposed via exploiting the page counts of two biomedical concepts returned by Google AJAX web search engine. The features are extracted as the co-occurrence patterns of two given terms P and Q, by querying P, Q, as well as P AND Q, and the web search hit counts of the defined lexico-syntactic patterns. These similarity scores of different patterns are evaluated, by adapting support vector machines for classification, to leverage the robustness of semantic similarity measures. Experimental results validating against two datasets: dataset 1 provided by A. Hliaoutakis; dataset 2 provided by T. Pedersen, are presented and discussed. In dataset 1, the proposed approach achieves the best correlation coefficient (0.802) under SNOMED-CT. In dataset 2, the proposed method obtains the best correlation coefficient (SNOMED-CT: 0.705; MeSH: 0.723) with physician scores comparing with measures of other methods. However, the correlation coefficients (SNOMED-CT: 0.496; MeSH: 0.539) with coder scores received opposite outcomes. In conclusion, the semantic similarity findings of the proposed method are close to those of physicians' ratings. Furthermore, the study provides a cornerstone investigation for extracting fully relevant information from digitizing, free-text medical records in the National Taiwan University Hospital database.
Discovering How Students Search a Library Web Site: A Usability Case Study.

Science.gov (United States)

Augustine, Susan; Greene, Courtney

2002-01-01

Discusses results of a usability study at the University of Illinois Chicago that investigated whether Internet search engines have influenced the way students search library Web sites. Results show students use the Web site's internal search engine rather than navigating through the pages; have difficulty interpreting library terminology; and…
GLIDERS - A web-based search engine for genome-wide linkage disequilibrium between HapMap SNPs

Directory of Open Access Journals (Sweden)

Broxholme John

2009-10-01

Full Text Available Abstract Background A number of tools for the examination of linkage disequilibrium (LD patterns between nearby alleles exist, but none are available for quickly and easily investigating LD at longer ranges (>500 kb. We have developed a web-based query tool (GLIDERS: Genome-wide LInkage DisEquilibrium Repository and Search engine that enables the retrieval of pairwise associations with r2 ≥ 0.3 across the human genome for any SNP genotyped within HapMap phase 2 and 3, regardless of distance between the markers. Description GLIDERS is an easy to use web tool that only requires the user to enter rs numbers of SNPs they want to retrieve genome-wide LD for (both nearby and long-range. The intuitive web interface handles both manual entry of SNP IDs as well as allowing users to upload files of SNP IDs. The user can limit the resulting inter SNP associations with easy to use menu options. These include MAF limit (5-45%, distance limits between SNPs (minimum and maximum, r2 (0.3 to 1, HapMap population sample (CEU, YRI and JPT+CHB combined and HapMap build/release. All resulting genome-wide inter-SNP associations are displayed on a single output page, which has a link to a downloadable tab delimited text file. Conclusion GLIDERS is a quick and easy way to retrieve genome-wide inter-SNP associations and to explore LD patterns for any number of SNPs of interest. GLIDERS can be useful in identifying SNPs with long-range LD. This can highlight mis-mapping or other potential association signal localisation problems.

Overview of the TREC 2013 federated web search track

OpenAIRE

Demeester, Thomas; Trieschnigg, D; Nguyen, D; Hiemstra, D

2013-01-01

The TREC Federated Web Search track is intended to promote research related to federated search in a realistic web setting, and hereto provides a large data collection gathered from a series of online search engines. This overview paper discusses the results of the first edition of the track, FedWeb 2013. The focus was on basic challenges in federated search: (1) resource selection, and (2) results merging. After an overview of the provided data collection and the relevance judgments for the ...
Web-Based Undergraduate Chemistry Problem-Solving: The Interplay of Task Performance, Domain Knowledge and Web-Searching Strategies

Science.gov (United States)

She, Hsiao-Ching; Cheng, Meng-Tzu; Li, Ta-Wei; Wang, Chia-Yu; Chiu, Hsin-Tien; Lee, Pei-Zon; Chou, Wen-Chi; Chuang, Ming-Hua

2012-01-01

This study investigates the effect of Web-based Chemistry Problem-Solving, with the attributes of Web-searching and problem-solving scaffolds, on undergraduate students' problem-solving task performance. In addition, the nature and extent of Web-searching strategies students used and its correlation with task performance and domain knowledge also…
Quality Dimensions of Internet Search Engines.

Science.gov (United States)

Xie, M.; Wang, H.; Goh, T. N.

1998-01-01

Reviews commonly used search engines (AltaVista, Excite, infoseek, Lycos, HotBot, WebCrawler), focusing on existing comparative studies; considers quality dimensions from the customer's point of view based on a SERVQUAL framework; and groups these quality expectations in five dimensions: tangibles, reliability, responsiveness, assurance, and…
A Taxonomic Search Engine: Federating taxonomic databases using web services

Directory of Open Access Journals (Sweden)

Page Roderic DM

2005-03-01

Full Text Available Abstract Background The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. Results The Taxonomic Search Engine (TSE is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata for each name. Conclusion The Taxonomic Search Engine is available at http://darwin.zoology.gla.ac.uk/~rpage/portal/ and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names.
Characterizing interdisciplinarity of researchers and research topics using web search engines.

Science.gov (United States)

Sayama, Hiroki; Akaishi, Jin

2012-01-01

Researchers' networks have been subject to active modeling and analysis. Earlier literature mostly focused on citation or co-authorship networks reconstructed from annotated scientific publication databases, which have several limitations. Recently, general-purpose web search engines have also been utilized to collect information about social networks. Here we reconstructed, using web search engines, a network representing the relatedness of researchers to their peers as well as to various research topics. Relatedness between researchers and research topics was characterized by visibility boost-increase of a researcher's visibility by focusing on a particular topic. It was observed that researchers who had high visibility boosts by the same research topic tended to be close to each other in their network. We calculated correlations between visibility boosts by research topics and researchers' interdisciplinarity at the individual level (diversity of topics related to the researcher) and at the social level (his/her centrality in the researchers' network). We found that visibility boosts by certain research topics were positively correlated with researchers' individual-level interdisciplinarity despite their negative correlations with the general popularity of researchers. It was also found that visibility boosts by network-related topics had positive correlations with researchers' social-level interdisciplinarity. Research topics' correlations with researchers' individual- and social-level interdisciplinarities were found to be nearly independent from each other. These findings suggest that the notion of "interdisciplinarity" of a researcher should be understood as a multi-dimensional concept that should be evaluated using multiple assessment means.
'Sciencenet'--towards a global search and share engine for all scientific knowledge.

Science.gov (United States)

Lütjohann, Dominic S; Shah, Asmi H; Christen, Michael P; Richter, Florian; Knese, Karsten; Liebel, Urban

2011-06-15

Modern biological experiments create vast amounts of data which are geographically distributed. These datasets consist of petabytes of raw data and billions of documents. Yet to the best of our knowledge, a search engine technology that searches and cross-links all different data types in life sciences does not exist. We have developed a prototype distributed scientific search engine technology, 'Sciencenet', which facilitates rapid searching over this large data space. By 'bringing the search engine to the data', we do not require server farms. This platform also allows users to contribute to the search index and publish their large-scale data to support e-Science. Furthermore, a community-driven method guarantees that only scientific content is crawled and presented. Our peer-to-peer approach is sufficiently scalable for the science web without performance or capacity tradeoff. The free to use search portal web page and the downloadable client are accessible at: http://sciencenet.kit.edu. The web portal for index administration is implemented in ASP.NET, the 'AskMe' experiment publisher is written in Python 2.7, and the backend 'YaCy' search engine is based on Java 1.6.
IBRI-CASONTO: Ontology-based semantic search engine

Directory of Open Access Journals (Sweden)

Awny Sayed

2017-11-01

Full Text Available The vast availability of information, that added in a very fast pace, in the data repositories creates a challenge in extracting correct and accurate information. Which has increased the competition among developers in order to gain access to technology that seeks to understand the intent researcher and contextual meaning of terms. While the competition for developing an Arabic Semantic Search systems are still in their infancy, and the reason could be traced back to the complexity of Arabic Language. It has a complex morphological, grammatical and semantic aspects, as it is a highly inflectional and derivational language. In this paper, we try to highlight and present an Ontological Search Engine called IBRI-CASONTO for Colleges of Applied Sciences, Oman. Our proposed engine supports both Arabic and English language. It is also employed two types of search which are a keyword-based search and a semantics-based search. IBRI-CASONTO is based on different technologies such as Resource Description Framework (RDF data and Ontological graph. The experiments represent in two sections, first it shows a comparison among Entity-Search and the Classical-Search inside the IBRI-CASONTO itself, second it compares the Entity-Search of IBRI-CASONTO with currently used search engines, such as Kngine, Wolfram Alpha and the most popular engine nowadays Google, in order to measure their performance and efficiency.
The effect of query complexity on Web searching results

Directory of Open Access Journals (Sweden)

B.J. Jansen

2000-01-01

Full Text Available This paper presents findings from a study of the effects of query structure on retrieval by Web search services. Fifteen queries were selected from the transaction log of a major Web search service in simple query form with no advanced operators (e.g., Boolean operators, phrase operators, etc. and submitted to 5 major search engines - Alta Vista, Excite, FAST Search, Infoseek, and Northern Light. The results from these queries became the baseline data. The original 15 queries were then modified using the various search operators supported by each of the 5 search engines for a total of 210 queries. Each of these 210 queries was also submitted to the applicable search service. The results obtained were then compared to the baseline results. A total of 2,768 search results were returned by the set of all queries. In general, increasing the complexity of the queries had little effect on the results with a greater than 70% overlap in results, on average. Implications for the design of Web search services and directions for future research are discussed.
Comparison of Physics Frameworks for WebGL-Based Game Engine

Directory of Open Access Journals (Sweden)

Yogya Resa

2014-03-01

Full Text Available Recently, a new technology called WebGL shows a lot of potentials for developing games. However since this technology is still new, there are still many potentials in the game development area that are not explored yet. This paper tries to uncover the potential of integrating physics frameworks with WebGL technology in a game engine for developing 2D or 3D games. Specifically we integrated three open source physics frameworks: Bullet, Cannon, and JigLib into a WebGL-based game engine. Using experiment, we assessed these frameworks in terms of their correctness or accuracy, performance, completeness and compatibility. The results show that it is possible to integrate open source physics frameworks into a WebGLbased game engine, and Bullet is the best physics framework to be integrated into the WebGL-based game engine.
Searching for Suicide Information on Web Search Engines in Chinese

Directory of Open Access Journals (Sweden)

Yen-Feng Lee

2017-01-01

Full Text Available Introduction: Recently, suicide prevention has been an important public health issue. However, with the growing access to information in cyberspace, the harmful information is easily accessible online. To investigate the accessibility of potentially harmful suicide-related information on the internet, we discuss the following issue about searching suicide information on the internet to draw attention to it. Methods: We use five search engines (Google, Yahoo, Bing, Yam, and Sina and four suicide-related search queries (suicide, how to suicide, suicide methods, and want to die in traditional Chinese in April 2016. We classified the first thirty linkages of the search results on each search engine by a psychiatric doctor into suicide prevention, pro-suicide, neutral, unrelated to suicide, or error websites. Results: Among the total 352 unique websites generated, the suicide prevention websites were the most frequent among the search results (37.8%, followed by websites unrelated to suicide (25.9% and neutral websites (23.0%. However, pro-suicide websites were still easily accessible (9.7%. Besides, compared with the USA and China, the search engine originating in Taiwan had the lowest accessibility to pro-suicide information. The results of ANOVA showed a significant difference between the groups, F = 8.772, P < 0.001. Conclusions: This study results suggest a need for further restrictions and regulations of pro-suicide information on the internet. Providing more supportive information online may be an effective plan for suicidal prevention.
ONTOLOGY BASED MEANINGFUL SEARCH USING SEMANTIC WEB AND NATURAL LANGUAGE PROCESSING TECHNIQUES

Directory of Open Access Journals (Sweden)

K. Palaniammal

2013-10-01

Full Text Available The semantic web extends the current World Wide Web by adding facilities for the machine understood description of meaning. The ontology based search model is used to enhance efficiency and accuracy of information retrieval. Ontology is the core technology for the semantic web and this mechanism for representing formal and shared domain descriptions. In this paper, we proposed ontology based meaningful search using semantic web and Natural Language Processing (NLP techniques in the educational domain. First we build the educational ontology then we present the semantic search system. The search model consisting three parts which are embedding spell-check, finding synonyms using WordNet API and querying ontology using SPARQL language. The results are both sensitive to spell check and synonymous context. This paper provides more accurate results and the complete details for the selected field in a single page.
Comparative Study on Three Major Internet Search Engines ...

African Journals Online (AJOL)

, Google and ask.com search engines. Experimental method was used with ten reference questions which were used to query each of the search engines . Yahoo obtained the highest results (521,801,043) among the three Web search ...
Tales from the Field: Search Strategies Applied in Web Searching

Directory of Open Access Journals (Sweden)

Soohyung Joo

2010-08-01

Full Text Available In their web search processes users apply multiple types of search strategies, which consist of different search tactics. This paper identifies eight types of information search strategies with associated cases based on sequences of search tactics during the information search process. Thirty-one participants representing the general public were recruited for this study. Search logs and verbal protocols offered rich data for the identification of different types of search strategies. Based on the findings, the authors further discuss how to enhance web-based information retrieval (IR systems to support each type of search strategy.
Search engine optimization

OpenAIRE

Marolt, Klemen

2013-01-01

Search engine optimization techniques, often shortened to “SEO,” should lead to first positions in organic search results. Some optimization techniques do not change over time, yet still form the basis for SEO. However, as the Internet and web design evolves dynamically, new optimization techniques flourish and flop. Thus, we looked at the most important factors that can help to improve positioning in search results. It is important to emphasize that none of the techniques can guarantee high ...
From people to entities new semantic search paradigms for the web

CERN Document Server

Demartini, G

2014-01-01

The exponential growth of digital information available in companies and on the Web creates the need for search tools that can respond to the most sophisticated information needs. Many user tasks would be simplified if Search Engines would support typed search, and return entities instead of just Web documents. For example, an executive who tries to solve a problem needs to find people in the company who are knowledgeable about a certain topic.In the first part of the book, we propose a model for expert finding based on the well-consolidated vector space model for Information Retrieval and inv
Raising Reliability of Web Search Tool Research through Replication and Chaos Theory

OpenAIRE

Nicholson, Scott

1999-01-01

Because the World Wide Web is a dynamic collection of information, the Web search tools (or "search engines") that index the Web are dynamic. Traditional information retrieval evaluation techniques may not provide reliable results when applied to the Web search tools. This study is the result of ten replications of the classic 1996 Ding and Marchionini Web search tool research. It explores the effects that replication can have on transforming unreliable results from one iteration into replica...
Regulating Search Engines: Taking Stock And Looking Ahead

OpenAIRE

Gasser, Urs

2006-01-01

Since the creation of the first pre-Web Internet search engines in the early 1990s, search engines have become almost as important as email as a primary online activity. Arguably, search engines are among the most important gatekeepers in today's digitally networked environment. Thus, it does not come as a surprise that the evolution of search technology and the diffusion of search engines have been accompanied by a series of conflicts among stakeholders such as search operators, content crea...
Using the open Web as an information resource and scholarly Web search engines as retrieval tools for academic and research purposes

OpenAIRE

Filistea Naude; Chris Rensleigh; Adeline S.A. du Toit

2010-01-01

This study provided insight into the significance of the open Web as an information resource and Web search engines as research tools amongst academics. The academic staff establishment of the University of South Africa (Unisa) was invited to participate in a questionnaire survey and included 1188 staff members from five colleges. This study culminated in a PhD dissertation in 2008. One hundred and eighty seven respondents participated in the survey which gave a response rate of 15.7%. The re...
Web Engineering

Energy Technology Data Exchange (ETDEWEB)

White, Bebo

2003-06-23

Web Engineering is the application of systematic, disciplined and quantifiable approaches to development, operation, and maintenance of Web-based applications. It is both a pro-active approach and a growing collection of theoretical and empirical research in Web application development. This paper gives an overview of Web Engineering by addressing the questions: (a) why is it needed? (b) what is its domain of operation? (c) how does it help and what should it do to improve Web application development? and (d) how should it be incorporated in education and training? The paper discusses the significant differences that exist between Web applications and conventional software, the taxonomy of Web applications, the progress made so far and the research issues and experience of creating a specialization at the master's level. The paper reaches a conclusion that Web Engineering at this stage is a moving target since Web technologies are constantly evolving, making new types of applications possible, which in turn may require innovations in how they are built, deployed and maintained.
Searching for information on the World Wide Web with a search engine: a pilot study on cognitive flexibility in younger and older users.

Science.gov (United States)

Dommes, Aurelie; Chevalier, Aline; Rossetti, Marilyne

2010-04-01

This pilot study investigated the age-related differences in searching for information on the World Wide Web with a search engine. 11 older adults (6 men, 5 women; M age=59 yr., SD=2.76, range=55-65 yr.) and 12 younger adults (2 men, 10 women; M=23.7 yr., SD=1.07, range=22-25 yr.) had to conduct six searches differing in complexity, and for which a search method was or was not induced. The results showed that the younger and older participants provided with an induced search method were less flexible than the others and produced fewer new keywords. Moreover, older participants took longer than the younger adults, especially in the complex searches. The younger participants were flexible in the first request and spontaneously produced new keywords (spontaneous flexibility), whereas the older participants only produced new keywords when confronted by impasses (reactive flexibility). Aging may influence web searches, especially the nature of keywords used.

Teen smoking cessation help via the Internet: a survey of search engines.

Science.gov (United States)

Edwards, Christine C; Elliott, Sean P; Conway, Terry L; Woodruff, Susan I

2003-07-01

The objective of this study was to assess Web sites related to teen smoking cessation on the Internet. Seven Internet search engines were searched using the keywords teen quit smoking. The top 20 hits from each search engine were reviewed and categorized. The keywords teen quit smoking produced between 35 and 400,000 hits depending on the search engine. Of 140 potential hits, 62% were active, unique sites; 85% were listed by only one search engine; and 40% focused on cessation. Findings suggest that legitimate on-line smoking cessation help for teens is constrained by search engine choice and the amount of time teens spend looking through potential sites. Resource listings should be updated regularly. Smoking cessation Web sites need to be picked up on multiple search engine searches. Further evaluation of smoking cessation Web sites need to be conducted to identify the most effective help for teens.
A Longitudinal Analysis of Search Engine Index Size

DEFF Research Database (Denmark)

Van den Bosch, Antal; Bogers, Toine; De Kunder, Maurice

2015-01-01

One of the determining factors of the quality of Web search engines is the size of their index. In addition to its influence on search result quality, the size of the indexed Web can also tell us something about which parts of the WWW are directly accessible to the everyday user. We propose a novel...... method of estimating the size of a Web search engine’s index by extrapolating from document frequencies of words observed in a large static corpus of Web pages. In addition, we provide a unique longitudinal perspective on the size of Google and Bing’s indexes over a nine-year period, from March 2006...... until January 2015. We find that index size estimates of these two search engines tend to vary dramatically over time, with Google generally possessing a larger index than Bing. This result raises doubts about the reliability of previous one-off estimates of the size of the indexed Web. We find...
How Will Online Affiliate Marketing Networks Impact Search Engine Rankings?

OpenAIRE

Janssen, David; Heck, Eric

2007-01-01

textabstractIn online affiliate marketing networks advertising web sites offer their affiliates revenues based on provided web site traffic and associated leads and sales. Advertising web sites can have a network of thousands of affiliates providing them with web site traffic through hyperlinks on their web sites. Search engines such as Google, MSN, and Yahoo, consider hyperlinks as a proof of quality and/or reliability of the linked web sites, and therefore use them to determine the relevanc...
Search Engines: Gateway to a New ``Panopticon''?

Science.gov (United States)

Kosta, Eleni; Kalloniatis, Christos; Mitrou, Lilian; Kavakli, Evangelia

Nowadays, Internet users are depending on various search engines in order to be able to find requested information on the Web. Although most users feel that they are and remain anonymous when they place their search queries, reality proves otherwise. The increasing importance of search engines for the location of the desired information on the Internet usually leads to considerable inroads into the privacy of users. The scope of this paper is to study the main privacy issues with regard to search engines, such as the anonymisation of search logs and their retention period, and to examine the applicability of the European data protection legislation to non-EU search engine providers. Ixquick, a privacy-friendly meta search engine will be presented as an alternative to privacy intrusive existing practices of search engines.
Construction of web-based nutrition education contents and searching engine for usage of healthy menu of children

Science.gov (United States)

Lee, Tae-Kyong; Chung, Hea-Jung; Park, Hye-Kyung; Lee, Eun-Ju; Nam, Hye-Seon; Jung, Soon-Im; Cho, Jee-Ye; Lee, Jin-Hee; Kim, Gon; Kim, Min-Chan

2008-01-01

A diet habit, which is developed in childhood, lasts for a life time. In this sense, nutrition education and early exposure to healthy menus in childhood is important. Children these days have easy access to the internet. Thus, a web-based nutrition education program for children is an effective tool for nutrition education of children. This site provides the material of the nutrition education for children with characters which are personified nutrients. The 151 menus are stored in the site together with video script of the cooking process. The menus are classified by the criteria based on age, menu type and the ethnic origin of the menu. The site provides a search function. There are three kinds of search conditions which are key words, menu type and "between" expression of nutrients such as calorie and other nutrients. The site is developed with the operating system Windows 2003 Server, the web server ZEUS 5, development language JSP, and database management system Oracle 10 g. PMID:20126375
Information Retrieval for Education: Making Search Engines Language Aware

Science.gov (United States)

Ott, Niels; Meurers, Detmar

2010-01-01

Search engines have been a major factor in making the web the successful and widely used information source it is today. Generally speaking, they make it possible to retrieve web pages on a topic specified by the keywords entered by the user. Yet web searching currently does not take into account which of the search results are comprehensible for…
A review of the reporting of web searching to identify studies for Cochrane systematic reviews.

Science.gov (United States)

Briscoe, Simon

2018-03-01

The literature searches that are used to identify studies for inclusion in a systematic review should be comprehensively reported. This ensures that the literature searches are transparent and reproducible, which is important for assessing the strengths and weaknesses of a systematic review and re-running the literature searches when conducting an update review. Web searching using search engines and the websites of topically relevant organisations is sometimes used as a supplementary literature search method. Previous research has shown that the reporting of web searching in systematic reviews often lacks important details and is thus not transparent or reproducible. Useful details to report about web searching include the name of the search engine or website, the URL, the date searched, the search strategy, and the number of results. This study reviews the reporting of web searching to identify studies for Cochrane systematic reviews published in the 6-month period August 2016 to January 2017 (n = 423). Of these reviews, 61 reviews reported using web searching using a search engine or website as a literature search method. In the majority of reviews, the reporting of web searching was found to lack essential detail for ensuring transparency and reproducibility, such as the search terms. Recommendations are made on how to improve the reporting of web searching in Cochrane systematic reviews. Copyright © 2017 John Wiley & Sons, Ltd.
Teaching AI Search Algorithms in a Web-Based Educational System

Science.gov (United States)

Grivokostopoulou, Foteini; Hatzilygeroudis, Ioannis

2013-01-01

In this paper, we present a way of teaching AI search algorithms in a web-based adaptive educational system. Teaching is based on interactive examples and exercises. Interactive examples, which use visualized animations to present AI search algorithms in a step-by-step way with explanations, are used to make learning more attractive. Practice…
Variability of patient spine education by Internet search engine.

Science.gov (United States)

Ghobrial, George M; Mehdi, Angud; Maltenfort, Mitchell; Sharan, Ashwini D; Harrop, James S

2014-03-01

Patients are increasingly reliant upon the Internet as a primary source of medical information. The educational experience varies by search engine, search term, and changes daily. There are no tools for critical evaluation of spinal surgery websites. To highlight the variability between common search engines for the same search terms. To detect bias, by prevalence of specific kinds of websites for certain spinal disorders. Demonstrate a simple scoring system of spinal disorder website for patient use, to maximize the quality of information exposed to the patient. Ten common search terms were used to query three of the most common search engines. The top fifty results of each query were tabulated. A negative binomial regression was performed to highlight the variation across each search engine. Google was more likely than Bing and Yahoo search engines to return hospital ads (P=0.002) and more likely to return scholarly sites of peer-reviewed lite (P=0.003). Educational web sites, surgical group sites, and online web communities had a significantly higher likelihood of returning on any search, regardless of search engine, or search string (P=0.007). Likewise, professional websites, including hospital run, industry sponsored, legal, and peer-reviewed web pages were less likely to be found on a search overall, regardless of engine and search string (P=0.078). The Internet is a rapidly growing body of medical information which can serve as a useful tool for patient education. High quality information is readily available, provided that the patient uses a consistent, focused metric for evaluating online spine surgery information, as there is a clear variability in the way search engines present information to the patient. Published by Elsevier B.V.
A Systematic Understanding of Successful Web Searches in Information-Based Tasks

Science.gov (United States)

Zhou, Mingming

2013-01-01

The purpose of this study is to research how Chinese university students solve information-based problems. With the Search Performance Index as the measure of search success, participants were divided into high, medium and low-performing groups. Based on their web search logs, these three groups were compared along five dimensions of the search…
Spiders and Worms and Crawlers, Oh My: Searching on the World Wide Web.

Science.gov (United States)

Eagan, Ann; Bender, Laura

Searching on the world wide web can be confusing. A myriad of search engines exist, often with little or no documentation, and many of these search engines work differently from the standard search engines people are accustomed to using. Intended for librarians, this paper defines search engines, directories, spiders, and robots, and covers basics…
Subject Gateway Sites and Search Engine Ranking.

Science.gov (United States)

Thelwall, Mike

2002-01-01

Discusses subject gateway sites and commercial search engines for the Web and presents an explanation of Google's PageRank algorithm. The principle question addressed is the conditions under which a gateway site will increase the likelihood that a target page is found in search engines. (LRW)
Web information retrieval based on ontology

Science.gov (United States)

Zhang, Jian

2013-03-01

The purpose of the Information Retrieval (IR) is to find a set of documents that are relevant for a specific information need of a user. Traditional Information Retrieval model commonly used in commercial search engine is based on keyword indexing system and Boolean logic queries. One big drawback of traditional information retrieval is that they typically retrieve information without an explicitly defined domain of interest to the users so that a lot of no relevance information returns to users, which burden the user to pick up useful answer from these no relevance results. In order to tackle this issue, many semantic web information retrieval models have been proposed recently. The main advantage of Semantic Web is to enhance search mechanisms with the use of Ontology's mechanisms. In this paper, we present our approach to personalize web search engine based on ontology. In addition, key techniques are also discussed in our paper. Compared to previous research, our works concentrate on the semantic similarity and the whole process including query submission and information annotation.
Estimating search engine index size variability: a 9-year longitudinal study.

Science.gov (United States)

van den Bosch, Antal; Bogers, Toine; de Kunder, Maurice

One of the determining factors of the quality of Web search engines is the size of their index. In addition to its influence on search result quality, the size of the indexed Web can also tell us something about which parts of the WWW are directly accessible to the everyday user. We propose a novel method of estimating the size of a Web search engine's index by extrapolating from document frequencies of words observed in a large static corpus of Web pages. In addition, we provide a unique longitudinal perspective on the size of Google and Bing's indices over a nine-year period, from March 2006 until January 2015. We find that index size estimates of these two search engines tend to vary dramatically over time, with Google generally possessing a larger index than Bing. This result raises doubts about the reliability of previous one-off estimates of the size of the indexed Web. We find that much, if not all of this variability can be explained by changes in the indexing and ranking infrastructure of Google and Bing. This casts further doubt on whether Web search engines can be used reliably for cross-sectional webometric studies.
Music Search Engines: Specifications and Challenges

DEFF Research Database (Denmark)

Nanopoulos, Alexandros; Rafilidis, Dimitrios; Manolopoulos, Yannis

2009-01-01

Nowadays we have a proliferation of music data available over the Web. One of the imperative challenges is how to search these vast, global-scale musical resources to find preferred music. Recent research has envisaged the notion of music search engines (MSEs) that allow for searching preferred...
Visibiome: an efficient microbiome search engine based on a scalable, distributed architecture.

Science.gov (United States)

Azman, Syafiq Kamarul; Anwar, Muhammad Zohaib; Henschel, Andreas

2017-07-24

Given the current influx of 16S rRNA profiles of microbiota samples, it is conceivable that large amounts of them eventually are available for search, comparison and contextualization with respect to novel samples. This process facilitates the identification of similar compositional features in microbiota elsewhere and therefore can help to understand driving factors for microbial community assembly. We present Visibiome, a microbiome search engine that can perform exhaustive, phylogeny based similarity search and contextualization of user-provided samples against a comprehensive dataset of 16S rRNA profiles environments, while tackling several computational challenges. In order to scale to high demands, we developed a distributed system that combines web framework technology, task queueing and scheduling, cloud computing and a dedicated database server. To further ensure speed and efficiency, we have deployed Nearest Neighbor search algorithms, capable of sublinear searches in high-dimensional metric spaces in combination with an optimized Earth Mover Distance based implementation of weighted UniFrac. The search also incorporates pairwise (adaptive) rarefaction and optionally, 16S rRNA copy number correction. The result of a query microbiome sample is the contextualization against a comprehensive database of microbiome samples from a diverse range of environments, visualized through a rich set of interactive figures and diagrams, including barchart-based compositional comparisons and ranking of the closest matches in the database. Visibiome is a convenient, scalable and efficient framework to search microbiomes against a comprehensive database of environmental samples. The search engine leverages a popular but computationally expensive, phylogeny based distance metric, while providing numerous advantages over the current state of the art tool.
Exploring the Relevance of Search Engines: An Overview of Google as a Case Study

Directory of Open Access Journals (Sweden)

Ricardo Beltrán-Alfonso

2017-08-01

Full Text Available The huge amount of data on the Internet and the diverse list of strategies used to try to link this information with relevant searches through Linked Data have generated a revolution in data treatment and its representation. Nevertheless, the conventional search engines like Google are kept as strategies with good reception to do search processes. The following article presents a study of the development and evolution of search engines, more specifically, to analyze the relevance of findings based on the number of results displayed in paging systems with Google as a case study. Finally, it is intended to contribute to indexing criteria in search results, based on an approach to Semantic Web as a stage in the evolution of the Web.
Automatic Planning of External Search Engine Optimization

Directory of Open Access Journals (Sweden)

Vita Jasevičiūtė

2015-07-01

Full Text Available This paper describes an investigation of the external search engine optimization (SEO action planning tool, dedicated to automatically extract a small set of most important keywords for each month during whole year period. The keywords in the set are extracted accordingly to external measured parameters, such as average number of searches during the year and for every month individually. Additionally the position of the optimized web site for each keyword is taken into account. The generated optimization plan is similar to the optimization plans prepared manually by the SEO professionals and can be successfully used as a support tool for web site search engine optimization.
Querying archetype-based EHRs by search ontology-based XPath engineering.

Science.gov (United States)

Kropf, Stefan; Uciteli, Alexandr; Schierle, Katrin; Krücken, Peter; Denecke, Kerstin; Herre, Heinrich

2018-05-11

Legacy data and new structured data can be stored in a standardized format as XML-based EHRs on XML databases. Querying documents on these databases is crucial for answering research questions. Instead of using free text searches, that lead to false positive results, the precision can be increased by constraining the search to certain parts of documents. A search ontology-based specification of queries on XML documents defines search concepts and relates them to parts in the XML document structure. Such query specification method is practically introduced and evaluated by applying concrete research questions formulated in natural language on a data collection for information retrieval purposes. The search is performed by search ontology-based XPath engineering that reuses ontologies and XML-related W3C standards. The key result is that the specification of research questions can be supported by the usage of search ontology-based XPath engineering. A deeper recognition of entities and a semantic understanding of the content is necessary for a further improvement of precision and recall. Key limitation is that the application of the introduced process requires skills in ontology and software development. In future, the time consuming ontology development could be overcome by implementing a new clinical role: the clinical ontologist. The introduced Search Ontology XML extension connects Search Terms to certain parts in XML documents and enables an ontology-based definition of queries. Search ontology-based XPath engineering can support research question answering by the specification of complex XPath expressions without deep syntax knowledge about XPaths.
Federated Search and the Library Web Site: A Study of Association of Research Libraries Member Web Sites

Science.gov (United States)

Williams, Sarah C.

2010-01-01

The purpose of this study was to investigate how federated search engines are incorporated into the Web sites of libraries in the Association of Research Libraries. In 2009, information was gathered for each library in the Association of Research Libraries with a federated search engine. This included the name of the federated search service and…

Snippet-based relevance predictions for federated web search

NARCIS (Netherlands)

Demeester, Thomas; Nguyen, Dong-Phuong; Trieschnigg, Rudolf Berend; Develder, Chris; Hiemstra, Djoerd

How well can the relevance of a page be predicted, purely based on snippets? This would be highly useful in a Federated Web Search setting where caching large amounts of result snippets is more feasible than caching entire pages. The experiments reported in this paper make use of result snippets and
Balancing Efficiency and Effectiveness for Fusion-Based Search Engines in the "Big Data" Environment

Science.gov (United States)

Li, Jieyu; Huang, Chunlan; Wang, Xiuhong; Wu, Shengli

2016-01-01

Introduction: In the big data age, we have to deal with a tremendous amount of information, which can be collected from various types of sources. For information search systems such as Web search engines or online digital libraries, the collection of documents becomes larger and larger. For some queries, an information search system needs to…
Considerations for the development of task-based search engines

DEFF Research Database (Denmark)

Petcu, Paula; Dragusin, Radu

2013-01-01

Based on previous experience from working on a task-based search engine, we present a list of suggestions and ideas for an Information Retrieval (IR) framework that could inform the development of next generation professional search systems. The specific task that we start from is the clinicians......' information need in finding rare disease diagnostic hypotheses at the time and place where medical decisions are made. Our experience from the development of a search engine focused on supporting clinicians in completing this task has provided us valuable insights in what aspects should be considered...... by the developers of vertical search engines....
Reflections on New Search Engine 新型搜索引擎畅想

OpenAIRE

Huang, Jiannian

2007-01-01

English abstract]Quick increment of need on internet information resources leads to a rush of search engines. This article introduces some new type of search engines which is appearing and will appear. These search engines includes as follows: grey document search engine, invisible web search engine, knowledge discovery search engine, clustering meta search engine, academic clustering search engine, conception comparison and conception analogy search engine, consultation search engine, teachi...
Sound Search Engine Concept

DEFF Research Database (Denmark)

2006-01-01

Sound search is provided by the major search engines, however, indexing is text based, not sound based. We will establish a dedicated sound search services with based on sound feature indexing. The current demo shows the concept of the sound search engine. The first engine will be realased June...
[Biomedical information on the internet using search engines. A one-year trial].

Science.gov (United States)

Corrao, Salvatore; Leone, Francesco; Arnone, Sabrina

2004-01-01

The internet is a communication medium and content distributor that provide information in the general sense but it could be of great utility regarding as the search and retrieval of biomedical information. Search engines represent a great deal to rapidly find information on the net. However, we do not know whether general search engines and meta-search ones are reliable in order to find useful and validated biomedical information. The aim of our study was to verify the reproducibility of a search by key-words (pediatric or evidence) using 9 international search engines and 1 meta-search engine at the baseline and after a one year period. We analysed the first 20 citations as output of each searching. We evaluated the formal quality of Web-sites and their domain extensions. Moreover, we compared the output of each search at the start of this study and after a one year period and we considered as a criterion of reliability the number of Web-sites cited again. We found some interesting results that are reported throughout the text. Our findings point out an extreme dynamicity of the information on the Web and, for this reason, we advice a great caution when someone want to use search and meta-search engines as a tool for searching and retrieve reliable biomedical information. On the other hand, some search and meta-search engines could be very useful as a first step searching for defining better a search and, moreover, for finding institutional Web-sites too. This paper allows to know a more conscious approach to the internet biomedical information universe.
A longitudinal analysis of search engine index size

NARCIS (Netherlands)

Bosch, A.P.J. van den; Bogers, T.; Kunder, M. de; Salah, A. A.; Tonta, Y.; Salah, A. A. A.; Sugimoto, C.; Al, U.

2015-01-01

One of the determining factors of the quality of Web search engines is the size and quality of their index. In addition to its influence on search result quality, the size of the indexed Web can also tell us something about which parts of the WWW are directly accessible to the everyday user. We
Improving Web Page Retrieval using Search Context from Clicked Domain Names

NARCIS (Netherlands)

Li, R.

Search context is a crucial factor that helps to understand a user’s information need in ad-hoc Web page retrieval. A query log of a search engine contains rich information on issued queries and their corresponding clicked Web pages. The clicked data implies its relevance to the query and can be
Quality of Web-Based Information on Cannabis Addiction

Science.gov (United States)

Khazaal, Yasser; Chatton, Anne; Cochand, Sophie; Zullino, Daniele

2008-01-01

This study evaluated the quality of Web-based information on cannabis use and addiction and investigated particular content quality indicators. Three keywords ("cannabis addiction," "cannabis dependence," and "cannabis abuse") were entered into two popular World Wide Web search engines. Websites were assessed with a standardized proforma designed…
SearchResultFinder: federated search made easy

NARCIS (Netherlands)

Trieschnigg, Rudolf Berend; Tjin-Kam-Jet, Kien; Hiemstra, Djoerd

Building a federated search engine based on a large number existing web search engines is a challenge: implementing the programming interface (API) for each search engine is an exacting and time-consuming job. In this demonstration we present SearchResultFinder, a browser plugin which speeds up
Searching for a New Way to Reach Patrons: A Search Engine Optimization Pilot Project at Binghamton University Libraries

Science.gov (United States)

Rushton, Erin E.; Kelehan, Martha Daisy; Strong, Marcy A.

2008-01-01

Search engine use is one of the most popular online activities. According to a recent OCLC report, nearly all students start their electronic research using a search engine instead of the library Web site. Instead of viewing search engines as competition, however, librarians at Binghamton University Libraries decided to employ search engine…
Grooker, KartOO, Addict-o-Matic and More: Really Different Search Engines

Science.gov (United States)

Descy, Don E.

2009-01-01

There are hundreds of unique search engines in the United States and thousands of unique search engines around the world. If people get into search engines designed just to search particular web sites, the number is in the hundreds of thousands. This article looks at: (1) clustering search engines, such as KartOO (www.kartoo.com) and Grokker…
How Google Web Search copes with very similar documents

NARCIS (Netherlands)

W. Mettrop (Wouter); P. Nieuwenhuysen; H. Smulders

2006-01-01

textabstractA significant portion of the computer files that carry documents, multimedia, programs etc. on the Web are identical or very similar to other files on the Web. How do search engines cope with this? Do they perform some kind of “deduplication”? How should users take into account that
Web search queries can predict stock market volumes.

Science.gov (United States)

Bordino, Ilaria; Battiston, Stefano; Caldarelli, Guido; Cristelli, Matthieu; Ukkonen, Antti; Weber, Ingmar

2012-01-01

We live in a computerized and networked society where many of our actions leave a digital trace and affect other people's actions. This has lead to the emergence of a new data-driven research field: mathematical methods of computer science, statistical physics and sociometry provide insights on a wide range of disciplines ranging from social science to human mobility. A recent important discovery is that search engine traffic (i.e., the number of requests submitted by users to search engines on the www) can be used to track and, in some cases, to anticipate the dynamics of social phenomena. Successful examples include unemployment levels, car and home sales, and epidemics spreading. Few recent works applied this approach to stock prices and market sentiment. However, it remains unclear if trends in financial markets can be anticipated by the collective wisdom of on-line users on the web. Here we show that daily trading volumes of stocks traded in NASDAQ-100 are correlated with daily volumes of queries related to the same stocks. In particular, query volumes anticipate in many cases peaks of trading by one day or more. Our analysis is carried out on a unique dataset of queries, submitted to an important web search engine, which enable us to investigate also the user behavior. We show that the query volume dynamics emerges from the collective but seemingly uncoordinated activity of many users. These findings contribute to the debate on the identification of early warnings of financial systemic risk, based on the activity of users of the www.
Web search queries can predict stock market volumes.

Directory of Open Access Journals (Sweden)

Ilaria Bordino

Full Text Available We live in a computerized and networked society where many of our actions leave a digital trace and affect other people's actions. This has lead to the emergence of a new data-driven research field: mathematical methods of computer science, statistical physics and sociometry provide insights on a wide range of disciplines ranging from social science to human mobility. A recent important discovery is that search engine traffic (i.e., the number of requests submitted by users to search engines on the www can be used to track and, in some cases, to anticipate the dynamics of social phenomena. Successful examples include unemployment levels, car and home sales, and epidemics spreading. Few recent works applied this approach to stock prices and market sentiment. However, it remains unclear if trends in financial markets can be anticipated by the collective wisdom of on-line users on the web. Here we show that daily trading volumes of stocks traded in NASDAQ-100 are correlated with daily volumes of queries related to the same stocks. In particular, query volumes anticipate in many cases peaks of trading by one day or more. Our analysis is carried out on a unique dataset of queries, submitted to an important web search engine, which enable us to investigate also the user behavior. We show that the query volume dynamics emerges from the collective but seemingly uncoordinated activity of many users. These findings contribute to the debate on the identification of early warnings of financial systemic risk, based on the activity of users of the www.
Nuclear expert web search and crawler algorithm

International Nuclear Information System (INIS)

Reis, Thiago; Barroso, Antonio C.O.; Baptista, Benedito Filho D.

2013-01-01

In this paper we present preliminary research on web search and crawling algorithm applied specifically to nuclear-related web information. We designed a web-based nuclear-oriented expert system guided by a web crawler algorithm and a neural network able to search and retrieve nuclear-related hyper textual web information in autonomous and massive fashion. Preliminary experimental results shows a retrieval precision of 80% for web pages related to any nuclear theme and a retrieval precision of 72% for web pages related only to nuclear power theme. (author)
Nuclear expert web search and crawler algorithm

Energy Technology Data Exchange (ETDEWEB)

Reis, Thiago; Barroso, Antonio C.O.; Baptista, Benedito Filho D., E-mail: thiagoreis@usp.br, E-mail: barroso@ipen.br, E-mail: bdbfilho@ipen.br [Instituto de Pesquisas Energeticas e Nucleares (IPEN/CNEN-SP), Sao Paulo, SP (Brazil)

2013-07-01

In this paper we present preliminary research on web search and crawling algorithm applied specifically to nuclear-related web information. We designed a web-based nuclear-oriented expert system guided by a web crawler algorithm and a neural network able to search and retrieve nuclear-related hyper textual web information in autonomous and massive fashion. Preliminary experimental results shows a retrieval precision of 80% for web pages related to any nuclear theme and a retrieval precision of 72% for web pages related only to nuclear power theme. (author)
Chemical Information in Scirus and BASE (Bielefeld Academic Search Engine)

Science.gov (United States)

Bendig, Regina B.

2009-01-01

The author sought to determine to what extent the two search engines, Scirus and BASE (Bielefeld Academic Search Engines), would be useful to first-year university students as the first point of searching for chemical information. Five topics were searched and the first ten records of each search result were evaluated with regard to the type of…
Collaborative Web Search Who, What, Where, When, and Why

CERN Document Server

Morris, Meredith Ringel

2009-01-01

Today, Web search is treated as a solitary experience. Web browsers and search engines are typically designed to support a single user, working alone. However, collaboration on information-seeking tasks is actually commonplace. Students work together to complete homework assignments, friends seek information about joint entertainment opportunities, family members jointly plan vacation travel, and colleagues jointly conduct research for their projects. As improved networking technologies and the rise of social media simplify the process of remote collaboration, and large, novel display form-fac
Intelligent Search Optimization using Artificial Fuzzy Logics

OpenAIRE

Manral, Jai

2015-01-01

Information on the web is prodigious; searching relevant information is difficult making web users to rely on search engines for finding relevant information on the web. Search engines index and categorize web pages according to their contents using crawlers and rank them accordingly. For given user query they retrieve millions of webpages and display them to users according to web-page rank. Every search engine has their own algorithms based on certain parameters for ranking web-pages. Searc...

What Snippets Say About Pages in Federated Web Search

NARCIS (Netherlands)

Demeester, Thomas; Nguyen, Dong-Phuong; Trieschnigg, Rudolf Berend; Develder, Chris; Hiemstra, Djoerd; Hou, Yuexian; Nie, Jian-Yun; Sun, Le; Wang, Bo; Zhang, Peng

2012-01-01

What is the likelihood that a Web page is considered relevant to a query, given the relevance assessment of the corresponding snippet? Using a new federated IR test collection that contains search results from over a hundred search engines on the internet, we are able to investigate such research
Evaluating search effectiveness of some selected search engines ...

African Journals Online (AJOL)

With advancement in technology, many individuals are getting familiar with the internet a lot of users seek for information on the World Wide Web (WWW) using variety of search engines. This research work evaluates the retrieval effectiveness of Google, Yahoo, Bing, AOL and Baidu. Precision, relative recall and response ...
Semantic interpretation of search engine resultant

Science.gov (United States)

Nasution, M. K. M.

2018-01-01

In semantic, logical language can be interpreted in various forms, but the certainty of meaning is included in the uncertainty, which directly always influences the role of technology. One results of this uncertainty applies to search engines as user interfaces with information spaces such as the Web. Therefore, the behaviour of search engine results should be interpreted with certainty through semantic formulation as interpretation. Behaviour formulation shows there are various interpretations that can be done semantically either temporary, inclusion, or repeat.
Survey of formal and informal citation in Google search engine

Directory of Open Access Journals (Sweden)

Afsaneh Teymourikhani

2016-03-01

Full Text Available Aim: Informal citations is bibliographic information (title or Internet address, citing sources of information resources for informal scholarly communication and always neglected in traditional citation databases. This study is done, in order to answer the question of whether informal citations in the web environment are traceable. The present research aims to determine what proportion of web citations of Google search engine is related to formal and informal citation. Research method: Webometrics is the method used. The study is done on 1344 research articles of 98 open access journal, and the method that is used to extract the web citation from Google search engine is “Web / URL citation extraction". Findings: The findings showed that ten percent of the web citations of Google search engine are formal and informal citations. The highest formal citation in the Google search engine with 19/27% is in the field of library and information science and the lowest official citation by 1/54% is devoted to the field of civil engineering. The highest percentage of informal citations with 3/57% is devoted to sociology and the lowest percentage of informal citations by 0/39% is devoted to the field of civil engineering. Journal Citation is highest with 94/12% in the surgical field and lowest with 5/26 percent in the philosophy filed. Result: Due to formal and informal citations in the Google search engine which is about 10 percent and the reduction of this amount compared to previous research, it seems that track citations by this engine should be treated with more caution. We see that the amount of formal citation is variable in different disciplines. Cited journals in the field of surgery, is highest and in the filed of philosophy is lowest, this indicates that in the filed of philosophy, that is a subset of the social sciences, journals in scientific communication do not play a significant role. On the other hand, book has a key role in this filed
A web search on environmental topics: what is the role of ranking?

Science.gov (United States)

Covolo, Loredana; Filisetti, Barbara; Mascaretti, Silvia; Limina, Rosa Maria; Gelatti, Umberto

2013-12-01

Although the Internet is easy to use, the mechanisms and logic behind a Web search are often unknown. Reliable information can be obtained, but it may not be visible as the Web site is not located in the first positions of search results. The possible risks of adverse health effects arising from environmental hazards are issues of increasing public interest, and therefore the information about these risks, particularly on topics for which there is no scientific evidence, is very crucial. The aim of this study was to investigate whether the presentation of information on some environmental health topics differed among various search engines, assuming that the most reliable information should come from institutional Web sites. Five search engines were used: Google, Yahoo!, Bing, Ask, and AOL. The following topics were searched in combination with the word "health": "nuclear energy," "electromagnetic waves," "air pollution," "waste," and "radon." For each topic three key words were used. The first 30 search results for each query were considered. The ranking variability among the search engines and the type of search results were analyzed for each topic and for each key word. The ranking of institutional Web sites was given particular consideration. Variable results were obtained when surfing the Internet on different environmental health topics. Multivariate logistic regression analysis showed that, when searching for radon and air pollution topics, it is more likely to find institutional Web sites in the first 10 positions compared with nuclear power (odds ratio=3.4, 95% confidence interval 2.1-5.4 and odds ratio=2.9, 95% confidence interval 1.8-4.7, respectively) and also when using Google compared with Bing (odds ratio=3.1, 95% confidence interval 1.9-5.1). The increasing use of online information could play an important role in forming opinions. Web users should become more aware of the importance of finding reliable information, and health institutions should be
The sources and popularity of online drug information: an analysis of top search engine results and web page views.

Science.gov (United States)

Law, Michael R; Mintzes, Barbara; Morgan, Steven G

2011-03-01

The Internet has become a popular source of health information. However, there is little information on what drug information and which Web sites are being searched. To investigate the sources of online information about prescription drugs by assessing the most common Web sites returned in online drug searches and to assess the comparative popularity of Web pages for particular drugs. This was a cross-sectional study of search results for the most commonly dispensed drugs in the US (n=278 active ingredients) on 4 popular search engines: Bing, Google (both US and Canada), and Yahoo. We determined the number of times a Web site appeared as the first result. A linked retrospective analysis counted Wikipedia page hits for each of these drugs in 2008 and 2009. About three quarters of the first result on Google USA for both brand and generic names linked to the National Library of Medicine. In contrast, Wikipedia was the first result for approximately 80% of generic name searches on the other 3 sites. On these other sites, over two thirds of brand name searches led to industry-sponsored sites. The Wikipedia pages with the highest number of hits were mainly for opiates, benzodiazepines, antibiotics, and antidepressants. Wikipedia and the National Library of Medicine rank highly in online drug searches. Further, our results suggest that patients most often seek information on drugs with the potential for dependence, for stigmatized conditions, that have received media attention, and for episodic treatments. Quality improvement efforts should focus on these drugs.
Exploiting Semantic Web Technologies to Develop OWL-Based Clinical Practice Guideline Execution Engines.

Science.gov (United States)

Jafarpour, Borna; Abidi, Samina Raza; Abidi, Syed Sibte Raza

2016-01-01

Computerizing paper-based CPG and then executing them can provide evidence-informed decision support to physicians at the point of care. Semantic web technologies especially web ontology language (OWL) ontologies have been profusely used to represent computerized CPG. Using semantic web reasoning capabilities to execute OWL-based computerized CPG unties them from a specific custom-built CPG execution engine and increases their shareability as any OWL reasoner and triple store can be utilized for CPG execution. However, existing semantic web reasoning-based CPG execution engines suffer from lack of ability to execute CPG with high levels of expressivity, high cognitive load of computerization of paper-based CPG and updating their computerized versions. In order to address these limitations, we have developed three CPG execution engines based on OWL 1 DL, OWL 2 DL and OWL 2 DL + semantic web rule language (SWRL). OWL 1 DL serves as the base execution engine capable of executing a wide range of CPG constructs, however for executing highly complex CPG the OWL 2 DL and OWL 2 DL + SWRL offer additional executional capabilities. We evaluated the technical performance and medical correctness of our execution engines using a range of CPG. Technical evaluations show the efficiency of our CPG execution engines in terms of CPU time and validity of the generated recommendation in comparison to existing CPG execution engines. Medical evaluations by domain experts show the validity of the CPG-mediated therapy plans in terms of relevance, safety, and ordering for a wide range of patient scenarios.
A novel architecture for information retrieval system based on semantic web

Science.gov (United States)

Zhang, Hui

2011-12-01

Nowadays, the web has enabled an explosive growth of information sharing (there are currently over 4 billion pages covering most areas of human endeavor) so that the web has faced a new challenge of information overhead. The challenge that is now before us is not only to help people locating relevant information precisely but also to access and aggregate a variety of information from different resources automatically. Current web document are in human-oriented formats and they are suitable for the presentation, but machines cannot understand the meaning of document. To address this issue, Berners-Lee proposed a concept of semantic web. With semantic web technology, web information can be understood and processed by machine. It provides new possibilities for automatic web information processing. A main problem of semantic web information retrieval is that when these is not enough knowledge to such information retrieval system, the system will return to a large of no sense result to uses due to a huge amount of information results. In this paper, we present the architecture of information based on semantic web. In addiction, our systems employ the inference Engine to check whether the query should pose to Keyword-based Search Engine or should pose to the Semantic Search Engine.
Minimalist instruction for learning to search the World Wide Web

NARCIS (Netherlands)

Lazonder, Adrianus W.

2001-01-01

This study examined the efficacy of minimalist instruction to develop self-regulatory skills involved in Web searching. Two versions of minimalist self-regulatory skill instruction were compared to a control group that was merely taught procedural skills to operate the search engine. Acquired skills
The internet and intelligent machines: search engines, agents and robots

International Nuclear Information System (INIS)

Achenbach, S.; Alfke, H.

2000-01-01

The internet plays an important role in a growing number of medical applications. Finding relevant information is not always easy as the amount of available information on the Web is rising quickly. Even the best Search Engines can only collect links to a fraction of all existing Web pages. In addition, many of these indexed documents have been changed or deleted. The vast majority of information on the Web is not searchable with conventional methods. New search strategies, technologies and standards are combined in Intelligent Search Agents (ISA) an Robots, which can retrieve desired information in a specific approach. Conclusion: The article describes differences between ISAs and conventional Search Engines and how communication between Agents improves their ability to find information. Examples of existing ISAs are given and the possible influences on the current and future work in radiology is discussed. (orig.) [de
Process-oriented semantic web search

CERN Document Server

Tran, DT

2011-01-01

The book is composed of two main parts. The first part is a general study of Semantic Web Search. The second part specifically focuses on the use of semantics throughout the search process, compiling a big picture of Process-oriented Semantic Web Search from different pieces of work that target specific aspects of the process.In particular, this book provides a rigorous account of the concepts and technologies proposed for searching resources and semantic data on the Semantic Web. To collate the various approaches and to better understand what the notion of Semantic Web Search entails, this bo
The History of the Internet Search Engine: Navigational Media and the Traffic Commodity

Science.gov (United States)

van Couvering, E.

This chapter traces the economic development of the search engine industry over time, beginning with the earliest Web search engines and ending with the domination of the market by Google, Yahoo! and MSN. Specifically, it focuses on the ways in which search engines are similar to and different from traditional media institutions, and how the relations between traditional and Internet media have changed over time. In addition to its historical overview, a core contribution of this chapter is the analysis of the industry using a media value chain based on audiences rather than on content, and the development of traffic as the core unit of exchange. It shows that traditional media companies failed when they attempted to create vertically integrated portals in the late 1990s, based on the idea of controlling Internet content, while search engines succeeded in creating huge "virtually integrated" networks based on control of Internet traffic rather than Internet content.
A Web Search on Environmental Topics: What Is the Role of Ranking?

Science.gov (United States)

Filisetti, Barbara; Mascaretti, Silvia; Limina, Rosa Maria; Gelatti, Umberto

2013-01-01

Abstract Background: Although the Internet is easy to use, the mechanisms and logic behind a Web search are often unknown. Reliable information can be obtained, but it may not be visible as the Web site is not located in the first positions of search results. The possible risks of adverse health effects arising from environmental hazards are issues of increasing public interest, and therefore the information about these risks, particularly on topics for which there is no scientific evidence, is very crucial. The aim of this study was to investigate whether the presentation of information on some environmental health topics differed among various search engines, assuming that the most reliable information should come from institutional Web sites. Materials and Methods: Five search engines were used: Google, Yahoo!, Bing, Ask, and AOL. The following topics were searched in combination with the word “health”: “nuclear energy,” “electromagnetic waves,” “air pollution,” “waste,” and “radon.” For each topic three key words were used. The first 30 search results for each query were considered. The ranking variability among the search engines and the type of search results were analyzed for each topic and for each key word. The ranking of institutional Web sites was given particular consideration. Results: Variable results were obtained when surfing the Internet on different environmental health topics. Multivariate logistic regression analysis showed that, when searching for radon and air pollution topics, it is more likely to find institutional Web sites in the first 10 positions compared with nuclear power (odds ratio=3.4, 95% confidence interval 2.1–5.4 and odds ratio=2.9, 95% confidence interval 1.8–4.7, respectively) and also when using Google compared with Bing (odds ratio=3.1, 95% confidence interval 1.9–5.1). Conclusions: The increasing use of online information could play an important role in forming opinions. Web users should become
Open meta-search with OpenSearch: a case study

OpenAIRE

O'Riordan, Adrian P.

2007-01-01

The goal of this project was to demonstrate the possibilities of open source search engine and aggregation technology in a Web environment by building a meta-search engine which employs free open search engines and open protocols. In contrast many meta-search engines on the Internet use proprietary search systems. The search engines employed in this case study are all based on the OpenSearch protocol. OpenSearch-compliant systems support XML technologies such as RSS and Atom for aggregation a...
Analysis of Search Engines and Meta Search Engines\\\\\\' Position by University of Isfahan Users Based on Rogers\\\\\\' Diffusion of Innovation Theory

Directory of Open Access Journals (Sweden)

Maryam Akbari

2012-10-01

Full Text Available The present study investigated the analysis of search engines and meta search engines adoption process by University of Isfahan users during 2009-2010 based on the Rogers' diffusion of innovation theory. The main aim of the research was to study the rate of adoption and recognizing the potentials and effective tools in search engines and meta search engines adoption among University of Isfahan users. The research method was descriptive survey study. The cases of the study were all of the post graduate students of the University of Isfahan. 351 students were selected as the sample and categorized by a stratified random sampling method. Questionnaire was used for collecting data. The collected data was analyzed using SPSS 16 in both descriptive and analytic statistic. For descriptive statistic frequency, percentage and mean were used, while for analytic statistic t-test and Kruskal-Wallis non parametric test (H-test were used. The finding of t-test and Kruscal-Wallis indicated that the mean of search engines and meta search engines adoption did not show statistical differences gender, level of education and the faculty. Special search engines adoption process was different in terms of gender but not in terms of the level of education and the faculty. Other results of the research indicated that among general search engines, Google had the most adoption rate. In addition, among the special search engines, Google Scholar and among the meta search engines Mamma had the most adopting rate. Findings also showed that friends played an important role on how students adopted general search engines while professors had important role on how students adopted special search engines and meta search engines. Moreover, results showed that the place where students got the most acquaintance with search engines and meta search engines was in the university. The finding showed that the curve of adoption rate was not normal and it was not also in S-shape. Morover
Penerapan Teknik Seo (Search Engine Optimization pada Website dalam Strategi Pemasaran melalui Internet

Directory of Open Access Journals (Sweden)

Rony Baskoro Lukito

2014-12-01

Full Text Available The purpose of this research is how to optimize a web design that can increase the number of visitors. The number of Internet users in the world continues to grow in line with advances in information technology. Products and services marketing media do not just use the printed and electronic media. Moreover, the cost of using the Internet as a medium of marketing is relatively inexpensive when compared to the use of television as a marketing medium. The penetration of the internet as a marketing medium lasted for 24 hours in different parts of the world. But to make an internet site into a site that is visited by many internet users, the site is not only good from the outside view only. Web sites that serve as a medium for marketing must be built with the correct rules, so that the Web site be optimal marketing media. One of the good rules in building the internet site as a marketing medium is how the content of such web sites indexed well in search engines like google. Search engine optimization in the index will be focused on the search engine Google for 83% of internet users across the world using Google as a search engine. Search engine optimization commonly known as SEO (Search Engine Optimization is an important rule that the internet site is easier to find a user with the desired keywords.
Enhancing discovery in spatial data infrastructures using a search engine

Directory of Open Access Journals (Sweden)

Paolo Corti

2018-05-01

Full Text Available A spatial data infrastructure (SDI is a framework of geospatial data, metadata, users and tools intended to provide an efficient and flexible way to use spatial information. One of the key software components of an SDI is the catalogue service which is needed to discover, query and manage the metadata. Catalogue services in an SDI are typically based on the Open Geospatial Consortium (OGC Catalogue Service for the Web (CSW standard which defines common interfaces for accessing the metadata information. A search engine is a software system capable of supporting fast and reliable search, which may use ‘any means necessary’ to get users to the resources they need quickly and efficiently. These techniques may include full text search, natural language processing, weighted results, fuzzy tolerance results, faceting, hit highlighting, recommendations and many others. In this paper we present an example of a search engine being added to an SDI to improve search against large collections of geospatial datasets. The Centre for Geographic Analysis (CGA at Harvard University re-engineered the search component of its public domain SDI (Harvard WorldMap which is based on the GeoNode platform. A search engine was added to the SDI stack to enhance the CSW catalogue discovery abilities. It is now possible to discover spatial datasets from metadata by using the standard search operations of the catalogue and to take advantage of the new abilities of the search engine, to return relevant and reliable content to SDI users.
Semantic Search of Web Services

Science.gov (United States)

Hao, Ke

2013-01-01

This dissertation addresses semantic search of Web services using natural language processing. We first survey various existing approaches, focusing on the fact that the expensive costs of current semantic annotation frameworks result in limited use of semantic search for large scale applications. We then propose a vector space model based service…
The Gaze of the Perfect Search Engine: Google as an Infrastructure of Dataveillance

Science.gov (United States)

Zimmer, M.

Web search engines have emerged as a ubiquitous and vital tool for the successful navigation of the growing online informational sphere. The goal of the world's largest search engine, Google, is to "organize the world's information and make it universally accessible and useful" and to create the "perfect search engine" that provides only intuitive, personalized, and relevant results. While intended to enhance intellectual mobility in the online sphere, this chapter reveals that the quest for the perfect search engine requires the widespread monitoring and aggregation of a users' online personal and intellectual activities, threatening the values the perfect search engines were designed to sustain. It argues that these search-based infrastructures of dataveillance contribute to a rapidly emerging "soft cage" of everyday digital surveillance, where they, like other dataveillance technologies before them, contribute to the curtailing of individual freedom, affect users' sense of self, and present issues of deep discrimination and social justice.
Andromeda: a peptide search engine integrated into the MaxQuant environment.

Science.gov (United States)

Cox, Jürgen; Neuhauser, Nadin; Michalski, Annette; Scheltema, Richard A; Olsen, Jesper V; Mann, Matthias

2011-04-01

A key step in mass spectrometry (MS)-based proteomics is the identification of peptides in sequence databases by their fragmentation spectra. Here we describe Andromeda, a novel peptide search engine using a probabilistic scoring model. On proteome data, Andromeda performs as well as Mascot, a widely used commercial search engine, as judged by sensitivity and specificity analysis based on target decoy searches. Furthermore, it can handle data with arbitrarily high fragment mass accuracy, is able to assign and score complex patterns of post-translational modifications, such as highly phosphorylated peptides, and accommodates extremely large databases. The algorithms of Andromeda are provided. Andromeda can function independently or as an integrated search engine of the widely used MaxQuant computational proteomics platform and both are freely available at www.maxquant.org. The combination enables analysis of large data sets in a simple analysis workflow on a desktop computer. For searching individual spectra Andromeda is also accessible via a web server. We demonstrate the flexibility of the system by implementing the capability to identify cofragmented peptides, significantly improving the total number of identified peptides.

Custom Search Engines: Tools & Tips

Science.gov (United States)

Notess, Greg R.

2008-01-01

Few have the resources to build a Google or Yahoo! from scratch. Yet anyone can build a search engine based on a subset of the large search engines' databases. Use Google Custom Search Engine or Yahoo! Search Builder or any of the other similar programs to create a vertical search engine targeting sites of interest to users. The basic steps to…
State-of-the-Art Review on Relevance of Genetic Algorithm to Internet Web Search

Directory of Open Access Journals (Sweden)

Kehinde Agbele

2012-01-01

Full Text Available People use search engines to find information they desire with the aim that their information needs will be met. Information retrieval (IR is a field that is concerned primarily with the searching and retrieving of information in the documents and also searching the search engine, online databases, and Internet. Genetic algorithms (GAs are robust, efficient, and optimizated methods in a wide area of search problems motivated by Darwin’s principles of natural selection and survival of the fittest. This paper describes information retrieval systems (IRS components. This paper looks at how GAs can be applied in the field of IR and specifically the relevance of genetic algorithms to internet web search. Finally, from the proposals surveyed it turns out that GA is applied to diverse problem fields of internet web search.
Document Clustering Approach for Meta Search Engine

Science.gov (United States)

Kumar, Naresh, Dr.

2017-08-01

The size of WWW is growing exponentially with ever change in technology. This results in huge amount of information with long list of URLs. Manually it is not possible to visit each page individually. So, if the page ranking algorithms are used properly then user search space can be restricted up to some pages of searched results. But available literatures show that no single search system can provide qualitative results from all the domains. This paper provides solution to this problem by introducing a new meta search engine that determine the relevancy of query corresponding to web page and cluster the results accordingly. The proposed approach reduces the user efforts, improves the quality of results and performance of the meta search engine.
Finding Web-Based Anxiety Interventions on the World Wide Web: A Scoping Review.

Science.gov (United States)

Ashford, Miriam Thiel; Olander, Ellinor K; Ayers, Susan

2016-06-01

One relatively new and increasingly popular approach of increasing access to treatment is Web-based intervention programs. The advantage of Web-based approaches is the accessibility, affordability, and anonymity of potentially evidence-based treatment. Despite much research evidence on the effectiveness of Web-based interventions for anxiety found in the literature, little is known about what is publically available for potential consumers on the Web. Our aim was to explore what a consumer searching the Web for Web-based intervention options for anxiety-related issues might find. The objectives were to identify currently publically available Web-based intervention programs for anxiety and to synthesize and review these in terms of (1) website characteristics such as credibility and accessibility; (2) intervention program characteristics such as intervention focus, design, and presentation modes; (3) therapeutic elements employed; and (4) published evidence of efficacy. Web keyword searches were carried out on three major search engines (Google, Bing, and Yahoo-UK platforms). For each search, the first 25 hyperlinks were screened for eligible programs. Included were programs that were designed for anxiety symptoms, currently publically accessible on the Web, had an online component, a structured treatment plan, and were available in English. Data were extracted for website characteristics, program characteristics, therapeutic characteristics, as well as empirical evidence. Programs were also evaluated using a 16-point rating tool. The search resulted in 34 programs that were eligible for review. A wide variety of programs for anxiety, including specific anxiety disorders, and anxiety in combination with stress, depression, or anger were identified and based predominantly on cognitive behavioral therapy techniques. The majority of websites were rated as credible, secure, and free of advertisement. The majority required users to register and/or to pay a program access
A geospatial search engine for discovering multi-format geospatial data across the web

Science.gov (United States)

Christopher Bone; Alan Ager; Ken Bunzel; Lauren Tierney

2014-01-01

The volume of publically available geospatial data on the web is rapidly increasing due to advances in server-based technologies and the ease at which data can now be created. However, challenges remain with connecting individuals searching for geospatial data with servers and websites where such data exist. The objective of this paper is to present a publically...
Burden of neurological diseases in the US revealed by web searches.

Directory of Open Access Journals (Sweden)

Ricardo Baeza-Yates

Full Text Available Analyzing the disease-related web searches of Internet users provides insight into the interests of the general population as well as the healthcare industry, which can be used to shape health care policies.We analyzed the searches related to neurological diseases and drugs used in neurology using the most popular search engines in the US, Google and Bing/Yahoo.We found that the most frequently searched diseases were common diseases such as dementia or Attention Deficit/Hyperactivity Disorder (ADHD, as well as medium frequency diseases with high social impact such as Parkinson's disease, MS and ALS. The most frequently searched CNS drugs were generic drugs used for pain, followed by sleep disorders, dementia, ADHD, stroke and Parkinson's disease. Regarding the interests of the healthcare industry, ADHD, Alzheimer's disease, MS, ALS, meningitis, and hypersomnia received the higher advertising bids for neurological diseases, while painkillers and drugs for neuropathic pain, drugs for dementia or insomnia, and triptans had the highest advertising bidding prices.Web searches reflect the interest of people and the healthcare industry, and are based either on the frequency or social impact of the disease.
The Search for Extension: 7 Steps to Help People Find Research-Based Information on the Internet

Science.gov (United States)

Hill, Paul; Rader, Heidi B.; Hino, Jeff

2012-01-01

For Extension's unbiased, research-based content to be found by people searching the Internet, it needs to be organized in a way conducive to the ranking criteria of a search engine. With proper web design and search engine optimization techniques, Extension's content can be found, recognized, and properly indexed by search engines and…
Quality of Web-based information on obsessive compulsive disorder.

Science.gov (United States)

Klila, Hedi; Chatton, Anne; Zermatten, Ariane; Khan, Riaz; Preisig, Martin; Khazaal, Yasser

2013-01-01

The Internet is increasingly used as a source of information for mental health issues. The burden of obsessive compulsive disorder (OCD) may lead persons with diagnosed or undiagnosed OCD, and their relatives, to search for good quality information on the Web. This study aimed to evaluate the quality of Web-based information on English-language sites dealing with OCD and to compare the quality of websites found through a general and a medically specialized search engine. Keywords related to OCD were entered into Google and OmniMedicalSearch. Websites were assessed on the basis of accountability, interactivity, readability, and content quality. The "Health on the Net" (HON) quality label and the Brief DISCERN scale score were used as possible content quality indicators. Of the 235 links identified, 53 websites were analyzed. The content quality of the OCD websites examined was relatively good. The use of a specialized search engine did not offer an advantage in finding websites with better content quality. A score ≥16 on the Brief DISCERN scale is associated with better content quality. This study shows the acceptability of the content quality of OCD websites. There is no advantage in searching for information with a specialized search engine rather than a general one. The Internet offers a number of high quality OCD websites. It remains critical, however, to have a provider-patient talk about the information found on the Web.
Development of a Google-based search engine for data mining radiology reports.

Science.gov (United States)

Erinjeri, Joseph P; Picus, Daniel; Prior, Fred W; Rubin, David A; Koppel, Paul

2009-08-01

The aim of this study is to develop a secure, Google-based data-mining tool for radiology reports using free and open source technologies and to explore its use within an academic radiology department. A Health Insurance Portability and Accountability Act (HIPAA)-compliant data repository, search engine and user interface were created to facilitate treatment, operations, and reviews preparatory to research. The Institutional Review Board waived review of the project, and informed consent was not required. Comprising 7.9 GB of disk space, 2.9 million text reports were downloaded from our radiology information system to a fileserver. Extensible markup language (XML) representations of the reports were indexed using Google Desktop Enterprise search engine software. A hypertext markup language (HTML) form allowed users to submit queries to Google Desktop, and Google's XML response was interpreted by a practical extraction and report language (PERL) script, presenting ranked results in a web browser window. The query, reason for search, results, and documents visited were logged to maintain HIPAA compliance. Indexing averaged approximately 25,000 reports per hour. Keyword search of a common term like "pneumothorax" yielded the first ten most relevant results of 705,550 total results in 1.36 s. Keyword search of a rare term like "hemangioendothelioma" yielded the first ten most relevant results of 167 total results in 0.23 s; retrieval of all 167 results took 0.26 s. Data mining tools for radiology reports will improve the productivity of academic radiologists in clinical, educational, research, and administrative tasks. By leveraging existing knowledge of Google's interface, radiologists can quickly perform useful searches.
Web-Based Search and Plot System for Nuclear Reaction Data

International Nuclear Information System (INIS)

Otuka, N.; Nakagawa, T.; Fukahori, T.; Katakura, J.; Aikawa, M.; Suda, T.; Naito, K.; Korennov, S.; Arai, K.; Noto, H.; Ohnishi, A.; Kato, K.

2005-01-01

A web-based search and plot system for nuclear reaction data has been developed, covering experimental data in EXFOR format and evaluated data in ENDF format. The system is implemented for Linux OS, with Perl and MySQL used for CGI scripts and the database manager, respectively. Two prototypes for experimental and evaluated data are presented
Analysis of Web Spam for Non-English Content: Toward More Effective Language-Based Classifiers.

Directory of Open Access Journals (Sweden)

Mansour Alsaleh

Full Text Available Web spammers aim to obtain higher ranks for their web pages by including spam contents that deceive search engines in order to include their pages in search results even when they are not related to the search terms. Search engines continue to develop new web spam detection mechanisms, but spammers also aim to improve their tools to evade detection. In this study, we first explore the effect of the page language on spam detection features and we demonstrate how the best set of detection features varies according to the page language. We also study the performance of Google Penguin, a newly developed anti-web spamming technique for their search engine. Using spam pages in Arabic as a case study, we show that unlike similar English pages, Google anti-spamming techniques are ineffective against a high proportion of Arabic spam pages. We then explore multiple detection features for spam pages to identify an appropriate set of features that yields a high detection accuracy compared with the integrated Google Penguin technique. In order to build and evaluate our classifier, as well as to help researchers to conduct consistent measurement studies, we collected and manually labeled a corpus of Arabic web pages, including both benign and spam pages. Furthermore, we developed a browser plug-in that utilizes our classifier to warn users about spam pages after clicking on a URL and by filtering out search engine results. Using Google Penguin as a benchmark, we provide an illustrative example to show that language-based web spam classifiers are more effective for capturing spam contents.
Brief Report: Consistency of Search Engine Rankings for Autism Websites

Science.gov (United States)

Reichow, Brian; Naples, Adam; Steinhoff, Timothy; Halpern, Jason; Volkmar, Fred R.

2012-01-01

The World Wide Web is one of the most common methods used by parents to find information on autism spectrum disorders and most consumers find information through search engines such as Google or Bing. However, little is known about how the search engines operate or the consistency of the results that are returned over time. This study presents the…
Vertical Search Engines

OpenAIRE

Curran, Kevin; Mc Glinchey, Jude

2017-01-01

This paper outlines the growth in popularity of vertical search engines, their origins, the differences between them and well-known broad based search engines such as Google and Yahoo. We also discuss their use in business-to-business, their marketing and advertising costs, what the revenue streams are and who uses them.
Quality of Web-based information on obsessive compulsive disorder

Directory of Open Access Journals (Sweden)

Klila H

2013-11-01

Full Text Available Hedi Klila,1 Anne Chatton,2 Ariane Zermatten,2 Riaz Khan,2 Martin Preisig,1,3 Yasser Khazaal2,4 1Department of Psychiatry, Lausanne University Hospital, Lausanne, Switzerland; 2Department of Mental Health and Psychiatry, Geneva University Hospitals, Geneva, Switzerland; 3Lausanne University, Lausanne, Switzerland; 4Geneva University, Geneva, Switzerland Background: The Internet is increasingly used as a source of information for mental health issues. The burden of obsessive compulsive disorder (OCD may lead persons with diagnosed or undiagnosed OCD, and their relatives, to search for good quality information on the Web. This study aimed to evaluate the quality of Web-based information on English-language sites dealing with OCD and to compare the quality of websites found through a general and a medically specialized search engine. Methods: Keywords related to OCD were entered into Google and OmniMedicalSearch. Websites were assessed on the basis of accountability, interactivity, readability, and content quality. The "Health on the Net" (HON quality label and the Brief DISCERN scale score were used as possible content quality indicators. Of the 235 links identified, 53 websites were analyzed. Results: The content quality of the OCD websites examined was relatively good. The use of a specialized search engine did not offer an advantage in finding websites with better content quality. A score ≥16 on the Brief DISCERN scale is associated with better content quality. Conclusion: This study shows the acceptability of the content quality of OCD websites. There is no advantage in searching for information with a specialized search engine rather than a general one. Practical implications: The Internet offers a number of high quality OCD websites. It remains critical, however, to have a provider–patient talk about the information found on the Web. Keywords: Internet, quality indicators, anxiety disorders, OCD, search engine
Internet Search Engines

OpenAIRE

Fatmaa El Zahraa Mohamed Abdou

2004-01-01

A general study about the internet search engines, the study deals main 7 points; the differance between search engines and search directories, components of search engines, the percentage of sites covered by search engines, cataloging of sites, the needed time for sites appearance in search engines, search capabilities, and types of search engines.
XML and Better Web Searching.

Science.gov (United States)

Jackson, Joe; Gilstrap, Donald L.

1999-01-01

Addresses the implications of the new Web metalanguage XML for searching on the World Wide Web and considers the future of XML on the Web. Compared to HTML, XML is more concerned with structure of data than documents, and these data structures should prove conducive to precise, context rich searching. (Author/LRW)
Taking It to the Top: A Lesson in Search Engine Optimization

Science.gov (United States)

Frydenberg, Mark; Miko, John S.

2011-01-01

Search engine optimization (SEO), the promoting of a Web site so it achieves optimal position with a search engine's rankings, is an important strategy for organizations and individuals in order to promote their brands online. Techniques for achieving SEO are relevant to students of marketing, computing, media arts, and other disciplines, and many…
Developing a search engine for pharmacotherapeutic information that is not published in biomedical journals.

Science.gov (United States)

Do Pazo-Oubiña, F; Calvo Pita, C; Puigventós Latorre, F; Periañez-Párraga, L; Ventayol Bosch, P

2011-01-01

To identify publishers of pharmacotherapeutic information not found in biomedical journals that focuses on evaluating and providing advice on medicines and to develop a search engine to access this information. Compiling web sites that publish information on the rational use of medicines and have no commercial interests. Free-access web sites in Spanish, Galician, Catalan or English. Designing a search engine using the Google "custom search" application. Overall 159 internet addresses were compiled and were classified into 9 labels. We were able to recover the information from the selected sources using a search engine, which is called "AlquimiA" and available from http://www.elcomprimido.com/FARHSD/AlquimiA.htm. The main sources of pharmacotherapeutic information not published in biomedical journals were identified. The search engine is a useful tool for searching and accessing "grey literature" on the internet. Copyright © 2010 SEFH. Published by Elsevier Espana. All rights reserved.
EIIS: An Educational Information Intelligent Search Engine Supported by Semantic Services

Science.gov (United States)

Huang, Chang-Qin; Duan, Ru-Lin; Tang, Yong; Zhu, Zhi-Ting; Yan, Yong-Jian; Guo, Yu-Qing

2011-01-01

The semantic web brings a new opportunity for efficient information organization and search. To meet the special requirements of the educational field, this paper proposes an intelligent search engine enabled by educational semantic support service, where three kinds of searches are integrated into Educational Information Intelligent Search (EIIS)…
Using Web 2.0 Techniques in NASA's Ares Engineering Operations Network (AEON) Environment - First Impressions

Science.gov (United States)

Scott, David W.

2010-01-01

The Mission Operations Laboratory (MOL) at Marshall Space Flight Center (MSFC) is responsible for Engineering Support capability for NASA s Ares rocket development and operations. In pursuit of this, MOL is building the Ares Engineering and Operations Network (AEON), a web-based portal to support and simplify two critical activities: Access and analyze Ares manufacturing, test, and flight performance data, with access to Shuttle data for comparison Establish and maintain collaborative communities within the Ares teams/subteams and with other projects, e.g., Space Shuttle, International Space Station (ISS). AEON seeks to provide a seamless interface to a) locally developed engineering applications and b) a Commercial-Off-The-Shelf (COTS) collaborative environment that includes Web 2.0 capabilities, e.g., blogging, wikis, and social networking. This paper discusses how Web 2.0 might be applied to the typically conservative engineering support arena, based on feedback from Integration, Verification, and Validation (IV&V) testing and on searching for their use in similar environments.

Comparing the Scale of Web Subject Directories Precision in Technical-Engineering Information Retrieval

Directory of Open Access Journals (Sweden)

Mehrdokht Wazirpour Keshmiri

2012-07-01

Full Text Available The main purpose of this research was to compare the scale of web subject directories precision in information retrieval of technical-engineering science. Information gathering was documentary and webometric. Keywords of technical-engineering science were chosen at twenty different subjects from IEEE (Institute of Electrical and Electronics Engineers and engineering magazines that situated in sciencedirect site. These keywords are used at five subject directories Yahoo, Google, Infomine, Intute, Dmoz, that were web directories high-utilization. Usually first results in searching tools are connected to searching keywords. Because, first ten results was evaluated in every search. These assessments to consist of scale of precision, scale of error, scale retrieval items in technical-engineering categories to retrieval items entirely. The used criteria for determining the scale of precision that was according to high-utilization standards in different documents, to consist of presence of the keywords in title, appearance of keywords at the part of web retrieved pages, keywords adjacency, URL of page, page description and subject categories. Information analysis was according to Kruskal-Wallis Test and L.S.D fisher. Results revealed that there was meaningful difference about precision of web subject directories in information retrieval of technical-engineering science, Therefore this theory was confirmed.web subject directories ranked from point of precision as follows. Google, Yahoo, Intute, Dmoz, and Infomine. The scale of observed error at the first results was another criterion that was used for comparing web subject directories. In this research, Yahoo had minimum scale of error and Infomine had most of error. This research also compared the scale of retrieval items in all of categories web subject directories entirely to retrieval items in technical-engineering categories, results revealed that there was meaningful difference between them. And
Quantitative evaluation of recall and precision of CAT Crawler, a search engine specialized on retrieval of Critically Appraised Topics

Science.gov (United States)

Dong, Peng; Wong, Ling Ling; Ng, Sarah; Loh, Marie; Mondry, Adrian

2004-01-01

Background Critically Appraised Topics (CATs) are a useful tool that helps physicians to make clinical decisions as the healthcare moves towards the practice of Evidence-Based Medicine (EBM). The fast growing World Wide Web has provided a place for physicians to share their appraised topics online, but an increasing amount of time is needed to find a particular topic within such a rich repository. Methods A web-based application, namely the CAT Crawler, was developed by Singapore's Bioinformatics Institute to allow physicians to adequately access available appraised topics on the Internet. A meta-search engine, as the core component of the application, finds relevant topics following keyword input. The primary objective of the work presented here is to evaluate the quantity and quality of search results obtained from the meta-search engine of the CAT Crawler by comparing them with those obtained from two individual CAT search engines. From the CAT libraries at these two sites, all possible keywords were extracted using a keyword extractor. Of those common to both libraries, ten were randomly chosen for evaluation. All ten were submitted to the two search engines individually, and through the meta-search engine of the CAT Crawler. Search results were evaluated for relevance both by medical amateurs and professionals, and the respective recall and precision were calculated. Results While achieving an identical recall, the meta-search engine showed a precision of 77.26% (±14.45) compared to the individual search engines' 52.65% (±12.0) (p search engine approach. The improved precision due to inherent filters underlines the practical usefulness of this tool for clinicians. PMID:15588311
A fuzzy method for improving the functionality of search engines based on user's web interactions

Directory of Open Access Journals (Sweden)

Farzaneh Kabirbeyk

2015-04-01

Full Text Available Web mining has been widely used to discover knowledge from various sources in the web. One of the important tools in web mining is mining of web user’s behavior that is considered as a way to discover the potential knowledge of web user’s interaction. Nowadays, Website personalization is regarded as a popular phenomenon among web users and it plays an important role in facilitating user access and provides information of users’ requirements based on their own interests. Extracting important features about web user behavior plays a significant role in web usage mining. Such features are page visit frequency in each session, visit duration, and dates of visiting a certain pages. This paper presents a method to predict user’s interest and to propose a list of pages based on their interests by identifying user’s behavior based on fuzzy techniques called fuzzy clustering method. Due to the user’s different interests and use of one or more interest at a time, user’s interest may belong to several clusters and fuzzy clustering provide a possible overlap. Using the resulted cluster helps extract fuzzy rules. This helps detecting user’s movement pattern and using neural network a list of suggested pages to the users is provided.
Engineering Web Applications

DEFF Research Database (Denmark)

Casteleyn, Sven; Daniel, Florian; Dolog, Peter

Nowadays, Web applications are almost omnipresent. The Web has become a platform not only for information delivery, but also for eCommerce systems, social networks, mobile services, and distributed learning environments. Engineering Web applications involves many intrinsic challenges due...... to their distributed nature, content orientation, and the requirement to make them available to a wide spectrum of users who are unknown in advance. The authors discuss these challenges in the context of well-established engineering processes, covering the whole product lifecycle from requirements engineering through...... design and implementation to deployment and maintenance. They stress the importance of models in Web application development, and they compare well-known Web-specific development processes like WebML, WSDM and OOHDM to traditional software development approaches like the waterfall model and the spiral...
PIA: An Intuitive Protein Inference Engine with a Web-Based User Interface.

Science.gov (United States)

Uszkoreit, Julian; Maerkens, Alexandra; Perez-Riverol, Yasset; Meyer, Helmut E; Marcus, Katrin; Stephan, Christian; Kohlbacher, Oliver; Eisenacher, Martin

2015-07-02

Protein inference connects the peptide spectrum matches (PSMs) obtained from database search engines back to proteins, which are typically at the heart of most proteomics studies. Different search engines yield different PSMs and thus different protein lists. Analysis of results from one or multiple search engines is often hampered by different data exchange formats and lack of convenient and intuitive user interfaces. We present PIA, a flexible software suite for combining PSMs from different search engine runs and turning these into consistent results. PIA can be integrated into proteomics data analysis workflows in several ways. A user-friendly graphical user interface can be run either locally or (e.g., for larger core facilities) from a central server. For automated data processing, stand-alone tools are available. PIA implements several established protein inference algorithms and can combine results from different search engines seamlessly. On several benchmark data sets, we show that PIA can identify a larger number of proteins at the same protein FDR when compared to that using inference based on a single search engine. PIA supports the majority of established search engines and data in the mzIdentML standard format. It is implemented in Java and freely available at https://github.com/mpc-bioinformatics/pia.
Development and Evaluation of Thesauri-Based Bibliographic Biomedical Search Engine

Science.gov (United States)

Alghoson, Abdullah

2017-01-01

Due to the large volume and exponential growth of biomedical documents (e.g., books, journal articles), it has become increasingly challenging for biomedical search engines to retrieve relevant documents based on users' search queries. Part of the challenge is the matching mechanism of free-text indexing that performs matching based on…
Effects of Web-Based Interactive Modules on Engineering Students' Learning Motivations

Science.gov (United States)

Bai, Haiyan; Aman, Amjad; Xu, Yunjun; Orlovskaya, Nina; Zhou, Mingming

2016-01-01

The purpose of this study is to assess the impact of a newly developed modules, Interactive Web-Based Visualization Tools for Gluing Undergraduate Fuel Cell Systems Courses system (IGLU), on learning motivations of engineering students using two samples (n[subscript 1] = 144 and n[subscript 2] = 135) from senior engineering classes. The…
Personalization of Rule-based Web Services.

Science.gov (United States)

Choi, Okkyung; Han, Sang Yong

2008-04-04

Nowadays Web users have clearly expressed their wishes to receive personalized services directly. Personalization is the way to tailor services directly to the immediate requirements of the user. However, the current Web Services System does not provide any features supporting this such as consideration of personalization of services and intelligent matchmaking. In this research a flexible, personalized Rule-based Web Services System to address these problems and to enable efficient search, discovery and construction across general Web documents and Semantic Web documents in a Web Services System is proposed. This system utilizes matchmaking among service requesters', service providers' and users' preferences using a Rule-based Search Method, and subsequently ranks search results. A prototype of efficient Web Services search and construction for the suggested system is developed based on the current work.
Eysenbach, Tuische and Diepgen’s Evaluation of Web Searching for Identifying Unpublished Studies for Systematic Reviews: An Innovative Study Which is Still Relevant Today.

Directory of Open Access Journals (Sweden)

Simon Briscoe

2016-09-01

Full Text Available A Review of: Eysenbach, G., Tuische, J. & Diepgen, T.L. (2001. Evaluation of the usefulness of Internet searches to identify unpublished clinical trials for systematic reviews. Medical Informatics and the Internet in Medicine, 26(3, 203-218. http://dx.doi.org/10.1080/14639230110075459 Objective – To consider whether web searching is a useful method for identifying unpublished studies for inclusion in systematic reviews. Design – Retrospective web searches using the AltaVista search engine were conducted to identify unpublished studies – specifically, clinical trials – for systematic reviews which did not use a web search engine. Setting – The Department of Clinical Social Medicine, University of Heidelberg, Germany. Subjects – n/a Methods – Pilot testing of 11 web search engines was carried out to determine which could handle complex search queries. Pre-specified search requirements included the ability to handle Boolean and proximity operators, and truncation searching. A total of seven Cochrane systematic reviews were randomly selected from the Cochrane Library Issue 2, 1998, and their bibliographic database search strategies were adapted for the web search engine, AltaVista. Each adaptation combined search terms for the intervention, problem, and study type in the systematic review. Hints to planned, ongoing, or unpublished studies retrieved by the search engine, which were not cited in the systematic reviews, were followed up by visiting websites and contacting authors for further details when required. The authors of the systematic reviews were then contacted and asked to comment on the potential relevance of the identified studies. Main Results – Hints to 14 unpublished and potentially relevant studies, corresponding to 4 of the 7 randomly selected Cochrane systematic reviews, were identified. Out of the 14 studies, 2 were considered irrelevant to the corresponding systematic review by the systematic review authors. The
Da "Search engines" a "Shop engines"

OpenAIRE

Lupi, Mauro

2001-01-01

The change occuring related to “search engines” is going towards e-commerce, transforming all the main search engines into information and commercial suggestion conveying means, basing their businnes on this activity. In a next future we will find two main series of search engines: from one side, the portals that will offer a general orientation guide being convoying means for services and to-buy products; from the other side, vertical portals able to offer information and products on specifi...
Competence Centered Specialization in Web Engineering Topics in a Software Engineering Masters Degree Programme

DEFF Research Database (Denmark)

Dolog, Peter; Thomsen, Lone Leth; Thomsen, Bent

2010-01-01

Web applications and Web-based systems are becoming increasingly complex as a result of either customer requests or technology evolution which has eased other aspects of software engineering. Therefore, there is an increasing demand for highly skilled software engineers able to build and also...... advance the systems on the one hand as well as professionals who are able to evaluate their eectiveness on the other hand. With this idea in mind, the computer science department at Aalborg University is continuously working on improvements in its specialization in web engineering topics as well...... as on general competence based web engineering proles oered also for those who specialize in other areas of software engineering. We describe the current state of the art and our experience with a web engineering curriculum within the software engineering masters degree programme. We also discuss an evolution...
Searching the Web for Earth Science Data: Semiotics to Cybernetics and Back

Directory of Open Access Journals (Sweden)

Bruce R. Barkstrom

2016-06-01

Full Text Available This paper discusses a search paradigm for numerical data in Earth science that relies on the intrinsic structure of an archive's collection. Such non-textual data lies outside the normal textual basis for the Semantic Web. The paradigm tries to bypass some of the difficulties associated with keyword searches, such as semantic heterogeneity. The suggested collection structure uses a hierarchical taxonomy based on multidimensional axes of continuous variables. This structure fits the underlying 'geometry' of Earth science data better than sets of keywords in an ontology. The alternative paradigm views the search as a two-agent cooperative game that uses a dialog between the search engine and the data user. In this view, the search engine knows about the objects in the archive. It cannot read the user's mind to identify what the user needs. We assume the user has a clear idea of the search target. However he or she may not have a clear idea of the archive's contents. The paper suggests how the user interface may provide information to deal with the user's difficulties in understanding items in the dialog.
Utilizing mixed methods research in analyzing Iranian researchers’ informarion search behaviour in the Web and presenting current pattern

Directory of Open Access Journals (Sweden)

Maryam Asadi

2015-12-01

Full Text Available Using mixed methods research design, the current study has analyzed Iranian researchers’ information searching behaviour on the Web.Then based on extracted concepts, the model of their information searching behavior was revealed. . Forty-four participants, including academic staff from universities and research centers were recruited for this study selected by purposive sampling. Data were gathered from questionnairs including ten questions and semi-structured interview. Each participant’s memos were analyzed using grounded theory methods adapted from Strauss & Corbin (1998. Results showed that the main objectives of subjects were doing a research, writing a paper, studying, doing assignments, downloading files and acquiring public information in using Web. The most important of learning about how to search and retrieve information were trial and error and get help from friends among the subjects. Information resources are identified by searching in information resources (e.g. search engines, references in papers, and search in Online database… communications facilities & tools (e.g. contact with colleagues, seminars & workshops, social networking..., and information services (e.g. RSS, Alerting, and SDI. Also, Findings indicated that searching by search engines, reviewing references, searching in online databases, and contact with colleagues and studying last issue of the electronic journals were the most important for searching. The most important strategies were using search engines and scientific tools such as Google Scholar. In addition, utilizing from simple (Quick search method was the most common among subjects. Using of topic, keywords, title of paper were most important of elements for retrieval information. Analysis of interview showed that there were nine stages in researchers’ information searching behaviour: topic selection, initiating search, formulating search query, information retrieval, access to information
The poor quality of information about laparoscopy on the World Wide Web as indexed by popular search engines.

Science.gov (United States)

Allen, J W; Finch, R J; Coleman, M G; Nathanson, L K; O'Rourke, N A; Fielding, G A

2002-01-01

This study was undertaken to determine the quality of information on the Internet regarding laparoscopy. Four popular World Wide Web search engines were used with the key word "laparoscopy." Advertisements, patient- or physician-directed information, and controversial material were noted. A total of 14,030 Web pages were found, but only 104 were unique Web sites. The majority of the sites were duplicate pages, subpages within a main Web page, or dead links. Twenty-eight of the 104 pages had a medical product for sale, 26 were patient-directed, 23 were written by a physician or group of physicians, and six represented corporations. The remaining 21 were "miscellaneous." The 46 pages containing educational material were critically reviewed. At least one of the senior authors found that 32 of the pages contained controversial or misleading statements. All of the three senior authors (LKN, NAO, GAF) independently agreed that 17 of the 46 pages contained controversial information. The World Wide Web is not a reliable source for patient or physician information about laparoscopy. Authenticating medical information on the World Wide Web is a difficult task, and no government or surgical society has taken the lead in regulating what is presented as fact on the World Wide Web.
Through the Google Goggles: Sociopolitical Bias in Search Engine Design

Science.gov (United States)

Diaz, A.

Search engines like Google are essential to navigating the Web's endless supply of news, political information, and citizen discourse. The mechanisms and conditions under which search results are selected should therefore be of considerable interest to media scholars, political theorists, and citizens alike. In this chapter, I adopt a "deliberative" ideal for search engines and examine whether Google exhibits the "same old" media biases of mainstreaming, hypercommercialism, and industry consolidation. In the end, serious objections to Google are raised: Google may favor popularity over richness; it provides advertising that competes directly with "editorial" content; it so overwhelmingly dominates the industry that users seldom get a second opinion, and this is unlikely to change. Ultimately, however, the results of this analysis may speak less about Google than about contradictions in the deliberative ideal and the so-called "inherently democratic" nature of the Web.
Quantitative evaluation of recall and precision of CAT Crawler, a search engine specialized on retrieval of Critically Appraised Topics

Directory of Open Access Journals (Sweden)

Loh Marie

2004-12-01

Full Text Available Abstract Background Critically Appraised Topics (CATs are a useful tool that helps physicians to make clinical decisions as the healthcare moves towards the practice of Evidence-Based Medicine (EBM. The fast growing World Wide Web has provided a place for physicians to share their appraised topics online, but an increasing amount of time is needed to find a particular topic within such a rich repository. Methods A web-based application, namely the CAT Crawler, was developed by Singapore's Bioinformatics Institute to allow physicians to adequately access available appraised topics on the Internet. A meta-search engine, as the core component of the application, finds relevant topics following keyword input. The primary objective of the work presented here is to evaluate the quantity and quality of search results obtained from the meta-search engine of the CAT Crawler by comparing them with those obtained from two individual CAT search engines. From the CAT libraries at these two sites, all possible keywords were extracted using a keyword extractor. Of those common to both libraries, ten were randomly chosen for evaluation. All ten were submitted to the two search engines individually, and through the meta-search engine of the CAT Crawler. Search results were evaluated for relevance both by medical amateurs and professionals, and the respective recall and precision were calculated. Results While achieving an identical recall, the meta-search engine showed a precision of 77.26% (±14.45 compared to the individual search engines' 52.65% (±12.0 (p Conclusion The results demonstrate the validity of the CAT Crawler meta-search engine approach. The improved precision due to inherent filters underlines the practical usefulness of this tool for clinicians.
Chemical Search Web Utility

Data.gov (United States)

U.S. Environmental Protection Agency — The Chemical Search Web Utility is an intuitive web application that allows the public to easily find the chemical that they are interested in using, and which...
Characteristics of scientific web publications

DEFF Research Database (Denmark)

Thorlund Jepsen, Erik; Seiden, Piet; Ingwersen, Peter Emil Rerup

2004-01-01

were generated based on specifically selected domain topics that are searched for in three publicly accessible search engines (Google, AllTheWeb, and AltaVista). A sample of the retrieved hits was analyzed with regard to how various publication attributes correlated with the scientific quality...... of the content and whether this information could be employed to harvest, filter, and rank Web publications. The attributes analyzed were inlinks, outlinks, bibliographic references, file format, language, search engine overlap, structural position (according to site structure), and the occurrence of various...... types of metadata. As could be expected, the ranked output differs between the three search engines. Apparently, this is caused by differences in ranking algorithms rather than the databases themselves. In fact, because scientific Web content in this subject domain receives few inlinks, both Alta...
A Software Engineering Approach based on WebML and BPMN to the Mediation Scenario of the SWS Challenge

Science.gov (United States)

Brambilla, Marco; Ceri, Stefano; Valle, Emanuele Della; Facca, Federico M.; Tziviskou, Christina

Although Semantic Web Services are expected to produce a revolution in the development of Web-based systems, very few enterprise-wide design experiences are available; one of the main reasons is the lack of sound Software Engineering methods and tools for the deployment of Semantic Web applications. In this chapter, we present an approach to software development for the Semantic Web based on classical Software Engineering methods (i.e., formal business process development, computer-aided and component-based software design, and automatic code generation) and on semantic methods and tools (i.e., ontology engineering, semantic service annotation and discovery).
Real-time earthquake monitoring using a search engine method.

Science.gov (United States)

Zhang, Jie; Zhang, Haijiang; Chen, Enhong; Zheng, Yi; Kuang, Wenhuan; Zhang, Xiong

2014-12-04

When an earthquake occurs, seismologists want to use recorded seismograms to infer its location, magnitude and source-focal mechanism as quickly as possible. If such information could be determined immediately, timely evacuations and emergency actions could be undertaken to mitigate earthquake damage. Current advanced methods can report the initial location and magnitude of an earthquake within a few seconds, but estimating the source-focal mechanism may require minutes to hours. Here we present an earthquake search engine, similar to a web search engine, that we developed by applying a computer fast search method to a large seismogram database to find waveforms that best fit the input data. Our method is several thousand times faster than an exact search. For an Mw 5.9 earthquake on 8 March 2012 in Xinjiang, China, the search engine can infer the earthquake's parameters in <1 s after receiving the long-period surface wave data.

A Portrait of the Audience for Instruction in Web Searching: Results of a Survey Conducted at Two Canadian Universities.

Science.gov (United States)

Tillotson, Joy

2003-01-01

Describes a survey that was conducted involving participants in the library instruction program at two Canadian universities in order to describe the characteristics of students receiving instruction in Web searching. Examines criteria for evaluating Web sites, search strategies, use of search engines, and frequency of use. Questionnaire is…
An Evidence-Based Review of Academic Web Search Engines, 2014-2016: Implications for Librarians’ Practice and Research Agenda

Directory of Open Access Journals (Sweden)

Jody Condit Fagan

2017-06-01

Full Text Available Academic web search engines have become central to scholarly research. While the fitness of Google Scholar for research purposes has been examined repeatedly, Microsoft Academic and Google Books have not received much attention. Recent studies have much to tell us about the coverage and utility of Google Scholar, its coverage of the sciences, and its utility for evaluating researcher impact. But other aspects have been woefully understudied, such as coverage of the arts and humanities, books, and non-Western, non-English publications. User research has also tapered off. A small number of articles hint at the opportunity for librarians to become expert advisors concerning opportunities of scholarly communication made possible or enhanced by these platforms. This article seeks to summarize research concerning Google Scholar, Google Books, and Microsoft Academic from the past three years with a mind to informing practice and setting a research agenda. Selected literature from earlier time periods is included to illuminate key findings and to help shape the proposed research agenda, especially in understudied areas.
Search Engine Customization and Data Set Builder

OpenAIRE

Arias Moreno, Fco Javier

2009-01-01

There are two core objectives in this work: firstly, to build a data set, and secondly, to customize a search engine. The first objective is to design and implement a data set builder. There are two steps required for this. The first step is to build a crawler. The second step is to include a cleaner. The crawler collects Web links. The cleaner extracts the main content and removes noise from the files crawled. The goal of this application is crawling Web news sites to find the...
Web-based surveillance of public information needs for informing preconception interventions.

Directory of Open Access Journals (Sweden)

Angelo D'Ambrosio

Full Text Available The risk of adverse pregnancy outcomes can be minimized through the adoption of healthy lifestyles before pregnancy by women of childbearing age. Initiatives for promotion of preconception health may be difficult to implement. Internet can be used to build tailored health interventions through identification of the public's information needs. To this aim, we developed a semi-automatic web-based system for monitoring Google searches, web pages and activity on social networks, regarding preconception health.Based on the American College of Obstetricians and Gynecologists guidelines and on the actual search behaviors of Italian Internet users, we defined a set of keywords targeting preconception care topics. Using these keywords, we analyzed the usage of Google search engine and identified web pages containing preconception care recommendations. We also monitored how the selected web pages were shared on social networks. We analyzed discrepancies between searched and published information and the sharing pattern of the topics.We identified 1,807 Google search queries which generated a total of 1,995,030 searches during the study period. Less than 10% of the reviewed pages contained preconception care information and in 42.8% information was consistent with ACOG guidelines. Facebook was the most used social network for sharing. Nutrition, Chronic Diseases and Infectious Diseases were the most published and searched topics. Regarding Genetic Risk and Folic Acid, a high search volume was not associated to a high web page production, while Medication pages were more frequently published than searched. Vaccinations elicited high sharing although web page production was low; this effect was quite variable in time.Our study represent a resource to prioritize communication on specific topics on the web, to address misconceptions, and to tailor interventions to specific populations.
What Major Search Engines Like Google, Yahoo and Bing Need to Know about Teachers in the UK?

Science.gov (United States)

Seyedarabi, Faezeh

2014-01-01

This article briefly outlines the current major search engines' approach to teachers' web searching. The aim of this article is to make Web searching easier for teachers when searching for relevant online teaching materials, in general, and UK teacher practitioners at primary, secondary and post-compulsory levels, in particular. Therefore, major…
Information retrieval implementing and evaluating search engines

CERN Document Server

Büttcher, Stefan; Cormack, Gordon V

2016-01-01

Information retrieval is the foundation for modern search engines. This textbook offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. Wumpus -- a multiuser open-source information retrieval system developed by one of the authors and available online -- provides model implementations and a basis for student work. The modular structure of the book allows instructors to use it in a variety of graduate-level courses, including courses taught from a database systems perspective, traditional information retrieval courses with a focus on IR theory, and courses covering the basics of Web retrieval. In addition to its classroom use, Information Retrieval will be a valuable reference for professionals in computer science, computer engineering, and software engineering.
Meta Search Engines.

Science.gov (United States)

Garman, Nancy

1999-01-01

Describes common options and features to consider in evaluating which meta search engine will best meet a searcher's needs. Discusses number and names of engines searched; other sources and specialty engines; search queries; other search options; and results options. (AEF)
New generation of the multimedia search engines

Science.gov (United States)

Mijes Cruz, Mario Humberto; Soto Aldaco, Andrea; Maldonado Cano, Luis Alejandro; López Rodríguez, Mario; Rodríguez Vázqueza, Manuel Antonio; Amaya Reyes, Laura Mariel; Cano Martínez, Elizabeth; Pérez Rosas, Osvaldo Gerardo; Rodríguez Espejo, Luis; Flores Secundino, Jesús Abimelek; Rivera Martínez, José Luis; García Vázquez, Mireya Saraí; Zamudio Fuentes, Luis Miguel; Sánchez Valenzuela, Juan Carlos; Montoya Obeso, Abraham; Ramírez Acosta, Alejandro Álvaro

2016-09-01

Current search engines are based upon search methods that involve the combination of words (text-based search); which has been efficient until now. However, the Internet's growing demand indicates that there's more diversity on it with each passing day. Text-based searches are becoming limited, as most of the information on the Internet can be found in different types of content denominated multimedia content (images, audio files, video files). Indeed, what needs to be improved in current search engines is: search content, and precision; as well as an accurate display of expected search results by the user. Any search can be more precise if it uses more text parameters, but it doesn't help improve the content or speed of the search itself. One solution is to improve them through the characterization of the content for the search in multimedia files. In this article, an analysis of the new generation multimedia search engines is presented, focusing the needs according to new technologies. Multimedia content has become a central part of the flow of information in our daily life. This reflects the necessity of having multimedia search engines, as well as knowing the real tasks that it must comply. Through this analysis, it is shown that there are not many search engines that can perform content searches. The area of research of multimedia search engines of new generation is a multidisciplinary area that's in constant growth, generating tools that satisfy the different needs of new generation systems.
A Secured Cognitive Agent based Multi-strategic Intelligent Search System

Directory of Open Access Journals (Sweden)

Neha Gulati

2018-04-01

Full Text Available Search Engine (SE is the most preferred information retrieval tool ubiquitously used. In spite of vast scale involvement of users in SE’s, their limited capabilities to understand the user/searcher context and emotions places high cognitive, perceptual and learning load on the user to maintain the search momentum. In this regard, the present work discusses a Cognitive Agent (CA based approach to support the user in Web-based search process. The work suggests a framework called Secured Cognitive Agent based Multi-strategic Intelligent Search System (CAbMsISS to assist the user in search process. It helps to reduce the contextual and emotional mismatch between the SE’s and user. After implementation of the proposed framework, performance analysis shows that CAbMsISS framework improves Query Retrieval Time (QRT and effectiveness for retrieving relevant results as compared to Present Search Engine (PSE. Supplementary to this, it also provides search suggestions when user accesses a resource previously tagged with negative emotions. Overall, the goal of the system is to enhance the search experience for keeping the user motivated. The framework provides suggestions through the search log that tracks the queries searched, resources accessed and emotions experienced during the search. The implemented framework also considers user security. Keywords: BDI model, Cognitive Agent, Emotion, Information retrieval, Intelligent search, Search Engine
An ontology-based search engine for protein-protein interactions.

Science.gov (United States)

Park, Byungkyu; Han, Kyungsook

2010-01-18

Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology.
Security and computer forensics in web engineering education

OpenAIRE

Glisson, W.; Welland, R.; Glisson, L.M.

2010-01-01

The integration of security and forensics into Web Engineering curricula is imperative! Poor security in web-based applications is continuing to cost organizations millions and the losses are still increasing annually. Security is frequently taught as a stand-alone course, assuming that security can be 'bolted on' to a web application at some point. Security issues must be integrated into Web Engineering processes right from the beginning to create secure solutions and therefore security shou...
Searching with Experience - A Search Engine for Product Information that Learns from its Users

NARCIS (Netherlands)

Leeuwen, van J.P.; Jessurun, A.J.; Jansen, G.; Martens, B.; Brown, A.

2005-01-01

This paper describes the motivation and development of a new algorithm for ranking web pages. This development aims to enable the implementation of a search engine that can provide highly personalised results to queries. It was initiated by a request from the Dutch CAD industry, but has generic
Finding Business Information on the "Invisible Web": Search Utilities vs. Conventional Search Engines.

Science.gov (United States)

Darrah, Brenda

Researchers for small businesses, which may have no access to expensive databases or market research reports, must often rely on information found on the Internet, which can be difficult to find. Although current conventional Internet search engines are now able to index over on billion documents, there are many more documents existing in…
Developing a Grid-based search and categorization tool

CERN Document Server

Haya, Glenn; Vigen, Jens

2003-01-01

Grid technology has the potential to improve the accessibility of digital libraries. The participants in Project GRACE (Grid Search And Categorization Engine) are in the process of developing a search engine that will allow users to search through heterogeneous resources stored in geographically distributed digital collections. What differentiates this project from current search tools is that GRACE will be run on the European Data Grid, a large distributed network, and will not have a single centralized index as current web search engines do. In some cases, the distributed approach offers advantages over the centralized approach since it is more scalable, can be used on otherwise inaccessible material, and can provide advanced search options customized for each data source.
Quality of web-based information on bipolar disorder.

Science.gov (United States)

Morel, Vincent; Chatton, Anne; Cochand, Sophie; Zullino, Daniele; Khazaal, Yasser

2008-10-01

To evaluate web-based information on bipolar disorder and to assess particular content quality indicators. Two keywords, "bipolar disorder" and "manic depressive illness" were entered into popular World Wide Web search engines. Websites were assessed with a standardized proforma designed to rate sites on the basis of accountability, presentation, interactivity, readability and content quality. "Health on the Net" (HON) quality label, and DISCERN scale scores were used to verify their efficiency as quality indicators. Of the 80 websites identified, 34 were included. Based on outcome measures, the content quality of the sites turned-out to be good. Content quality of web sites dealing with bipolar disorder is significantly explained by readability, accountability and interactivity as well as a global score. The overall content quality of the studied bipolar disorder websites is good.
FedWeb Greatest Hits: Presenting the New Test Collection for Federated Web Search

NARCIS (Netherlands)

Demeester, Thomas; Trieschnigg, Rudolf Berend; Zhou, Ke; Nguyen, Dong-Phuong; Hiemstra, Djoerd

This paper presents 'FedWeb Greatest Hits', a large new test collection for research in web information retrieval. As a combination and extension of the datasets used in the TREC Federated Web Search Track, this collection opens up new research possibilities on federated web search challenges, as
Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.

Science.gov (United States)

De-Arteaga, Maria; Eggel, Ivan; Do, Bao; Rubin, Daniel; Kahn, Charles E; Müller, Henning

2015-08-01

Information search has changed the way we manage knowledge and the ubiquity of information access has made search a frequent activity, whether via Internet search engines or increasingly via mobile devices. Medical information search is in this respect no different and much research has been devoted to analyzing the way in which physicians aim to access information. Medical image search is a much smaller domain but has gained much attention as it has different characteristics than search for text documents. While web search log files have been analysed many times to better understand user behaviour, the log files of hospital internal systems for search in a PACS/RIS (Picture Archival and Communication System, Radiology Information System) have rarely been analysed. Such a comparison between a hospital PACS/RIS search and a web system for searching images of the biomedical literature is the goal of this paper. Objectives are to identify similarities and differences in search behaviour of the two systems, which could then be used to optimize existing systems and build new search engines. Log files of the ARRS GoldMiner medical image search engine (freely accessible on the Internet) containing 222,005 queries, and log files of Stanford's internal PACS/RIS search called radTF containing 18,068 queries were analysed. Each query was preprocessed and all query terms were mapped to the RadLex (Radiology Lexicon) terminology, a comprehensive lexicon of radiology terms created and maintained by the Radiological Society of North America, so the semantic content in the queries and the links between terms could be analysed, and synonyms for the same concept could be detected. RadLex was mainly created for the use in radiology reports, to aid structured reporting and the preparation of educational material (Lanlotz, 2006) [1]. In standard medical vocabularies such as MeSH (Medical Subject Headings) and UMLS (Unified Medical Language System) specific terms of radiology are often
Research on Web Search Behavior: How Online Query Data Inform Social Psychology.

Science.gov (United States)

Lai, Kaisheng; Lee, Yan Xin; Chen, Hao; Yu, Rongjun

2017-10-01

The widespread use of web searches in daily life has allowed researchers to study people's online social and psychological behavior. Using web search data has advantages in terms of data objectivity, ecological validity, temporal resolution, and unique application value. This review integrates existing studies on web search data that have explored topics including sexual behavior, suicidal behavior, mental health, social prejudice, social inequality, public responses to policies, and other psychosocial issues. These studies are categorized as descriptive, correlational, inferential, predictive, and policy evaluation research. The integration of theory-based hypothesis testing in future web search research will result in even stronger contributions to social psychology.
Enhancing food engineering education with interactive web-based simulations

OpenAIRE

Alexandros Koulouris; Georgios Aroutidis; Dimitrios Vardalis; Petros Giannoulis; Paraskevi Karakosta

2015-01-01

In the traditional deductive approach in teaching any engineering topic, teachers would first expose students to the derivation of the equations that govern the behavior of a physical system and then demonstrate the use of equations through a limited number of textbook examples. This methodology, however, is rarely adequate to unmask the cause-effect and quantitative relationships between the system variables that the equations embody. Web-based simulation, which is the integration of simulat...
A real-time all-atom structural search engine for proteins.

Science.gov (United States)

Gonzalez, Gabriel; Hannigan, Brett; DeGrado, William F

2014-07-01

Protein designers use a wide variety of software tools for de novo design, yet their repertoire still lacks a fast and interactive all-atom search engine. To solve this, we have built the Suns program: a real-time, atomic search engine integrated into the PyMOL molecular visualization system. Users build atomic-level structural search queries within PyMOL and receive a stream of search results aligned to their query within a few seconds. This instant feedback cycle enables a new "designability"-inspired approach to protein design where the designer searches for and interactively incorporates native-like fragments from proven protein structures. We demonstrate the use of Suns to interactively build protein motifs, tertiary interactions, and to identify scaffolds compatible with hot-spot residues. The official web site and installer are located at http://www.degradolab.org/suns/ and the source code is hosted at https://github.com/godotgildor/Suns (PyMOL plugin, BSD license), https://github.com/Gabriel439/suns-cmd (command line client, BSD license), and https://github.com/Gabriel439/suns-search (search engine server, GPLv2 license).

An architecture for diversity-aware search for medical web content.

Science.gov (United States)

Denecke, K

2012-01-01

The Web provides a huge source of information, also on medical and health-related issues. In particular the content of medical social media data can be diverse due to the background of an author, the source or the topic. Diversity in this context means that a document covers different aspects of a topic or a topic is described in different ways. In this paper, we introduce an approach that allows to consider the diverse aspects of a search query when providing retrieval results to a user. We introduce a system architecture for a diversity-aware search engine that allows retrieving medical information from the web. The diversity of retrieval results is assessed by calculating diversity measures that rely upon semantic information derived from a mapping to concepts of a medical terminology. Considering these measures, the result set is diversified by ranking more diverse texts higher. The methods and system architecture are implemented in a retrieval engine for medical web content. The diversity measures reflect the diversity of aspects considered in a text and its type of information content. They are used for result presentation, filtering and ranking. In a user evaluation we assess the user satisfaction with an ordering of retrieval results that considers the diversity measures. It is shown through the evaluation that diversity-aware retrieval considering diversity measures in ranking could increase the user satisfaction with retrieval results.
SA-Search: a web tool for protein structure mining based on a Structural Alphabet.

Science.gov (United States)

Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

2004-07-01

SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search.
Developing a distributed HTML5-based search engine for geospatial resource discovery

Science.gov (United States)

ZHOU, N.; XIA, J.; Nebert, D.; Yang, C.; Gui, Z.; Liu, K.

2013-12-01

With explosive growth of data, Geospatial Cyberinfrastructure(GCI) components are developed to manage geospatial resources, such as data discovery and data publishing. However, the efficiency of geospatial resources discovery is still challenging in that: (1) existing GCIs are usually developed for users of specific domains. Users may have to visit a number of GCIs to find appropriate resources; (2) The complexity of decentralized network environment usually results in slow response and pool user experience; (3) Users who use different browsers and devices may have very different user experiences because of the diversity of front-end platforms (e.g. Silverlight, Flash or HTML). To address these issues, we developed a distributed and HTML5-based search engine. Specifically, (1)the search engine adopts a brokering approach to retrieve geospatial metadata from various and distributed GCIs; (2) the asynchronous record retrieval mode enhances the search performance and user interactivity; (3) the search engine based on HTML5 is able to provide unified access capabilities for users with different devices (e.g. tablet and smartphone).
Information Diversity in Web Search

Science.gov (United States)

Liu, Jiahui

2009-01-01

The web is a rich and diverse information source with incredible amounts of information about all kinds of subjects in various forms. This information source affords great opportunity to build systems that support users in their work and everyday lives. To help users explore information on the web, web search systems should find information that…
Engineering Adaptive Web Applications

DEFF Research Database (Denmark)

Dolog, Peter

2007-01-01

suit the user profile the most. This paper summarizes the domain engineering framework for such adaptive web applications. The framework provides guidelines to develop adaptive web applications as members of a family. It suggests how to utilize the design artifacts as knowledge which can be used......Information and services on the web are accessible for everyone. Users of the web differ in their background, culture, political and social environment, interests and so on. Ambient intelligence was envisioned as a concept for systems which are able to adapt to user actions and needs....... With the growing amount of information and services, the web applications become natural candidates to adopt the concepts of ambient intelligence. Such applications can deal with divers user intentions and actions based on the user profile and can suggest the combination of information content and services which...
Tangled in the breast cancer web: an evaluation of the usage of web-based information resources by breast cancer patients.

Science.gov (United States)

Nguyen, Sonia Kim Anh; Ingledew, Paris-Ann

2013-12-01

This study describes Internet use by breast cancer patients highlighting search patterns and examining the impact of web-based information on the clinical encounter. From September 2011 to January 2012, breast cancer patients at a cancer center completed a survey. Answers were closed and open-ended. Eighty-one patients were approached and 56 completed the survey. Forty-five (80 %) respondents used the Internet and 32 (71 %) searched for breast cancer information. All used Google as their principal search engine. To evaluate quality, 47 % referred to author credentials and 41 % examined references. Most sought information with respect to treatment or prognosis. Eighty percent felt that the information increased their knowledge and influenced treatment decision making for 53 %. This study highlights search patterns and factors used by breast cancer patients in seeking web-based information. Physicians must appreciate that patients use the Internet and address discrepancies between information sought and that which is available.
F-OWL: An Inference Engine for Semantic Web

Science.gov (United States)

Zou, Youyong; Finin, Tim; Chen, Harry

2004-01-01

Understanding and using the data and knowledge encoded in semantic web documents requires an inference engine. F-OWL is an inference engine for the semantic web language OWL language based on F-logic, an approach to defining frame-based systems in logic. F-OWL is implemented using XSB and Flora-2 and takes full advantage of their features. We describe how F-OWL computes ontology entailment and compare it with other description logic based approaches. We also describe TAGA, a trading agent environment that we have used as a test bed for F-OWL and to explore how multiagent systems can use semantic web concepts and technology.
Can Interactive Web-Based CAD Tools Improve the Learning of Engineering Drawing? A Case Study

Science.gov (United States)

Pando Cerra, Pablo; Suárez González, Jesús M.; Busto Parra, Bernardo; Rodríguez Ortiz, Diana; Álvarez Peñín, Pedro I.

2014-01-01

Many current Web-based learning environments facilitate the theoretical teaching of a subject but this may not be sufficient for those disciplines that require a significant use of graphic mechanisms to resolve problems. This research study looks at the use of an environment that can help students learn engineering drawing with Web-based CAD…
Query transformations and their role in Web searching by the members of the general public

Directory of Open Access Journals (Sweden)

Martin Whittle

2006-01-01

Full Text Available Introduction. This paper reports preliminary research in a primarily experimental study of how the general public search for information on the Web. The focus is on the query transformation patterns that characterise searching. Method. In this work, we have used transaction logs from the Excite search engine to develop methods for analysing query transformations that should aid the analysis of our ongoing experimental work. Our methods involve the use of similarity techniques to link queries with the most similar previous query in a train. The resulting query transformations are represented as a list of codes representing a whole search. Analysis. It is shown how query transformation sequences can be represented as graphical networks and some basic statistical results are shown. A correlation analysis is performed to examine the co-occurrence of Boolean and quotation mark changes with the syntactic changes. Results. A frequency analysis of the occurrence of query transformation codes is presented. The connectivity of graphs obtained from the query transformation is investigated and found to follow an exponential scaling law. The correlation analysis reveals a number of patterns that provide some interesting insights into Web searching by the general public. Conclusion. We have developed analytical methods based on query similarity that can be applied to our current experimental work with volunteer subjects. The results of these will form part of a database with the aim of developing an improved understanding of how the public search the Web.
SA-Search: a web tool for protein structure mining based on a Structural Alphabet

OpenAIRE

Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

2004-01-01

SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of f...
Detecting the norovirus season in Sweden using search engine data--meeting the needs of hospital infection control teams.

Science.gov (United States)

Edelstein, Michael; Wallensten, Anders; Zetterqvist, Inga; Hulth, Anette

2014-01-01

Norovirus outbreaks severely disrupt healthcare systems. We evaluated whether Websök, an internet-based surveillance system using search engine data, improved norovirus surveillance and response in Sweden. We compared Websök users' characteristics with the general population, cross-correlated weekly Websök searches with laboratory notifications between 2006 and 2013, compared the time Websök and laboratory data crossed the epidemic threshold and surveyed infection control teams about their perception and use of Websök. Users of Websök were not representative of the general population. Websök correlated with laboratory data (b = 0.88-0.89) and gave an earlier signal to the onset of the norovirus season compared with laboratory-based surveillance. 17/21 (81%) infection control teams answered the survey, of which 11 (65%) believed Websök could help with infection control plans. Websök is a low-resource, easily replicable system that detects the norovirus season as reliably as laboratory data, but earlier. Using Websök in routine surveillance can help infection control teams prepare for the yearly norovirus season.
Designing a Pedagogical Model for Web Engineering Education: An Evolutionary Perspective

Science.gov (United States)

Hadjerrouit, Said

2005-01-01

In contrast to software engineering, which relies on relatively well established development approaches, there is a lack of a proven methodology that guides Web engineers in building reliable and effective Web-based systems. Currently, Web engineering lacks process models, architectures, suitable techniques and methods, quality assurance, and a…
Bat-Inspired Algorithm Based Query Expansion for Medical Web Information Retrieval.

Science.gov (United States)

Khennak, Ilyes; Drias, Habiba

2017-02-01

With the increasing amount of medical data available on the Web, looking for health information has become one of the most widely searched topics on the Internet. Patients and people of several backgrounds are now using Web search engines to acquire medical information, including information about a specific disease, medical treatment or professional advice. Nonetheless, due to a lack of medical knowledge, many laypeople have difficulties in forming appropriate queries to articulate their inquiries, which deem their search queries to be imprecise due the use of unclear keywords. The use of these ambiguous and vague queries to describe the patients' needs has resulted in a failure of Web search engines to retrieve accurate and relevant information. One of the most natural and promising method to overcome this drawback is Query Expansion. In this paper, an original approach based on Bat Algorithm is proposed to improve the retrieval effectiveness of query expansion in medical field. In contrast to the existing literature, the proposed approach uses Bat Algorithm to find the best expanded query among a set of expanded query candidates, while maintaining low computational complexity. Moreover, this new approach allows the determination of the length of the expanded query empirically. Numerical results on MEDLINE, the on-line medical information database, show that the proposed approach is more effective and efficient compared to the baseline.
A Survey On Various Web Template Detection And Extraction Methods

Directory of Open Access Journals (Sweden)

Neethu Mary Varghese

2015-03-01

Full Text Available Abstract In todays digital world reliance on the World Wide Web as a source of information is extensive. Users increasingly rely on web based search engines to provide accurate search results on a wide range of topics that interest them. The search engines in turn parse the vast repository of web pages searching for relevant information. However majority of web portals are designed using web templates which are designed to provide consistent look and feel to end users. The presence of these templates however can influence search results leading to inaccurate results being delivered to the users. Therefore to improve the accuracy and reliability of search results identification and removal of web templates from the actual content is essential. A wide range of approaches are commonly employed to achieve this and this paper focuses on the study of the various approaches of template detection and extraction that can be applied across homogenous as well as heterogeneous web pages.
Introducing Model-Based System Engineering Transforming System Engineering through Model-Based Systems Engineering

Science.gov (United States)

2014-03-31

Web Presentation...Software ..................................................... 20 Figure 6. Published Web Page from Data Collection...the term Model Based Engineering (MBE), Model Driven Engineering ( MDE ), or Model-‐Based Systems
Integration of Web mining and web crawler: Relevance and State of Art

OpenAIRE

Subhendu kumar pani; Deepak Mohapatra,; Bikram Keshari Ratha

2010-01-01

This study presents the role of web crawler in web mining environment. As the growth of the World Wide Web exceeded all expectations,the research on Web mining is growing more and more.web mining research topic which combines two of the activated research areas: Data Mining and World Wide Web .So, the World Wide Web is a very advanced area for data mining research. Search engines that are based on web crawling framework also used in web mining to find theinteracted web pages. This paper discu...
Extracting Macroscopic Information from Web Links.

Science.gov (United States)

Thelwall, Mike

2001-01-01

Discussion of Web-based link analysis focuses on an evaluation of Ingversen's proposed external Web Impact Factor for the original use of the Web, namely the interlinking of academic research. Studies relationships between academic hyperlinks and research activities for British universities and discusses the use of search engines for Web link…
Design and implementation of Web-based SDUV-FEL engineering database system

International Nuclear Information System (INIS)

Sun Xiaoying; Shen Liren; Dai Zhimin; Xie Dong

2006-01-01

A design of Web-based SDUV-FEL engineering database and its implementation are introduced. This system will save and offer static data and archived data of SDUV-FEL, and build a proper and effective platform for share of SDUV-FEL data. It offers usable and reliable SDUV-FEL data for operators and scientists. (authors)
TOWARDS ACTIVE SEO (SEARCH ENGINE OPTIMIZATION 2.0

Directory of Open Access Journals (Sweden)

Charles-Victor Boutet

2012-12-01

Full Text Available In the age of writable web, new skills and new practices are appearing. In an environment that allows everyone to communicate information globally, internet referencing (or SEO is a strategic discipline that aims to generate visibility, internet traffic and a maximum exploitation of sites publications. Often misperceived as a fraud, SEO has evolved to be a facilitating tool for anyone who wishes to reference their website with search engines. In this article we show that it is possible to achieve the first rank in search results of keywords that are very competitive. We show methods that are quick, sustainable and legal; while applying the principles of active SEO 2.0. This article also clarifies some working functions of search engines, some advanced referencing techniques (that are completely ethical and legal and we lay the foundations for an in depth reflection on the qualities and advantages of these techniques.
Database Search Engines: Paradigms, Challenges and Solutions.

Science.gov (United States)

Verheggen, Kenneth; Martens, Lennart; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

2016-01-01

The first step in identifying proteins from mass spectrometry based shotgun proteomics data is to infer peptides from tandem mass spectra, a task generally achieved using database search engines. In this chapter, the basic principles of database search engines are introduced with a focus on open source software, and the use of database search engines is demonstrated using the freely available SearchGUI interface. This chapter also discusses how to tackle general issues related to sequence database searching and shows how to minimize their impact.

BPELPower—A BPEL execution engine for geospatial web services

Science.gov (United States)

Yu, Genong (Eugene); Zhao, Peisheng; Di, Liping; Chen, Aijun; Deng, Meixia; Bai, Yuqi

2012-10-01

The Business Process Execution Language (BPEL) has become a popular choice for orchestrating and executing workflows in the Web environment. As one special kind of scientific workflow, geospatial Web processing workflows are data-intensive, deal with complex structures in data and geographic features, and execute automatically with limited human intervention. To enable the proper execution and coordination of geospatial workflows, a specially enhanced BPEL execution engine is required. BPELPower was designed, developed, and implemented as a generic BPEL execution engine with enhancements for executing geospatial workflows. The enhancements are especially in its capabilities in handling Geography Markup Language (GML) and standard geospatial Web services, such as the Web Processing Service (WPS) and the Web Feature Service (WFS). BPELPower has been used in several demonstrations over the decade. Two scenarios were discussed in detail to demonstrate the capabilities of BPELPower. That study showed a standard-compliant, Web-based approach for properly supporting geospatial processing, with the only enhancement at the implementation level. Pattern-based evaluation and performance improvement of the engine are discussed: BPELPower directly supports 22 workflow control patterns and 17 workflow data patterns. In the future, the engine will be enhanced with high performance parallel processing and broad Web paradigms.
Resource Selection for Federated Search on the Web

NARCIS (Netherlands)

Nguyen, Dong-Phuong; Demeester, Thomas; Trieschnigg, Rudolf Berend; Hiemstra, Djoerd

A publicly available dataset for federated search reflecting a real web environment has long been bsent, making it difficult for researchers to test the validity of their federated search algorithms for the web setting. We present several experiments and analyses on resource selection on the web
Urban networks among Chinese cities along "the Belt and Road": A case of web search activity in cyberspace.

Science.gov (United States)

Zhang, Lu; Du, Hongru; Zhao, Yannan; Wu, Rongwei; Zhang, Xiaolei

2017-01-01

"The Belt and Road" initiative has been expected to facilitate interactions among numerous city centers. This initiative would generate a number of centers, both economic and political, which would facilitate greater interaction. To explore how information flows are merged and the specific opportunities that may be offered, Chinese cities along "the Belt and Road" are selected for a case study. Furthermore, urban networks in cyberspace have been characterized by their infrastructure orientation, which implies that there is a relative dearth of studies focusing on the investigation of urban hierarchies by capturing information flows between Chinese cities along "the Belt and Road". This paper employs Baidu, the main web search engine in China, to examine urban hierarchies. The results show that urban networks become more balanced, shifting from a polycentric to a homogenized pattern. Furthermore, cities in networks tend to have both a hierarchical system and a spatial concentration primarily in regions such as Beijing-Tianjin-Hebei, Yangtze River Delta and the Pearl River Delta region. Urban hierarchy based on web search activity does not follow the existing hierarchical system based on geospatial and economic development in all cases. Moreover, urban networks, under the framework of "the Belt and Road", show several significant corridors and more opportunities for more cities, particularly western cities. Furthermore, factors that may influence web search activity are explored. The results show that web search activity is significantly influenced by the economic gap, geographical proximity and administrative rank of the city.
Urban networks among Chinese cities along "the Belt and Road": A case of web search activity in cyberspace.

Directory of Open Access Journals (Sweden)

Lu Zhang

Full Text Available "The Belt and Road" initiative has been expected to facilitate interactions among numerous city centers. This initiative would generate a number of centers, both economic and political, which would facilitate greater interaction. To explore how information flows are merged and the specific opportunities that may be offered, Chinese cities along "the Belt and Road" are selected for a case study. Furthermore, urban networks in cyberspace have been characterized by their infrastructure orientation, which implies that there is a relative dearth of studies focusing on the investigation of urban hierarchies by capturing information flows between Chinese cities along "the Belt and Road". This paper employs Baidu, the main web search engine in China, to examine urban hierarchies. The results show that urban networks become more balanced, shifting from a polycentric to a homogenized pattern. Furthermore, cities in networks tend to have both a hierarchical system and a spatial concentration primarily in regions such as Beijing-Tianjin-Hebei, Yangtze River Delta and the Pearl River Delta region. Urban hierarchy based on web search activity does not follow the existing hierarchical system based on geospatial and economic development in all cases. Moreover, urban networks, under the framework of "the Belt and Road", show several significant corridors and more opportunities for more cities, particularly western cities. Furthermore, factors that may influence web search activity are explored. The results show that web search activity is significantly influenced by the economic gap, geographical proximity and administrative rank of the city.
Detecting the Norovirus Season in Sweden Using Search Engine Data – Meeting the Needs of Hospital Infection Control Teams

Science.gov (United States)

Edelstein, Michael; Wallensten, Anders; Zetterqvist, Inga; Hulth, Anette

2014-01-01

Norovirus outbreaks severely disrupt healthcare systems. We evaluated whether Websök, an internet-based surveillance system using search engine data, improved norovirus surveillance and response in Sweden. We compared Websök users' characteristics with the general population, cross-correlated weekly Websök searches with laboratory notifications between 2006 and 2013, compared the time Websök and laboratory data crossed the epidemic threshold and surveyed infection control teams about their perception and use of Websök. Users of Websök were not representative of the general population. Websök correlated with laboratory data (b = 0.88-0.89) and gave an earlier signal to the onset of the norovirus season compared with laboratory-based surveillance. 17/21 (81%) infection control teams answered the survey, of which 11 (65%) believed Websök could help with infection control plans. Websök is a low-resource, easily replicable system that detects the norovirus season as reliably as laboratory data, but earlier. Using Websök in routine surveillance can help infection control teams prepare for the yearly norovirus season. PMID:24955857
Quality of web-based information on cannabis addiction.

Science.gov (United States)

Khazaal, Yasser; Chatton, Anne; Cochand, Sophie; Zullino, Daniele

2008-01-01

This study evaluated the quality of Web-based information on cannabis use and addiction and investigated particular content quality indicators. Three keywords ("cannabis addiction," "cannabis dependence," and "cannabis abuse") were entered into two popular World Wide Web search engines. Websites were assessed with a standardized proforma designed to rate sites on the basis of accountability, presentation, interactivity, readability, and content quality. "Health on the Net" (HON) quality label, and DISCERN scale scores were used to verify their efficiency as quality indicators. Of the 94 Websites identified, 57 were included. Most were commercial sites. Based on outcome measures, the overall quality of the sites turned out to be poor. A global score (the sum of accountability, interactivity, content quality and esthetic criteria) appeared as a good content quality indicator. While cannabis education Websites for patients are widespread, their global quality is poor. There is a need for better evidence-based information about cannabis use and addiction on the Web.
An evaluation of web-based information.

Science.gov (United States)

Murphy, Rebecca; Frost, Susie; Webster, Peter; Schmidt, Ulrike

2004-03-01

To evaluate the quality of web-based information on the treatment of eating disorders and to investigate potential indicators of content quality. Two search engines were queried to obtain 15 commonly accessed websites about eating disorders. Two reviewers evaluated the characteristics, quality of content, and accountability of the sites. Intercorrelations between variables were calculated. The overall quality of the sites was poor based on the outcome measures used. All quality of content measures correlated with a measure of accountability (Silberg, W.M., Lundberg, G.D., & Mussachio, R.A., 1993). There is a lack of quality information on the treatment of eating disorders on the web. Although accountability criteria may be useful indicators of content quality, there is a need to investigate whether these can be usefully applied to other mental health areas. Copyright 2004 by Wiley Periodicals, Inc. Int J Eat Disord 35: 145-154, 2004.
SpEnD: Linked Data SPARQL Endpoints Discovery Using Search Engines

OpenAIRE

Yumusak, Semih; Dogdu, Erdogan; Kodaz, Halife; Kamilaris, Andreas

2016-01-01

In this study, a novel metacrawling method is proposed for discovering and monitoring linked data sources on the Web. We implemented the method in a prototype system, named SPARQL Endpoints Discovery (SpEnD). SpEnD starts with a "search keyword" discovery process for finding relevant keywords for the linked data domain and specifically SPARQL endpoints. Then, these search keywords are utilized to find linked data sources via popular search engines (Google, Bing, Yahoo, Yandex). By using this ...
Electronic biomedical literature search for budding researcher.

Science.gov (United States)

Thakre, Subhash B; Thakre S, Sushama S; Thakre, Amol D

2013-09-01

Search for specific and well defined literature related to subject of interest is the foremost step in research. When we are familiar with topic or subject then we can frame appropriate research question. Appropriate research question is the basis for study objectives and hypothesis. The Internet provides a quick access to an overabundance of the medical literature, in the form of primary, secondary and tertiary literature. It is accessible through journals, databases, dictionaries, textbooks, indexes, and e-journals, thereby allowing access to more varied, individualised, and systematic educational opportunities. Web search engine is a tool designed to search for information on the World Wide Web, which may be in the form of web pages, images, information, and other types of files. Search engines for internet-based search of medical literature include Google, Google scholar, Scirus, Yahoo search engine, etc., and databases include MEDLINE, PubMed, MEDLARS, etc. Several web-libraries (National library Medicine, Cochrane, Web of Science, Medical matrix, Emory libraries) have been developed as meta-sites, providing useful links to health resources globally. A researcher must keep in mind the strengths and limitations of a particular search engine/database while searching for a particular type of data. Knowledge about types of literature, levels of evidence, and detail about features of search engine as available, user interface, ease of access, reputable content, and period of time covered allow their optimal use and maximal utility in the field of medicine. Literature search is a dynamic and interactive process; there is no one way to conduct a search and there are many variables involved. It is suggested that a systematic search of literature that uses available electronic resource effectively, is more likely to produce quality research.
Deep web search: an overview and roadmap

NARCIS (Netherlands)

Tjin-Kam-Jet, Kien; Trieschnigg, Rudolf Berend; Hiemstra, Djoerd

2011-01-01

We review the state-of-the-art in deep web search and propose a novel classification scheme to better compare deep web search systems. The current binary classification (surfacing versus virtual integration) hides a number of implicit decisions that must be made by a developer. We make these
Utilization of a radiology-centric search engine.

Science.gov (United States)

Sharpe, Richard E; Sharpe, Megan; Siegel, Eliot; Siddiqui, Khan

2010-04-01

Internet-based search engines have become a significant component of medical practice. Physicians increasingly rely on information available from search engines as a means to improve patient care, provide better education, and enhance research. Specialized search engines have emerged to more efficiently meet the needs of physicians. Details about the ways in which radiologists utilize search engines have not been documented. The authors categorized every 25th search query in a radiology-centric vertical search engine by radiologic subspecialty, imaging modality, geographic location of access, time of day, use of abbreviations, misspellings, and search language. Musculoskeletal and neurologic imagings were the most frequently searched subspecialties. The least frequently searched were breast imaging, pediatric imaging, and nuclear medicine. Magnetic resonance imaging and computed tomography were the most frequently searched modalities. A majority of searches were initiated in North America, but all continents were represented. Searches occurred 24 h/day in converted local times, with a majority occurring during the normal business day. Misspellings and abbreviations were common. Almost all searches were performed in English. Search engine utilization trends are likely to mirror trends in diagnostic imaging in the region from which searches originate. Internet searching appears to function as a real-time clinical decision-making tool, a research tool, and an educational resource. A more thorough understanding of search utilization patterns can be obtained by analyzing phrases as actually entered as well as the geographic location and time of origination. This knowledge may contribute to the development of more efficient and personalized search engines.
Enhancing food engineering education with interactive web-based simulations

Directory of Open Access Journals (Sweden)

Alexandros Koulouris

2015-04-01

Full Text Available In the traditional deductive approach in teaching any engineering topic, teachers would first expose students to the derivation of the equations that govern the behavior of a physical system and then demonstrate the use of equations through a limited number of textbook examples. This methodology, however, is rarely adequate to unmask the cause-effect and quantitative relationships between the system variables that the equations embody. Web-based simulation, which is the integration of simulation and internet technologies, has the potential to enhance the learning experience by offering an interactive and easily accessible platform for quick and effortless experimentation with physical phenomena.This paper presents the design and development of a web-based platform for teaching basic food engineering phenomena to food technology students. The platform contains a variety of modules (“virtual experiments” covering the topics of mass and energy balances, fluid mechanics and heat transfer. In this paper, the design and development of three modules for mass balances and heat transfer is presented. Each webpage representing an educational module has the following features: visualization of the studied phenomenon through graphs, charts or videos, computation through a mathematical model and experimentation. The student is allowed to edit key parameters of the phenomenon and observe the effect of these changes on the outputs. Experimentation can be done in a free or guided fashion with a set of prefabricated examples that students can run and self-test their knowledge by answering multiple-choice questions.
Internet Search Engines: Copyright's "Fair Use" in Reproduction and Public Display Rights

National Research Council Canada - National Science Library

Jeweler, Robin

2007-01-01

.... If so, is the activity a "fair use" protected by the Copyright Act? These issues frequently implicate search engines, which scan the web to allow users to find content for uses, both legitimate and illegitimate...
World Wide Web Metaphors for Search Mission Data

Science.gov (United States)

Norris, Jeffrey S.; Wallick, Michael N.; Joswig, Joseph C.; Powell, Mark W.; Torres, Recaredo J.; Mittman, David S.; Abramyan, Lucy; Crockett, Thomas M.; Shams, Khawaja S.; Fox, Jason M.;

2010-01-01

A software program that searches and browses mission data emulates a Web browser, containing standard meta - phors for Web browsing. By taking advantage of back-end URLs, users may save and share search states. Also, since a Web interface is familiar to users, training time is reduced. Familiar back and forward buttons move through a local search history. A refresh/reload button regenerates a query, and loads in any new data. URLs can be constructed to save search results. Adding context to the current search is also handled through a familiar Web metaphor. The query is constructed by clicking on hyperlinks that represent new components to the search query. The selection of a link appears to the user as a page change; the choice of links changes to represent the updated search and the results are filtered by the new criteria. Selecting a navigation link changes the current query and also the URL that is associated with it. The back button can be used to return to the previous search state. This software is part of the MSLICE release, which was written in Java. It will run on any current Windows, Macintosh, or Linux system.

LigSearch: a knowledge-based web server to identify likely ligands for a protein target

Energy Technology Data Exchange (ETDEWEB)

Beer, Tjaart A. P. de; Laskowski, Roman A. [European Bioinformatics Institute (EMBL–EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD (United Kingdom); Duban, Mark-Eugene [Northwestern University Feinberg School of Medicine, Chicago, Illinois (United States); Chan, A. W. Edith [University College London, London WC1E 6BT (United Kingdom); Anderson, Wayne F. [Northwestern University Feinberg School of Medicine, Chicago, Illinois (United States); Thornton, Janet M., E-mail: thornton@ebi.ac.uk [European Bioinformatics Institute (EMBL–EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD (United Kingdom)

2013-12-01

LigSearch is a web server for identifying ligands likely to bind to a given protein. Identifying which ligands might bind to a protein before crystallization trials could provide a significant saving in time and resources. LigSearch, a web server aimed at predicting ligands that might bind to and stabilize a given protein, has been developed. Using a protein sequence and/or structure, the system searches against a variety of databases, combining available knowledge, and provides a clustered and ranked output of possible ligands. LigSearch can be accessed at http://www.ebi.ac.uk/thornton-srv/databases/LigSearch.
Chemical-text hybrid search engines.

Science.gov (United States)

Zhou, Yingyao; Zhou, Bin; Jiang, Shumei; King, Frederick J

2010-01-01

As the amount of chemical literature increases, it is critical that researchers be enabled to accurately locate documents related to a particular aspect of a given compound. Existing solutions, based on text and chemical search engines alone, suffer from the inclusion of "false negative" and "false positive" results, and cannot accommodate diverse repertoire of formats currently available for chemical documents. To address these concerns, we developed an approach called Entity-Canonical Keyword Indexing (ECKI), which converts a chemical entity embedded in a data source into its canonical keyword representation prior to being indexed by text search engines. We implemented ECKI using Microsoft Office SharePoint Server Search, and the resultant hybrid search engine not only supported complex mixed chemical and keyword queries but also was applied to both intranet and Internet environments. We envision that the adoption of ECKI will empower researchers to pose more complex search questions that were not readily attainable previously and to obtain answers at much improved speed and accuracy.
TDCCREC: AN EFFICIENT AND SCALABLE WEB-BASED RECOMMENDATION SYSTEM

Directory of Open Access Journals (Sweden)

K.Latha

2010-10-01

Full Text Available Web browsers are provided with complex information space where the volume of information available to them is huge. There comes the Recommender system which effectively recommends web pages that are related to the current webpage, to provide the user with further customized reading material. To enhance the performance of the recommender systems, we include an elegant proposed web based recommendation system; Truth Discovery based Content and Collaborative RECommender (TDCCREC which is capable of addressing scalability. Existing approaches such as Learning automata deals with usage and navigational patterns of users. On the other hand, Weighted Association Rule is applied for recommending web pages by assigning weights to each page in all the transactions. Both of them have their own disadvantages. The websites recommended by the search engines have no guarantee for information correctness and often delivers conflicting information. To solve them, content based filtering and collaborative filtering techniques are introduced for recommending web pages to the active user along with the trustworthiness of the website and confidence of facts which outperforms the existing methods. Our results show how the proposed recommender system performs better in predicting the next request of web users.
Identify Web-page Content meaning using Knowledge based System for Dual Meaning Words

OpenAIRE

Sinha, Sukanta; Dattagupta, Rana; Mukhopadhyay, Debajyoti

2012-01-01

Meaning of Web-page content plays a big role while produced a search result from a search engine. Most of the cases Web-page meaning stored in title or meta-tag area but those meanings do not always match with Web-page content. To overcome this situation we need to go through the Web-page content to identify the Web-page meaning. In such cases, where Webpage content holds dual meaning words that time it is really difficult to identify the meaning of the Web-page. In this paper, we are introdu...
Web-Based Simulation Games for the Integration of Engineering and Business Fundamentals

Science.gov (United States)

Calfa, Bruno; Banholzer, William; Alger, Monty; Doherty, Michael

2017-01-01

This paper describes a web-based suite of simulation games that have the purpose to enhance the chemical engineering curriculum with business-oriented decisions. Two simulation cases are discussed whose teaching topics include closing material and energy balances, importance of recycle streams, price-volume relationship in a dynamic market, impact…
Research Proposal for Distributed Deep Web Search

NARCIS (Netherlands)

Tjin-Kam-Jet, Kien

2010-01-01

This proposal identifies two main problems related to deep web search, and proposes a step by step solution for each of them. The first problem is about searching deep web content by means of a simple free-text interface (with just one input field, instead of a complex interface with many input

FPS-RAM: Fast Prefix Search RAM-Based Hardware for Forwarding Engine

Science.gov (United States)

Zaitsu, Kazuya; Yamamoto, Koji; Kuroda, Yasuto; Inoue, Kazunari; Ata, Shingo; Oka, Ikuo

Ternary content addressable memory (TCAM) is becoming very popular for designing high-throughput forwarding engines on routers. However, TCAM has potential problems in terms of hardware and power costs, which limits its ability to deploy large amounts of capacity in IP routers. In this paper, we propose new hardware architecture for fast forwarding engines, called fast prefix search RAM-based hardware (FPS-RAM). We designed FPS-RAM hardware with the intent of maintaining the same search performance and physical user interface as TCAM because our objective is to replace the TCAM in the market. Our RAM-based hardware architecture is completely different from that of TCAM and has dramatically reduced the costs and power consumption to 62% and 52%, respectively. We implemented FPS-RAM on an FPGA to examine its lookup operation.
Are cannabis prevalence estimates comparable across countries and regions? A cross-cultural validation using search engine query data.

Science.gov (United States)

Steppan, Martin; Kraus, Ludwig; Piontek, Daniela; Siciliano, Valeria

2013-01-01

Prevalence estimation of cannabis use is usually based on self-report data. Although there is evidence on the reliability of this data source, its cross-cultural validity is still a major concern. External objective criteria are needed for this purpose. In this study, cannabis-related search engine query data are used as an external criterion. Data on cannabis use were taken from the 2007 European School Survey Project on Alcohol and Other Drugs (ESPAD). Provincial data came from three Italian nation-wide studies using the same methodology (2006-2008; ESPAD-Italia). Information on cannabis-related search engine query data was based on Google search volume indices (GSI). (1) Reliability analysis was conducted for GSI. (2) Latent measurement models of "true" cannabis prevalence were tested using perceived availability, web-based cannabis searches and self-reported prevalence as indicators. (3) Structure models were set up to test the influences of response tendencies and geographical position (latitude, longitude). In order to test the stability of the models, analyses were conducted on country level (Europe, US) and on provincial level in Italy. Cannabis-related GSI were found to be highly reliable and constant over time. The overall measurement model was highly significant in both data sets. On country level, no significant effects of response bias indicators and geographical position on perceived availability, web-based cannabis searches and self-reported prevalence were found. On provincial level, latitude had a significant positive effect on availability indicating that perceived availability of cannabis in northern Italy was higher than expected from the other indicators. Although GSI showed weaker associations with cannabis use than perceived availability, the findings underline the external validity and usefulness of search engine query data as external criteria. The findings suggest an acceptable relative comparability of national (provincial) prevalence
Development and tuning of an original search engine for patent libraries in medicinal chemistry.

Science.gov (United States)

Pasche, Emilie; Gobeill, Julien; Kreim, Olivier; Oezdemir-Zaech, Fatma; Vachon, Therese; Lovis, Christian; Ruch, Patrick

2014-01-01

The large increase in the size of patent collections has led to the need of efficient search strategies. But the development of advanced text-mining applications dedicated to patents of the biomedical field remains rare, in particular to address the needs of the pharmaceutical & biotech industry, which intensively uses patent libraries for competitive intelligence and drug development. We describe here the development of an advanced retrieval engine to search information in patent collections in the field of medicinal chemistry. We investigate and combine different strategies and evaluate their respective impact on the performance of the search engine applied to various search tasks, which covers the putatively most frequent search behaviours of intellectual property officers in medical chemistry: 1) a prior art search task; 2) a technical survey task; and 3) a variant of the technical survey task, sometimes called known-item search task, where a single patent is targeted. The optimal tuning of our engine resulted in a top-precision of 6.76% for the prior art search task, 23.28% for the technical survey task and 46.02% for the variant of the technical survey task. We observed that co-citation boosting was an appropriate strategy to improve prior art search tasks, while IPC classification of queries was improving retrieval effectiveness for technical survey tasks. Surprisingly, the use of the full body of the patent was always detrimental for search effectiveness. It was also observed that normalizing biomedical entities using curated dictionaries had simply no impact on the search tasks we evaluate. The search engine was finally implemented as a web-application within Novartis Pharma. The application is briefly described in the report. We have presented the development of a search engine dedicated to patent search, based on state of the art methods applied to patent corpora. We have shown that a proper tuning of the system to adapt to the various search tasks
An overview of biomedical literature search on the World Wide Web in the third millennium.

Science.gov (United States)

Kumar, Prince; Goel, Roshni; Jain, Chandni; Kumar, Ashish; Parashar, Abhishek; Gond, Ajay Ratan

2012-06-01

Complete access to the existing pool of biomedical literature and the ability to "hit" upon the exact information of the relevant specialty are becoming essential elements of academic and clinical expertise. With the rapid expansion of the literature database, it is almost impossible to keep up to date with every innovation. Using the Internet, however, most people can freely access this literature at any time, from almost anywhere. This paper highlights the use of the Internet in obtaining valuable biomedical research information, which is mostly available from journals, databases, textbooks and e-journals in the form of web pages, text materials, images, and so on. The authors present an overview of web-based resources for biomedical researchers, providing information about Internet search engines (e.g., Google), web-based bibliographic databases (e.g., PubMed, IndMed) and how to use them, and other online biomedical resources that can assist clinicians in reaching well-informed clinical decisions.
Spatial Visualization Learning in Engineering: Traditional Methods vs. a Web-Based Tool

Science.gov (United States)

Pedrosa, Carlos Melgosa; Barbero, Basilio Ramos; Miguel, Arturo Román

2014-01-01

This study compares an interactive learning manager for graphic engineering to develop spatial vision (ILMAGE_SV) to traditional methods. ILMAGE_SV is an asynchronous web-based learning tool that allows the manipulation of objects with a 3D viewer, self-evaluation, and continuous assessment. In addition, student learning may be monitored, which…
DRUMS: a human disease related unique gene mutation search engine.

Science.gov (United States)

Li, Zuofeng; Liu, Xingnan; Wen, Jingran; Xu, Ye; Zhao, Xin; Li, Xuan; Liu, Lei; Zhang, Xiaoyan

2011-10-01

With the completion of the human genome project and the development of new methods for gene variant detection, the integration of mutation data and its phenotypic consequences has become more important than ever. Among all available resources, locus-specific databases (LSDBs) curate one or more specific genes' mutation data along with high-quality phenotypes. Although some genotype-phenotype data from LSDB have been integrated into central databases little effort has been made to integrate all these data by a search engine approach. In this work, we have developed disease related unique gene mutation search engine (DRUMS), a search engine for human disease related unique gene mutation as a convenient tool for biologists or physicians to retrieve gene variant and related phenotype information. Gene variant and phenotype information were stored in a gene-centred relational database. Moreover, the relationships between mutations and diseases were indexed by the uniform resource identifier from LSDB, or another central database. By querying DRUMS, users can access the most popular mutation databases under one interface. DRUMS could be treated as a domain specific search engine. By using web crawling, indexing, and searching technologies, it provides a competitively efficient interface for searching and retrieving mutation data and their relationships to diseases. The present system is freely accessible at http://www.scbit.org/glif/new/drums/index.html. © 2011 Wiley-Liss, Inc.
GGRNA: an ultrafast, transcript-oriented search engine for genes and transcripts.

Science.gov (United States)

Naito, Yuki; Bono, Hidemasa

2012-07-01

GGRNA (http://GGRNA.dbcls.jp/) is a Google-like, ultrafast search engine for genes and transcripts. The web server accepts arbitrary words and phrases, such as gene names, IDs, gene descriptions, annotations of gene and even nucleotide/amino acid sequences through one simple search box, and quickly returns relevant RefSeq transcripts. A typical search takes just a few seconds, which dramatically enhances the usability of routine searching. In particular, GGRNA can search sequences as short as 10 nt or 4 amino acids, which cannot be handled easily by popular sequence analysis tools. Nucleotide sequences can be searched allowing up to three mismatches, or the query sequences may contain degenerate nucleotide codes (e.g. N, R, Y, S). Furthermore, Gene Ontology annotations, Enzyme Commission numbers and probe sequences of catalog microarrays are also incorporated into GGRNA, which may help users to conduct searches by various types of keywords. GGRNA web server will provide a simple and powerful interface for finding genes and transcripts for a wide range of users. All services at GGRNA are provided free of charge to all users.
The LAILAPS Search Engine: Relevance Ranking in Life Science Databases

Directory of Open Access Journals (Sweden)

Lange Matthias

2010-06-01

Full Text Available Search engines and retrieval systems are popular tools at a life science desktop. The manual inspection of hundreds of database entries, that reflect a life science concept or fact, is a time intensive daily work. Hereby, not the number of query results matters, but the relevance does. In this paper, we present the LAILAPS search engine for life science databases. The concept is to combine a novel feature model for relevance ranking, a machine learning approach to model user relevance profiles, ranking improvement by user feedback tracking and an intuitive and slim web user interface, that estimates relevance rank by tracking user interactions. Queries are formulated as simple keyword lists and will be expanded by synonyms. Supporting a flexible text index and a simple data import format, LAILAPS can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases.
IdentiPy: an extensible search engine for protein identification in shotgun proteomics.

Science.gov (United States)

Levitsky, Lev I; Ivanov, Mark V; Lobas, Anna A; Bubis, Julia A; Tarasova, Irina A; Solovyeva, Elizaveta M; Pridatchenko, Marina L; Gorshkov, Mikhail V

2018-04-23

We present an open-source, extensible search engine for shotgun proteomics. Implemented in Python programming language, IdentiPy shows competitive processing speed and sensitivity compared with the state-of-the-art search engines. It is equipped with a user-friendly web interface, IdentiPy Server, enabling the use of a single server installation accessed from multiple workstations. Using a simplified version of X!Tandem scoring algorithm and its novel ``auto-tune'' feature, IdentiPy outperforms the popular alternatives on high-resolution data sets. Auto-tune adjusts the search parameters for the particular data set, resulting in improved search efficiency and simplifying the user experience. IdentiPy with the auto-tune feature shows higher sensitivity compared with the evaluated search engines. IdentiPy Server has built-in post-processing and protein inference procedures and provides graphic visualization of the statistical properties of the data set and the search results. It is open-source and can be freely extended to use third-party scoring functions or processing algorithms, and allows customization of the search workflow for specialized applications.
A Web portal for the Engineering and Equipment Data Management System at CERN

International Nuclear Information System (INIS)

Tsyganov, A; Petit, S; Martel, P; Milenkovic, S; Suwalska, A; Delamare, C; Widegren, D; Amerigo, S Mallon; Pettersson, T

2010-01-01

CERN, the European Laboratory for Particle Physics, located in Geneva - Switzerland, has recently started the Large Hadron Collider (LHC), a 27 km particle accelerator. The CERN Engineering and Equipment Data Management Service (EDMS) provides support for managing engineering and equipment information throughout the entire lifecycle of a project. Based on several both in-house developed and commercial data management systems, this service supports management and follow-up of different kinds of information throughout the lifecycle of the LHC project: design, manufacturing, installation, commissioning data, maintenance and more. The data collection phase, carried out by specialists, is now being replaced by a phase during which data will be consulted on an extensive basis by non-experts users. In order to address this change, a Web portal for the EDMS has been developed. It brings together in one space all the aspects covered by the EDMS: project and document management, asset tracking and safety follow-up. This paper presents the EDMS Web portal, its dynamic content management and its 'one click' information search engine.
ERRATUM: TOWARDS ACTIVE SEO (SEARCH ENGINE OPTIMIZATION 2.0

Directory of Open Access Journals (Sweden)

Charles-Victor Boutet

2013-04-01

Full Text Available In the age of writable web, new skills and new practices are appearing. In an environment that allows everyone to communicate information globally, internet referencing (or SEO is a strategic discipline that aims to generate visibility, internet traffic and a maximum exploitation of sites publications. Often misperceived as a fraud, SEO has evolved to be a facilitating tool for anyone who wishes to reference their website with search engines. In this article we show that it is possible to achieve the first rank in search results of keywords that are very competitive. We show methods that are quick, sustainable and legal; while applying the principles of active SEO 2.0. This article also clarifies some working functions of search engines, some advanced referencing techniques (that are completely ethical and legal and we lay the foundations for an in depth reflection on the qualities and advantages of these techniques.
Exposing the Hidden-Web Induced by Ajax

NARCIS (Netherlands)

Mesbah, A.; Van Deursen, A.

2008-01-01

AJAX is a very promising approach for improving rich interactivity and responsiveness of web applications. At the same time, AJAX techniques increase the totality of the hidden web by shattering the metaphor of a web ‘page’ upon which general search engines are based. This paper describes a
FDRAnalysis: a tool for the integrated analysis of tandem mass spectrometry identification results from multiple search engines.

Science.gov (United States)

Wedge, David C; Krishna, Ritesh; Blackhurst, Paul; Siepen, Jennifer A; Jones, Andrew R; Hubbard, Simon J

2011-04-01

Confident identification of peptides via tandem mass spectrometry underpins modern high-throughput proteomics. This has motivated considerable recent interest in the postprocessing of search engine results to increase confidence and calculate robust statistical measures, for example through the use of decoy databases to calculate false discovery rates (FDR). FDR-based analyses allow for multiple testing and can assign a single confidence value for both sets and individual peptide spectrum matches (PSMs). We recently developed an algorithm for combining the results from multiple search engines, integrating FDRs for sets of PSMs made by different search engine combinations. Here we describe a web-server and a downloadable application that makes this routinely available to the proteomics community. The web server offers a range of outputs including informative graphics to assess the confidence of the PSMs and any potential biases. The underlying pipeline also provides a basic protein inference step, integrating PSMs into protein ambiguity groups where peptides can be matched to more than one protein. Importantly, we have also implemented full support for the mzIdentML data standard, recently released by the Proteomics Standards Initiative, providing users with the ability to convert native formats to mzIdentML files, which are available to download.
Introduction to Chemical Engineering Reactor Analysis: A Web-Based Reactor Design Game

Science.gov (United States)

Orbey, Nese; Clay, Molly; Russell, T.W. Fraser

2014-01-01

An approach to explain chemical engineering through a Web-based interactive game design was developed and used with college freshman and junior/senior high school students. The goal of this approach was to demonstrate how to model a lab-scale experiment, and use the results to design and operate a chemical reactor. The game incorporates both…
Development of Web-Based Learning Environment Model to Enhance Cognitive Skills for Undergraduate Students in the Field of Electrical Engineering

Science.gov (United States)

Lakonpol, Thongmee; Ruangsuwan, Chaiyot; Terdtoon, Pradit

2015-01-01

This research aimed to develop a web-based learning environment model for enhancing cognitive skills of undergraduate students in the field of electrical engineering. The research is divided into 4 phases: 1) investigating the current status and requirements of web-based learning environment models. 2) developing a web-based learning environment…
Optimizing Online Suicide Prevention: A Search Engine-Based Tailored Approach.

Science.gov (United States)

Arendt, Florian; Scherr, Sebastian

2017-11-01

Search engines are increasingly used to seek suicide-related information online, which can serve both harmful and helpful purposes. Google acknowledges this fact and presents a suicide-prevention result for particular search terms. Unfortunately, the result is only presented to a limited number of visitors. Hence, Google is missing the opportunity to provide help to vulnerable people. We propose a two-step approach to a tailored optimization: First, research will identify the risk factors. Second, search engines will reweight algorithms according to the risk factors. In this study, we show that the query share of the search term "poisoning" on Google shows substantial peaks corresponding to peaks in actual suicidal behavior. Accordingly, thresholds for showing the suicide-prevention result should be set to the lowest levels during the spring, on Sundays and Mondays, on New Year's Day, and on Saturdays following Thanksgiving. Search engines can help to save lives globally by utilizing a more tailored approach to suicide prevention.
Using Google’s Custom Search Engine Product to Discover Scholarly Open Access and Cost-Free eBooks from Latin America

Directory of Open Access Journals (Sweden)

Melissa Gasparotto

2018-05-01

Full Text Available Many Latin American scholarly monographs are available for free to read and download in a scattered fashion across the web, hosted on educational, institutional and government websites as well as commercial websites and publishing platforms. There is as of yet no single way to identify all of this content at once, but web-based discovery leveraging existing search engine indexing would seem to be a likely option. This case study suggests and evaluates one such method for discovery of open access and other cost-free scholarly monographs produced in Latin America. One possible configuration of Google’s Custom Search Engine product is proposed and evaluated, and findings suggest its usefulness for a variety of applications, including for collection development, the preparation of thematic research guides with open content, and the enrichment of existing lists of open access eBook sources from Latin America. Unlike existing open access eBook portals, which search across known collections of such materials, search portals such as the one proposed allow users to search across the entire web to uncover scholarly free eBook sources that were previously unknown to them alongside known content sources, a key advantage to this method of discovery. The results further suggest the importance of pursuing discovery of these monograph titles outside established known collections, as an astonishing 45 % of all monographs identified through the Custom Search Engine portal were not discoverable in any edition, print or electronic, through WorldCat, and only 27 % were indexed by Google Books. Additionally, the low number of these eBook titles hosted in preservation-worthy repositories raises cause for concern about their long-term digital availability.
Quality analysis of patient information about knee arthroscopy on the World Wide Web.

Science.gov (United States)

Sambandam, Senthil Nathan; Ramasamy, Vijayaraj; Priyanka, Priyanka; Ilango, Balakrishnan

2007-05-01

This study was designed to ascertain the quality of patient information available on the World Wide Web on the topic of knee arthroscopy. For the purpose of quality analysis, we used a pool of 232 search results obtained from 7 different search engines. We used a modified assessment questionnaire to assess the quality of these Web sites. This questionnaire was developed based on similar studies evaluating Web site quality and includes items on illustrations, accessibility, availability, accountability, and content of the Web site. We also compared results obtained with different search engines and tried to establish the best possible search strategy to attain the most relevant, authentic, and adequate information with minimum time consumption. For this purpose, we first compared 100 search results from the single most commonly used search engine (AltaVista) with the pooled sample containing 20 search results from each of the 7 different search engines. The search engines used were metasearch (Copernic and Mamma), general search (Google, AltaVista, and Yahoo), and health topic-related search engines (MedHunt and Healthfinder). The phrase "knee arthroscopy" was used as the search terminology. Excluding the repetitions, there were 117 Web sites available for quality analysis. These sites were analyzed for accessibility, relevance, authenticity, adequacy, and accountability by use of a specially designed questionnaire. Our analysis showed that most of the sites providing patient information on knee arthroscopy contained outdated information, were inadequate, and were not accountable. Only 16 sites were found to be providing reasonably good patient information and hence can be recommended to patients. Understandably, most of these sites were from nonprofit organizations and educational institutions. Furthermore, our study revealed that using multiple search engines increases patients' chances of obtaining more relevant information rather than using a single search
Determination of geographic variance in stroke prevalence using Internet search engine analytics.

Science.gov (United States)

Walcott, Brian P; Nahed, Brian V; Kahle, Kristopher T; Redjal, Navid; Coumans, Jean-Valery

2011-06-01

Previous methods to determine stroke prevalence, such as nationwide surveys, are labor-intensive endeavors. Recent advances in search engine query analytics have led to a new metric for disease surveillance to evaluate symptomatic phenomenon, such as influenza. The authors hypothesized that the use of search engine query data can determine the prevalence of stroke. The Google Insights for Search database was accessed to analyze anonymized search engine query data. The authors' search strategy utilized common search queries used when attempting either to identify the signs and symptoms of a stroke or to perform stroke education. The search logic was as follows: (stroke signs + stroke symptoms + mini stroke--heat) from January 1, 2005, to December 31, 2010. The relative number of searches performed (the interest level) for this search logic was established for all 50 states and the District of Columbia. A Pearson product-moment correlation coefficient was calculated from the statespecific stroke prevalence data previously reported. Web search engine interest level was available for all 50 states and the District of Columbia over the time period for January 1, 2005-December 31, 2010. The interest level was highest in Alabama and Tennessee (100 and 96, respectively) and lowest in California and Virginia (58 and 53, respectively). The Pearson correlation coefficient (r) was calculated to be 0.47 (p = 0.0005, 2-tailed). Search engine query data analysis allows for the determination of relative stroke prevalence. Further investigation will reveal the reliability of this metric to determine temporal pattern analysis and prevalence in this and other symptomatic diseases.
Multimedia Search Engines : Concept, Performance, and Types

OpenAIRE

Sayed Rabeh Sayed

2005-01-01

A Research about multimedia search engines, it starts with definition of search engines at general and multimedia search engines, then explains how they work, and divided them into: Video search engines, Images search engines, and Audio search engines. Finally, it reviews a samples to multimedia search engines.

Quantifying retrieval bias in Web archive search

NARCIS (Netherlands)

Samar, Thaer; Traub, Myriam C.; van Ossenbruggen, Jacco; Hardman, Lynda; de Vries, Arjen P.

2018-01-01

A Web archive usually contains multiple versions of documents crawled from the Web at different points in time. One possible way for users to access a Web archive is through full-text search systems. However, previous studies have shown that these systems can induce a bias, known as the
Market Dominance and Search Quality in the Search Engine Market

NARCIS (Netherlands)

Lianos, I.; Motchenkova, E.I.

2013-01-01

We analyze a search engine market from a law and economics perspective and incorporate the choice of quality-improving innovations by a search engine platform in a two-sided model of Internet search engine. In the proposed framework, we first discuss the legal issues the search engine market raises
Improvement of natural image search engines results by emotional filtering

Directory of Open Access Journals (Sweden)

Patrice Denis

2016-04-01

Full Text Available With the Internet 2.0 era, managing user emotions is a problem that more and more actors are interested in. Historically, the first notions of emotion sharing were expressed and defined with emoticons. They allowed users to show their emotional status to others in an impersonal and emotionless digital world. Now, in the Internet of social media, every day users share lots of content with each other on Facebook, Twitter, Google+ and so on. Several new popular web sites like FlickR, Picassa, Pinterest, Instagram or DeviantArt are now specifically based on sharing image content as well as personal emotional status. This kind of information is economically very valuable as it can for instance help commercial companies sell more efficiently. In fact, with this king of emotional information, business can made where companies will better target their customers needs, and/or even sell them more products. Research has been and is still interested in the mining of emotional information from user data since then. In this paper, we focus on the impact of emotions from images that have been collected from search image engines. More specifically our proposition is the creation of a filtering layer applied on the results of such image search engines. Our peculiarity relies in the fact that it is the first attempt from our knowledge to filter image search engines results with an emotional filtering approach.
Exploring the academic invisible web

OpenAIRE

Lewandowski, Dirk; Mayr, Philipp

2006-01-01

Purpose: To provide a critical review of Bergman’s 2001 study on the Deep Web. In addition, we bring a new concept into the discussion, the Academic Invisible Web (AIW). We define the Academic Invisible Web as consisting of all databases and collections relevant to academia but not searchable by the general-purpose internet search engines. Indexing this part of the Invisible Web is central to scientific search engines. We provide an overview of approaches followed thus far. Design/methodol...
The Impact of User Knowledge on Web Search Satisfaction

OpenAIRE

Fadhilah M. Yamin; T. Ramayah

2011-01-01

Problem statement: Searching on the web is a tedious process as it requires knowledge and skills on what and how to search. What to search is basically, the core of the searching activity as it represents the need of the searcher. How to search is related to the knowledge on how the facilities available on the web can be utilized in order to achieve the needs. Search satisfaction is the level of measurement that describes the achievement of the searcher towards his/her information needs. Appr...
Which Search Engine Is the Most Used One among University Students?

Science.gov (United States)

Cavus, Nadire; Alpan, Kezban

2010-01-01

The importance of information is increasing in the information age that we are living in with internet becoming the major information resource for people with rapidly increasing number of documents. This situation makes finding information on the internet without web search engines impossible. The aim of the study is revealing most widely used…
Incorporating the surfing behavior of web users into PageRank

OpenAIRE

Ashyralyyev, Shatlyk

2013-01-01

Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2013. Thesis (Master's) -- Bilkent University, 2013. Includes bibliographical references leaves 68-73 One of the most crucial factors that determines the effectiveness of a large-scale commercial web search engine is the ranking (i.e., order) in which web search results are presented to the end user. In modern web search engines, the skeleton for the rank...
Applying Russian Search Engines to market Finnish Corporates

OpenAIRE

Pankratovs, Vladimirs

2013-01-01

The goal of this thesis work is to provide basic knowledge of Russia-based Search Engines marketing capabilities. After reading this material, the user is able to diverge different kinds of Search Engine Marketing tools and can perform advertising campaigns. This study includes information about the majority of tools available to the user and provides up to date screenshots of Russian Search engines front-end, which can be useful in further work. Study discusses the main principles and ba...
Situational Requirements Engineering for the Development of Content Management System-based Web Applications

NARCIS (Netherlands)

Souer, J.; van de Weerd, I.; Versendaal, J.M.; Brinkkemper, S.

2005-01-01

Web applications are evolving towards strong content-centered Web applications. The development processes and implementation of these applications are unlike the development and implementation of traditional information systems. In this paper we propose WebEngineering Method; a method for developing
More Effective Web Search Using Bigrams and Trigrams

OpenAIRE

Peter Vamplew; Vishv Malhotra; David Johnson

2006-01-01

This paper investigates the effectiveness of quoted bigrams and trigrams as query terms to target web search. Prior research in this area has largely focused on static corpora each containing only a few million documents, and has reported mixed (usually negative) results. We investigate the bigram/trigram extraction problem and present an extraction algorithm that shows promising results when applied to real-time web search. We also present a prototype augmented search software package that c...
A web-based approach to data imputation

KAUST Repository

Li, Zhixu; Sharaf, Mohamed Abdel Fattah; Sitbon, Laurianne; Sadiq, Shazia Wasim; Indulska, Marta; Zhou, Xiaofang

2013-01-01

principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme
Usability evaluation of an experimental text summarization system and three search engines: implications for the reengineering of health care interfaces.

Science.gov (United States)

Kushniruk, Andre W; Kan, Min-Yem; McKeown, Kathleen; Klavans, Judith; Jordan, Desmond; LaFlamme, Mark; Patel, Vimia L

2002-01-01

This paper describes the comparative evaluation of an experimental automated text summarization system, Centrifuser and three conventional search engines - Google, Yahoo and About.com. Centrifuser provides information to patients and families relevant to their questions about specific health conditions. It then produces a multidocument summary of articles retrieved by a standard search engine, tailored to the user's question. Subjects, consisting of friends or family of hospitalized patients, were asked to "think aloud" as they interacted with the four systems. The evaluation involved audio- and video recording of subject interactions with the interfaces in situ at a hospital. Results of the evaluation show that subjects found Centrifuser's summarization capability useful and easy to understand. In comparing Centrifuser to the three search engines, subjects' ratings varied; however, specific interface features were deemed useful across interfaces. We conclude with a discussion of the implications for engineering Web-based retrieval systems.
[Advanced online search techniques and dedicated search engines for physicians].

Science.gov (United States)

Nahum, Yoav

2008-02-01

In recent years search engines have become an essential tool in the work of physicians. This article will review advanced search techniques from the world of information specialists, as well as some advanced search engine operators that may help physicians improve their online search capabilities, and maximize the yield of their searches. This article also reviews popular dedicated scientific and biomedical literature search engines.
Critical Reading of the Web

Science.gov (United States)

Griffin, Teresa; Cohen, Deb

2012-01-01

The ubiquity and familiarity of the world wide web means that students regularly turn to it as a source of information. In doing so, they "are said to rely heavily on simple search engines, such as Google to find what they want." Researchers have also investigated how students use search engines, concluding that "the young web users tended to…
Digging Deeper: The Deep Web.

Science.gov (United States)

Turner, Laura

2001-01-01

Focuses on the Deep Web, defined as Web content in searchable databases of the type that can be found only by direct query. Discusses the problems of indexing; inability to find information not indexed in the search engine's database; and metasearch engines. Describes 10 sites created to access online databases or directly search them. Lists ways…
Improving PHENIX search with Solr, Nutch and Drupal

International Nuclear Information System (INIS)

Morrison, Dave; Sourikova, Irina

2012-01-01

During its 20 years of R and D, construction and operation the PHENIX experiment at the Relativistic Heavy Ion Collider (RHIC) has accumulated large amounts of proprietary collaboration data that is hosted on many servers around the world and is not open for commercial search engines for indexing and searching. The legacy search infrastructure did not scale well with the fast growing PHENIX document base and produced results inadequate in both precision and recall. After considering the possible alternatives that would provide an aggregated, fast, full text search of a variety of data sources and file formats we decided to use Nutch [1] as a web crawler and Solr [2] as a search engine. To present XML-based Solr search results in a user-friendly format we use Drupal [3] as a web interface to Solr. We describe the experience of building a federated search for a heterogeneous collection of 10 million PHENIX documents with Nutch, Solr and Drupal.
Improving PHENIX search with Solr, Nutch and Drupal.

Science.gov (United States)

Morrison, Dave; Sourikova, Irina

2012-12-01

During its 20 years of R&D, construction and operation the PHENIX experiment at the Relativistic Heavy Ion Collider (RHIC) has accumulated large amounts of proprietary collaboration data that is hosted on many servers around the world and is not open for commercial search engines for indexing and searching. The legacy search infrastructure did not scale well with the fast growing PHENIX document base and produced results inadequate in both precision and recall. After considering the possible alternatives that would provide an aggregated, fast, full text search of a variety of data sources and file formats we decided to use Nutch [1] as a web crawler and Solr [2] as a search engine. To present XML-based Solr search results in a user-friendly format we use Drupal [3] as a web interface to Solr. We describe the experience of building a federated search for a heterogeneous collection of 10 million PHENIX documents with Nutch, Solr and Drupal.
World Wide Web-based system for the calculation of substituent parameters and substituent similarity searches.

Science.gov (United States)

Ertl, P

1998-02-01

Easy to use, interactive, and platform-independent WWW-based tools are ideal for development of chemical applications. By using the newly emerging Web technologies such as Java applets and sophisticated scripting, it is possible to deliver powerful molecular processing capabilities directly to the desk of synthetic organic chemists. In Novartis Crop Protection in Basel, a Web-based molecular modelling system has been in use since 1995. In this article two new modules of this system are presented: a program for interactive calculation of important hydrophobic, electronic, and steric properties of organic substituents, and a module for substituent similarity searches enabling the identification of bioisosteric functional groups. Various possible applications of calculated substituent parameters are also discussed, including automatic design of molecules with the desired properties and creation of targeted virtual combinatorial libraries.
Search of the Deep and Dark Web via DARPA Memex

Science.gov (United States)

Mattmann, C. A.

2015-12-01

Search has progressed through several stages due to the increasing size of the Web. Search engines first focused on text and its rate of occurrence; then focused on the notion of link analysis and citation then on interactivity and guided search; and now on the use of social media - who we interact with, what we comment on, and who we follow (and who follows us). The next stage, referred to as "deep search," requires solutions that can bring together text, images, video, importance, interactivity, and social media to solve this challenging problem. The Apache Nutch project provides an open framework for large-scale, targeted, vertical search with capabilities to support all past and potential future search engine foci. Nutch is a flexible infrastructure allowing open access to ranking; URL selection and filtering approaches, to the link graph generated from search, and Nutch has spawned entire sub communities including Apache Hadoop and Apache Tika. It addresses many current needs with the capability to support new technologies such as image and video. On the DARPA Memex project, we are creating create specific extensions to Nutch that will directly improve its overall technological superiority for search and that will directly allow us to address complex search problems including human trafficking. We are integrating state-of-the-art algorithms developed by Kitware for IARPA Aladdin combined with work by Harvard to provide image and video understanding support allowing automatic detection of people and things and massive deployment via Nutch. We are expanding Apache Tika for scene understanding, object/person detection and classification in images/video. We are delivering an interactive and visual interface for initiating Nutch crawls. The interface uses Python technologies to expose Nutch data and to provide a domain specific language for crawls. With the Bokeh visualization library the interface we are delivering simple interactive crawl visualization and
`Googling' Terrorists: Are Northern Irish Terrorists Visible on Internet Search Engines?

Science.gov (United States)

Reilly, P.

In this chapter, the analysis suggests that Northern Irish terrorists are not visible on Web search engines when net users employ conventional Internet search techniques. Editors of mass media organisations traditionally have had the ability to decide whether a terrorist atrocity is `newsworthy,' controlling the `oxygen' supply that sustains all forms of terrorism. This process, also known as `gatekeeping,' is often influenced by the norms of social responsibility, or alternatively, with regard to the interests of the advertisers and corporate sponsors that sustain mass media organisations. The analysis presented in this chapter suggests that Internet search engines can also be characterised as `gatekeepers,' albeit without the ability to shape the content of Websites before it reaches net users. Instead, Internet search engines give priority retrieval to certain Websites within their directory, pointing net users towards these Websites rather than others on the Internet. Net users are more likely to click on links to the more `visible' Websites on Internet search engine directories, these sites invariably being the highest `ranked' in response to a particular search query. A number of factors including the design of the Website and the number of links to external sites determine the `visibility' of a Website on Internet search engines. The study suggests that Northern Irish terrorists and their sympathisers are unlikely to achieve a greater degree of `visibility' online than they enjoy in the conventional mass media through the perpetration of atrocities. Although these groups may have a greater degree of freedom on the Internet to publicise their ideologies, they are still likely to be speaking to the converted or members of the press. Although it is easier to locate Northern Irish terrorist organisations on Internet search engines by linking in via ideology, ideological description searches, such as `Irish Republican' and `Ulster Loyalist,' are more likely to

Evaluating aggregated search using interleaving

NARCIS (Netherlands)

Chuklin, A.; Schuth, A.; Hofmann, K.; Serdyukov, P.; de Rijke, M.

2013-01-01

A result page of a modern web search engine is often much more complicated than a simple list of "ten blue links." In particular, a search engine may combine results from different sources (e.g., Web, News, and Images), and display these as grouped results to provide a better user experience. Such a
The Semantic Web: opportunities and challenges for next-generation Web applications

Directory of Open Access Journals (Sweden)

2002-01-01

Full Text Available Recently there has been a growing interest in the investigation and development of the next generation web - the Semantic Web. While most of the current forms of web content are designed to be presented to humans, but are barely understandable by computers, the content of the Semantic Web is structured in a semantic way so that it is meaningful to computers as well as to humans. In this paper, we report a survey of recent research on the Semantic Web. In particular, we present the opportunities that this revolution will bring to us: web-services, agent-based distributed computing, semantics-based web search engines, and semantics-based digital libraries. We also discuss the technical and cultural challenges of realizing the Semantic Web: the development of ontologies, formal semantics of Semantic Web languages, and trust and proof models. We hope that this will shed some light on the direction of future work on this field.
Surfing for suicide methods and help: content analysis of websites retrieved with search engines in Austria and the United States.

Science.gov (United States)

Till, Benedikt; Niederkrotenthaler, Thomas

2014-08-01

The Internet provides a variety of resources for individuals searching for suicide-related information. Structured content-analytic approaches to assess intercultural differences in web contents retrieved with method-related and help-related searches are scarce. We used the 2 most popular search engines (Google and Yahoo/Bing) to retrieve US-American and Austrian search results for the term suicide, method-related search terms (e.g., suicide methods, how to kill yourself, painless suicide, how to hang yourself), and help-related terms (e.g., suicidal thoughts, suicide help) on February 11, 2013. In total, 396 websites retrieved with US search engines and 335 websites from Austrian searches were analyzed with content analysis on the basis of current media guidelines for suicide reporting. We assessed the quality of websites and compared findings across search terms and between the United States and Austria. In both countries, protective outweighed harmful website characteristics by approximately 2:1. Websites retrieved with method-related search terms (e.g., how to hang yourself) contained more harmful (United States: P search engines generally had more protective characteristics (P search engines. Resources with harmful characteristics were better ranked than those with protective characteristics (United States: P < .01, Austria: P < .05). The quality of suicide-related websites obtained depends on the search terms used. Preventive efforts to improve the ranking of preventive web content, particularly regarding method-related search terms, seem necessary. © Copyright 2014 Physicians Postgraduate Press, Inc.
Classifying web genres in context: a case study documenting the web genres used by a software engineer

NARCIS (Netherlands)

Montesi, M.; Navarrete, T.

2008-01-01

This case study analyzes the Internet-based resources that a software engineer uses in his daily work. Methodologically, we studied the web browser history of the participant, classifying all the web pages he had seen over a period of 12 days into web genres. We interviewed him before and after the
Federated Search in the Wild: the combined power of over a hundred search engines

NARCIS (Netherlands)

Nguyen, Dong-Phuong; Demeester, Thomas; Trieschnigg, Rudolf Berend; Hiemstra, Djoerd

2012-01-01

Federated search has the potential of improving web search: the user becomes less dependent on a single search provider and parts of the deep web become available through a unified interface, leading to a wider variety in the retrieved search results. However, a publicly available dataset for
Engineering Compensations in Web Service Environment

DEFF Research Database (Denmark)

Schäfer, Micahel; Dolog, Peter; Nejdl, Wolfgang

2007-01-01

Business to business integration has recently been performed by employing Web service environments. Moreover, such environments are being provided by major players on the technology markets. Those environments are based on open specifications for transaction coordination. When a failure in such a......Business to business integration has recently been performed by employing Web service environments. Moreover, such environments are being provided by major players on the technology markets. Those environments are based on open specifications for transaction coordination. When a failure...... in such an environment occurs, a compensation can be initiated to recover from the failure. However, current environments have only limited capabilities for compensations, and are usually based on backward recovery. In this paper, we introduce an engineering approach and an environment to deal with advanced...... compensations based on forward recovery principles. We extend the existing Web service transaction coordination architecture and infrastructure in order to support flexible compensation operations. A contract-based approach is being used, which allows the specification of permitted compensations at runtime. We...
Content Based Searching for INIS

International Nuclear Information System (INIS)

Jain, V.; Jain, R.K.

2016-01-01

Full text: Whatever a user wants is available on the internet, but to retrieve the information efficiently, a multilingual and most-relevant document search engine is a must. Most current search engines are word based or pattern based. They do not consider the meaning of the query posed to them; purely based on the keywords of the query; no support of multilingual query and and dismissal of nonrelevant results. Current information-retrieval techniques either rely on an encoding process, using a certain perspective or classification scheme, to describe a given item, or perform a full-text analysis, searching for user-specified words. Neither case guarantees content matching because an encoded description might reflect only part of the content and the mere occurrence of a word does not necessarily reflect the document’s content. For general documents, there doesn’t yet seem to be a much better option than lazy full-text analysis, by manually going through those endless results pages. In contrast to this, new search engine should extract the meaning of the query and then perform the search based on this extracted meaning. New search engine should also employ Interlingua based machine translation technology to present information in the language of choice of the user. (author
A web-based online collaboration platform for formulating engineering design projects

Science.gov (United States)

Varikuti, Sainath

Effective communication and collaboration among students, faculty and industrial sponsors play a vital role while formulating and solving engineering design projects. With the advent in the web technology, online platforms and systems have been proposed to facilitate interactions and collaboration among different stakeholders in the context of senior design projects. However, there are noticeable gaps in the literature with respect to understanding the effects of online collaboration platforms for formulating engineering design projects. Most of the existing literature is focused on exploring the utility of online platforms on activities after the problem is defined and teams are formed. Also, there is a lack of mechanisms and tools to guide the project formation phase in senior design projects, which makes it challenging for students and faculty to collaboratively develop and refine project ideas and to establish appropriate teams. In this thesis a web-based online collaboration platform is designed and implemented to share, discuss and obtain feedback on project ideas and to facilitate collaboration among students and faculty prior to the start of the semester. The goal of this thesis is to understand the impact of an online collaboration platform for formulating engineering design projects, and how a web-based online collaboration platform affects the amount of interactions among stakeholders during the early phases of design process. A survey measuring the amount of interactions among students and faculty is administered. Initial findings show a marked improvement in the students' ability to share project ideas and form teams with other students and faculty. Students found the online platform simple to use. The suggestions for improving the tool generally included features that were not necessarily design specific, indicating that the underlying concept of this collaborative platform provides a strong basis and can be extended for future online platforms
Web-Based Problem-Solving Assignment and Grading System

Science.gov (United States)

Brereton, Giles; Rosenberg, Ronald

2014-11-01

In engineering courses with very specific learning objectives, such as fluid mechanics and thermodynamics, it is conventional to reinforce concepts and principles with problem-solving assignments and to measure success in problem solving as an indicator of student achievement. While the modern-day ease of copying and searching for online solutions can undermine the value of traditional assignments, web-based technologies also provide opportunities to generate individualized well-posed problems with an infinite number of different combinations of initial/final/boundary conditions, so that the probability of any two students being assigned identical problems in a course is vanishingly small. Such problems can be designed and programmed to be: single or multiple-step, self-grading, allow students single or multiple attempts; provide feedback when incorrect; selectable according to difficulty; incorporated within gaming packages; etc. In this talk, we discuss the use of a homework/exam generating program of this kind in a single-semester course, within a web-based client-server system that ensures secure operation.
Web based electronic logbook and experiment run database viewer for Alcator C-Mod

International Nuclear Information System (INIS)

Fredian, T.W.; Stillerman, J.A.

2006-01-01

Since 1991, the scientists and engineers at the Alcator C-Mod experiment at MIT have been recording text entries about the experiments being performed in an electronic logbook. In addition, separate documents such as run plans, run summaries and experimental proposals have been created and stored in a variety of formats in computer files. This information has now been organized and made available via any modern web browser. The new web based interface permits the user to browse through all the logbook entries, run information and even view some key data traces of the experiment. Since this information is being catalogued by Internet search engines, these tools can also be used to quickly locate information. The web based logbook and run information interface provides some additional capabilities. Once logged into the web site, users can add, delete or modify logbook entries directly from their browser. The logbook window on their browser also provides dynamic updating when any new logbook entries are made. There is also live C-Mod operation status information with optional audio announcements available. The user can receive the same state change announcements such as 'entering init' or 'entering pulse' as they would if they were sitting in the C-Mod control room. This paper will describe the functionality of the web based logbook and how it was implemented
Interactive WebGL-based 3D visualizations for EAST experiment

International Nuclear Information System (INIS)

Xia, J.Y.; Xiao, B.J.; Li, Dan; Wang, K.R.

2016-01-01

Highlights: • Developing a user-friendly interface to visualize the EAST experimental data and the device is important to scientists and engineers. • The Web3D visualization system is based on HTML5 and WebGL, which runs without the need for plug-ins or third party components. • The interactive WebGL-based 3D visualization system is a web-portal integrating EAST 3D models, experimental data and plasma videos. • The original CAD model was discretized into different layers with different simplification to enable realistic rendering and improve performance. - Abstract: In recent years EAST (Experimental Advanced Superconducting Tokamak) experimental data are being shared and analyzed by an increasing number of international collaborators. Developing a user-friendly interface to visualize the data, meta data and the relevant parts of the device is becoming more and more important to aid scientists and engineers. Compared with the previous virtual EAST system based on VRML/Java3D [1] (Li et al., 2014), a new technology is being adopted to create a 3D visualization system based on HTML5 and WebGL, which runs without the need for plug-ins or third party components. The interactive WebGL-based 3D visualization system is a web-portal integrating EAST 3D models, experimental data and plasma videos. It offers a highly interactive interface allowing scientists to roam inside EAST device and view the complex 3-D structure of the machine. It includes technical details of the device and various diagnostic components, and provides visualization of diagnostic metadata with a direct link to each signal name and its stored data. In order for the quick access to the device 3D model, the original CAD model was discretized into different layers with different simplification. It allows users to search for plasma videos in any experiment and analyze the video frame by frame. In this paper, we present the implementation details to enable realistic rendering and improve performance.
Interactive WebGL-based 3D visualizations for EAST experiment

Energy Technology Data Exchange (ETDEWEB)

Xia, J.Y., E-mail: jyxia@ipp.ac.cn [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei, Anhui (China); University of Science and Technology of China, Hefei, Anhui (China); Xiao, B.J. [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei, Anhui (China); University of Science and Technology of China, Hefei, Anhui (China); Li, Dan [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei, Anhui (China); Wang, K.R. [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei, Anhui (China); University of Science and Technology of China, Hefei, Anhui (China)

2016-11-15

Highlights: • Developing a user-friendly interface to visualize the EAST experimental data and the device is important to scientists and engineers. • The Web3D visualization system is based on HTML5 and WebGL, which runs without the need for plug-ins or third party components. • The interactive WebGL-based 3D visualization system is a web-portal integrating EAST 3D models, experimental data and plasma videos. • The original CAD model was discretized into different layers with different simplification to enable realistic rendering and improve performance. - Abstract: In recent years EAST (Experimental Advanced Superconducting Tokamak) experimental data are being shared and analyzed by an increasing number of international collaborators. Developing a user-friendly interface to visualize the data, meta data and the relevant parts of the device is becoming more and more important to aid scientists and engineers. Compared with the previous virtual EAST system based on VRML/Java3D [1] (Li et al., 2014), a new technology is being adopted to create a 3D visualization system based on HTML5 and WebGL, which runs without the need for plug-ins or third party components. The interactive WebGL-based 3D visualization system is a web-portal integrating EAST 3D models, experimental data and plasma videos. It offers a highly interactive interface allowing scientists to roam inside EAST device and view the complex 3-D structure of the machine. It includes technical details of the device and various diagnostic components, and provides visualization of diagnostic metadata with a direct link to each signal name and its stored data. In order for the quick access to the device 3D model, the original CAD model was discretized into different layers with different simplification. It allows users to search for plasma videos in any experiment and analyze the video frame by frame. In this paper, we present the implementation details to enable realistic rendering and improve performance.
Web document engineering

International Nuclear Information System (INIS)

White, B.

1996-05-01

This tutorial provides an overview of several document engineering techniques which are applicable to the authoring of World Wide Web documents. It illustrates how pre-WWW hypertext research is applicable to the development of WWW information resources
Quality of Web-based information on cocaine addiction.

Science.gov (United States)

Khazaal, Yasser; Chatton, Anne; Cochand, Sophie; Zullino, Daniele

2008-08-01

To evaluate the quality of web-based information on cocaine use and addiction and to investigate potential content quality indicators. Three keywords: cocaine, cocaine addiction and cocaine dependence were entered into two popular World Wide Web search engines. Websites were assessed with a standardized proforma designed to rate sites on the basis of accountability, presentation, interactivity, readability and content quality. "Health on the Net" (HON) quality label, and DISCERN scale scores aiding people without content expertise to assess quality of written health publication were used to verify their efficiency as quality indicators. Of the 120 websites identified, 61 were included. Most were commercial sites. The results of the study indicate low scores on each of the measures including content quality. A global score (the sum of accountability, interactivity, content quality and aesthetic criteria) appeared as a good content quality indicator. While cocaine education websites for patients are widespread, their global quality is poor. There is a need for better evidence-based information about cocaine use and addiction on the web. The poor and variable quality of web-based information and its possible impact on physician-patient relationship argue for a serious provider for patient talk about the health information found on Internet. Internet sites could improve their content using the global score as a quality indicator.
General vs health specialized search engine: a blind comparative evaluation of top search results.

Science.gov (United States)

Pletneva, Natalia; Ruiz de Castaneda, Rafael; Baroz, Frederic; Boyer, Celia

2014-01-01

This paper presents the results of a blind comparison of top ten search results retrieved by Google.ch (French) and Khresmoi for everyone, a health specialized search engine. Participants--students of the Faculty of Medicine of the University of Geneva had to complete three tasks and select their preferred results. The majority of the participants have largely preferred Google results while Khresmoi results showed potential to compete in specific topics. The coverage of the results seems to be one of the reasons. The second being that participants do not know how to select quality and transparent health web pages. More awareness, tools and education about the matter is required for the students of Medicine to be able to efficiently distinguish trustworthy online health information.
Evaluating a Federated Medical Search Engine

Science.gov (United States)

Belden, J.; Williams, J.; Richardson, B.; Schuster, K.

2014-01-01

Summary Background Federated medical search engines are health information systems that provide a single access point to different types of information. Their efficiency as clinical decision support tools has been demonstrated through numerous evaluations. Despite their rigor, very few of these studies report holistic evaluations of medical search engines and even fewer base their evaluations on existing evaluation frameworks. Objectives To evaluate a federated medical search engine, MedSocket, for its potential net benefits in an established clinical setting. Methods This study applied the Human, Organization, and Technology (HOT-fit) evaluation framework in order to evaluate MedSocket. The hierarchical structure of the HOT-factors allowed for identification of a combination of efficiency metrics. Human fit was evaluated through user satisfaction and patterns of system use; technology fit was evaluated through the measurements of time-on-task and the accuracy of the found answers; and organization fit was evaluated from the perspective of system fit to the existing organizational structure. Results Evaluations produced mixed results and suggested several opportunities for system improvement. On average, participants were satisfied with MedSocket searches and confident in the accuracy of retrieved answers. However, MedSocket did not meet participants’ expectations in terms of download speed, access to information, and relevance of the search results. These mixed results made it necessary to conclude that in the case of MedSocket, technology fit had a significant influence on the human and organization fit. Hence, improving technological capabilities of the system is critical before its net benefits can become noticeable. Conclusions The HOT-fit evaluation framework was instrumental in tailoring the methodology for conducting a comprehensive evaluation of the search engine. Such multidimensional evaluation of the search engine resulted in recommendations for
Personal health records: retrieving contextual information with Google Custom Search.

Science.gov (United States)

Ahsan, Mahmud; Seldon, H Lee; Sayeed, Shohel

2012-01-01

Ubiquitous personal health records, which can accompany a person everywhere, are a necessary requirement for ubiquitous healthcare. Contextual information related to health events is important for the diagnosis and treatment of disease and for the maintenance of good health, yet it is seldom recorded in a health record. We describe a dual cellphone-and-Web-based personal health record system which can include 'external' contextual information. Much contextual information is available on the Internet and we can use ontologies to help identify relevant sites and information. But a search engine is required to retrieve information from the Web and developing a customized search engine is beyond our scope, so we can use Google Custom Search API Web service to get contextual data. In this paper we describe a framework which combines a health-and-environment 'knowledge base' or ontology with the Google Custom Search API to retrieve relevant contextual information related to entries in a ubiquitous personal health record.
Design of personalized search engine based on user-webpage dynamic model

Science.gov (United States)

Li, Jihan; Li, Shanglin; Zhu, Yingke; Xiao, Bo

2013-12-01

Personalized search engine focuses on establishing a user-webpage dynamic model. In this model, users' personalized factors are introduced so that the search engine is better able to provide the user with targeted feedback. This paper constructs user and webpage dynamic vector tables, introduces singular value decomposition analysis in the processes of topic categorization, and extends the traditional PageRank algorithm.
Generating crop calendars with Web search data

International Nuclear Information System (INIS)

Van der Velde, Marijn; See, Linda; Fritz, Steffen; Khabarov, Nikolay; Obersteiner, Michael; Verheijen, Frank G A

2012-01-01

This paper demonstrates the potential of using Web search volumes for generating crop specific planting and harvesting dates in the USA integrating climatic, social and technological factors affecting crop calendars. Using Google Insights for Search, clear peaks in volume occur at times of planting and harvest at the national level, which were used to derive corn specific planting and harvesting dates at a weekly resolution. Disaggregated to state level, search volumes for corn planting generally are in agreement with planting dates from a global crop calendar dataset. However, harvest dates were less discriminatory at the state level, indicating that peaks in search volume may be blurred by broader searches on harvest as a time of cultural events. The timing of other agricultural activities such as purchase of seed and response to weed and pest infestation was also investigated. These results highlight the future potential of using Web search data to derive planting dates in countries where the data are sparse or unreliable, once sufficient search volumes are realized, as well as the potential for monitoring in real time the response of farmers to climate change over the coming decades. Other potential applications of search volume data of relevance to agronomy are also discussed. (letter)
APLIKASI SEARCH ENGINE PAPER KARYA ILMIAH BERBASIS WEB DENGAN METODE FUZZY RELATION

Directory of Open Access Journals (Sweden)

Bernard Adytia Darmadi

2005-01-01

Full Text Available The number of paper collected by an educational institution is incresing each year. The increasing number of paper collected demand a method in order to find the right paper everytime there is someone who needs a reference. By far, most search engine still depend on keyword matching / string maching to find the apropriate result. This method will only find the apropriate paper based on the occurance of the inserted keyword on the paper. This research will discuss a searching system using fuzzy relation, by using fuzzy relation the relation between keyword and paper is found and determined. Searching system using fuzzy relation allows the search result include paper that do not have the keyword to be shown as a result. This result is made posssible because the word which occur in the paper is related to keyword inserted. Abstract in Bahasa Indonesia : Banyaknya jumlah paper yang dikoleksi sebuah lembaga pendidikan setiap tahun akan bertambah. Seiring dengan pertambahan jumlah paper tersebut maka diperlukan sebuah metode untuk mencari paper agar bila membutuhkan referensi maka paper/dokumen yang diperlukan dapat dengan mudah dapat ditemukan. Sejauh yang ada saat ini, kebanyakan mesin pencari masih mengandalkan pencarian dengan menggunakan keyword matching/string matching sehingga mengakibatkan hasil pencarian hanya akan menampilkan paper-paper yang mempunyai keyword/kata kunci yang dicari. Penelitan ini membahas sebuah sistem pencarian dengan menggunakan metode fuzzy relation, dimana dengan fuzzy relation didapatkan hubungan antara keyword dan paper. Dengan metode fuzzy relation maka sebuah pencarian mempunyai kemungkinan menampilkan hasil berupa paper yang tidak mengandung keyword yang dicari. Karena kata yang mengakibatkan paper (yang tidak mengandung keyword muncul mempunyai hubungan dengan keyword yang dimasukkan. Kata kunci: mesin pencari, relasi fuzzy, sistem cerdas.

In Search of Search Engine Marketing Strategy Amongst SME's in Ireland

Science.gov (United States)

Barry, Chris; Charleton, Debbie

Researchers have identified the Web as a searchers first port of call for locating information. Search Engine Marketing (SEM) strategies have been noted as a key consideration when developing, maintaining and managing Websites. A study presented here of SEM practices of Irish small to medium enterprises (SMEs) reveals they plan to spend more resources on SEM in the future. Most firms utilize an informal SEM strategy, where Website optimization is perceived most effective in attracting traffic. Respondents cite the use of ‘keywords in title and description tags’ as the most used SEM technique, followed by the use of ‘keywords throughout the whole Website’; while ‘Pay for Placement’ was most widely used Paid Search technique. In concurrence with the literature, measuring SEM performance remains a significant challenge with many firms unsure if they measure it effectively. An encouraging finding is that Irish SMEs adopt a positive ethical posture when undertaking SEM.
Opportunities for web-based indicators in environmental sciences.

Directory of Open Access Journals (Sweden)

Sergio Malcevschi

Full Text Available This paper proposes a set of web-based indicators for quantifying and ranking the relevance of terms related to key-issues in Ecology and Sustainability Science. Search engines that operate in different contexts (e.g. global, social, scientific are considered as web information carriers (WICs and are able to analyse; (i relevance on different levels: global web, individual/personal sphere, on-line news, and culture/science; (ii time trends of relevance; (iii relevance of keywords for environmental governance. For the purposes of this study, several indicators and specific indices (relational indices and dynamic indices were applied to a test-set of 24 keywords. Outputs consistently show that traditional study topics in environmental sciences such as water and air have remained the most quantitatively relevant keywords, while interest in systemic issues (i.e. ecosystem and landscape has grown over the last 20 years. Nowadays, the relevance of new concepts such as resilience and ecosystem services is increasing, but the actual ability of these concepts to influence environmental governance needs to be further studied and understood. The proposed approach, which is based on intuitive and easily replicable procedures, can support the decision-making processes related to environmental governance.
The effect of patient narratives on information search in a web-based breast cancer decision aid: an eye-tracking study.

Science.gov (United States)

Shaffer, Victoria A; Owens, Justin; Zikmund-Fisher, Brian J

2013-12-17

Previous research has examined the impact of patient narratives on treatment choices, but to our knowledge, no study has examined the effect of narratives on information search. Further, no research has considered the relative impact of their format (text vs video) on health care decisions in a single study. Our goal was to examine the impact of video and text-based narratives on information search in a Web-based patient decision aid for early stage breast cancer. Fifty-six women were asked to imagine that they had been diagnosed with early stage breast cancer and needed to choose between two surgical treatments (lumpectomy with radiation or mastectomy). Participants were randomly assigned to view one of four versions of a Web decision aid. Two versions of the decision aid included videos of interviews with patients and physicians or videos of interviews with physicians only. To distinguish between the effect of narratives and the effect of videos, we created two text versions of the Web decision aid by replacing the patient and physician interviews with text transcripts of the videos. Participants could freely browse the Web decision aid until they developed a treatment preference. We recorded participants' eye movements using the Tobii 1750 eye-tracking system equipped with Tobii Studio software. A priori, we defined 24 areas of interest (AOIs) in the Web decision aid. These AOIs were either separate pages of the Web decision aid or sections within a single page covering different content. We used multilevel modeling to examine the effect of narrative presence, narrative format, and their interaction on information search. There was a significant main effect of condition, P=.02; participants viewing decision aids with patient narratives spent more time searching for information than participants viewing the decision aids without narratives. The main effect of format was not significant, P=.10. However, there was a significant condition by format interaction on
Applying Web Usage Mining for Personalizing Hyperlinks in Web-Based Adaptive Educational Systems

Science.gov (United States)

Romero, Cristobal; Ventura, Sebastian; Zafra, Amelia; de Bra, Paul

2009-01-01

Nowadays, the application of Web mining techniques in e-learning and Web-based adaptive educational systems is increasing exponentially. In this paper, we propose an advanced architecture for a personalization system to facilitate Web mining. A specific Web mining tool is developed and a recommender engine is integrated into the AHA! system in…
Improving Web Search for Difficult Queries

Science.gov (United States)

Wang, Xuanhui

2009-01-01

Search engines have now become essential tools in all aspects of our life. Although a variety of information needs can be served very successfully, there are still a lot of queries that search engines can not answer very effectively and these queries always make users feel frustrated. Since it is quite often that users encounter such "difficult…
Publicizing Your Web Resources for Maximum Exposure.

Science.gov (United States)

Smith, Kerry J.

2001-01-01

Offers advice to librarians for marketing their Web sites on Internet search engines. Advises against relying solely on spiders and recommends adding metadata to the source code and delivering that information directly to the search engines. Gives an overview of metadata and typical coding for meta tags. Includes Web addresses for a number of…
Exploring the academic invisible web

OpenAIRE

Lewandowski, Dirk

2006-01-01

The Invisible Web is often discussed in the academic context, where its contents (mainly in the form of databases) are of great importance. But this discussion is mainly based on some seminal research done by Sherman and Price (2001) and Bergman (2001), respectively. We focus on the types of Invisible Web content relevant for academics and the improvements made by search engines to deal with these content types. In addition, we question the volume of the Invisible Web as stated by Bergman. Ou...
CWI and TU Delft at TREC 2013: Contextual Suggestion, Federated Web Search, KBA, and Web Tracks

NARCIS (Netherlands)

A. Bellogín Kouki (Alejandro); G.G. Gebremeskel (Gebre); J. He (Jiyin); J.J.P. Lin (Jimmy); A. Said (Alan); T. Samar (Thaer); A.P. de Vries (Arjen); J.B.P. Vuurens (Jeroen)

2014-01-01

htmlabstractThis paper provides an overview of the work done at the Centrum Wiskunde & Informatica (CWI) and Delft University of Technology (TU Delft) for different tracks of TREC 2013. We participated in the Contextual Suggestion Track, the Federated Web Search Track, the Knowledge Base
Seasonal Web Search Query Selection for Influenza-Like Illness (ILI) Estimation

DEFF Research Database (Denmark)

Hansen, Niels Dalum; Mølbak, Kåre; Cox, Ingemar Johansson

2017-01-01

Inuenza-like illness (ILI) estimation from web search data is an important web analytics task. The basic idea is to use the frequencies of queries in web search logs that are correlated with past ILI activity as features when estimating current ILI activity. It has been noted that since inuenza...
Web-based Analysis Services Report

CERN Document Server

AUTHOR|(CDS)2108758; Canali, Luca; Grancher, Eric; Lamanna, Massimo; McCance, Gavin; Mato Vila, Pere; Piparo, Danilo; Moscicki, Jakub; Pace, Alberto; Brito Da Rocha, Ricardo; Simko, Tibor; Smith, Tim; Tejedor Saavedra, Enric; CERN. Geneva. IT Department

2017-01-01

Web-based services (cloud services) is an important trend to innovate end-user services while optimising the service operational costs. CERN users are constantly proposing new approaches (inspired from services existing on the web, tools used in education or other science or based on their experience in using existing computing services). In addition, industry and open source communities have recently made available a large number of powerful and attractive tools and platforms that enable large scale data processing. “Big Data” software stacks notably provide solutions for scalable storage, distributed compute and data analysis engines, data streaming, web-based interfaces (notebooks). Some of those platforms and tools, typically available as open source products, are experiencing a very fast adoption in industry and science such that they are becoming “de facto” references in several areas of data engineering, data science and machine learning. In parallel to users' requests, WLCG is considering to c...
Query Log Analysis of an Electronic Health Record Search Engine

Science.gov (United States)

Yang, Lei; Mei, Qiaozhu; Zheng, Kai; Hanauer, David A.

2011-01-01

We analyzed a longitudinal collection of query logs of a full-text search engine designed to facilitate information retrieval in electronic health records (EHR). The collection, 202,905 queries and 35,928 user sessions recorded over a course of 4 years, represents the information-seeking behavior of 533 medical professionals, including frontline practitioners, coding personnel, patient safety officers, and biomedical researchers for patient data stored in EHR systems. In this paper, we present descriptive statistics of the queries, a categorization of information needs manifested through the queries, as well as temporal patterns of the users’ information-seeking behavior. The results suggest that information needs in medical domain are substantially more sophisticated than those that general-purpose web search engines need to accommodate. Therefore, we envision there exists a significant challenge, along with significant opportunities, to provide intelligent query recommendations to facilitate information retrieval in EHR. PMID:22195150
The Theory of Planned Behaviour Applied to Search Engines as a Learning Tool

Science.gov (United States)

Liaw, Shu-Sheng

2004-01-01

Search engines have been developed for helping learners to seek online information. Based on theory of planned behaviour approach, this research intends to investigate the behaviour of using search engines as a learning tool. After factor analysis, the results suggest that perceived satisfaction of search engine, search engines as an information…
Is It "Writing on Water" or "Strike It Rich?" The Experiences of Prospective Teachers in Using Search Engines

Science.gov (United States)

Sahin, Abdurrahman; Cermik, Hulya; Dogan, Birsen

2010-01-01

Information searching skills have become increasingly important for prospective teachers with the exponential growth of learning materials on the web. This study is an attempt to understand the experiences of prospective teachers with search engines through metaphoric images and to further investigate whether their experiences are related to the…
WEB-server for search of a periodicity in amino acid and nucleotide sequences

Science.gov (United States)

E Frenkel, F.; Skryabin, K. G.; Korotkov, E. V.

2017-12-01

A new web server (http://victoria.biengi.ac.ru/splinter/login.php) was designed and developed to search for periodicity in nucleotide and amino acid sequences. The web server operation is based upon a new mathematical method of searching for multiple alignments, which is founded on the position weight matrices optimization, as well as on implementation of the two-dimensional dynamic programming. This approach allows the construction of multiple alignments of the indistinctly similar amino acid and nucleotide sequences that accumulated more than 1.5 substitutions per a single amino acid or a nucleotide without performing the sequences paired comparisons. The article examines the principles of the web server operation and two examples of studying amino acid and nucleotide sequences, as well as information that could be obtained using the web server.
The effects of link format and screen location on visual search of web pages.

Science.gov (United States)

Ling, Jonathan; Van Schaik, Paul

2004-06-22

Navigation of web pages is of critical importance to the usability of web-based systems such as the World Wide Web and intranets. The primary means of navigation is through the use of hyperlinks. However, few studies have examined the impact of the presentation format of these links on visual search. The present study used a two-factor mixed measures design to investigate whether there was an effect of link format (plain text, underlined, bold, or bold and underlined) upon speed and accuracy of visual search and subjective measures in both the navigation and content areas of web pages. An effect of link format on speed of visual search for both hits and correct rejections was found. This effect was observed in the navigation and the content areas. Link format did not influence accuracy in either screen location. Participants showed highest preference for links that were in bold and underlined, regardless of screen area. These results are discussed in the context of visual search processes and design recommendations are given.
Self-learning search engines

NARCIS (Netherlands)

Schuth, A.

2015-01-01

How does a search engine such as Google know which search results to display? There are many competing algorithms that generate search results, but which one works best? We developed a new probabilistic method for quickly comparing large numbers of search algorithms by examining the results users
Social Search: A Taxonomy of, and a User-Centred Approach to, Social Web Search

Science.gov (United States)

McDonnell, Michael; Shiri, Ali

2011-01-01

Purpose: The purpose of this paper is to introduce the notion of social search as a new concept, drawing upon the patterns of web search behaviour. It aims to: define social search; present a taxonomy of social search; and propose a user-centred social search method. Design/methodology/approach: A mixed method approach was adopted to investigate…
Making Statistical Data More Easily Accessible on the Web Results of the StatSearch Case Study

CERN Document Server

Rajman, M; Boynton, I M; Fridlund, B; Fyhrlund, A; Sundgren, B; Lundquist, P; Thelander, H; Wänerskär, M

2005-01-01

In this paper we present the results of the StatSearch case study that aimed at providing an enhanced access to statistical data available on the Web. In the scope of this case study we developed a prototype of an information access tool combining a query-based search engine with semi-automated navigation techniques exploiting the hierarchical structuring of the available data. This tool enables a better control of the information retrieval, improving the quality and ease of the access to statistical information. The central part of the presented StatSearch tool consists in the design of an algorithm for automated navigation through a tree-like hierarchical document structure. The algorithm relies on the computation of query related relevance score distributions over the available database to identify the most relevant clusters in the data structure. These most relevant clusters are then proposed to the user for navigation, or, alternatively, are the support for the automated navigation process. Several appro...
Web-based resources for critical care education.

Science.gov (United States)

Kleinpell, Ruth; Ely, E Wesley; Williams, Ged; Liolios, Antonios; Ward, Nicholas; Tisherman, Samuel A

2011-03-01

To identify, catalog, and critically evaluate Web-based resources for critical care education. A multilevel search strategy was utilized. Literature searches were conducted (from 1996 to September 30, 2010) using OVID-MEDLINE, PubMed, and the Cumulative Index to Nursing and Allied Health Literature with the terms "Web-based learning," "computer-assisted instruction," "e-learning," "critical care," "tutorials," "continuing education," "virtual learning," and "Web-based education." The Web sites of relevant critical care organizations (American College of Chest Physicians, American Society of Anesthesiologists, American Thoracic Society, European Society of Intensive Care Medicine, Society of Critical Care Medicine, World Federation of Societies of Intensive and Critical Care Medicine, American Association of Critical Care Nurses, and World Federation of Critical Care Nurses) were reviewed for the availability of e-learning resources. Finally, Internet searches and e-mail queries to critical care medicine fellowship program directors and members of national and international acute/critical care listserves were conducted to 1) identify the use of and 2) review and critique Web-based resources for critical care education. To ensure credibility of Web site information, Web sites were reviewed by three independent reviewers on the basis of the criteria of authority, objectivity, authenticity, accuracy, timeliness, relevance, and efficiency in conjunction with suggested formats for evaluating Web sites in the medical literature. Literature searches using OVID-MEDLINE, PubMed, and the Cumulative Index to Nursing and Allied Health Literature resulted in >250 citations. Those pertinent to critical care provide examples of the integration of e-learning techniques, the development of specific resources, reports of the use of types of e-learning, including interactive tutorials, case studies, and simulation, and reports of student or learner satisfaction, among other general
NASA Indexing Benchmarks: Evaluating Text Search Engines

Science.gov (United States)

Esler, Sandra L.; Nelson, Michael L.

1997-01-01

The current proliferation of on-line information resources underscores the requirement for the ability to index collections of information and search and retrieve them in a convenient manner. This study develops criteria for analytically comparing the index and search engines and presents results for a number of freely available search engines. A product of this research is a toolkit capable of automatically indexing, searching, and extracting performance statistics from each of the focused search engines. This toolkit is highly configurable and has the ability to run these benchmark tests against other engines as well. Results demonstrate that the tested search engines can be grouped into two levels. Level one engines are efficient on small to medium sized data collections, but show weaknesses when used for collections 100MB or larger. Level two search engines are recommended for data collections up to and beyond 100MB.

ROLE AND IMPORTANCE OF SEARCH ENGINE OPTIMIZATION

OpenAIRE

Gurneet Kaur

2017-01-01

Search Engines are an indispensible platform for users all over the globe to search for relevant information online. Search Engine Optimization (SEO) is the exercise of improving the position of a website in search engine rankings, for a chosen set of keywords. SEO is divided into two parts: On-Page and Off-Page SEO. In order to be successful, both the areas require equal attention. This paper aims to explain the functioning of the search engines along with the role and importance of search e...
Utility of Web search query data in testing theoretical assumptions about mephedrone.

Science.gov (United States)

Kapitány-Fövény, Máté; Demetrovics, Zsolt

2017-05-01

With growing access to the Internet, people who use drugs and traffickers started to obtain information about novel psychoactive substances (NPS) via online platforms. This paper aims to analyze whether a decreasing Web interest in formerly banned substances-cocaine, heroin, and MDMA-and the legislative status of mephedrone predict Web interest about this NPS. Google Trends was used to measure changes of Web interest on cocaine, heroin, MDMA, and mephedrone. Google search results for mephedrone within the same time frame were analyzed and categorized. Web interest about classic drugs found to be more persistent. Regarding geographical distribution, location of Web searches for heroin and cocaine was less centralized. Illicit status of mephedrone was a negative predictor of its Web search query rates. The connection between mephedrone-related Web search rates and legislative status of this substance was significantly mediated by ecstasy-related Web search queries, the number of documentaries, and forum/blog entries about mephedrone. The results might provide support for the hypothesis that mephedrone's popularity was highly correlated with its legal status as well as it functioned as a potential substitute for MDMA. Google Trends was found to be a useful tool for testing theoretical assumptions about NPS. Copyright © 2017 John Wiley & Sons, Ltd.
An ant colony optimization based feature selection for web page classification.

Science.gov (United States)

Saraç, Esra; Özel, Selma Ayşe

2014-01-01

The increased popularity of the web has caused the inclusion of huge amount of information to the web, and as a result of this explosive information growth, automated web page classification systems are needed to improve search engines' performance. Web pages have a large number of features such as HTML/XML tags, URLs, hyperlinks, and text contents that should be considered during an automated classification process. The aim of this study is to reduce the number of features to be used to improve runtime and accuracy of the classification of web pages. In this study, we used an ant colony optimization (ACO) algorithm to select the best features, and then we applied the well-known C4.5, naive Bayes, and k nearest neighbor classifiers to assign class labels to web pages. We used the WebKB and Conference datasets in our experiments, and we showed that using the ACO for feature selection improves both accuracy and runtime performance of classification. We also showed that the proposed ACO based algorithm can select better features with respect to the well-known information gain and chi square feature selection methods.
Project GRACE A grid based search tool for the global digital library

CERN Document Server

Scholze, Frank; Vigen, Jens; Prazak, Petra; The Seventh International Conference on Electronic Theses and Dissertations

2004-01-01

The paper will report on the progress of an ongoing EU project called GRACE - Grid Search and Categorization Engine (http://www.grace-ist.org). The project participants are CERN, Sheffield Hallam University, Stockholm University, Stuttgart University, GL 2006 and Telecom Italia. The project started in 2002 and will finish in 2005, resulting in a Grid based search engine that will search across a variety of content sources including a number of electronic thesis and dissertation repositories. The Open Archives Initiative (OAI) is expanding and is clearly an interesting movement for a community advocating open access to ETD. However, the OAI approach alone may not be sufficiently scalable to achieve a truly global ETD Digital Library. Many universities simply offer their collections to the world via their local web services without being part of any federated system for archiving and even those dissertations that are provided with OAI compliant metadata will not necessarily be picked up by a centralized OAI Ser...
Developing a Data Discovery Tool for Interdisciplinary Science: Leveraging a Web-based Mapping Application and Geosemantic Searching

Science.gov (United States)

Albeke, S. E.; Perkins, D. G.; Ewers, S. L.; Ewers, B. E.; Holbrook, W. S.; Miller, S. N.

2015-12-01

The sharing of data and results is paramount for advancing scientific research. The Wyoming Center for Environmental Hydrology and Geophysics (WyCEHG) is a multidisciplinary group that is driving scientific breakthroughs to help manage water resources in the Western United States. WyCEHG is mandated by the National Science Foundation (NSF) to share their data. However, the infrastructure from which to share such diverse, complex and massive amounts of data did not exist within the University of Wyoming. We developed an innovative framework to meet the data organization, sharing, and discovery requirements of WyCEHG by integrating both open and closed source software, embedded metadata tags, semantic web technologies, and a web-mapping application. The infrastructure uses a Relational Database Management System as the foundation, providing a versatile platform to store, organize, and query myriad datasets, taking advantage of both structured and unstructured formats. Detailed metadata are fundamental to the utility of datasets. We tag data with Uniform Resource Identifiers (URI's) to specify concepts with formal descriptions (i.e. semantic ontologies), thus allowing users the ability to search metadata based on the intended context rather than conventional keyword searches. Additionally, WyCEHG data are geographically referenced. Using the ArcGIS API for Javascript, we developed a web mapping application leveraging database-linked spatial data services, providing a means to visualize and spatially query available data in an intuitive map environment. Using server-side scripting (PHP), the mapping application, in conjunction with semantic search modules, dynamically communicates with the database and file system, providing access to available datasets. Our approach provides a flexible, comprehensive infrastructure from which to store and serve WyCEHG's highly diverse research-based data. This framework has not only allowed WyCEHG to meet its data stewardship
A Web-based Tool for SDSS and 2MASS Database Searches

Science.gov (United States)

Hendrickson, M. A.; Uomoto, A.; Golimowski, D. A.

We have developed a web site using HTML, Php, Python, and MySQL that extracts, processes, and displays data from the Sloan Digital Sky Survey (SDSS) and the Two-Micron All-Sky Survey (2MASS). The goal is to locate brown dwarf candidates in the SDSS database by looking at color cuts; however, this site could also be useful for targeted searches of other databases as well. MySQL databases are created from broad searches of SDSS and 2MASS data. Broad queries on the SDSS and 2MASS database servers are run weekly so that observers have the most up-to-date information from which to select candidates for observation. Observers can look at detailed information about specific objects including finding charts, images, and available spectra. In addition, updates from previous observations can be added by any collaborators; this format makes observational collaboration simple. Observers can also restrict the database search, just before or during an observing run, to select objects of special interest.
Practical and Efficient Searching in Proteomics: A Cross Engine Comparison

Science.gov (United States)

Paulo, Joao A.

2014-01-01

Background Analysis of large datasets produced by mass spectrometry-based proteomics relies on database search algorithms to sequence peptides and identify proteins. Several such scoring methods are available, each based on different statistical foundations and thereby not producing identical results. Here, the aim is to compare peptide and protein identifications using multiple search engines and examine the additional proteins gained by increasing the number of technical replicate analyses. Methods A HeLa whole cell lysate was analyzed on an Orbitrap mass spectrometer for 10 technical replicates. The data were combined and searched using Mascot, SEQUEST, and Andromeda. Comparisons were made of peptide and protein identifications among the search engines. In addition, searches using each engine were performed with incrementing number of technical replicates. Results The number and identity of peptides and proteins differed across search engines. For all three search engines, the differences in proteins identifications were greater than the differences in peptide identifications indicating that the major source of the disparity may be at the protein inference grouping level. The data also revealed that analysis of 2 technical replicates can increase protein identifications by up to 10-15%, while a third replicate results in an additional 4-5%. Conclusions The data emphasize two practical methods of increasing the robustness of mass spectrometry data analysis. The data show that 1) using multiple search engines can expand the number of identified proteins (union) and validate protein identifications (intersection), and 2) analysis of 2 or 3 technical replicates can substantially expand protein identifications. Moreover, information can be extracted from a dataset by performing database searching with different engines and performing technical repeats, which requires no additional sample preparation and effectively utilizes research time and effort. PMID:25346847
SpEnD: Linked Data SPARQL Endpoints Discovery Using Search Engines

Science.gov (United States)

Yumusak, Semih; Dogdu, Erdogan; Kodaz, Halife; Kamilaris, Andreas; Vandenbussche, Pierre-Yves

In this study, a novel metacrawling method is proposed for discovering and monitoring linked data sources on the Web. We implemented the method in a prototype system, named SPARQL Endpoints Discovery (SpEnD). SpEnD starts with a "search keyword" discovery process for finding relevant keywords for the linked data domain and specifically SPARQL endpoints. Then, these search keywords are utilized to find linked data sources via popular search engines (Google, Bing, Yahoo, Yandex). By using this method, most of the currently listed SPARQL endpoints in existing endpoint repositories, as well as a significant number of new SPARQL endpoints, have been discovered. Finally, we have developed a new SPARQL endpoint crawler (SpEC) for crawling and link analysis.
Cardiac Resynchronization Therapy Online: What Patients Find when Searching the World Wide Web.

Science.gov (United States)

Modi, Minal; Laskar, Nabila; Modi, Bhavik N

2016-06-01

To objectively assess the quality of information available on the World Wide Web on cardiac resynchronization therapy (CRT). Patients frequently search the internet regarding their healthcare issues. It has been shown that patients seeking information can help or hinder their healthcare outcomes depending on the quality of information consulted. On the internet, this information can be produced and published by anyone, resulting in the risk of patients accessing inaccurate and misleading information. The search term "Cardiac Resynchronisation Therapy" was entered into the three most popular search engines and the first 50 pages on each were pooled and analyzed, after excluding websites inappropriate for objective review. The "LIDA" instrument (a validated tool for assessing quality of healthcare information websites) was to generate scores on Accessibility, Reliability, and Usability. Readability was assessed using the Flesch Reading Ease Score (FRES). Of the 150 web-links, 41 sites met the eligibility criteria. The sites were assessed using the LIDA instrument and the FRES. A mean total LIDA score for all the websites assessed was 123.5 of a possible 165 (74.8%). The average Accessibility of the sites assessed was 50.1 of 60 (84.3%), on Usability 41.4 of 54 (76.6%), on Reliability 31.5 of 51 (61.7%), and 41.8 on FRES. There was a significant variability among sites and interestingly, there was no correlation between the sites' search engine ranking and their scores. This study has illustrated the variable quality of online material on the topic of CRT. Furthermore, there was also no apparent correlation between highly ranked, popular websites and their quality. Healthcare professionals should be encouraged to guide their patients toward the online material that contains reliable information. © 2016 Wiley Periodicals, Inc.
Spatial Search Techniques for Mobile 3D Queries in Sensor Web Environments

Directory of Open Access Journals (Sweden)

James D. Carswell

2013-03-01

Full Text Available Developing mobile geo-information systems for sensor web applications involves technologies that can access linked geographical and semantically related Internet information. Additionally, in tomorrow’s Web 4.0 world, it is envisioned that trillions of inexpensive micro-sensors placed throughout the environment will also become available for discovery based on their unique geo-referenced IP address. Exploring these enormous volumes of disparate heterogeneous data on today’s location and orientation aware smartphones requires context-aware smart applications and services that can deal with “information overload”. 3DQ (Three Dimensional Query is our novel mobile spatial interaction (MSI prototype that acts as a next-generation base for human interaction within such geospatial sensor web environments/urban landscapes. It filters information using “Hidden Query Removal” functionality that intelligently refines the search space by calculating the geometry of a three dimensional visibility shape (Vista space at a user’s current location. This 3D shape then becomes the query “window” in a spatial database for retrieving information on only those objects visible within a user’s actual 3D field-of-view. 3DQ reduces information overload and serves to heighten situation awareness on constrained commercial off-the-shelf devices by providing visibility space searching as a mobile web service. The effects of variations in mobile spatial search techniques in terms of query speed vs. accuracy are evaluated and presented in this paper.
Do two heads search better than one? Effects of student collaboration on web search behavior and search outcomes.

NARCIS (Netherlands)

Lazonder, Adrianus W.

2005-01-01

This study compared Pairs of students with Single students in web search tasks. The underlying hypothesis was that peer-to-peer collaboration encourages students to articulate their thoughts, which in turn has a facilitative effect on the regulation of the search process as well as search outcomes.
Web-based Logbook System for EAST Experiments

International Nuclear Information System (INIS)

Yang Fei; Xiao Bingjia

2010-01-01

Implementation of a web-based logbook system on EAST is introduced, which can store the comments for the experiments into a database and access the documents via various web browsers. The three-tier software architecture and asynchronous access technology are adopted to improve the system effectively. Authorized users can view the information of real-time discharge, comments from others and signal plots; add, delete, or revise their own comments; search signal data or comments under complicated search conditions; and collect relevant information and output it to an excel file. The web pages can be automatically updated after a new discharge is completed and without refreshment.
Search Engine Liability for Copyright Infringement

Science.gov (United States)

Fitzgerald, B.; O'Brien, D.; Fitzgerald, A.

The chapter provides a broad overview to the topic of search engine liability for copyright infringement. In doing so, the chapter examines some of the key copyright law principles and their application to search engines. The chapter also provides a discussion of some of the most important cases to be decided within the courts of the United States, Australia, China and Europe regarding the liability of search engines for copyright infringement. Finally, the chapter will conclude with some thoughts for reform, including how copyright law can be amended in order to accommodate and realise the great informative power which search engines have to offer society.
Web components and the semantic web

OpenAIRE

Casey, Maire; Pahl, Claus

2003-01-01

Component-based software engineering on the Web differs from traditional component and software engineering. We investigate Web component engineering activites that are crucial for the development,com position, and deployment of components on the Web. The current Web Services and Semantic Web initiatives strongly influence our work. Focussing on Web component composition we develop description and reasoning techniques that support a component developer in the composition activities,fo cussing...
Web-Searching to Learn: The Role of Internet Self-Efficacy in Pre-School Educators' Conceptions and Approaches

Science.gov (United States)

Kao, Chia-Pin; Chien, Hui-Min

2017-01-01

This study was conducted to explore the relationships between pre-school educators' conceptions of and approaches to learning by web-searching through Internet Self-efficacy. Based on data from 242 pre-school educators who had prior experience of participating in web-searching in Taiwan for path analyses, it was found in this study that…
CHIME : service-oriented framework for adaptive web-based systems

NARCIS (Netherlands)

Chepegin, V.; Aroyo, L.M.; De Bra, P.M.E.; Houben, G.J.P.M.; De Bra, P.M.E.

2003-01-01

In this paper we present our view on how the current development of knowledge engineering in the context of Semantic Web can contribute to the better applicability, reusability and sharability of adaptive web-based systems. We propose a service-oriented framework for adaptive web-based systems,
MODEST: a web-based design tool for oligonucleotide-mediated genome engineering and recombineering

DEFF Research Database (Denmark)

Bonde, Mads; Klausen, Michael Schantz; Anderson, Mads Valdemar

2014-01-01

Recombineering and multiplex automated genome engineering (MAGE) offer the possibility to rapidly modify multiple genomic or plasmid sites at high efficiencies. This enables efficient creation of genetic variants including both single mutants with specifically targeted modifications as well......, which confers the corresponding genetic change, is performed manually. To address these challenges, we have developed the MAGE Oligo Design Tool (MODEST). This web-based tool allows designing of MAGE oligos for (i) tuning translation rates by modifying the ribosomal binding site, (ii) generating...
Developing Creativity and Problem-Solving Skills of Engineering Students: A Comparison of Web- and Pen-and-Paper-Based Approaches

Science.gov (United States)

Valentine, Andrew; Belski, Iouri; Hamilton, Margaret

2017-01-01

Problem-solving is a key engineering skill, yet is an area in which engineering graduates underperform. This paper investigates the potential of using web-based tools to teach students problem-solving techniques without the need to make use of class time. An idea generation experiment involving 90 students was designed. Students were surveyed…
Combining Search Engines for Comparative Proteomics

Science.gov (United States)

Tabb, David

2012-01-01

Many proteomics laboratories have found spectral counting to be an ideal way to recognize biomarkers that differentiate cohorts of samples. This approach assumes that proteins that differ in quantity between samples will generate different numbers of identifiable tandem mass spectra. Increasingly, researchers are employing multiple search engines to maximize the identifications generated from data collections. This talk evaluates four strategies to combine information from multiple search engines in comparative proteomics. The “Count Sum” model pools the spectra across search engines. The “Vote Counting” model combines the judgments from each search engine by protein. Two other models employ parametric and non-parametric analyses of protein-specific p-values from different search engines. We evaluated the four strategies in two different data sets. The ABRF iPRG 2009 study generated five LC-MS/MS analyses of “red” E. coli and five analyses of “yellow” E. coli. NCI CPTAC Study 6 generated five concentrations of Sigma UPS1 spiked into a yeast background. All data were identified with X!Tandem, Sequest, MyriMatch, and TagRecon. For both sample types, “Vote Counting” appeared to manage the diverse identification sets most effectively, yielding heightened discrimination as more search engines were added.
Automatic identification of web-based risk markers for health events

DEFF Research Database (Denmark)

Yom-Tov, Elad; Borsa, Diana; Hayward, Andrew C.

2015-01-01

but these are often limited in size and cost and can fail to take full account of diseases where there are social stigmas or to identify transient acute risk factors. Objective: Here we report that Web search engine queries coupled with information on Wikipedia access patterns can be used to infer health events...

Ontology-Driven Search and Triage: Design of a Web-Based Visual Interface for MEDLINE.

Science.gov (United States)

Demelo, Jonathan; Parsons, Paul; Sedig, Kamran

2017-02-02

Diverse users need to search health and medical literature to satisfy open-ended goals such as making evidence-based decisions and updating their knowledge. However, doing so is challenging due to at least two major difficulties: (1) articulating information needs using accurate vocabulary and (2) dealing with large document sets returned from searches. Common search interfaces such as PubMed do not provide adequate support for exploratory search tasks. Our objective was to improve support for exploratory search tasks by combining two strategies in the design of an interactive visual interface by (1) using a formal ontology to help users build domain-specific knowledge and vocabulary and (2) providing multi-stage triaging support to help mitigate the information overload problem. We developed a Web-based tool, Ontology-Driven Visual Search and Triage Interface for MEDLINE (OVERT-MED), to test our design ideas. We implemented a custom searchable index of MEDLINE, which comprises approximately 25 million document citations. We chose a popular biomedical ontology, the Human Phenotype Ontology (HPO), to test our solution to the vocabulary problem. We implemented multistage triaging support in OVERT-MED, with the aid of interactive visualization techniques, to help users deal with large document sets returned from searches. Formative evaluation suggests that the design features in OVERT-MED are helpful in addressing the two major difficulties described above. Using a formal ontology seems to help users articulate their information needs with more accurate vocabulary. In addition, multistage triaging combined with interactive visualizations shows promise in mitigating the information overload problem. Our strategies appear to be valuable in addressing the two major problems in exploratory search. Although we tested OVERT-MED with a particular ontology and document collection, we anticipate that our strategies can be transferred successfully to other contexts. �
A Powerful, Cost Effective, Web Based Engineering Solution Supporting Conjunction Detection and Visual Analysis

Science.gov (United States)

Novak, Daniel M.; Biamonti, Davide; Gross, Jeremy; Milnes, Martin

2013-08-01

An innovative and visually appealing tool is presented for efficient all-vs-all conjunction analysis on a large catalogue of objects. The conjunction detection uses a nearest neighbour search algorithm, based on spatial binning and identification of pairs of objects in adjacent bins. This results in the fastest all vs all filtering the authors are aware of. The tool is constructed on a server-client architecture, where the server broadcasts to the client the conjunction data and ephemerides, while the client supports the user interface through a modern browser, without plug-in. In order to make the tool flexible and maintainable, Java software technologies were used on the server side, including Spring, Camel, ActiveMQ and CometD. The user interface and visualisation are based on the latest web technologies: HTML5, WebGL, THREE.js. Importance has been given on the ergonomics and visual appeal of the software. In fact certain design concepts have been borrowed from the gaming industry.
Web-based child pornography: The global impact of deterrence efforts and its consumption on mobile platforms.

Science.gov (United States)

Steel, Chad M S

2015-06-01

Our study is the first to look at mobile device use for child sexual exploitation material (CSEM) consumption, and at the global impact of deterrence efforts by search providers. We used data from Google, Bing, and Yandex to assess how web searches for CSEM are being conducted, both at present and historically. Our findings show that the blocking efforts by Google and Microsoft have resulted in a 67% drop in the past year in web-based searches for CSEM. Additionally, our findings show that mobile devices are a substantial platform for web-based consumption of CSEM, with tablets and smartphones representing 32% of all queries associated with CSEM conducted on Bing. Further, our findings show that a major search engine not located in the United States, Yandex, did not undertake blocking efforts similar to those implemented by Google and Microsoft and has seen no commensurate drop in CSEM searches and continues to profit from ad revenue on these queries. While the efforts by Google and Microsoft have had a deterrence effect in the United States, searchers from Russia and other locations where child pornography possession is not criminalized have continued to use these services. Additionally, the same lax enforcement environment has allowed searchers from the United States to utilize Yandex with little fear of detection or referral to United States law enforcement from the Russian authorities. Copyright © 2015 Elsevier Ltd. All rights reserved.
Integrating ecosystem engineering and food webs

NARCIS (Netherlands)

Sanders, Dirk; Jones, Clive G.; Thebault, Elisa; Bouma, Tjeerd J.; van der Heide, Tjisse; van Belzen, Jim; Barot, Sebastien

Ecosystem engineering, the physical modification of the environment by organisms, is a common and often influential process whose significance to food web structure and dynamics is largely unknown. In the light of recent calls to expand food web studies to include non-trophic interactions, we
Integrating ecosystem engineering and food webs

NARCIS (Netherlands)

Sanders, D.; Jones, C.G.; Thébault, E.; Bouma, T.J.; van der Heide, T.; van Belzen, J.; Barot, S.

2014-01-01

Ecosystem engineering, the physical modification of the environment by organisms, is a common and often influential process whose significance to food web structure and dynamics is largely unknown. In the light of recent calls to expand food web studies to include non-trophic interactions, we
Rendimiento de los sistemas de recuperación de información en la web: evalución de servicios de búsqueda (search engines.

Directory of Open Access Journals (Sweden)

Olvera Lobo, María Dolores

2000-09-01

Full Text Available Ten search engines, Altavista, Excite, Hotbot, Infoseek, Lycos. Magellan, OpenText, WebCrawler, WWWWorm, Yahoo, were evaluated, by means of a questionnaire with 20 items (adding up to a total of 200 questions. The 20 first results for each question were analysed in terms of relevance, and values of precision and recall were computed for the resulting 4000 references. The results are also analyzed in terms of the type of question (boolean or natural language and topic (specialized vs. general interest. The results showed that Excite, Infoseek and AltaVista performed generally better. The conclusion of this methodological trial was that the method used allows the evaluation of the performance of Information Retrieval Systems in the Web. As for the results, web search engines are not very precise but extremely exhaustive.

Se han evaluado diez servicios de búsqueda: Altavista, Excite, Hotbot, Infoseek, Lycos, Magellan, OpenText, WebCrawler, WWWWorm, Yahoo. Se formularon 20 preguntas a cada uno de los 10 sistemas evaluados por lo que se realizaron 200 consultas. Además, se examinó la relevancia de los primeros 20 resultados de cada consulta lo que significa que, en total, se revisaron aproximadamente 4.000 referencias, para cada una de las cuales se calcularon los valores de precisión y exhaustividad. Los análisis muestran que Excite, Infoseek y Altavista son los tres servicios que, de forma genérica, muestran mejor rendimiento. Se analizan también los resultados en función del tipo de pregunta (booleanas o de frase y del tema (ocio o especializada. Se concluye que el método empleado permite analizar el rendimiento de los SRI de la W3 y que los resultados ponen de manifiesto que los buscadores no son sistemas de recuperación de información muy precisos aunque sí muy exhaustivos.
WebVR——Web Virtual Reality Engine Based on P2P network

OpenAIRE

zhihan LV; Tengfei Yin; Yong Han; Yong Chen; Ge Chen

2011-01-01

WebVR, a multi-user online virtual reality engine, is introduced. The main contributions are mapping the geographical space and virtual space to the P2P overlay network space, and dividing the three spaces by quad-tree method. The geocoding is identified with Hash value, which is used to index the user list, terrain data, and the model object data. Sharing of data through improved Kademlia network model is designed and implemented. In this model, XOR algorithm is used to calculate the distanc...
Promoting Your Web Site.

Science.gov (United States)

Raeder, Aggi

1997-01-01

Discussion of ways to promote sites on the World Wide Web focuses on how search engines work and how they retrieve and identify sites. Appropriate Web links for submitting new sites and for Internet marketing are included. (LRW)
search GenBank: interactive orchestration and ad-hoc choreography of Web services in the exploration of the biomedical resources of the National Center For Biotechnology Information.

Science.gov (United States)

Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Siążnik, Artur

2013-03-01

Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user's query, advanced data searching based on the specified user's query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. search GenBank extends standard capabilities of the
Surfing the World Wide Web to Education Hot-Spots.

Science.gov (United States)

Dyrli, Odvard Egil

1995-01-01

Provides a brief explanation of Web browsers and their use, as well as technical information for those considering access to the WWW (World Wide Web). Curriculum resources and addresses to useful Web sites are included. Sidebars show sample searches using Yahoo and Lycos search engines, and a list of recommended Web resources. (JKP)
Discovering Land Cover Web Map Services from the Deep Web with JavaScript Invocation Rules

Directory of Open Access Journals (Sweden)

Dongyang Hou

2016-06-01

Full Text Available Automatic discovery of isolated land cover web map services (LCWMSs can potentially help in sharing land cover data. Currently, various search engine-based and crawler-based approaches have been developed for finding services dispersed throughout the surface web. In fact, with the prevalence of geospatial web applications, a considerable number of LCWMSs are hidden in JavaScript code, which belongs to the deep web. However, discovering LCWMSs from JavaScript code remains an open challenge. This paper aims to solve this challenge by proposing a focused deep web crawler for finding more LCWMSs from deep web JavaScript code and the surface web. First, the names of a group of JavaScript links are abstracted as initial judgements. Through name matching, these judgements are utilized to judge whether or not the fetched webpages contain predefined JavaScript links that may prompt JavaScript code to invoke WMSs. Secondly, some JavaScript invocation functions and URL formats for WMS are summarized as JavaScript invocation rules from prior knowledge of how WMSs are employed and coded in JavaScript. These invocation rules are used to identify the JavaScript code for extracting candidate WMSs through rule matching. The above two operations are incorporated into a traditional focused crawling strategy situated between the tasks of fetching webpages and parsing webpages. Thirdly, LCWMSs are selected by matching services with a set of land cover keywords. Moreover, a search engine for LCWMSs is implemented that uses the focused deep web crawler to retrieve and integrate the LCWMSs it discovers. In the first experiment, eight online geospatial web applications serve as seed URLs (Uniform Resource Locators and crawling scopes; the proposed crawler addresses only the JavaScript code in these eight applications. All 32 available WMSs hidden in JavaScript code were found using the proposed crawler, while not one WMS was discovered through the focused crawler-based
MEASURING THE PERFORMANCE OF SIMILARITY PROPAGATION IN AN SEMANTIC SEARCH ENGINE

Directory of Open Access Journals (Sweden)

S. K. Jayanthi

2013-10-01

Full Text Available In the current scenario, web page result personalization is playing a vital role. Nearly 80 % of the users expect the best results in the first page itself without having any persistence to browse longer in URL mode. This research work focuses on two main themes: Semantic web search through online and Domain based search through offline. The first part is to find an effective method which allows grouping similar results together using BookShelf Data Structure and organizing the various clusters. The second one is focused on the academic domain based search through offline. This paper focuses on finding documents which are similar and how Vector space can be used to solve it. So more weightage is given for the principles and working methodology of similarity propagation. Cosine similarity measure is used for finding the relevancy among the documents.
Comparing the diversity of information by word-of-mouth vs. web spread

Science.gov (United States)

Sela, Alon; Shekhtman, Louis; Havlin, Shlomo; Ben-Gal, Irad

2016-06-01

Many studies have explored spreading and diffusion through complex networks. The following study examines a specific case of spreading of opinions in modern society through two spreading schemes —defined as being either through “word of mouth” (WOM), or through online search engines (WEB). We apply both modelling and real experimental results and compare the opinions people adopt through an exposure to their friend's opinions, as opposed to the opinions they adopt when using a search engine based on the PageRank algorithm. A simulated study shows that when members in a population adopt decisions through the use of the WEB scheme, the population ends up with a few dominant views, while other views are barely expressed. In contrast, when members adopt decisions based on the WOM scheme, there is a far more diverse distribution of opinions in that population. The simulative results are further supported by an online experiment which finds that people searching information through a search engine end up with far more homogenous opinions as compared to those asking their friends.
The influence that JavaScript(TM has on the visibility of a Website to search engines - a pilot study

Directory of Open Access Journals (Sweden)

M. Weideman

2006-01-01

Full Text Available Introduction. In this research project, an empirical pilot study on the relationship between JavaScript(TM usage and Website visibility was carried out. The main purpose was to establish whethe JavaScript(TM-based hyperlinks attract or repel crawlers, resulting in an increase or decrease in Website visibility. Method. A literature survey has established that there appears to be contradiction amongst claims by various authors as to whether or not crawlers can parse or interpret JavaScript(TM. The chosen methodology involved the creation of a Website that contains different kinds of links to other pages, where actual data files were stored. Search engine crawler visits to the page pointed to by the different kinds of links were monitored and recorded. Analysis. This experiment took into account the fact that JavaScript(TM can be embedded within the HTML of a Web page or referenced as an external '.js' file. It also considered different ways of specifying links within JavaScript(TM. Results. The results obtained indicated that text links provide the highest level of opportunity for crawlers to discover and index non-homepages. In general, crawlers did not follow Javascript(TM-based links to Web pages blindly. Conclusion. . Most crawlers evade Javascript(TM links, implying that Web pages using forms of this technology, for example in pop-up/pull-down menus, could be jeopardising their chances of achieving high search engine rankings. Certain Javascript(TM links were not followed at all, which has serious implications for designers of e-Commerce Websites.
Mining social media and web searches for disease detection.

Science.gov (United States)

Yang, Y Tony; Horneffer, Michael; DiLisio, Nicole

2013-04-28

Web-based social media is increasingly being used across different settings in the health care industry. The increased frequency in the use of the Internet via computer or mobile devices provides an opportunity for social media to be the medium through which people can be provided with valuable health information quickly and directly. While traditional methods of detection relied predominately on hierarchical or bureaucratic lines of communication, these often failed to yield timely and accurate epidemiological intelligence. New web-based platforms promise increased opportunities for a more timely and accurate spreading of information and analysis. This article aims to provide an overview and discussion of the availability of timely and accurate information. It is especially useful for the rapid identification of an outbreak of an infectious disease that is necessary to promptly and effectively develop public health responses. These web-based platforms include search queries, data mining of web and social media, process and analysis of blogs containing epidemic key words, text mining, and geographical information system data analyses. These new sources of analysis and information are intended to complement traditional sources of epidemic intelligence. Despite the attractiveness of these new approaches, further study is needed to determine the accuracy of blogger statements, as increases in public participation may not necessarily mean the information provided is more accurate.
Mining social media and web searches for disease detection

Directory of Open Access Journals (Sweden)

Y. Tony Yang

2013-05-01

Full Text Available Web-based social media is increasingly being used across different settings in the health care industry. The increased frequency in the use of the Internet via computer or mobile devices provides an opportunity for social media to be the medium through which people can be provided with valuable health information quickly and directly. While traditional methods of detection relied predominately on hierarchical or bureaucratic lines of communication, these often failed to yield timely and accurate epidemiological intelligence. New web-based platforms promise increased opportunities for a more timely and accurate spreading of information and analysis. This article aims to provide an overview and discussion of the availability of timely and accurate information. It is especially useful for the rapid identification of an outbreak of an infectious disease that is necessary to promptly and effectively develop public health responses. These web-based platforms include search queries, data mining of web and social media, process and analysis of blogs containing epidemic key words, text mining, and geographical information system data analyses. These new sources of analysis and information are intended to complement traditional sources of epidemic intelligence. Despite the attractiveness of these new approaches, further study is needed to determine the accuracy of blogger statements, as increases in public participation may not necessarily mean the information provided is more accurate.
A Web Based Approach to Integrate Space Culture and Education

Science.gov (United States)

Gerla, F.

2002-01-01

Our intention is to dedicate a large section of our web site to space education. As the national User Support and Operation Center (USOC) for the International Space Station, MARS Center is also willing to provide material, such as videos and data, for educational purposes. In order to base our initiative on authoritative precedents, our first step has been a comparative analysis between different space agency education web sites, such as ESA and NASA. As is well known, Internet is a powerful reality, capable of connecting people all over the world and rendering public a huge amount of information. The first problem, then, is to organize this information, in order to use the web as an efficient education tool. That is why studies such as User Modeling (UM), Human Computer Interaction (HCI) and Semantic Web have become more important in Information Technology and Science. Traditional search engines are unable to provide an optimal retrieval of contents really searched for by users. Semantic Web is a valid alternative: according to its theories, web information should be represented using metadata language. Users should be able and enabled to successfully search, obtain and study new information from web. Forging knowledge in an intelligent manner, preventing users from making errors, and making this formidable quantity of information easily available have also been the starting points for HCI methodologies for defining Adaptable Interfaces. Here the information is divided into different sets, on the basis of the intended user profile, in order to prevent users from getting lost. Realized as an adaptable interface, an education web site can help users to effectively retrieve the information necessary for their scopes (teaching for a teacher and learning for a student). For students it's a great advantage to use interfaces designed on the basis of their age and scholastic level. Indeed, an adaptable interface is intended not just for students, but also for teachers
A Web-Based Learning System for Software Test Professionals

Science.gov (United States)

Wang, Minhong; Jia, Haiyang; Sugumaran, V.; Ran, Weijia; Liao, Jian

2011-01-01

Fierce competition, globalization, and technology innovation have forced software companies to search for new ways to improve competitive advantage. Web-based learning is increasingly being used by software companies as an emergent approach for enhancing the skills of knowledge workers. However, the current practice of Web-based learning is…
A fuzzy-match search engine for physician directories.

Science.gov (United States)

Rastegar-Mojarad, Majid; Kadolph, Christopher; Ye, Zhan; Wall, Daniel; Murali, Narayana; Lin, Simon

2014-11-04

A search engine to find physicians' information is a basic but crucial function of a health care provider's website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names. The Marshfield Clinic website provides a search engine for users to search for physicians' names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster. Instead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: "Typographic", "Phonetic spelling variation", and "Nickname". To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data. Using the "Challenge Data Set of Marshfield Physician Names," we evaluated the accuracy of fuzzy-match engine-top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine-top one (71%). We designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website.
How Users Search the Mobile Web: A Model for Understanding the Impact of Motivation and Context on Search Behaviors

Directory of Open Access Journals (Sweden)

Dan Wu

2016-03-01

Full Text Available Purpose: This study explores how search motivation and context influence mobile Web search behaviors. Design/methodology/approach: We studied 30 experienced mobile Web users via questionnaires, semi-structured interviews, and an online diary tool that participants used to record their daily search activities. SQLite Developer was used to extract data from the users' phone logs for correlation analysis in Statistical Product and Service Solutions (SPSS. Findings: One quarter of mobile search sessions were driven by two or more search motivations. It was especially difficult to distinguish curiosity from time killing in particular user reporting. Multi-dimensional contexts and motivations influenced mobile search behaviors, and among the context dimensions, gender, place, activities they engaged in while searching, task importance, portal, and interpersonal relations (whether accompanied or alone when searching correlated with each other. Research limitations: The sample was comprised entirely of college students, so our findings may not generalize to other populations. More participants and longer experimental duration will improve the accuracy and objectivity of the research. Practical implications: Motivation analysis and search context recognition can help mobile service providers design applications and services for particular mobile contexts and usages. Originality/value: Most current research focuses on specific contexts, such as studies on place, or other contextual influences on mobile search, and lacks a systematic analysis of mobile search context. Based on analysis of the impact of mobile search motivations and search context on search behaviors, we built a multi-dimensional model of mobile search behaviors.

Andromeda - a peptide search engine integrated into the MaxQuant environment

DEFF Research Database (Denmark)

Cox, Jurgen; Neuhauser, Nadin; Michalski, Annette

2011-01-01

A key step in mass spectrometry (MS)-based proteomics is the identification of peptides in sequence databases by their fragmentation spectra. Here we describe Andromeda, a novel peptide search engine using a probabilistic scoring model. On proteome data Andromeda performs as well as Mascot......, a widely used commercial search engine, as judged by sensitivity and specificity analysis based on target decoy searches. Furthermore, it can handle data with arbitrarily high fragment mass accuracy, is able to assign and score complex patterns of post-translational modifications, such as highly...... phosphorylated peptides and accommodates extremely large databases. The algorithms of Andromeda are provided. Andromeda can function independently or as an integrated search engine of the widely used MaxQuant computational proteomics platform and both are freely available at www.maxquant.org. The combination...
Using Internet Search Engines to Obtain Medical Information: A Comparative Study

Science.gov (United States)

Wang, Liupu; Wang, Juexin; Wang, Michael; Li, Yong; Liang, Yanchun

2012-01-01

Background The Internet has become one of the most important means to obtain health and medical information. It is often the first step in checking for basic information about a disease and its treatment. The search results are often useful to general users. Various search engines such as Google, Yahoo!, Bing, and Ask.com can play an important role in obtaining medical information for both medical professionals and lay people. However, the usability and effectiveness of various search engines for medical information have not been comprehensively compared and evaluated. Objective To compare major Internet search engines in their usability of obtaining medical and health information. Methods We applied usability testing as a software engineering technique and a standard industry practice to compare the four major search engines (Google, Yahoo!, Bing, and Ask.com) in obtaining health and medical information. For this purpose, we searched the keyword breast cancer in Google, Yahoo!, Bing, and Ask.com and saved the results of the top 200 links from each search engine. We combined nonredundant links from the four search engines and gave them to volunteer users in an alphabetical order. The volunteer users evaluated the websites and scored each website from 0 to 10 (lowest to highest) based on the usefulness of the content relevant to breast cancer. A medical expert identified six well-known websites related to breast cancer in advance as standards. We also used five keywords associated with breast cancer defined in the latest release of Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) and analyzed their occurrence in the websites. Results Each search engine provided rich information related to breast cancer in the search results. All six standard websites were among the top 30 in search results of all four search engines. Google had the best search validity (in terms of whether a website could be opened), followed by Bing, Ask.com, and Yahoo!. The search
Using Internet search engines to obtain medical information: a comparative study.

Science.gov (United States)

Wang, Liupu; Wang, Juexin; Wang, Michael; Li, Yong; Liang, Yanchun; Xu, Dong

2012-05-16

The Internet has become one of the most important means to obtain health and medical information. It is often the first step in checking for basic information about a disease and its treatment. The search results are often useful to general users. Various search engines such as Google, Yahoo!, Bing, and Ask.com can play an important role in obtaining medical information for both medical professionals and lay people. However, the usability and effectiveness of various search engines for medical information have not been comprehensively compared and evaluated. To compare major Internet search engines in their usability of obtaining medical and health information. We applied usability testing as a software engineering technique and a standard industry practice to compare the four major search engines (Google, Yahoo!, Bing, and Ask.com) in obtaining health and medical information. For this purpose, we searched the keyword breast cancer in Google, Yahoo!, Bing, and Ask.com and saved the results of the top 200 links from each search engine. We combined nonredundant links from the four search engines and gave them to volunteer users in an alphabetical order. The volunteer users evaluated the websites and scored each website from 0 to 10 (lowest to highest) based on the usefulness of the content relevant to breast cancer. A medical expert identified six well-known websites related to breast cancer in advance as standards. We also used five keywords associated with breast cancer defined in the latest release of Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) and analyzed their occurrence in the websites. Each search engine provided rich information related to breast cancer in the search results. All six standard websites were among the top 30 in search results of all four search engines. Google had the best search validity (in terms of whether a website could be opened), followed by Bing, Ask.com, and Yahoo!. The search results highly overlapped between the
Multi-objective Search-based Mobile Testing

OpenAIRE

Mao, K.

2017-01-01

Despite the tremendous popularity of mobile applications, mobile testing still relies heavily on manual testing. This thesis presents mobile test automation approaches based on multi-objective search. We introduce three approaches: Sapienz (for native Android app testing), Octopuz (for hybrid/web JavaScript app testing) and Polariz (for using crowdsourcing to support search-based mobile testing). These three approaches represent the primary scientific and technical contributions of the thesis...
Search 3.0: Present, Personal, Precise

Science.gov (United States)

Spivack, Nova

The next generation of Web search is already beginning to emerge. With it we will see several shifts in the way people search, and the way major search engines provide search functionality to consumers.
A Study of HTML Title Tag Creation Behavior of Academic Web Sites

Science.gov (United States)

Noruzi, Alireza

2007-01-01

The HTML title tag information should identify and describe exactly what a Web page contains. This paper analyzes the "Title element" and raises a significant question: "Why is the title tag important?" Search engines base search results and page rankings on certain criteria. Among the most important criteria is the presence of the search keywords…
Building a semantic search engine with games and crowdsourcing

OpenAIRE

Wieser, Christoph

2014-01-01

Semantic search engines aim at improving conventional search with semantic information, or meta-data, on the data searched for and/or on the searchers. So far, approaches to semantic search exploit characteristics of the searchers like age, education, or spoken language for selecting and/or ranking search results. Such data allow to build up a semantic search engine as an extension of a conventional search engine. The crawlers of well established search engines like Google, Yahoo! or Bing ...
Quality of web-based information on pathological gambling.

Science.gov (United States)

Khazaal, Yasser; Chatton, Anne; Cochand, Sophie; Jermann, Françoise; Osiek, Christian; Bondolfi, Guido; Zullino, Daniele

2008-09-01

The present study aims to evaluate the quality of web-based information on gambling and to investigate potential content quality indicators. The following key words: gambling, pathological gambling, excessive gambling, gambling problem and gambling addiction were entered into two popular search engines: Google and Yahoo. Websites were assessed with a standardized proforma designed to rate sites on the basis of "accountability", "presentation", "interactivity", "readability" and "content quality". "Health on the Net" (HON) quality label, and DISCERN scale scores aiding people without content expertise to assess quality of written health publication were used to verify their efficiency as quality indicators. Of the 200 links identified, 75 websites were included. The results of the study indicate low scores on each of the measures. A composite global score appeared as a good content quality indicator. While gambling-related education websites for patients are common, their global quality is poor. There is a need for useful evidence-based information about gambling on the web. As the phenomenon has greatly increased, it could be relevant for Internet sites to improve their content by using global score as a quality indicator.
A survey of the current status of web-based databases indexing Iranian journals.

Science.gov (United States)

Merat, Shahin; Khatibzadeh, Shahab; Mesgarpour, Bita; Malekzadeh, Reza

2009-05-01

The scientific output of Iran is increasing rapidly during the recent years. Unfortunately, most papers are published in journals which are not indexed by popular indexing systems and many of them are in Persian without English translation. This makes the results of Iranian scientific research unavailable to other researchers, including Iranians. The aim of this study was to evaluate the quality of current web-based databases indexing scientific articles published in Iran. We identified web-based databases which indexed scientific journals published in Iran using popular search engines. The sites were then subjected to a series of tests to evaluate their coverage, search capabilities, stability, accuracy of information, consistency, accessibility, ease of use, and other features. Results were compared with each other to identify strengths and shortcomings of each site. Five web sites were indentified. None had a complete coverage on scientific Iranian journals. The search capabilities were less than optimal in most sites. English translations of research titles, author names, keywords, and abstracts of Persian-language articles did not follow standards. Some sites did not cover abstracts. Numerous typing errors make searches ineffective and citation indexing unreliable. None of the currently available indexing sites are capable of presenting Iranian research to the international scientific community. The government should intervene by enforcing policies designed to facilitate indexing through a systematic approach. The policies should address Iranian journals, authors, and indexing sites. Iranian journals should be required to provide their indexing data, including references, electronically; authors should provide correct indexing information to journals; and indexing sites should improve their software to meet standards set by the government.
Web page sorting algorithm based on query keyword distance relation

Science.gov (United States)

Yang, Han; Cui, Hong Gang; Tang, Hao

2017-08-01

In order to optimize the problem of page sorting, according to the search keywords in the web page in the relationship between the characteristics of the proposed query keywords clustering ideas. And it is converted into the degree of aggregation of the search keywords in the web page. Based on the PageRank algorithm, the clustering degree factor of the query keyword is added to make it possible to participate in the quantitative calculation. This paper proposes an improved algorithm for PageRank based on the distance relation between search keywords. The experimental results show the feasibility and effectiveness of the method.
Evaluating company growth potential using AI and web media data

DEFF Research Database (Denmark)

Droll, Andrew; Khan, Shahzad; Tanev, Stoyan

2017-01-01

The article focuses on adapting and validating the use of an existing web search and analytics engine to evaluate the growth and competitive potential of new technology start-ups and existing firms in the newly emerging precision medicine sector. The results are based on two different search...... includes new technology firms in the same sector. The firms in the second sample were used as test cases in examining if their growth related web search scores would relate to the degree of their innovativeness. The second part of the study applied the same methodology to the real time monitoring of firms...
Search Engine For Ebook Portal

Directory of Open Access Journals (Sweden)

Prashant Kanade

2017-05-01

Full Text Available The purpose of this paper is to establish the textual analytics involved in developing a search engine for an ebook portal. We have extracted our dataset from Project Gutenberg using a robot harvester. Textual Analytics is used for efficient search retrieval. The entire dataset is represented using Vector Space Model where each document is a vector in the vector space. Further for computational purposes we represent our dataset in the form of a Term Frequency- Inverse Document Frequency tf-idf matrix. The first step involves obtaining the most coherent sequence of words of the search query entered. The entered query is processed using Front End algorithms this includes-Spell Checker Text Segmentation and Language Modeling. Back End processing includes Similarity Modeling Clustering Indexing and Retrieval. The relationship between documents and words is established using cosine similarity measured between the documents and words in Vector Space. Clustering performed is used to suggest books that are similar to the search query entered by the user. Lastly the Lucene Based Elasticsearch engine is used for indexing on the documents. This allows faster retrieval of data. Elasticsearch returns a dictionary and creates a tf-idf matrix. The processed query is compared with the dictionary obtained and tf-idf matrix is used to calculate the score for each match to give most relevant result.
The end of meta search engines in Europe?

NARCIS (Netherlands)

Husovec, Martin

2015-01-01

The technology behind the meta search engines supports countless number of Internet services ranging from the price and quality comparison websites to more sophisticated traffic connection finders and general search engines like Google. Meta search engines generally increase market transparency,
A Literature Review of Indexing and Searching Techniques Implementation in Educational Search Engines

Science.gov (United States)

El Guemmat, Kamal; Ouahabi, Sara

2018-01-01

The objective of this article is to analyze the searching and indexing techniques of educational search engines' implementation while treating future challenges. Educational search engines could greatly help in the effectiveness of e-learning if used correctly. However, these engines have several gaps which influence the performance of e-learning…
Design considerations for a large-scale image-based text search engine in historical manuscript collections

NARCIS (Netherlands)

Schomaker, Lambertus

2016-01-01

This article gives an overview of design considerations for a handwriting search engine based on pattern recognition and high-performance computing, “Monk”. In order to satisfy multiple and often conflicting technological requirements, an architecture is used which heavily relies on high-performance
Knowledge engineering in a temporal symantic web context

NARCIS (Netherlands)

Milea, D.V.; Frasincar, F.; Kaymak, U.; Schwabe, D.; Curbera, F.; Dantzig, P.

2008-01-01

The emergence of Web 2.0 and the semantic Web as established technologies is fostering a whole new breed of Web applications and systems. These are often centered around knowledge engineering and context awareness. However, adequate temporal formalisms underlying context awareness are currently
Predicting consumer behavior with Web search.

Science.gov (United States)

Goel, Sharad; Hofman, Jake M; Lahaie, Sébastien; Pennock, David M; Watts, Duncan J

2010-10-12

Recent work has demonstrated that Web search volume can "predict the present," meaning that it can be used to accurately track outcomes such as unemployment levels, auto and home sales, and disease prevalence in near real time. Here we show that what consumers are searching for online can also predict their collective future behavior days or even weeks in advance. Specifically we use search query volume to forecast the opening weekend box-office revenue for feature films, first-month sales of video games, and the rank of songs on the Billboard Hot 100 chart, finding in all cases that search counts are highly predictive of future outcomes. We also find that search counts generally boost the performance of baseline models fit on other publicly available data, where the boost varies from modest to dramatic, depending on the application in question. Finally, we reexamine previous work on tracking flu trends and show that, perhaps surprisingly, the utility of search data relative to a simple autoregressive model is modest. We conclude that in the absence of other data sources, or where small improvements in predictive performance are material, search queries provide a useful guide to the near future.
Information Retrieval Strategies of Millennial Undergraduate Students in Web and Library Database Searches

Science.gov (United States)

Porter, Brandi

2009-01-01

Millennial students make up a large portion of undergraduate students attending colleges and universities, and they have a variety of online resources available to them to complete academically related information searches, primarily Web based and library-based online information retrieval systems. The content, ease of use, and required search…
How to Search the Internet Archive Without Indexing It

DEFF Research Database (Denmark)

Kanhabua, Nattiya; Kemkes, Philipp; Nejdl, Wolfgang

2016-01-01

Significant parts of our cultural heritage are produced on the Web in recent years. While the easy accessibility to the current Web is a good baseline, optimal access to the past of the Web faces several challenges. This includes dealing with large-scale web archive collections, as well as lacking...... search results to the WayBack Machine; thus al- lowing keyword search on the Internet Archive without processing and indexing its raw content. Our system complements existing web archive search tools through a user interface, which comes close to the functionalities of modern web search engines (e...
The Effect of Internet Searches on Afforestation: The Case of a Green Search Engine

Directory of Open Access Journals (Sweden)

Pedro Palos-Sanchez

2018-01-01

Full Text Available Ecosia is an Internet search engine that plants trees with the income obtained from advertising. This study explored the factors that affect the adoption of Ecosia.org from the perspective of technology adoption and trust. This was done by using the Unified Theory of Acceptance and Use of Technology (UTAUT2 and then analyzing the results with PLS-SEM (Partial Least Squares-Structural Equation Modeling. Subsequently, a survey was conducted with a structured questionnaire on search engines, which yielded the following results: (1 the idea of a company helping to mitigate the effects of climate change by planting trees is well received by Internet users. However, few people accept the idea of changing their habits from using traditional search engines; (2 Ecosia is a search engine believed to have higher compatibility rates, and needing less hardware resources, and (3 ecological marketing is an appropriate and future strategy that can increase the intention to use a technological product. Based on the results obtained, this study shows that a search engine or other service provided by the Internet, which can be audited (visits, searches, files, etc., can also contribute to curb the effects of deforestation and climate change. In addition, companies, and especially technological start-ups, are advised to take into account that users feel better using these tools. Finally, this study urges foundations and non-governmental organizations to fight against the effects of deforestation by supporting these initiatives. The study also urges companies to support technological services, and follow the behavior of Ecosia.org in order to positively influence user satisfaction by using ecological marketing strategies.

The HMMER Web Server for Protein Sequence Similarity Search.

Science.gov (United States)

Prakash, Ananth; Jeffryes, Matt; Bateman, Alex; Finn, Robert D

2017-12-08

Protein sequence similarity search is one of the most commonly used bioinformatics methods for identifying evolutionarily related proteins. In general, sequences that are evolutionarily related share some degree of similarity, and sequence-search algorithms use this principle to identify homologs. The requirement for a fast and sensitive sequence search method led to the development of the HMMER software, which in the latest version (v3.1) uses a combination of sophisticated acceleration heuristics and mathematical and computational optimizations to enable the use of profile hidden Markov models (HMMs) for sequence analysis. The HMMER Web server provides a common platform by linking the HMMER algorithms to databases, thereby enabling the search for homologs, as well as providing sequence and functional annotation by linking external databases. This unit describes three basic protocols and two alternate protocols that explain how to use the HMMER Web server using various input formats and user defined parameters. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
LoyalTracker: Visualizing Loyalty Dynamics in Search Engines.

Science.gov (United States)

Shi, Conglei; Wu, Yingcai; Liu, Shixia; Zhou, Hong; Qu, Huamin

2014-12-01

The huge amount of user log data collected by search engine providers creates new opportunities to understand user loyalty and defection behavior at an unprecedented scale. However, this also poses a great challenge to analyze the behavior and glean insights into the complex, large data. In this paper, we introduce LoyalTracker, a visual analytics system to track user loyalty and switching behavior towards multiple search engines from the vast amount of user log data. We propose a new interactive visualization technique (flow view) based on a flow metaphor, which conveys a proper visual summary of the dynamics of user loyalty of thousands of users over time. Two other visualization techniques, a density map and a word cloud, are integrated to enable analysts to gain further insights into the patterns identified by the flow view. Case studies and the interview with domain experts are conducted to demonstrate the usefulness of our technique in understanding user loyalty and switching behavior in search engines.
The internet and intelligent machines: search engines, agents and robots; Radiologische Informationssuche im Internet: Datenbanken, Suchmaschinen und intelligente Agenten

Energy Technology Data Exchange (ETDEWEB)

Achenbach, S; Alfke, H [Marburg Univ. (Germany). Abt. fuer Strahlendiagnostik

2000-04-01

The internet plays an important role in a growing number of medical applications. Finding relevant information is not always easy as the amount of available information on the Web is rising quickly. Even the best Search Engines can only collect links to a fraction of all existing Web pages. In addition, many of these indexed documents have been changed or deleted. The vast majority of information on the Web is not searchable with conventional methods. New search strategies, technologies and standards are combined in Intelligent Search Agents (ISA) an Robots, which can retrieve desired information in a specific approach. Conclusion: The article describes differences between ISAs and conventional Search Engines and how communication between Agents improves their ability to find information. Examples of existing ISAs are given and the possible influences on the current and future work in radiology is discussed. (orig.) [German] Das Internet findet zunehmend in medizinischen Anwendungen Verbreitung, jedoch ist das Auffinden relevanter Informationen nicht immer leicht. Die Anzahl der verfuegbaren Dokumente im World wide web nimmt so schnell zu, dass die Suche zunehmend Probleme bereitet: Auch gute Suchmaschinen erfassen nur einige Prozent der vorhandenen Seiten in Ihren Datenbanken. Zusaetzlich sorgen staendige Veraenderungen dafuer, dass nur ein Teil dieser durchsuchbaren Dokumente ueberhaupt noch existiert. Der Grossteil des Internets ist daher mit konventionellen Methoden nicht zu erschliessen. Neue Standards, Suchstrategien und Technologien vereinen sich in den Suchagenten und Robots, die gezielter und intelligenter Inhalte ermitteln koennen. Schlussfolgerung: Der Artikel stellt dar, wie sich ein Intelligent search agent (ISA) von einer Suchmaschine unterscheidet und durch Kooperation mit anderen Agenten die Anforderungen der Benutzer besser erfuellen kann. Neben den Grundlagen werden exemplarische Anwendungen gezeigt, die heute im Netz existieren, und ein Ausblick
Combining results of multiple search engines in proteomics.

Science.gov (United States)

Shteynberg, David; Nesvizhskii, Alexey I; Moritz, Robert L; Deutsch, Eric W

2013-09-01

A crucial component of the analysis of shotgun proteomics datasets is the search engine, an algorithm that attempts to identify the peptide sequence from the parent molecular ion that produced each fragment ion spectrum in the dataset. There are many different search engines, both commercial and open source, each employing a somewhat different technique for spectrum identification. The set of high-scoring peptide-spectrum matches for a defined set of input spectra differs markedly among the various search engine results; individual engines each provide unique correct identifications among a core set of correlative identifications. This has led to the approach of combining the results from multiple search engines to achieve improved analysis of each dataset. Here we review the techniques and available software for combining the results of multiple search engines and briefly compare the relative performance of these techniques.
Combining Results of Multiple Search Engines in Proteomics*

Science.gov (United States)

Shteynberg, David; Nesvizhskii, Alexey I.; Moritz, Robert L.; Deutsch, Eric W.

2013-01-01

A crucial component of the analysis of shotgun proteomics datasets is the search engine, an algorithm that attempts to identify the peptide sequence from the parent molecular ion that produced each fragment ion spectrum in the dataset. There are many different search engines, both commercial and open source, each employing a somewhat different technique for spectrum identification. The set of high-scoring peptide-spectrum matches for a defined set of input spectra differs markedly among the various search engine results; individual engines each provide unique correct identifications among a core set of correlative identifications. This has led to the approach of combining the results from multiple search engines to achieve improved analysis of each dataset. Here we review the techniques and available software for combining the results of multiple search engines and briefly compare the relative performance of these techniques. PMID:23720762
Keeping Dublin Core Simple: Cross-Domain Discovery or Resource Description?; First Steps in an Information Commerce Economy: Digital Rights Management in the Emerging E-Book Environment; Interoperability: Digital Rights Management and the Emerging EBook Environment; Searching the Deep Web: Direct Query Engine Applications at the Department of Energy.

Science.gov (United States)

Lagoze, Carl; Neylon, Eamonn; Mooney, Stephen; Warnick, Walter L.; Scott, R. L.; Spence, Karen J.; Johnson, Lorrie A.; Allen, Valerie S.; Lederman, Abe

2001-01-01

Includes four articles that discuss Dublin Core metadata, digital rights management and electronic books, including interoperability; and directed query engines, a type of search engine designed to access resources on the deep Web that is being used at the Department of Energy. (LRW)
L1000CDS2: LINCS L1000 characteristic direction signatures search engine.

Science.gov (United States)

Duan, Qiaonan; Reid, St Patrick; Clark, Neil R; Wang, Zichen; Fernandez, Nicolas F; Rouillard, Andrew D; Readhead, Ben; Tritsch, Sarah R; Hodos, Rachel; Hafner, Marc; Niepel, Mario; Sorger, Peter K; Dudley, Joel T; Bavari, Sina; Panchal, Rekha G; Ma'ayan, Avi

2016-01-01

The library of integrated network-based cellular signatures (LINCS) L1000 data set currently comprises of over a million gene expression profiles of chemically perturbed human cell lines. Through unique several intrinsic and extrinsic benchmarking schemes, we demonstrate that processing the L1000 data with the characteristic direction (CD) method significantly improves signal to noise compared with the MODZ method currently used to compute L1000 signatures. The CD processed L1000 signatures are served through a state-of-the-art web-based search engine application called L1000CDS 2 . The L1000CDS 2 search engine provides prioritization of thousands of small-molecule signatures, and their pairwise combinations, predicted to either mimic or reverse an input gene expression signature using two methods. The L1000CDS 2 search engine also predicts drug targets for all the small molecules profiled by the L1000 assay that we processed. Targets are predicted by computing the cosine similarity between the L1000 small-molecule signatures and a large collection of signatures extracted from the gene expression omnibus (GEO) for single-gene perturbations in mammalian cells. We applied L1000CDS 2 to prioritize small molecules that are predicted to reverse expression in 670 disease signatures also extracted from GEO, and prioritized small molecules that can mimic expression of 22 endogenous ligand signatures profiled by the L1000 assay. As a case study, to further demonstrate the utility of L1000CDS 2 , we collected expression signatures from human cells infected with Ebola virus at 30, 60 and 120 min. Querying these signatures with L1000CDS 2 we identified kenpaullone, a GSK3B/CDK2 inhibitor that we show, in subsequent experiments, has a dose-dependent efficacy in inhibiting Ebola infection in vitro without causing cellular toxicity in human cell lines. In summary, the L1000CDS 2 tool can be applied in many biological and biomedical settings, while improving the extraction of
Interactive Web-based e-learning for Studying Flexible Manipulator Systems

Directory of Open Access Journals (Sweden)

Abul K. M. Azad

2008-03-01

Full Text Available AbstractÃ¢Â€Â” This paper presents a web-based e-leaning facility for simulation, modeling, and control of flexible manipulator systems. The simulation and modeling part includes finite difference and finite element simulations along with neural network and genetic algorithm based modeling strategies for flexible manipulator systems. The controller part constitutes a number of open-loop and closed-loop designs. Closed loop control designs include the classical, adaptive, and neuro-model based strategies. Matlab software package and its associated toolboxes are used to implement these. The Matlab web server is used as the gateway between the facility and web-access. ASP.NET technology and SQL database are utilized to develop web applications for access control, user account and password maintenance, administrative management, and facility utilization monitoring. The reported facility provides a flexible but effective approach of web-based interactive e-learning facility of an engineering system. This can be extended to incorporate additional engineering systems within the e-learning framework.
Web-Based Antismoking Advertising to Promote Smoking Cessation: A Randomized Controlled Trial.

Science.gov (United States)

Yom-Tov, Elad; Muennig, Peter; El-Sayed, Abdulrahman M

2016-11-21

Although hundreds of millions of dollars are spent each year on public health advertising, the advertisement content, design, and placement are usually developed by intuition rather than research. The objective of our study was to develop a methodology for testing Web-based advertisements to promote smoking cessation. We developed 10 advertisements that varied by their content (those that empower viewers to quit, help viewers to quit, or discuss the effects of smoking). We then conducted a series of Web-based randomized controlled trials that explored the effects of exposing users of Microsoft's Bing search engine to antismoking advertisements that differed by content, placement, or other characteristics. Finally, we followed users to explore whether they conducted subsequent searches for smoking cessation products or services. The advertisements were shown 710,106 times and clicked on 1167 times. In general, empowering advertisements had the greatest impact (hazard ratio [HR] 2.6, standard error [SE] 0.09 relative to nonempowering advertisements), but we observed significant variations by gender. For instance, we found that men exposed to smoking cessation advertisements were less likely than women to subsequently conduct smoking cessation searches (HR 0.2, SE 0.07), but that this likelihood increased 3.5 times in men exposed to advertisements containing empowering content. Women were more influenced by advertisements that emphasized the health effects of smoking. We also found that appearing at the top right of the page (HR 2.1, SE 0.07) or at the bottom rather than the top of a list (HR 1.1, SE 0.02) can improve smoking cessation advertisements' effectiveness in prompting future searches related to smoking cessation. Advertising should be targeted to different demographic groups in ways that are not always intuitive. Our study provides a method for testing the effectiveness of Web-based antismoking advertisements and demonstrates the importance of advertisements
Ontology-based Semantic Search Engine for Healthcare Services

OpenAIRE

Jotsna Molly Rajan; M. Deepa Lakshmi

2012-01-01

With the development of Web Services, the retrieval of relevant services has become a challenge. The keyword-based discovery mechanism using UDDI and WSDL is insufficient due to the retrievalof a large amount of irrelevant information. Also, keywords are insufficient in expressing semantic concepts since a single concept can be referred using syntactically different terms. Hence, service capabilities need to be manually analyzed, which lead to the development of the Semantic Web for automatic...
WebBio, a web-based management and analysis system for patient data of biological products in hospital.

Science.gov (United States)

Lu, Ying-Hao; Kuo, Chen-Chun; Huang, Yaw-Bin

2011-08-01

We selected HTML, PHP and JavaScript as the programming languages to build "WebBio", a web-based system for patient data of biological products and used MySQL as database. WebBio is based on the PHP-MySQL suite and is run by Apache server on Linux machine. WebBio provides the functions of data management, searching function and data analysis for 20 kinds of biological products (plasma expanders, human immunoglobulin and hematological products). There are two particular features in WebBio: (1) pharmacists can rapidly find out whose patients used contaminated products for medication safety, and (2) the statistics charts for a specific product can be automatically generated to reduce pharmacist's work loading. WebBio has successfully turned traditional paper work into web-based data management.
An approach in building a chemical compound search engine in oracle database.

Science.gov (United States)

Wang, H; Volarath, P; Harrison, R

2005-01-01

A searching or identifying of chemical compounds is an important process in drug design and in chemistry research. An efficient search engine involves a close coupling of the search algorithm and database implementation. The database must process chemical structures, which demands the approaches to represent, store, and retrieve structures in a database system. In this paper, a general database framework for working as a chemical compound search engine in Oracle database is described. The framework is devoted to eliminate data type constrains for potential search algorithms, which is a crucial step toward building a domain specific query language on top of SQL. A search engine implementation based on the database framework is also demonstrated. The convenience of the implementation emphasizes the efficiency and simplicity of the framework.
Multiple Presents: How Search Engines Re-write the Past

NARCIS (Netherlands)

Hellsten, I; Leydesdorff, L.; Wouters, P.

2006-01-01

Internet search engines function in a present which changes continuously. The search engines update their indices regularly, overwriting webpages with newer ones, adding new pages to the index and losing older ones. Some search engines can be used to search for information on the internet for
Using Internet search engines to estimate word frequency.

Science.gov (United States)

Blair, Irene V; Urland, Geoffrey R; Ma, Jennifer E

2002-05-01

The present research investigated Internet search engines as a rapid, cost-effective alternative for estimating word frequencies. Frequency estimates for 382 words were obtained and compared across four methods: (1) Internet search engines, (2) the Kucera and Francis (1967) analysis of a traditional linguistic corpus, (3) the CELEX English linguistic database (Baayen, Piepenbrock, & Gulikers, 1995), and (4) participant ratings of familiarity. The results showed that Internet search engines produced frequency estimates that were highly consistent with those reported by Kucera and Francis and those calculated from CELEX, highly consistent across search engines, and very reliable over a 6-month period of time. Additional results suggested that Internet search engines are an excellent option when traditional word frequency analyses do not contain the necessary data (e.g., estimates for forenames and slang). In contrast, participants' familiarity judgments did not correspond well with the more objective estimates of word frequency. Researchers are advised to use search engines with large databases (e.g., AltaVista) to ensure the greatest representativeness of the frequency estimates.
Search Techniques for the Web of Things: A Taxonomy and Survey

Science.gov (United States)

Zhou, Yuchao; De, Suparna; Wang, Wei; Moessner, Klaus

2016-01-01

The Web of Things aims to make physical world objects and their data accessible through standard Web technologies to enable intelligent applications and sophisticated data analytics. Due to the amount and heterogeneity of the data, it is challenging to perform data analysis directly; especially when the data is captured from a large number of distributed sources. However, the size and scope of the data can be reduced and narrowed down with search techniques, so that only the most relevant and useful data items are selected according to the application requirements. Search is fundamental to the Web of Things while challenging by nature in this context, e.g., mobility of the objects, opportunistic presence and sensing, continuous data streams with changing spatial and temporal properties, efficient indexing for historical and real time data. The research community has developed numerous techniques and methods to tackle these problems as reported by a large body of literature in the last few years. A comprehensive investigation of the current and past studies is necessary to gain a clear view of the research landscape and to identify promising future directions. This survey reviews the state-of-the-art search methods for the Web of Things, which are classified according to three different viewpoints: basic principles, data/knowledge representation, and contents being searched. Experiences and lessons learned from the existing work and some EU research projects related to Web of Things are discussed, and an outlook to the future research is presented. PMID:27128918
People searching for people: analysis of a people search engine log

NARCIS (Netherlands)

Weerkamp, W.; Berendsen, R.; Kovachev, B.; Meij, E.; Balog, K.; de Rijke, M.

2011-01-01

Recent years show an increasing interest in vertical search: searching within a particular type of information. Understanding what people search for in these "verticals" gives direction to research and provides pointers for the search engines themselves. In this paper we analyze the search logs of
An Exploratory Survey of Student Perspectives Regarding Search Engines

Science.gov (United States)

Alshare, Khaled; Miller, Don; Wenger, James

2005-01-01

This study explored college students' perceptions regarding their use of search engines. The main objective was to determine how frequently students used various search engines, whether advanced search features were used, and how many search engines were used. Various factors that might influence student responses were examined. Results showed…
Evaluating Open-Source Full-Text Search Engines for Matching ICD-10 Codes.

Science.gov (United States)

Jurcău, Daniel-Alexandru; Stoicu-Tivadar, Vasile

2016-01-01

This research presents the results of evaluating multiple free, open-source engines on matching ICD-10 diagnostic codes via full-text searches. The study investigates what it takes to get an accurate match when searching for a specific diagnostic code. For each code the evaluation starts by extracting the words that make up its text and continues with building full-text search queries from the combinations of these words. The queries are then run against all the ICD-10 codes until a match indicates the code in question as a match with the highest relative score. This method identifies the minimum number of words that must be provided in order for the search engines choose the desired entry. The engines analyzed include a popular Java-based full-text search engine, a lightweight engine written in JavaScript which can even execute on the user's browser, and two popular open-source relational database management systems.
Effects of Diacritics on Web Search Engines’ Performance for Retrieval of Yoruba Documents

Directory of Open Access Journals (Sweden)

Toluwase Victor Asubiaro

2014-06-01

Full Text Available This paper aims to find out the possible effect of the use or nonuse of diacritics in Yoruba search queries on the performance of major search engines, AOL, Bing, Google and Yahoo!, in retrieving documents. 30 Yoruba queries created from the most searched keywords from Nigeria on Google search logs were submitted to the search engines. The search queries were posed to the search engines without diacritics and then with diacritics. All of the search engines retrieved more sites in response to the queries without diacritics. Also, they all retrieved more precise results for queries without diacritics. The search engines also answered more queries without diacritics. There was no significant difference in the precision values of any two of the four search engines for diacritized and undiacritized queries. There was a significant difference in the effectiveness of AOL and Yahoo when diacritics were applied and when they were not applied. The findings of the study indicate that the search engines do not find a relationship between the diacritized Yoruba words and the undiacritized versions. Therefore, there is a need for search engines to add normalization steps to pre-process Yoruba queries and indexes. This study concentrates on a problem with search engines that has not been previously investigated.
Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows.

Science.gov (United States)

Verheggen, Kenneth; Raeder, Helge; Berven, Frode S; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

2017-09-13

Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines. © 2017 Wiley Periodicals, Inc.

BioTCM-SE: a semantic search engine for the information retrieval of modern biology and traditional Chinese medicine.

Science.gov (United States)

Chen, Xi; Chen, Huajun; Bi, Xuan; Gu, Peiqin; Chen, Jiaoyan; Wu, Zhaohui

2014-01-01

Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM), essentially different from Western Medicine (WM), is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM.
BioTCM-SE: A Semantic Search Engine for the Information Retrieval of Modern Biology and Traditional Chinese Medicine

Directory of Open Access Journals (Sweden)

Xi Chen

2014-01-01

Full Text Available Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM, essentially different from Western Medicine (WM, is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM.
Quality of web-based information on social phobia: a cross-sectional study.

Science.gov (United States)

Khazaal, Yasser; Fernandez, Sebastien; Cochand, Sophie; Reboh, Isabel; Zullino, Daniele

2008-01-01

The objective of the study is to evaluate the quality of web-based information on social phobia and to investigate particular quality indicators. Two keywords, "Social phobia" and "Social Anxiety Disorder", were entered into five popular World Wide Web search engines. Websites were assessed with a standardized proforma designed to rate sites on the basis of accountability, presentation, interactivity, readability, and content quality. "Health On the Net" (HON) quality label and DISCERN scale scores aiding people without content expertise to assess quality of written health publication were used to verify their efficiency as quality indicators. This study evaluates the quality of web-based information on social phobia. On the 200 identified links, 58 were included. On the basis of outcome measures, the overall quality of the sites turned out to be poor. DISCERN and HON label were indicators of good quality indicators. Accountability criteria were poor indicators of site quality. Although social phobia education Websites for patients are common, educational material highly varies in quality and content. There is a need for better evidence-based information about social phobia on the Web and a need to reconsider the role of accountability criteria as indicators of site quality. Clinicians should advise patients of the HON label and DISCERN as useful indicators of site quality. (c) 2007 Wiley-Liss, Inc.
The “I’m Feeling Lucky Syndrome”: Teacher-Candidates’ Knowledge of Web Searching Strategies

Directory of Open Access Journals (Sweden)

Corinne Laverty

2008-06-01

Full Text Available The need for web literacy has become increasingly important with the exponential growth of learning materials on the web that are freely accessible to educators. Teachers need the skills to locate these tools and also the ability to teach their students web search strategies and evaluation of websites so they can effectively explore the web by themselves. This study examined the web searching strategies of 253 teachers-in-training using both a survey (247 participants and live screen capture with think aloud audio recording (6 participants. The results present a picture of the strategic, syntactic, and evaluative search abilities of these students that librarians and faculty can use to plan how instruction can target information skill deficits in university student populations.
RNA FRABASE 2.0: an advanced web-accessible database with the capacity to search the three-dimensional fragments within RNA structures

Directory of Open Access Journals (Sweden)

Wasik Szymon

2010-05-01

Full Text Available Abstract Background Recent discoveries concerning novel functions of RNA, such as RNA interference, have contributed towards the growing importance of the field. In this respect, a deeper knowledge of complex three-dimensional RNA structures is essential to understand their new biological functions. A number of bioinformatic tools have been proposed to explore two major structural databases (PDB, NDB in order to analyze various aspects of RNA tertiary structures. One of these tools is RNA FRABASE 1.0, the first web-accessible database with an engine for automatic search of 3D fragments within PDB-derived RNA structures. This search is based upon the user-defined RNA secondary structure pattern. In this paper, we present and discuss RNA FRABASE 2.0. This second version of the system represents a major extension of this tool in terms of providing new data and a wide spectrum of novel functionalities. An intuitionally operated web server platform enables very fast user-tailored search of three-dimensional RNA fragments, their multi-parameter conformational analysis and visualization. Description RNA FRABASE 2.0 has stored information on 1565 PDB-deposited RNA structures, including all NMR models. The RNA FRABASE 2.0 search engine algorithms operate on the database of the RNA sequences and the new library of RNA secondary structures, coded in the dot-bracket format extended to hold multi-stranded structures and to cover residues whose coordinates are missing in the PDB files. The library of RNA secondary structures (and their graphics is made available. A high level of efficiency of the 3D search has been achieved by introducing novel tools to formulate advanced searching patterns and to screen highly populated tertiary structure elements. RNA FRABASE 2.0 also stores data and conformational parameters in order to provide "on the spot" structural filters to explore the three-dimensional RNA structures. An instant visualization of the 3D RNA
Surveillance Tools Emerging From Search Engines and Social Media Data for Determining Eye Disease Patterns.

Science.gov (United States)

Deiner, Michael S; Lietman, Thomas M; McLeod, Stephen D; Chodosh, James; Porco, Travis C

2016-09-01

Internet-based search engine and social media data may provide a novel complementary source for better understanding the epidemiologic factors of infectious eye diseases, which could better inform eye health care and disease prevention. To assess whether data from internet-based social media and search engines are associated with objective clinic-based diagnoses of conjunctivitis. Data from encounters of 4143 patients diagnosed with conjunctivitis from June 3, 2012, to April 26, 2014, at the University of California San Francisco (UCSF) Medical Center, were analyzed using Spearman rank correlation of each weekly observation to compare demographics and seasonality of nonallergic conjunctivitis with allergic conjunctivitis. Data for patient encounters with diagnoses for glaucoma and influenza were also obtained for the same period and compared with conjunctivitis. Temporal patterns of Twitter and Google web search data, geolocated to the United States and associated with these clinical diagnoses, were compared with the clinical encounters. The a priori hypothesis was that weekly internet-based searches and social media posts about conjunctivitis may reflect the true weekly clinical occurrence of conjunctivitis. Weekly total clinical diagnoses at UCSF of nonallergic conjunctivitis, allergic conjunctivitis, glaucoma, and influenza were compared using Spearman rank correlation with equivalent weekly data on Tweets related to disease or disease-related keyword searches obtained from Google Trends. Seasonality of clinical diagnoses of nonallergic conjunctivitis among the 4143 patients (2364 females [57.1%] and 1776 males [42.9%]) with 5816 conjunctivitis encounters at UCSF correlated strongly with results of Google searches in the United States for the term pink eye (ρ, 0.68 [95% CI, 0.52 to 0.78]; P < .001) and correlated moderately with Twitter results about pink eye (ρ, 0.38 [95% CI, 0.16 to 0.56]; P < .001) and with clinical diagnosis of influenza (ρ, 0
Users' Understanding of Search Engine Advertisements

Directory of Open Access Journals (Sweden)

Lewandowski, Dirk

2017-12-01

Full Text Available In this paper, a large-scale study on users' understanding of search-based advertising is presented. It is based on (1 a survey, (2 a task-based user study, and (3 an online experiment. Data were collected from 1,000 users representative of the German online population. Findings show that users generally lack an understanding of Google's business model and the workings of search-based advertising. 42% of users self-report that they either do not know that it is possible to pay Google for preferred listings for one's company on the SERPs or do not know how to distinguish between organic results and ads. In the task-based user study, we found that only 1.3 percent of participants were able to mark all areas correctly. 9.6 percent had all their identifications correct but did not mark all results they were required to mark. For none of the screenshots given were more than 35% of users able to mark all areas correctly. In the experiment, we found that users who are not able to distinguish between the two results types choose ads around twice as often as users who can recognize the ads. The implications are that models of search engine advertising and of information seeking need to be amended, and that there is a severe need for regulating search-based advertising.
Online information for parents caring for their premature baby at home: A focus group study and systematic web search.

Science.gov (United States)

Alderdice, Fiona; Gargan, Phyl; McCall, Emma; Franck, Linda

2018-01-30

Online resources are a source of information for parents of premature babies when their baby is discharged from hospital. To explore what topics parents deemed important after returning home from hospital with their premature baby and to evaluate the quality of existing websites that provide information for parents post-discharge. In stage 1, 23 parents living in Northern Ireland participated in three focus groups and shared their information and support needs following the discharge of their infant(s). In stage 2, a World Wide Web (WWW) search was conducted using Google, Yahoo and Bing search engines. Websites meeting pre-specified inclusion criteria were reviewed using two website assessment tools and by calculating a readability score. Website content was compared to the topics identified by parents in the focus groups. Five overarching topics were identified across the three focus groups: life at home after neonatal care, taking care of our family, taking care of our premature baby, baby's growth and development and help with getting support and advice. Twenty-nine sites were identified that met the systematic web search inclusion criteria. Fifteen (52%) covered all five topics identified by parents to some extent and 9 (31%) provided current, accurate and relevant information based on the assessment criteria. Parents reported the need for information and support post-discharge from hospital. This was not always available to them, and relevant online resources were of varying quality. Listening to parents needs and preferences can facilitate the development of high-quality, evidence-based, parent-centred resources. © 2018 The Authors Health Expectations published by John Wiley & Sons Ltd.
Síntesis y crítica de las evaluaciones de la efectividad de los motores de búsqueda en la Web. (Synthesis and critical review of evaluations of the effectiveness of Web search engines

Directory of Open Access Journals (Sweden)

Francisco Javier Martínez Méndez

2003-01-01

Full Text Available A considerable number of proposals for measuring the effectiveness of information retrieval systems have been made since the early days of such systems. The consolidation of the World Wide Web as the paradigmatic method for developing the Information Society, and the continuous multiplication of the number of documents published in this environment, has led to the implementation of the most advanced, and extensive information retrieval systems, in the shape of web search engines. Nevertheless, there is an underlying concern about the effectiveness of these systems, especially when they usually present, in response to a question, many documents with little relevance to the users' information needs. The evaluation of these systems has been, up to now, dispersed and various. The scattering is due to the lack of uniformity in the criteria used in evaluation, and this disparity derives from their a periodicity and variable coverage. In this review, we identify three groups of studies: explicit evaluations, experimental evaluations and, more recently, several proposals for the establishment of a global framework to evaluate these systems.
Development and empirical user-centered evaluation of semantically-based query recommendation for an electronic health record search engine.

Science.gov (United States)

Hanauer, David A; Wu, Danny T Y; Yang, Lei; Mei, Qiaozhu; Murkowski-Steffy, Katherine B; Vydiswaran, V G Vinod; Zheng, Kai

2017-03-01

The utility of biomedical information retrieval environments can be severely limited when users lack expertise in constructing effective search queries. To address this issue, we developed a computer-based query recommendation algorithm that suggests semantically interchangeable terms based on an initial user-entered query. In this study, we assessed the value of this approach, which has broad applicability in biomedical information retrieval, by demonstrating its application as part of a search engine that facilitates retrieval of information from electronic health records (EHRs). The query recommendation algorithm utilizes MetaMap to identify medical concepts from search queries and indexed EHR documents. Synonym variants from UMLS are used to expand the concepts along with a synonym set curated from historical EHR search logs. The empirical study involved 33 clinicians and staff who evaluated the system through a set of simulated EHR search tasks. User acceptance was assessed using the widely used technology acceptance model. The search engine's performance was rated consistently higher with the query recommendation feature turned on vs. off. The relevance of computer-recommended search terms was also rated high, and in most cases the participants had not thought of these terms on their own. The questions on perceived usefulness and perceived ease of use received overwhelmingly positive responses. A vast majority of the participants wanted the query recommendation feature to be available to assist in their day-to-day EHR search tasks. Challenges persist for users to construct effective search queries when retrieving information from biomedical documents including those from EHRs. This study demonstrates that semantically-based query recommendation is a viable solution to addressing this challenge. Published by Elsevier Inc.
Web の探索行動と情報評価過程の分析

OpenAIRE

種市, 淳子; 逸村, 裕; TANEICHI, Junko; ITSUMURA, Hiroshi

2005-01-01

In this study, we discussed information seeking behavior on the Web. First, the currentWeb-searching studies are reviewed from the perspective of: (1) Web-searching characteristics; (2) the process model for how users evaluate Web resources. Secondly, we investigated information seeking processes using the Web search engine and online public access catalogue (OPAC) system by undergraduate students, through an experiment and its protocol analysis. The results indicate that: (1) Web-searching p...
The LAILAPS search engine: a feature model for relevance ranking in life science databases.

Science.gov (United States)

Lange, Matthias; Spies, Karl; Colmsee, Christian; Flemming, Steffen; Klapperstück, Matthias; Scholz, Uwe

2010-03-25

Efficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking. In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable. The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries. Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de.
On development of search engine for geodata

Directory of Open Access Journals (Sweden)

David Procházka

2010-01-01

Full Text Available Eﬀective management and sharing of geodata is one of the priorities of the European Union (INSPIRE activity and companies all around the world. Many diﬀerent companies and organisations publish their geodata using web mapping services. This situation leads to a multiple publishing of similar or completely same geodata. On the other hand, there is frequently a problem how to determine an appropriate mapserver with the required data. This paper presents a geodata search engine which solves the problem how to access geodata more eﬀectively. Presented solution aggregates data from the diﬀerent mapservers and provides an interface according to the Open Geospatial Consortium Web Map Server speciﬁcation. This allows to use our solution in the standard GIS tools as common mapserver. Completely new feature is a request which allows to select map layers which fulﬁlls speciﬁed criteria. Selection could be given by keywords in a map layer description and by deﬁning a bounding box on Earth surface. Response is a list of appropriate layers sorted according to their relevance. Presented solution could be among other applications signiﬁcant source of information for many data mining techniques. It allows to interconnect processed data with their space-temporal context.
Search Techniques for the Web of Things: A Taxonomy and Survey

Directory of Open Access Journals (Sweden)

Yuchao Zhou

2016-04-01

Full Text Available The Web of Things aims to make physical world objects and their data accessible through standard Web technologies to enable intelligent applications and sophisticated data analytics. Due to the amount and heterogeneity of the data, it is challenging to perform data analysis directly; especially when the data is captured from a large number of distributed sources. However, the size and scope of the data can be reduced and narrowed down with search techniques, so that only the most relevant and useful data items are selected according to the application requirements. Search is fundamental to the Web of Things while challenging by nature in this context, e.g., mobility of the objects, opportunistic presence and sensing, continuous data streams with changing spatial and temporal properties, efficient indexing for historical and real time data. The research community has developed numerous techniques and methods to tackle these problems as reported by a large body of literature in the last few years. A comprehensive investigation of the current and past studies is necessary to gain a clear view of the research landscape and to identify promising future directions. This survey reviews the state-of-the-art search methods for the Web of Things, which are classified according to three different viewpoints: basic principles, data/knowledge representation, and contents being searched. Experiences and lessons learned from the existing work and some EU research projects related to Web of Things are discussed, and an outlook to the future research is presented.
Search Engine : an effective tool for exploring the Internet

OpenAIRE

Ranasinghe, W.M. Tharanga Dilruk

2006-01-01

The Internet has become the largest source of information. Today, millions of Websites exist and this number continuous to grow. Finding the right information at the right time is the challenge in the Internet age. Search engine is searchable database which allows locating the information on the Internet by submitting the keywords. Search engines can be divided into two categories as the Individual and Meta Search engines. This article discusses the features of these search engines in detail.
Millennial Undergraduate Research Strategies in Web and Library Information Retrieval Systems

Science.gov (United States)

Porter, Brandi

2011-01-01

This article summarizes the author's dissertation regarding search strategies of millennial undergraduate students in Web and library online information retrieval systems. Millennials bring a unique set of search characteristics and strategies to their research since they have never known a world without the Web. Through the use of search engines,…
Search Engine Advertising Effectiveness in a Multimedia Campaign

NARCIS (Netherlands)

Zenetti, German; Bijmolt, Tammo H. A.; Leeflang, Peter S. H.; Klapper, Daniel

2014-01-01

Search engine advertising has become a multibillion-dollar business and one of the dominant forms of advertising on the Internet. This study examines the effectiveness of search engine advertising within a multimedia campaign, with explicit consideration of the interaction effects between search
Google Patents: The global patent search engine

OpenAIRE

Noruzi, Alireza; Abdekhoda, Mohammadhiwa

2014-01-01

Google Patents (www.google.com/patents) includes over 8 million full-text patents. Google Patents works in the same way as the Google search engine. Google Patents is the global patent search engine that lets users search through patents from the USPTO (United States Patent and Trademark Office), EPO (European Patent Office), etc. This study begins with an overview of how to use Google Patent and identifies advanced search techniques not well-documented by Google Patent. It makes several sug...
Improving sensitivity in proteome studies by analysis of false discovery rates for multiple search engines.

Science.gov (United States)

Jones, Andrew R; Siepen, Jennifer A; Hubbard, Simon J; Paton, Norman W

2009-03-01

LC-MS experiments can generate large quantities of data, for which a variety of database search engines are available to make peptide and protein identifications. Decoy databases are becoming widely used to place statistical confidence in result sets, allowing the false discovery rate (FDR) to be estimated. Different search engines produce different identification sets so employing more than one search engine could result in an increased number of peptides (and proteins) being identified, if an appropriate mechanism for combining data can be defined. We have developed a search engine independent score, based on FDR, which allows peptide identifications from different search engines to be combined, called the FDR Score. The results demonstrate that the observed FDR is significantly different when analysing the set of identifications made by all three search engines, by each pair of search engines or by a single search engine. Our algorithm assigns identifications to groups according to the set of search engines that have made the identification, and re-assigns the score (combined FDR Score). The combined FDR Score can differentiate between correct and incorrect peptide identifications with high accuracy, allowing on average 35% more peptide identifications to be made at a fixed FDR than using a single search engine.
Survey of Techniques for Deep Web Source Selection and Surfacing the Hidden Web Content

OpenAIRE

Khushboo Khurana; M.B. Chandak

2016-01-01

Large and continuously growing dynamic web content has created new opportunities for large-scale data analysis in the recent years. There is huge amount of information that the traditional web crawlers cannot access, since they use link analysis technique by which only the surface web can be accessed. Traditional search engine crawlers require the web pages to be linked to other pages via hyperlinks causing large amount of web data to be hidden from the crawlers. Enormous data is available in...

Multilingual Federated Searching Across Heterogeneous Collections.

Science.gov (United States)

Powell, James; Fox, Edward A.

1998-01-01

Describes a scalable system for searching heterogeneous multilingual collections on the World Wide Web. Details Searchable Database Markup Language (SearchDB-ML) for describing the characteristics of a search engine and its interface, and a protocol for requesting word translations between languages. (Author)
iPixel: a visual content-based and semantic search engine for retrieving digitized mammograms by using collective intelligence.

Science.gov (United States)

Alor-Hernández, Giner; Pérez-Gallardo, Yuliana; Posada-Gómez, Rubén; Cortes-Robles, Guillermo; Rodríguez-González, Alejandro; Aguilar-Laserre, Alberto A

2012-09-01

Nowadays, traditional search engines such as Google, Yahoo and Bing facilitate the retrieval of information in the format of images, but the results are not always useful for the users. This is mainly due to two problems: (1) the semantic keywords are not taken into consideration and (2) it is not always possible to establish a query using the image features. This issue has been covered in different domains in order to develop content-based image retrieval (CBIR) systems. The expert community has focussed their attention on the healthcare domain, where a lot of visual information for medical analysis is available. This paper provides a solution called iPixel Visual Search Engine, which involves semantics and content issues in order to search for digitized mammograms. iPixel offers the possibility of retrieving mammogram features using collective intelligence and implementing a CBIR algorithm. Our proposal compares not only features with similar semantic meaning, but also visual features. In this sense, the comparisons are made in different ways: by the number of regions per image, by maximum and minimum size of regions per image and by average intensity level of each region. iPixel Visual Search Engine supports the medical community in differential diagnoses related to the diseases of the breast. The iPixel Visual Search Engine has been validated by experts in the healthcare domain, such as radiologists, in addition to experts in digital image analysis.
Pedagogy for teaching and learning cooperatively on the Web: a Web-based pharmacology course.

Science.gov (United States)

Tse, Mimi M Y; Pun, Sandra P Y; Chan, Moon Fai

2007-02-01

The Internet is becoming a preferred place to find information. Millions of people go online in the search of health and medical information. Likewise, the demand for Web-based courses grows. This article presents the development, utilization and evaluation of a web-based pharmacology course for nursing students. The course was developed based on 150 commonly used drugs. There were 110 year 1 nursing students took part in the course. After attending six hours face to face lecture of pharmacology over three weeks, students were invited to complete a questionnaire (pre-test) about learning pharmacology. The course materials were then uploaded to a WebCT for student's self-directed learning and attempts to pass two scheduled online quizzes. At the end of the semester, students were given the same questionnaire (post-test). There were a significant increase in the understanding compared with memorizing the subject content, the development of problem solving ability in learning pharmacology and becoming an independent learner (p ,0.05). Online quizzes yielded satisfactory results. In the focused group interview, students appreciated the time flexibility and convenience associated with web-based learning, also, they had made good suggestions in enhancing web-based learning. Web-based approach is promising for teaching and learning pharmacology for nurses and other health-care professionals.
Manually Classifying User Search Queries on an Academic Library Web Site

Science.gov (United States)

Chapman, Suzanne; Desai, Shevon; Hagedorn, Kat; Varnum, Ken; Mishra, Sonali; Piacentine, Julie

2013-01-01

The University of Michigan Library wanted to learn more about the kinds of searches its users were conducting through the "one search" search box on the Library Web site. Library staff conducted two investigations. A preliminary investigation in 2011 involved the manual review of the 100 most frequently occurring queries conducted…
BAIKâ€“ PROGRAMMING LANGUAGE BASED ON INDONESIAN LEXICAL PARSING FOR MULTITIER WEB DEVELOPMENT

Directory of Open Access Journals (Sweden)

Haris Hasanudin

2012-05-01

Full Text Available Business software development with global team is increasing rapidly and the programming language as development tool takes the important role in the global web development. The real user friendly programming language should be written in local language for programmer who has native language is not in English. This paper presents our design of BAIK (Bahasa Anak Indonesia untuk Komputerscripting language which syntax is modeled with Bahasa Indonesian for multitier web development. Researcher propose the implementation of Indonesian Parsing Engine and Binary Search Tree structure for memory allocation of variable and compose the language features that support basic Object Oriented Programming, Common Gateway Interface, HTML style manipulation and database connection. Our goal is to build real programming language from simple structure design for web development using Indonesian lexical words. Pengembangan bisnis perangkat lunak dalam tim berskala global meningkat dengan cepat dan bahasa pemrograman berperan penting dalam pengembangan web secara global. Bahasa pemrograman yang benar-benar ramah terhadap pengguna harus ditulis dalam bahasa lokal programmer yang bahasa ibunya bukan Bahasa Inggris. Paper ini menyajikan desain dari bahasa penulisan BAIK (Bahasa Anak Indonesia untuk Komputer, yang sintaksisnya dimodelkan dengan Bahasa Indonesia untuk pengembangan web multitier. Peneliti mengusulkan implementasi dari parsing engine Bahasa Indonesia dan struktur binary search tree untuk alokasi memori terhadap variabel, serta membuat fitur bahasa yang mendukung dasar pemrograman berbasis objek, common gateway interface, manipulasi gaya HTML, dan koneksi basis data. Tujuan penelitian ini adalah untuk menciptakan bahasa pemrograman yang sesungguhnya dan menggunakan desain struktur sederhana untuk pengembangan web dengan menggunakan kata-kata dari Bahasa Indonesia.
Specification framework for engineering adaptive web applications

NARCIS (Netherlands)

Frasincar, F.; Houben, G.J.P.M.; Vdovják, R.

2002-01-01

The growing demand for data-driven Web applications has led to the need for a structured and controlled approach to the engineering of such applications. Both designers and developers need a framework that in all stages of the engineering process allows them to specify the relevant aspects of the
QUEST: An Assessment Tool for Web-Based Learning.

Science.gov (United States)

Choren, Ricardo; Blois, Marcelo; Fuks, Hugo

In 1997, the Software Engineering Laboratory at Pontifical Catholic University of Rio de Janeiro (Brazil) implemented the first version of AulaNet (TM) a World Wide Web-based educational environment. Some of the teaching staff will use this environment in 1998 to offer regular term disciplines through the Web. This paper introduces Quest, a tool…
Development and implementation of an institutional repository within a Science, Engineering and Technology (SET) environment

CSIR Research Space (South Africa)

Van der Merwe, Adèle

2008-10-01

Full Text Available -based searches. The scholarly federated search engine of Google (http://scholar.google.com) has been used extensively but not exclusively. Subscription databases such as ISI’s Web of Knowledge were also used. An analysis of the exiting proprietary database... internal controls to prevent unauthorized changes. • Registration of the IR with search engines and service providers such as Google, OAIster and DOAR demands that the IR manager keep abreast with developments in terms of suitable search engines...
Human Flesh Search Engine and Online Privacy.

Science.gov (United States)

Zhang, Yang; Gao, Hong

2016-04-01

Human flesh search engine can be a double-edged sword, bringing convenience on the one hand and leading to infringement of personal privacy on the other hand. This paper discusses the ethical problems brought about by the human flesh search engine, as well as possible solutions.
Conceptual Web Users' Actions Prediction for Ontology-Based Browsing Recommendations

Science.gov (United States)

Robal, Tarmo; Kalja, Ahto

The Internet consists of thousands of web sites with different kinds of structures. However, users are browsing the web according to their informational expectations towards the web site searched, having an implicit conceptual model of the domain in their minds. Nevertheless, people tend to repeat themselves and have partially shared conceptual views while surfing the web, finding some areas of web sites more interesting than others. Herein, we take advantage of the latter and provide a model and a study on predicting users' actions based on the web ontology concepts and their relations.
A cognitive evaluation of four online search engines for answering definitional questions posed by physicians.

Science.gov (United States)

Yu, Hong; Kaufman, David

2007-01-01

The Internet is having a profound impact on physicians' medical decision making. One recent survey of 277 physicians showed that 72% of physicians regularly used the Internet to research medical information and 51% admitted that information from web sites influenced their clinical decisions. This paper describes the first cognitive evaluation of four state-of-the-art Internet search engines: Google (i.e., Google and Scholar.Google), MedQA, Onelook, and PubMed for answering definitional questions (i.e., questions with the format of "What is X?") posed by physicians. Onelook is a portal for online definitions, and MedQA is a question answering system that automatically generates short texts to answer specific biomedical questions. Our evaluation criteria include quality of answer, ease of use, time spent, and number of actions taken. Our results show that MedQA outperforms Onelook and PubMed in most of the criteria, and that MedQA surpasses Google in time spent and number of actions, two important efficiency criteria. Our results show that Google is the best system for quality of answer and ease of use. We conclude that Google is an effective search engine for medical definitions, and that MedQA exceeds the other search engines in that it provides users direct answers to their questions; while the users of the other search engines have to visit several sites before finding all of the pertinent information.
Understanding and modeling users of modern search engines

NARCIS (Netherlands)

Chuklin, A.

2017-01-01

As search is being used by billions of people, modern search engines are becoming more and more complex. And complexity does not just come from the algorithms. Richer and richer content is being added to search engine result pages: news and sports results, definitions and translations, images and
A reverse engineering approach for automatic annotation of Web pages

NARCIS (Netherlands)

R. de Virgilio (Roberto); F. Frasincar (Flavius); W. Hop (Walter); S. Lachner (Stephan)

2013-01-01

textabstractThe Semantic Web is gaining increasing interest to fulfill the need of sharing, retrieving, and reusing information. Since Web pages are designed to be read by people, not machines, searching and reusing information on the Web is a difficult task without human participation. To this aim
A Web-based modeling tool for the SEMAT Essence theory of software engineering

Directory of Open Access Journals (Sweden)

Daniel Graziotin

2013-09-01

Full Text Available As opposed to more mature subjects, software engineering lacks general theories that establish its foundations as a discipline. The Essence Theory of software engineering (Essence has been proposed by the Software Engineering Methods and Theory (SEMAT initiative. The goal of Essence is to develop a theoretically sound basis for software engineering practice and its wide adoption. However, Essence is far from reaching academic- and industry-wide adoption. The reasons for this include a struggle to foresee its utilization potential and a lack of tools for implementation. SEMAT Accelerator (SematAcc is a Web-positioning tool for a software engineering endeavor, which implements the SEMAT’s Essence kernel. SematAcc permits the use of Essence, thus helping to understand it. The tool enables the teaching, adoption, and research of Essence in controlled experiments and case studies.
Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea.

Science.gov (United States)

Woo, Hyekyung; Cho, Youngtae; Shim, Eunyoung; Lee, Jong-Koo; Lee, Chang-Gun; Kim, Seong Hwan

2016-07-04

As suggested as early as in 2006, logs of queries submitted to search engines seeking information could be a source for detection of emerging influenza epidemics if changes in the volume of search queries are monitored (infodemiology). However, selecting queries that are most likely to be associated with influenza epidemics is a particular challenge when it comes to generating better predictions. In this study, we describe a methodological extension for detecting influenza outbreaks using search query data; we provide a new approach for query selection through the exploration of contextual information gleaned from social media data. Additionally, we evaluate whether it is possible to use these queries for monitoring and predicting influenza epidemics in South Korea. Our study was based on freely available weekly influenza incidence data and query data originating from the search engine on the Korean website Daum between April 3, 2011 and April 5, 2014. To select queries related to influenza epidemics, several approaches were applied: (1) exploring influenza-related words in social media data, (2) identifying the chief concerns related to influenza, and (3) using Web query recommendations. Optimal feature selection by least absolute shrinkage and selection operator (Lasso) and support vector machine for regression (SVR) were used to construct a model predicting influenza epidemics. In total, 146 queries related to influenza were generated through our initial query selection approach. A considerable proportion of optimal features for final models were derived from queries with reference to the social media data. The SVR model performed well: the prediction values were highly correlated with the recent observed influenza-like illness (r=.956; Psearch queries to enhance influenza surveillance in South Korea. In addition, an approach for query selection using social media data seems ideal for supporting influenza surveillance based on search query data.
A web-based approach to data imputation

KAUST Repository

Li, Zhixu

2013-10-24

In this paper, we present WebPut, a prototype system that adopts a novel web-based approach to the data imputation problem. Towards this, Webput utilizes the available information in an incomplete database in conjunction with the data consistency principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme that efficiently leverages our suite of data imputation queries to automatically select the most effective imputation query for each missing value. A greedy iterative algorithm is proposed to schedule the imputation order of the different missing values in a database, and in turn the issuing of their corresponding imputation queries, for improving the accuracy and efficiency of WebPut. Moreover, several optimization techniques are also proposed to reduce the cost of estimating the confidence of imputation queries at both the tuple-level and the database-level. Experiments based on several real-world data collections demonstrate not only the effectiveness of WebPut compared to existing approaches, but also the efficiency of our proposed algorithms and optimization techniques. © 2013 Springer Science+Business Media New York.
Search, Read and Write: An Inquiry into Web Accessibility for People with Dyslexia.

Science.gov (United States)

Berget, Gerd; Herstad, Jo; Sandnes, Frode Eika

2016-01-01

Universal design in context of digitalisation has become an integrated part of international conventions and national legislations. A goal is to make the Web accessible for people of different genders, ages, backgrounds, cultures and physical, sensory and cognitive abilities. Political demands for universally designed solutions have raised questions about how it is achieved in practice. Developers, designers and legislators have looked towards the Web Content Accessibility Guidelines (WCAG) for answers. WCAG 2.0 has become the de facto standard for universal design on the Web. Some of the guidelines are directed at the general population, while others are targeted at more specific user groups, such as the visually impaired or hearing impaired. Issues related to cognitive impairments such as dyslexia receive less attention, although dyslexia is prevalent in at least 5-10% of the population. Navigation and search are two common ways of using the Web. However, while navigation has received a fair amount of attention, search systems are not explicitly included, although search has become an important part of people's daily routines. This paper discusses WCAG in the context of dyslexia for the Web in general and search user interfaces specifically. Although certain guidelines address topics that affect dyslexia, WCAG does not seem to fully accommodate users with dyslexia.
Applying Semantic Web technologies to improve the retrieval, credibility and use of health-related web resources.

Science.gov (United States)

Mayer, Miguel A; Karampiperis, Pythagoras; Kukurikos, Antonis; Karkaletsis, Vangelis; Stamatakis, Kostas; Villarroel, Dagmar; Leis, Angela

2011-06-01

The number of health-related websites is increasing day-by-day; however, their quality is variable and difficult to assess. Various "trust marks" and filtering portals have been created in order to assist consumers in retrieving quality medical information. Consumers are using search engines as the main tool to get health information; however, the major problem is that the meaning of the web content is not machine-readable in the sense that computers cannot understand words and sentences as humans can. In addition, trust marks are invisible to search engines, thus limiting their usefulness in practice. During the last five years there have been different attempts to use Semantic Web tools to label health-related web resources to help internet users identify trustworthy resources. This paper discusses how Semantic Web technologies can be applied in practice to generate machine-readable labels and display their content, as well as to empower end-users by providing them with the infrastructure for expressing and sharing their opinions on the quality of health-related web resources.
Building maps to search the web: the method Sewcom

Directory of Open Access Journals (Sweden)

Corrado Petrucco

2002-01-01

Full Text Available Seeking information on the Internet is becoming a necessity 'at school, at work and in every social sphere. Unfortunately the difficulties' inherent in the use of search engines and the use of unconscious cognitive approaches inefficient limit their effectiveness. It is in this respect presented a method, called SEWCOM that lets you create conceptual maps through interaction with search engines.
Deep Web Search Interface Identification: A Semi-Supervised Ensemble Approach

Directory of Open Access Journals (Sweden)

Hong Wang

2014-12-01

Full Text Available To surface the Deep Web, one crucial task is to predict whether a given web page has a search interface (searchable HyperText Markup Language (HTML form or not. Previous studies have focused on supervised classification with labeled examples. However, labeled data are scarce, hard to get and requires tediousmanual work, while unlabeled HTML forms are abundant and easy to obtain. In this research, we consider the plausibility of using both labeled and unlabeled data to train better models to identify search interfaces more effectively. We present a semi-supervised co-training ensemble learning approach using both neural networks and decision trees to deal with the search interface identification problem. We show that the proposed model outperforms previous methods using only labeled data. We also show that adding unlabeled data improves the effectiveness of the proposed model.

Enhanced Web Interfaces for Administering Invenio Digital Library

CERN Document Server

Batista, João

2012-01-01

Invenio is an open source web-based application that implements a digital library or document server, and it's used at CERN as the base of the CERN Document Server Institutional Repository and the Inspire High Energy Physics Subject Repository. The purpose of this work was to reimplement the administrative interface of the search engine in Invenio, using new and proved open source technologies, to simplify the code base and lay the foundations for the work that it will be done in porting the rest of the administrative interfaces to use these newer technologies. In my time as a CERN openlab summer student I was able to implement some of the features for the WebSearch Admin Interfaces, enhance some of the existing code with new features and find solutions to technical challenges that will be common when porting the other administrative interfaces modules.
Use of search engine optimization factors for Google page rank prediction

OpenAIRE

Tvrdi, Barbara

2012-01-01

Over the years, search engines have become an important tool for finding information. It is known that users select the link on the first page of search results in 62% of the cases. Search engine optimization techniques enable website improvement and therefore a better ranking in search engines. The exact specification of the factors that affect website ranking is not disclosed by search engine owners. In this thesis we tried to choose some most frequently mentioned search engine optimizatio...
Searching Choices: Quantifying Decision-Making Processes Using Search Engine Data.

Science.gov (United States)

Moat, Helen Susannah; Olivola, Christopher Y; Chater, Nick; Preis, Tobias

2016-07-01

When making a decision, humans consider two types of information: information they have acquired through their prior experience of the world, and further information they gather to support the decision in question. Here, we present evidence that data from search engines such as Google can help us model both sources of information. We show that statistics from search engines on the frequency of content on the Internet can help us estimate the statistical structure of prior experience; and, specifically, we outline how such statistics can inform psychological theories concerning the valuation of human lives, or choices involving delayed outcomes. Turning to information gathering, we show that search query data might help measure human information gathering, and it may predict subsequent decisions. Such data enable us to compare information gathered across nations, where analyses suggest, for example, a greater focus on the future in countries with a higher per capita GDP. We conclude that search engine data constitute a valuable new resource for cognitive scientists, offering a fascinating new tool for understanding the human decision-making process. Copyright © 2016 The Authors. Topics in Cognitive Science published by Wiley Periodicals, Inc. on behalf of Cognitive Science Society.
How much data resides in a web collection: how to estimate size of a web collection

NARCIS (Netherlands)

Khelghati, Mohammadreza; Hiemstra, Djoerd; van Keulen, Maurice

2013-01-01

With increasing amount of data in deep web sources (hidden from general search engines behind web forms), accessing this data has gained more attention. In the algorithms applied for this purpose, it is the knowledge of a data source size that enables the algorithms to make accurate decisions in
Anamneses-Based Internet Information Supply: Can a Combination of an Expert System and Meta-Search Engine Help Consumers find the Health Information they Require?

Science.gov (United States)

Honekamp, Wilfried; Ostermann, Herwig

2010-04-09

An increasing number of people search for health information online. During the last 10 years various researchers have determined the requirements for an ideal consumer health information system. The aim of this study was to figure out, whether medical laymen can find a more accurate diagnosis for a given anamnesis via the developed prototype health information system than via ordinary internet search.In a randomized controlled trial, the prototype information system was evaluated by the assessment of two sample cases. Participants had to determine the diagnosis of a patient with a headache via information found searching the web. A patient's history sheet and a computer with internet access were provided to the participants and they were guided through the study by an especially designed study website. The intervention group used the prototype information system; the control group used common search engines and portals. The numbers of correct diagnoses in each group were compared.A total of 140 (60/80) participants took part in two study sections. In the first case, which determined a common diagnosis, both groups did equally well. In the second section, which determined a less common and more complex case, the intervention group did significantly better (P=0.031) due to the tailored information supply.Using medical expert systems in combination with a portal searching meta-search engine represents a feasible strategy to provide reliable patient-tailored information and can ultimately contribute to patient safety with respect to information found via the internet.
Search as Learning (Dagstuhl Seminar 17092)

OpenAIRE

Collins-Thompson, Kevyn; Hansen, Preben; Hauff, Claudia

2017-01-01

This report describes the program and the results of Dagstuhl Seminar 17092 "Search as Learning", which brought together 26 researchers from diverse research backgrounds. The motivation for the seminar stems from the fact that modern Web search engines are largely engineered and optimized to fulfill lookup tasks instead of complex search tasks. The latter though are an essential component of information discovery and learning. The 3-day seminar started with four perspective talks, providing f...
Earth Science Mining Web Services

Science.gov (United States)

Pham, Long; Lynnes, Christopher; Hegde, Mahabaleshwa; Graves, Sara; Ramachandran, Rahul; Maskey, Manil; Keiser, Ken

2008-01-01

To allow scientists further capabilities in the area of data mining and web services, the Goddard Earth Sciences Data and Information Services Center (GES DISC) and researchers at the University of Alabama in Huntsville (UAH) have developed a system to mine data at the source without the need of network transfers. The system has been constructed by linking together several pre-existing technologies: the Simple Scalable Script-based Science Processor for Measurements (S4PM), a processing engine at he GES DISC; the Algorithm Development and Mining (ADaM) system, a data mining toolkit from UAH that can be configured in a variety of ways to create customized mining processes; ActiveBPEL, a workflow execution engine based on BPEL (Business Process Execution Language); XBaya, a graphical workflow composer; and the EOS Clearinghouse (ECHO). XBaya is used to construct an analysis workflow at UAH using ADam components, which are also installed remotely at the GES DISC, wrapped as Web Services. The S4PM processing engine searches ECHO for data using space-time criteria, staging them to cache, allowing the ActiveBPEL engine to remotely orchestras the processing workflow within S4PM. As mining is completed, the output is placed in an FTP holding area for the end user. The goals are to give users control over the data they want to process, while mining data at the data source using the server's resources rather than transferring the full volume over the internet. These diverse technologies have been infused into a functioning, distributed system with only minor changes to the underlying technologies. The key to the infusion is the loosely coupled, Web-Services based architecture: All of the participating components are accessible (one way or another) through (Simple Object Access Protocol) SOAP-based Web Services.
Finding Specification Pages from the Web

Science.gov (United States)

Yoshinaga, Naoki; Torisawa, Kentaro

This paper presents a method of finding a specification page on the Web for a given object (e.g., ``Ch. d'Yquem'') and its class label (e.g., ``wine''). A specification page for an object is a Web page which gives concise attribute-value information about the object (e.g., ``county''-``Sauternes'') in well formatted structures. A simple unsupervised method using layout and symbolic decoration cues was applied to a large number of the Web pages to acquire candidate attributes for each class (e.g., ``county'' for a class ``wine''). We then filter out irrelevant words from the putative attributes through an author-aware scoring function that we called site frequency. We used the acquired attributes to select a representative specification page for a given object from the Web pages retrieved by a normal search engine. Experimental results revealed that our system greatly outperformed the normal search engine in terms of this specification retrieval.
Database with web interface and search engine as a diagnostics tool for electromagnetic calorimeter

CERN Document Server

Paluoja, Priit

2017-01-01

During 2016 data collection, the Compact Muon Solenoid Data Acquisition (CMS DAQ) system has shown a very good reliability. Nevertheless, the high complexity of the hardware and the software involved is, by its nature, prone to some occasional problems. As CMS subdetector, electromagnetic calorimeter (ECAL) is affected in the same way. Some of the issues are not predictable and can appear during the year more than once such as components getting noisy, power shortcuts or failing communication between machines. The chain detection-diagnosis-intervention must be as fast as possible to minimise the downtime of the detector. The aim of this project was to create a diagnostic software for ECAL crew, which consists of database and its web interface that allows to search, add and edit the contents of the database.
Efficient Top-k Locality Search for Co-located Spatial Web Objects

DEFF Research Database (Denmark)

Qu, Qiang; Liu, Siyuan; Yang, Bin

2014-01-01

In step with the web being used widely by mobile users, user location is becoming an essential signal in services, including local intent search. Given a large set of spatial web objects consisting of a geographical location and a textual description (e.g., online business directory entries of re...
Web-based information on the treatment of oral leukoplakia - quality and readability.

Science.gov (United States)

Wiriyakijja, Paswach; Fedele, Stefano; Porter, Stephen; Ni Riordain, Richeal

2016-09-01

To categorise the content and assess the quality and readability of the online information regarding the treatment for oral leukoplakia. An online search using the term 'leukoplakia treatment' was carried out on 8th June 2015 using the Google search engine. The content, quality and readability of the first 100 sites were explored. The quality of the web information was assessed using the following tools, the DISCERN instrument and the Journal of the American Medical Association (JAMA) benchmarks for website analysis and the HON seal. Readability was assessed via the Flesch Reading Ease Score. The search strategy generated 357 000 sites on the Google search engine. Due to duplicate links, non-operating links and irrelevant links, a total of 47 of the first 100 websites were included in this study. The mean overall rating achieved by included websites using the DISCERN instrument was 2.3. With regard to the JAMA benchmarks, the vast majority of examined websites (95.7%) completely fulfilled the disclosure benchmark and less than 50% of included websites met the three remaining criteria. A mean total readability score of 47.5 was recorded with almost 90% of websites having a readability level ranging from fairly difficult to very difficult. Based on this study, the online health information regarding oral leukoplakia has challenging readability with content of questionable accuracy. As patients often search for health information online, it would be prudent for clinicians to highlight the caution with which online information should be interpreted. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
GASS-WEB: a web server for identifying enzyme active sites based on genetic algorithms.

Science.gov (United States)

Moraes, João P A; Pappa, Gisele L; Pires, Douglas E V; Izidoro, Sandro C

2017-07-03

Enzyme active sites are important and conserved functional regions of proteins whose identification can be an invaluable step toward protein function prediction. Most of the existing methods for this task are based on active site similarity and present limitations including performing only exact matches on template residues, template size restraints, despite not being capable of finding inter-domain active sites. To fill this gap, we proposed GASS-WEB, a user-friendly web server that uses GASS (Genetic Active Site Search), a method based on an evolutionary algorithm to search for similar active sites in proteins. GASS-WEB can be used under two different scenarios: (i) given a protein of interest, to match a set of specific active site templates; or (ii) given an active site template, looking for it in a database of protein structures. The method has shown to be very effective on a range of experiments and was able to correctly identify >90% of the catalogued active sites from the Catalytic Site Atlas. It also managed to achieve a Matthew correlation coefficient of 0.63 using the Critical Assessment of protein Structure Prediction (CASP 10) dataset. In our analysis, GASS was ranking fourth among 18 methods. GASS-WEB is freely available at http://gass.unifei.edu.br/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Rare disease diagnosis: A review of web search, social media and large-scale data-mining approaches.

Science.gov (United States)

Svenstrup, Dan; Jørgensen, Henrik L; Winther, Ole

2015-01-01

Physicians and the general public are increasingly using web-based tools to find answers to medical questions. The field of rare diseases is especially challenging and important as shown by the long delay and many mistakes associated with diagnoses. In this paper we review recent initiatives on the use of web search, social media and data mining in data repositories for medical diagnosis. We compare the retrieval accuracy on 56 rare disease cases with known diagnosis for the web search tools google.com, pubmed.gov, omim.org and our own search tool findzebra.com. We give a detailed description of IBM's Watson system and make a rough comparison between findzebra.com and Watson on subsets of the Doctor's dilemma dataset. The recall@10 and recall@20 (fraction of cases where the correct result appears in top 10 and top 20) for the 56 cases are found to be be 29%, 16%, 27% and 59% and 32%, 18%, 34% and 64%, respectively. Thus, FindZebra has a significantly (p mining tools and social media are some of the areas that hold promise.
Comparing the Influence of Title and URL in Information Retrieval Relevance in Search Engines Results between Human Science and Agriculture Science

Directory of Open Access Journals (Sweden)

Parisa Allami

2012-12-01

Full Text Available When the World Wide Web provides suitable methods for producing and publishing information to scientists, the Web has become a mediator to publishing information. This environment has been formed billions of web pages that each of them has a special title, special content, special address and special purpose. Search engines provide a variety of facilities limit search results to raise the possibility of relevance in the retrieval results. One of these facilities is the limitation of the keywords and search terms to the title or URL. It can increase the possibility of results relevance significantly. Search engines claim what are limited to title and URL is most relevant. This research tried to compare the results relevant between results limited in title and URL in agricultural and Humanities areas from their users sights also it notice to Comparison of the presence of keywords in the title and URL between two areas and the relationship between search query numbers and matching keywords in title and their URLs. For this purpose, the number of 30 students in each area whom were in MA process and in doing their thesis was chosen. There was a significant relevant of the results that they limited their information needs to title and URL. There was significantly relevance in URL results in agricultural area, but there was not any significant difference between title and URL results in the humanities. For comparing the number of keywords in title and URL in two areas, 30 keywords in each area were chosen. There was not any significantly difference between the number of keywords in the title and URL of websites in two areas. To show relationship between number of search keyword and the matching of title and URL 45 keywords in each area were chosen. They were divided to three parts (one keyword, two keywords and three keywords. It was determined that if search keyword was less, the amount of matching between title and URL was more and if the matching
In Search of a Better Search Engine

Science.gov (United States)

Kolowich, Steve

2009-01-01

Early this decade, the number of Web-based documents stored on the servers of the University of Florida hovered near 300,000. By the end of 2006, that number had leapt to four million. Two years later, the university hosts close to eight million Web documents. Web sites for colleges and universities everywhere have become repositories for data…
Quality of Web-Based Educational Interventions for Clinicians on Human Papillomavirus Vaccine: Content and Usability Assessment.

Science.gov (United States)

Rosen, Brittany L; Bishop, James M; McDonald, Skye L; Kahn, Jessica A; Kreps, Gary L

2018-02-16

Human papillomavirus (HPV) vaccination rates fall far short of Healthy People 2020 objectives. A leading reason is that clinicians do not recommend the vaccine consistently and strongly to girls and boys in the age group recommended for vaccination. Although Web-based HPV vaccine educational interventions for clinicians have been created to promote vaccination recommendations, rigorous evaluations of these interventions have not been conducted. Such evaluations are important to maximize the efficacy of educational interventions in promoting clinician recommendations for HPV vaccination. The objectives of our study were (1) to expand previous research by systematically identifying HPV vaccine Web-based educational interventions developed for clinicians and (2) to evaluate the quality of these Web-based educational interventions as defined by access, content, design, user evaluation, interactivity, and use of theory or models to create the interventions. Current HPV vaccine Web-based educational interventions were identified from general search engines (ie, Google), continuing medical education search engines, health department websites, and professional organization websites. Web-based educational interventions were included if they were created for clinicians (defined as individuals qualified to deliver health care services, such as physicians, clinical nurses, and school nurses, to patients aged 9 to 26 years), delivered information about the HPV vaccine and how to increase vaccination rates, and provided continuing education credits. The interventions' content and usability were analyzed using 6 key indicators: access, content, design, evaluation, interactivity, and use of theory or models. A total of 21 interventions were identified, out of which 7 (33%) were webinars, 7 (33%) were videos or lectures, and 7 (33%) were other (eg, text articles, website modules). Of the 21 interventions, 17 (81%) identified the purpose of the intervention, 12 (57%) provided the
Blending vertical and web results: A case study using video intent

NARCIS (Netherlands)

Lefortier, D.; Serdyukov, P.; Romanenko, F.; de Rijke, M.; de Rijke, M.; Kenter, T.; de Vries, A.P.; Zhai, C.X.; de Jong, F.; Radinsky, K.; Hofmann, K.

2014-01-01

Modern search engines aggregate results from specialized verticals into the Web search results. We study a setting where vertical and Web results are blended into a single result list, a setting that has not been studied before. We focus on video intent and present a detailed observational study of
Implementasi Seo Web Design Methodology Pada Official Homepage Pondok Pesantren Qodratullah

OpenAIRE

Ependi, Usman

2013-01-01

Homepage or website for an organization is a way to deliver information to the public. Now the number of homepage or website of the day is always increasing both personal or owned by the organization. To communicate or disseminate information homepage/ website Islamic Boarding School of Qodratullah need a surefire way to use the Search Engine Optimization Web Design Methodology. Conducted with the implementation of the Search Engine Optimization Web Design Methodology on the homepage/ website...
Problem-Based Learning in Web Environments: The Case of ``Virtual eBMS'' for Business Engineering Education

Science.gov (United States)

Elia, Gianluca; Secundo, Giustina; Taurino, Cesare

This chapter presents a case study where Problem Based Learning (PBL) approach is applied to a Web-based environment. It first describes the main features behind the PBL for creating Business Engineers able to face the grand technological challenges of the 2020. Then it introduces a Web Based system supporting the PBL strategy, called the “Virtual eBMS”. This system has been designed and implemented at the e-Business Management Section of the Scuola Superiore ISUFI - University of Salento (Italy), in the framework of a research project carried out in collaboration with IBM. Besides the logical and technological description of Virtual eBMS, the chapter presents two applications of the platform in two different contexts: an academic context (international master) and an entrepreneurial context (awareness workshop with companies and entrepreneurs). The system is illustrated starting from the description of an operational framework for designing curricula PBL based from the author perspective and, then, illustrating a typical scenario of a learner accessing to the curricula. In the description, it is highlighted both the “structured” way and the “unstructured” way to create and follow an entire learning path.
A Web Search on Environmental Topics: What Is the Role of Ranking?

OpenAIRE

Covolo, Loredana; Filisetti, Barbara; Mascaretti, Silvia; Limina, Rosa Maria; Gelatti, Umberto

2013-01-01

Background: Although the Internet is easy to use, the mechanisms and logic behind a Web search are often unknown. Reliable information can be obtained, but it may not be visible as the Web site is not located in the first positions of search results. The possible risks of adverse health effects arising from environmental hazards are issues of increasing public interest, and therefore the information about these risks, particularly on topics for which there is no scientific evidence, is ver...

Some links on this page may take you to non-federal websites. Their policies may differ from this site.