WorldWideScience

Sample records for web query federation

  1. Federated query processing for the semantic web

    CERN Document Server

    Buil-Aranda, C

    2014-01-01

    During the last years, the amount of RDF data has increased exponentially over the Web, exposed via SPARQL endpoints. These SPARQL endpoints allow users to direct SPARQL queries to the RDF data. Federated SPARQL query processing allows to query several of these RDF databases as if they were a single one, integrating the results from all of them. This is a key concept in the Web of Data and it is also a hot topic in the community. Besides of that, the W3C SPARQL-WG has standardized it in the new Recommendation SPARQL 1.1.This book provides a formalisation of the W3C proposed recommendation. Thi

  2. A journey to Semantic Web query federation in the life sciences.

    Science.gov (United States)

    Cheung, Kei-Hoi; Frost, H Robert; Marshall, M Scott; Prud'hommeaux, Eric; Samwald, Matthias; Zhao, Jun; Paschke, Adrian

    2009-10-01

    As interest in adopting the Semantic Web in the biomedical domain continues to grow, Semantic Web technology has been evolving and maturing. A variety of technological approaches including triplestore technologies, SPARQL endpoints, Linked Data, and Vocabulary of Interlinked Datasets have emerged in recent years. In addition to the data warehouse construction, these technological approaches can be used to support dynamic query federation. As a community effort, the BioRDF task force, within the Semantic Web for Health Care and Life Sciences Interest Group, is exploring how these emerging approaches can be utilized to execute distributed queries across different neuroscience data sources. We have created two health care and life science knowledge bases. We have explored a variety of Semantic Web approaches to describe, map, and dynamically query multiple datasets. We have demonstrated several federation approaches that integrate diverse types of information about neurons and receptors that play an important role in basic, clinical, and translational neuroscience research. Particularly, we have created a prototype receptor explorer which uses OWL mappings to provide an integrated list of receptors and executes individual queries against different SPARQL endpoints. We have also employed the AIDA Toolkit, which is directed at groups of knowledge workers who cooperatively search, annotate, interpret, and enrich large collections of heterogeneous documents from diverse locations. We have explored a tool called "FeDeRate", which enables a global SPARQL query to be decomposed into subqueries against the remote databases offering either SPARQL or SQL query interfaces. Finally, we have explored how to use the vocabulary of interlinked Datasets (voiD) to create metadata for describing datasets exposed as Linked Data URIs or SPARQL endpoints. We have demonstrated the use of a set of novel and state-of-the-art Semantic Web technologies in support of a neuroscience query

  3. Efficient Query Rewrite for Structured Web Queries

    CERN Document Server

    Gollapudi, Sreenivas; Ntoulas, Alexandros; Paparizos, Stelios

    2011-01-01

    Web search engines and specialized online verticals are increasingly incorporating results from structured data sources to answer semantically rich user queries. For example, the query \\WebQuery{Samsung 50 inch led tv} can be answered using information from a table of television data. However, the users are not domain experts and quite often enter values that do not match precisely the underlying data. Samsung makes 46- or 55- inch led tvs, but not 50-inch ones. So a literal execution of the above mentioned query will return zero results. For optimal user experience, a search engine would prefer to return at least a minimum number of results as close to the original query as possible. Furthermore, due to typical fast retrieval speeds in web-search, a search engine query execution is time-bound. In this paper, we address these challenges by proposing algorithms that rewrite the user query in a principled manner, surfacing at least the required number of results while satisfying the low-latency constraint. We f...

  4. Identifying Aspects for Web-Search Queries

    OpenAIRE

    Wu, Fei; Madhavan, Jayant; Halevy, Alon

    2014-01-01

    Many web-search queries serve as the beginning of an exploration of an unknown space of information, rather than looking for a specific web page. To answer such queries effec- tively, the search engine should attempt to organize the space of relevant information in a way that facilitates exploration. We describe the Aspector system that computes aspects for a given query. Each aspect is a set of search queries that together represent a distinct information need relevant to the original search...

  5. Responsive web design with jQuery

    CERN Document Server

    Carlos, Gilberto

    2013-01-01

    Responsive Web Design with jQuery follows a standard tutorial-based approach, covering various aspects of responsive web design by building a comprehensive website.""Responsive Web Design with jQuery"" is aimed at web designers who are interested in building device-agnostic websites. You should have a grasp of standard HTML, CSS, and JavaScript development, and have a familiarity with graphic design. Some exposure to jQuery and HTML5 will be beneficial but isn't essential.

  6. Web development with jQuery

    CERN Document Server

    York, Richard

    2015-01-01

    Newly revised and updated resource on jQuery's many features and advantages Web Development with jQuery offers a major update to the popular Beginning JavaScript and CSS Development with jQuery from 2009. More than half of the content is new or updated, and reflects recent innovations with regard to mobile applications, jQuery mobile, and the spectrum of associated plugins. Readers can expect thorough revisions with expanded coverage of events, CSS, AJAX, animation, and drag and drop. New chapters bring developers up to date on popular features like jQuery UI, navigation, tables, interacti

  7. Animating the Web with jQuery

    Directory of Open Access Journals (Sweden)

    Asokan M

    2013-02-01

    Full Text Available World globalization and present day technology increases the web users rapidly. Every website is trying to attract the web users. The web site creators /developers add different kind of animations to their websites. There are many softwares available to create animation. jQuery can be used to create interactive and powerful web pages with animations. JQuery is a JavaScript library intendedto make Java Script programming easier and more fun. A JavaScript library is a complex JavaScript program that both simplifies difficult tasks and solves cross-browser problems. With jQuery, we canaccomplish tasks in a single line of code. JQuery is used on millions of websites. This paper discuss about the advantages and usage statistics of jQuery on the web. A complete procedure to create a slider and banner plug-ins are also included. They are tested with different browsers.

  8. Improving query services of web map by web mining

    Science.gov (United States)

    Huang, Maojun

    2007-11-01

    Web map is the hybrid of map and the World Wide Web (known as Web). It is usually created with WebGIS techniques. With the rapid social development, web maps oriented the public are facing pressure that dissatisfy the increased demanding. The geocoding database plays a key role in supporting query services effectively. The traditional geocoding method is laborious and time-consuming. And there is much online spatial information, which would be the supplementary information source for geocoding. Therefore, this paper discusses how to improve query services by web mining. The improvement can be described from three facets: first, improving location query by discovering and extracting address information from the Web to extend geocoding database. Second, enhancing the ability of optimum path query of public traffic and buffer query by spatial analyzing and reasoning on the extended geocoding database. Third, adjusting strategies of collecting data according to patterns discovered by web map query mining. Finally, this paper presents the designing of the application system and experimental results.

  9. Date restricted queries in web search engines

    OpenAIRE

    Lewandowski, Dirk

    2004-01-01

    Search engines usually offer a date restricted search on their advanced search pages. But determining the actual update of a web page is not without problems. We conduct a study testing date restricted queries on the search engines Google, Teoma and Yahoo!. We find that these searches fail to work properly in the examined engines. We discuss implications of this for further research and search engine development.

  10. Deep web query interface understanding and integration

    CERN Document Server

    Dragut, Eduard C; Yu, Clement T

    2012-01-01

    There are millions of searchable data sources on the Web and to a large extent their contents can only be reached through their own query interfaces. There is an enormous interest in making the data in these sources easily accessible. There are primarily two general approaches to achieve this objective. The first is to surface the contents of these sources from the deep Web and add the contents to the index of regular search engines. The second is to integrate the searching capabilities of these sources and support integrated access to them. In this book, we introduce the state-of-the-art tech

  11. An Efficient Query Rewriting Approach for Web Cached Data Management

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    With the internet development, querying data on the Web is an attention problem of involving information from distributed, and often dynamically, related Web sources. Basically, some sub-queries can be effectively cached from previous queries or materialized views in order to achieve a better query performance based on the notion of rewriting queries. In this paper, we propose a novel query-rewriting model, called Hierarchical Query Tree, for representing Web queries. Hierarchical Query Tree is a labeled tree that is suitable for representing the inherent hierarchy feature of data on the Web. Based on Hierarchical Query Tree, we use case-based approach to determine what the query results should be. The definitions of queries and query results are both represented as labeled trees. Thus, we can use the same model for representing cases and the medium query results can also be dynamically updated by the user queries. We show that our case-based method can be used to answer a new query based on the combination of previous queries, including changes of requirements and various information sources.

  12. A novel methodology for querying web images

    Science.gov (United States)

    Prabhakara, Rashmi; Lee, Ching Cheng

    2005-01-01

    Ever since the advent of Internet, there has been an immense growth in the amount of image data that is available on the World Wide Web. With such a magnitude of image availability, an efficient and effective image retrieval system is required to make use of this information. This research presents an effective image matching and indexing technique that improvises on existing integrated image retrieval methods. The proposed technique follows a two-phase approach, integrating query by topic and query by example specification methods. The first phase consists of topic-based image retrieval using an improved text information retrieval (IR) technique that makes use of the structured format of HTML documents. It consists of a focused crawler that not only provides for the user to enter the keyword for the topic-based search but also, the scope in which the user wants to find the images. The second phase uses the query by example specification to perform a low-level content-based image match for the retrieval of smaller and relatively closer results of the example image. Information related to the image feature is automatically extracted from the query image by the image processing system. A technique that is not computationally intensive based on color feature is used to perform content-based matching of images. The main goal is to develop a functional image search and indexing system and to demonstrate that better retrieval results can be achieved with this proposed hybrid search technique.

  13. A Preliminary Mapping of Web Queries Using Existing Image Query Schemes.

    Science.gov (United States)

    Jansen, Bernard J.

    End user searching on the Web has become the primary method of locating images for many people. This study investigates the nature of Web image queries by attempting to map them to known image classification schemes. In this study, approximately 100,000 image queries from a major Web search engine were collected in 1997, 1999, and 2001. A…

  14. Federated query services provided by the Seamless SAR Archive project

    Science.gov (United States)

    Baker, S.; Bryson, G.; Buechler, B.; Meertens, C. M.; Crosby, C. J.; Fielding, E. J.; Nicoll, J.; Youn, C.; Baru, C.

    2013-12-01

    The NASA Advancing Collaborative Connections for Earth System Science (ACCESS) seamless synthetic aperture radar (SAR) archive (SSARA) project is a 2-year collaboration between UNAVCO, the Alaska Satellite Facility (ASF), the Jet Propulsion Laboratory (JPL), and OpenTopography at the San Diego Supercomputer Center (SDSC) to design and implement a seamless distributed access system for SAR data and derived data products (i.e. interferograms). A major milestone for the first year of the SSARA project was a unified application programming interface (API) for SAR data search and results at ASF and UNAVCO (WInSAR and EarthScope data archives) through the use of simple web services. A federated query service was developed using the unified APIs, providing users a single search interface for both archives (http://www.unavco.org/ws/brokered/ssara/sar/search). A command line client that utilizes this new service is provided as an open source utility for the community on GitHub (https://github.com/bakerunavco/SSARA). Further API development and enhancements added more InSAR specific keywords and quality control parameters (Doppler centroid, faraday rotation, InSAR stack size, and perpendicular baselines). To facilitate InSAR processing, the federated query service incorporated URLs for DEM (from OpenTopography) and tropospheric corrections (from the JPL OSCAR service) in addition to the URLs for SAR data. This federated query service will provide relevant QC metadata for selecting pairs of SAR data for InSAR processing and all the URLs necessary for interferogram generation. Interest from the international community has prompted an effort to incorporate other SAR data archives (the ESA Virtual Archive 4 and the DLR TerraSAR-X_SSC Geohazard Supersites and Natural Laboratories collections) into the federated query service which provide data for researchers outside the US and North America.

  15. Improved query difficulty prediction for the web

    NARCIS (Netherlands)

    Hauff, C.; Murdock, V.; Baeza-Yates, R.

    2008-01-01

    Query performance prediction aims to predict whether a query will have a high average precision given retrieval from a particular collection, or low average precision. An accurate estimator of the quality of search engine results can allow the search engine to decide to which queries to apply query

  16. OntoQuery: easy-to-use web-based OWL querying.

    Science.gov (United States)

    Tudose, Ilinca; Hastings, Janna; Muthukrishnan, Venkatesh; Owen, Gareth; Turner, Steve; Dekker, Adriano; Kale, Namrata; Ennis, Marcus; Steinbeck, Christoph

    2013-11-15

    The Web Ontology Language (OWL) provides a sophisticated language for building complex domain ontologies and is widely used in bio-ontologies such as the Gene Ontology. The Protégé-OWL ontology editing tool provides a query facility that allows composition and execution of queries with the human-readable Manchester OWL syntax, with syntax checking and entity label lookup. No equivalent query facility such as the Protégé Description Logics (DL) query yet exists in web form. However, many users interact with bio-ontologies such as chemical entities of biological interest and the Gene Ontology using their online Web sites, within which DL-based querying functionality is not available. To address this gap, we introduce the OntoQuery web-based query utility.  The source code for this implementation together with instructions for installation is available at http://github.com/IlincaTudose/OntoQuery. OntoQuery software is fully compatible with all OWL-based ontologies and is available for download (CC-0 license). The ChEBI installation, ChEBI OntoQuery, is available at http://www.ebi.ac.uk/chebi/tools/ontoquery. hastings@ebi.ac.uk.

  17. Semantic Annotations and Querying of Web Data Sources

    Science.gov (United States)

    Hornung, Thomas; May, Wolfgang

    A large part of the Web, actually holding a significant portion of the useful information throughout the Web, consists of views on hidden databases, provided by numerous heterogeneous interfaces that are partly human-oriented via Web forms ("Deep Web"), and partly based on Web Services (only machine accessible). In this paper we present an approach for annotating these sources in a way that makes them citizens of the Semantic Web. We illustrate how queries can be stated in terms of the ontology, and how the annotations are used to selected and access appropriate sources and to answer the queries.

  18. A Hierarchical Approach to Model Web Query Interfaces for Web Source Integration

    OpenAIRE

    Kabisch, Thomas; Dragut, Eduard; Yu, Clement; Leser, Ulf

    2009-01-01

    Much data in the Web is hidden behind Web query interfaces. In most cases the only means to "surface" the content of a Web database is by formulating complex queries on such interfaces. Applications such as Deep Web crawling and Web database integration require an automatic usage of these interfaces. Therefore, an important problem to be addressed is the automatic extraction of query interfaces into an appropriate model. We hypothesize the existence of a set of domain-independent "commonsense...

  19. Error Checking for Chinese Query by Mining Web Log

    Directory of Open Access Journals (Sweden)

    Jianyong Duan

    2015-01-01

    Full Text Available For the search engine, error-input query is a common phenomenon. This paper uses web log as the training set for the query error checking. Through the n-gram language model that is trained by web log, the queries are analyzed and checked. Some features including query words and their number are introduced into the model. At the same time data smoothing algorithm is used to solve data sparseness problem. It will improve the overall accuracy of the n-gram model. The experimental results show that it is effective.

  20. Web Database Schema Identification through Simple Query Interface

    Science.gov (United States)

    Lin, Ling; Zhou, Lizhu

    Web databases provide different types of query interfaces to access the data records stored in the backend databases. While most existing works exploit a complex query interface with multiple input fields to perform schema identification of the Web databases, little attention has been paid on how to identify the schema of web databases by simple query interface (SQI), which has only one single query text input field. This paper proposes a new method of instance-based query probing to identify WDBs' interface and result schema for SQI. The interface schema identification problem is defined as generating the fullcondition query of SQI and a novel query probing strategy is proposed. The result schema is also identified based on the result webpages of SQI's full-condition query, and an extended identification of the non-query attributes is proposed to improve the attribute recall rate. Experimental results on web databases of online shopping for book, movie and mobile phone show that our method is effective and efficient.

  1. Improving Web Search for Difficult Queries

    Science.gov (United States)

    Wang, Xuanhui

    2009-01-01

    Search engines have now become essential tools in all aspects of our life. Although a variety of information needs can be served very successfully, there are still a lot of queries that search engines can not answer very effectively and these queries always make users feel frustrated. Since it is quite often that users encounter such "difficult…

  2. The effect of query complexity on Web searching results

    Directory of Open Access Journals (Sweden)

    B.J. Jansen

    2000-01-01

    Full Text Available This paper presents findings from a study of the effects of query structure on retrieval by Web search services. Fifteen queries were selected from the transaction log of a major Web search service in simple query form with no advanced operators (e.g., Boolean operators, phrase operators, etc. and submitted to 5 major search engines - Alta Vista, Excite, FAST Search, Infoseek, and Northern Light. The results from these queries became the baseline data. The original 15 queries were then modified using the various search operators supported by each of the 5 search engines for a total of 210 queries. Each of these 210 queries was also submitted to the applicable search service. The results obtained were then compared to the baseline results. A total of 2,768 search results were returned by the set of all queries. In general, increasing the complexity of the queries had little effect on the results with a greater than 70% overlap in results, on average. Implications for the design of Web search services and directions for future research are discussed.

  3. SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases.

    Science.gov (United States)

    Schweiger, Dominik; Trajanoski, Zlatko; Pabinger, Stephan

    2014-08-15

    Semantic Web has established itself as a framework for using and sharing data across applications and database boundaries. Here, we present a web-based platform for querying biological Semantic Web databases in a graphical way. SPARQLGraph offers an intuitive drag & drop query builder, which converts the visual graph into a query and executes it on a public endpoint. The tool integrates several publicly available Semantic Web databases, including the databases of the just recently released EBI RDF platform. Furthermore, it provides several predefined template queries for answering biological questions. Users can easily create and save new query graphs, which can also be shared with other researchers. This new graphical way of creating queries for biological Semantic Web databases considerably facilitates usability as it removes the requirement of knowing specific query languages and database structures. The system is freely available at http://sparqlgraph.i-med.ac.at.

  4. Ensemble Learned Vaccination Uptake Prediction using Web Search Queries

    OpenAIRE

    Hansen, Niels Dalum; Lioma, Christina; Mølbak, Kåre

    2016-01-01

    We present a method that uses ensemble learning to combine clinical and web-mined time-series data in order to predict future vaccination uptake. The clinical data is official vaccination registries, and the web data is query frequencies collected from Google Trends. Experiments with official vaccine records show that our method predicts vaccination uptake eff?ectively (4.7 Root Mean Squared Error). Whereas performance is best when combining clinical and web data, using solely web data yields...

  5. Query Translation on the Fly in Deep Web Integration

    Institute of Scientific and Technical Information of China (English)

    JIANG Fangjiao; JIA Linlin; MENG Xiaofeng

    2007-01-01

    To facilitate users to access the desired information,many researches have dedicated to the Deep Web (i.e. Web databases) integration. We focus on query translation which is an important part of the Deep Web integration. Our aim is to construct automatically a set of constraints mapping rules so that the system can translate the query from the integrated interface to the Web database interfaces based on them. We construct a concept hierarchy for the attributes of the query interfaces, especially, store the synonyms and the types (e.g. Number, Text, etc.) for every concept.At the same time, we construct the data hierarchies for some concepts if necessary. Then we present an algorithm to generate the constraint mapping rules based on these hierarchies. The approach is suitable for the scalability of such application and can be extended easily from one domain to another for its domain independent feature. The results of experiment show its effectiveness and efficiency.

  6. BioFed: federated query processing over life sciences linked open data.

    Science.gov (United States)

    Hasnain, Ali; Mehmood, Qaiser; Sana E Zainab, Syeda; Saleem, Muhammad; Warren, Claude; Zehra, Durre; Decker, Stefan; Rebholz-Schuhmann, Dietrich

    2017-03-15

    endpoint's availability based on the EndpointData graph. Our evaluation of BioFed against FedX is based on 20 heterogeneous federated SPARQL queries and shows competitive execution performance in comparison to FedX, which can be attributed to the provision of provenance information for the source selection. Developing and testing federated query engines for life sciences data is still a challenging task. According to our findings, it is advantageous to optimise the source selection. The cataloguing of SPARQL endpoints, including type and property indexing, leads to efficient querying of data resources over the Web of Data. This could even be further improved through the use of ontologies, e.g., for abstract normalisation of query terms.

  7. On the Querying for Places on the Mobile Web

    DEFF Research Database (Denmark)

    Jensen, Christian S.

    2011-01-01

    The web is undergoing a fundamental transformation: it is becoming mobile and is acquiring a spatial dimension. Thus, the web is increasingly being used from mobile devices, notably smartphones, that can be geo-positioned using GPS or technologies that exploit wireless communication networks....... In addition, web content is being geo-tagged. This transformation calls for new, spatio-textual query functionality. The research community is hard at work enabling efficient support for such functionality....

  8. Ensemble learned vaccination uptake prediction using web search queries

    DEFF Research Database (Denmark)

    Hansen, Niels Dalum; Lioma, Christina; Mølbak, Kåre

    2016-01-01

    We present a method that uses ensemble learning to combine clinical and web-mined time-series data in order to predict future vaccination uptake. The clinical data is official vaccination registries, and the web data is query frequencies collected from Google Trends. Experiments with official...... vaccine records show that our method predicts vaccination uptake eff?ectively (4.7 Root Mean Squared Error). Whereas performance is best when combining clinical and web data, using solely web data yields comparative performance. To our knowledge, this is the ?first study to predict vaccination uptake...

  9. Data management and query processing in semantic web databases

    CERN Document Server

    Groppe, Sven

    2011-01-01

    The Semantic Web, which is intended to establish a machine-understandable Web, is currently changing from being an emerging trend to a technology used in complex real-world applications. A number of standards and techniques have been developed by the World Wide Web Consortium (W3C), e.g., the Resource Description Framework (RDF), which provides a general method for conceptual descriptions for Web resources, and SPARQL, an RDF querying language. Recent examples of large RDF data with billions of facts include the UniProt comprehensive catalog of protein sequence, function and annotation data, t

  10. Nowcasting Mobile Games Ranking Using Web Search Query Data

    Directory of Open Access Journals (Sweden)

    Yoones A. Sekhavat

    2016-01-01

    Full Text Available In recent years, the Internet has become embedded into the purchasing decision of consumers. The purpose of this paper is to study whether the Internet behavior of users correlates with their actual behavior in computer games market. Rather than proposing the most accurate model for computer game sales, we aim to investigate to what extent web search query data can be exploited to nowcast (contraction of “now” and “forecasting” referring to techniques used to make short-term forecasts (predict the present status of the ranking of mobile games in the world. Google search query data is used for this purpose, since this data can provide a real-time view on the topics of interest. Various statistical techniques are used to show the effectiveness of using web search query data to nowcast mobile games ranking.

  11. Developing responsive web applications with Ajax and jQuery

    CERN Document Server

    Patel, Sandeep Kumar

    2014-01-01

    This book is a standard tutorial for web application developers presented in a comprehensive, step-by-step manner to explain the nuances involved. It has an abundance of code and examples supporting explanations of each feature. This book is intended for Java developers wanting to create rich and responsive applications using AJAX. Basic experience of using jQuery is assumed.

  12. jQuery mobile web development essentials

    CERN Document Server

    Camden, Raymond

    2013-01-01

    Packed with practical examples, code, and screenshots, this book will show you how to create mobile optimized sites using the easiest, most practical HTML/JavaScript framework available today.If you are a web developer looking to create mobile optimized websites then this book is for you. Basic knowledge of HTML is required. Some familiarity with JavaScript will help, but is not required.

  13. Federated Space-Time Query for Earth Science Data Using OpenSearch Conventions

    Science.gov (United States)

    Lynnes, C.; Beaumont, B.; Duerr, R. E.; Hua, H.

    2009-12-01

    The past decade has seen a burgeoning of remote sensing and Earth science data providers, as evidenced in the growth of the Earth Science Information Partner (ESIP) federation. At the same time, the need to combine diverse data sets to enable understanding of the Earth as a system has also grown. While the expansion of data providers is in general a boon to such studies, the diversity presents a challenge to finding useful data for a given study. Locating all the data files with aerosol information for a particular volcanic eruption, for example, may involve learning and using several different search tools to execute the requisite space-time queries. To address this issue, the ESIP federation is developing a federated space-time query framework, based on the OpenSearch convention (www.opensearch.org), with Geo and Time extensions. In this framework, data providers publish OpenSearch Description Documents that describe in a machine-readable form how to execute queries against the provider. The novelty of OpenSearch is that the space-time query interface becomes both machine callable and easy enough to integrate into the web browser's search box. This flexibility, together with a simple REST (HTTP-get) interface, should allow a variety of data providers to participate in the federated search framework, from large institutional data centers to individual scientists. The simple interface enables trivial querying of multiple data sources and participation in recursive-like federated searches--all using the same common OpenSearch interface. This simplicity also makes the construction of clients easy, as does existing OpenSearch client libraries in a variety of languages. Moreover, a number of clients and aggregation services already exist and OpenSearch is already supported by a number of web browsers such as Firefox and Internet Explorer.

  14. A study of medical and health queries to web search engines.

    Science.gov (United States)

    Spink, Amanda; Yang, Yin; Jansen, Jim; Nykanen, Pirrko; Lorence, Daniel P; Ozmutlu, Seda; Ozmutlu, H Cenk

    2004-03-01

    This paper reports findings from an analysis of medical or health queries to different web search engines. We report results: (i). comparing samples of 10000 web queries taken randomly from 1.2 million query logs from the AlltheWeb.com and Excite.com commercial web search engines in 2001 for medical or health queries, (ii). comparing the 2001 findings from Excite and AlltheWeb.com users with results from a previous analysis of medical and health related queries from the Excite Web search engine for 1997 and 1999, and (iii). medical or health advice-seeking queries beginning with the word 'should'. Findings suggest: (i). a small percentage of web queries are medical or health related, (ii). the top five categories of medical or health queries were: general health, weight issues, reproductive health and puberty, pregnancy/obstetrics, and human relationships, and (iii). over time, the medical and health queries may have declined as a proportion of all web queries, as the use of specialized medical/health websites and e-commerce-related queries has increased. Findings provide insights into medical and health-related web querying and suggests some implications for the use of the general web search engines when seeking medical/health information.

  15. CrossQuery: a web tool for easy associative querying of transcriptome data.

    Directory of Open Access Journals (Sweden)

    Toni U Wagner

    Full Text Available Enormous amounts of data are being generated by modern methods such as transcriptome or exome sequencing and microarray profiling. Primary analyses such as quality control, normalization, statistics and mapping are highly complex and need to be performed by specialists. Thereafter, results are handed back to biomedical researchers, who are then confronted with complicated data lists. For rather simple tasks like data filtering, sorting and cross-association there is a need for new tools which can be used by non-specialists. Here, we describe CrossQuery, a web tool that enables straight forward, simple syntax queries to be executed on transcriptome sequencing and microarray datasets. We provide deep-sequencing data sets of stem cell lines derived from the model fish Medaka and microarray data of human endothelial cells. In the example datasets provided, mRNA expression levels, gene, transcript and sample identification numbers, GO-terms and gene descriptions can be freely correlated, filtered and sorted. Queries can be saved for later reuse and results can be exported to standard formats that allow copy-and-paste to all widespread data visualization tools such as Microsoft Excel. CrossQuery enables researchers to quickly and freely work with transcriptome and microarray data sets requiring only minimal computer skills. Furthermore, CrossQuery allows growing association of multiple datasets as long as at least one common point of correlated information, such as transcript identification numbers or GO-terms, is shared between samples. For advanced users, the object-oriented plug-in and event-driven code design of both server-side and client-side scripts allow easy addition of new features, data sources and data types.

  16. CrossQuery: a web tool for easy associative querying of transcriptome data.

    Science.gov (United States)

    Wagner, Toni U; Fischer, Andreas; Thoma, Eva C; Schartl, Manfred

    2011-01-01

    Enormous amounts of data are being generated by modern methods such as transcriptome or exome sequencing and microarray profiling. Primary analyses such as quality control, normalization, statistics and mapping are highly complex and need to be performed by specialists. Thereafter, results are handed back to biomedical researchers, who are then confronted with complicated data lists. For rather simple tasks like data filtering, sorting and cross-association there is a need for new tools which can be used by non-specialists. Here, we describe CrossQuery, a web tool that enables straight forward, simple syntax queries to be executed on transcriptome sequencing and microarray datasets. We provide deep-sequencing data sets of stem cell lines derived from the model fish Medaka and microarray data of human endothelial cells. In the example datasets provided, mRNA expression levels, gene, transcript and sample identification numbers, GO-terms and gene descriptions can be freely correlated, filtered and sorted. Queries can be saved for later reuse and results can be exported to standard formats that allow copy-and-paste to all widespread data visualization tools such as Microsoft Excel. CrossQuery enables researchers to quickly and freely work with transcriptome and microarray data sets requiring only minimal computer skills. Furthermore, CrossQuery allows growing association of multiple datasets as long as at least one common point of correlated information, such as transcript identification numbers or GO-terms, is shared between samples. For advanced users, the object-oriented plug-in and event-driven code design of both server-side and client-side scripts allow easy addition of new features, data sources and data types.

  17. What Snippets Say About Pages in Federated Web Search

    NARCIS (Netherlands)

    Demeester, Thomas; Nguyen, Dong-Phuong; Trieschnigg, Dolf; Develder, Chris; Hiemstra, Djoerd; Hou, Yuexian; Nie, Jian-Yun; Sun, Le; Wang, Bo; Zhang, Peng

    2012-01-01

    What is the likelihood that a Web page is considered relevant to a query, given the relevance assessment of the corresponding snippet? Using a new federated IR test collection that contains search results from over a hundred search engines on the internet, we are able to investigate such research qu

  18. SQL level global query resolving for web based GIS

    Science.gov (United States)

    Chen, Bin; Huang, Fengru; Huang, Zhou; Sun, Yumei; Fang, Yu

    2007-06-01

    This paper introduced a SQL level approach to resolve global spatial query in Web based heterogeneous distributed spatial database environment. The main prohibit of this SQL level approach was its widespread compatibility and standardization. Firstly, a SQL based Equivalent Distributed Program (EDP) was introduced to express distributed spatial processing transactions. Then global resource directories for virtual global view describing were discussed to organize information that resolving need. The contents of global resource directories included data storage directory, hosts directory and working status directory. With these mechanisms, relational algebra expression equivalence principles were utilized to resolve global spatial queries to EDPs. Finally, several samples were presented to show the process of resolving. This approach was suitable to all sorts of distributed computing environments either centralized such as CORBA or decentralized such as P2P computing platforms.

  19. Web search queries can predict stock market volumes.

    Science.gov (United States)

    Bordino, Ilaria; Battiston, Stefano; Caldarelli, Guido; Cristelli, Matthieu; Ukkonen, Antti; Weber, Ingmar

    2012-01-01

    We live in a computerized and networked society where many of our actions leave a digital trace and affect other people's actions. This has lead to the emergence of a new data-driven research field: mathematical methods of computer science, statistical physics and sociometry provide insights on a wide range of disciplines ranging from social science to human mobility. A recent important discovery is that search engine traffic (i.e., the number of requests submitted by users to search engines on the www) can be used to track and, in some cases, to anticipate the dynamics of social phenomena. Successful examples include unemployment levels, car and home sales, and epidemics spreading. Few recent works applied this approach to stock prices and market sentiment. However, it remains unclear if trends in financial markets can be anticipated by the collective wisdom of on-line users on the web. Here we show that daily trading volumes of stocks traded in NASDAQ-100 are correlated with daily volumes of queries related to the same stocks. In particular, query volumes anticipate in many cases peaks of trading by one day or more. Our analysis is carried out on a unique dataset of queries, submitted to an important web search engine, which enable us to investigate also the user behavior. We show that the query volume dynamics emerges from the collective but seemingly uncoordinated activity of many users. These findings contribute to the debate on the identification of early warnings of financial systemic risk, based on the activity of users of the www.

  20. Web search queries can predict stock market volumes.

    Directory of Open Access Journals (Sweden)

    Ilaria Bordino

    Full Text Available We live in a computerized and networked society where many of our actions leave a digital trace and affect other people's actions. This has lead to the emergence of a new data-driven research field: mathematical methods of computer science, statistical physics and sociometry provide insights on a wide range of disciplines ranging from social science to human mobility. A recent important discovery is that search engine traffic (i.e., the number of requests submitted by users to search engines on the www can be used to track and, in some cases, to anticipate the dynamics of social phenomena. Successful examples include unemployment levels, car and home sales, and epidemics spreading. Few recent works applied this approach to stock prices and market sentiment. However, it remains unclear if trends in financial markets can be anticipated by the collective wisdom of on-line users on the web. Here we show that daily trading volumes of stocks traded in NASDAQ-100 are correlated with daily volumes of queries related to the same stocks. In particular, query volumes anticipate in many cases peaks of trading by one day or more. Our analysis is carried out on a unique dataset of queries, submitted to an important web search engine, which enable us to investigate also the user behavior. We show that the query volume dynamics emerges from the collective but seemingly uncoordinated activity of many users. These findings contribute to the debate on the identification of early warnings of financial systemic risk, based on the activity of users of the www.

  1. Web Search Queries Can Predict Stock Market Volumes

    Science.gov (United States)

    Bordino, Ilaria; Battiston, Stefano; Caldarelli, Guido; Cristelli, Matthieu; Ukkonen, Antti; Weber, Ingmar

    2012-01-01

    We live in a computerized and networked society where many of our actions leave a digital trace and affect other people’s actions. This has lead to the emergence of a new data-driven research field: mathematical methods of computer science, statistical physics and sociometry provide insights on a wide range of disciplines ranging from social science to human mobility. A recent important discovery is that search engine traffic (i.e., the number of requests submitted by users to search engines on the www) can be used to track and, in some cases, to anticipate the dynamics of social phenomena. Successful examples include unemployment levels, car and home sales, and epidemics spreading. Few recent works applied this approach to stock prices and market sentiment. However, it remains unclear if trends in financial markets can be anticipated by the collective wisdom of on-line users on the web. Here we show that daily trading volumes of stocks traded in NASDAQ-100 are correlated with daily volumes of queries related to the same stocks. In particular, query volumes anticipate in many cases peaks of trading by one day or more. Our analysis is carried out on a unique dataset of queries, submitted to an important web search engine, which enable us to investigate also the user behavior. We show that the query volume dynamics emerges from the collective but seemingly uncoordinated activity of many users. These findings contribute to the debate on the identification of early warnings of financial systemic risk, based on the activity of users of the www. PMID:22829871

  2. VIQI: A New Approach for Visual Interpretation of Deep Web Query Interfaces

    CERN Document Server

    Boughamoura, Radhouane; Omri, Mohamed Nazih

    2012-01-01

    Deep Web databases contain more than 90% of pertinent information of the Web. Despite their importance, users don't profit of this treasury. Many deep web services are offering competitive services in term of prices, quality of service, and facilities. As the number of services is growing rapidly, users have difficulty to ask many web services in the same time. In this paper, we imagine a system where users have the possibility to formulate one query using one query interface and then the system translates query to the rest of query interfaces. However, interfaces are created by designers in order to be interpreted visually by users, machines can not interpret query from a given interface. We propose a new approach which emulates capacity of interpretation of users and extracts query from deep web query interfaces. Our approach has proved good performances on two standard datasets.

  3. Extracting Result Schema Based on Query Instances in the Deep Web

    Institute of Scientific and Technical Information of China (English)

    NIE Tiezheng; YU Ge; SHEN Derong; KOU Yue; LIU Wei

    2007-01-01

    Deep Web sources contain a large of high-quality and query-related structured date. One of the challenges in the Deep Web is extracting result schemas of Deep Web sources. To address this challenge, this paper describes a novel approach that extracts both result data and the result schema of a Web database.The approach first models the query interface of a Deep Web source and fills in it with a specifically query instance. Then the result pages of the Deep Web sources are formatted in the tree structure to retrieve subtrees that contain elements of the query instance. Next, result schema of the Deep Web source is extracted by matching the subtree' nodes with the query instance, in which,a two-phase schema extraction method is adopted for obtaining more accurate result schema. Finally, experiments on real Deep Web sources show the utility of our approach, which provides a high precision and recall.

  4. A probabilistic approach for mapping free-text queries to complex web forms

    NARCIS (Netherlands)

    Tjin-Kam-Jet, Kien; Trieschnigg, Rudolf Berend; Hiemstra, Djoerd

    Web applications with complex interfaces consisting of multiple input fields should understand free-text queries. We propose a probabilistic approach to map parts of a free-text query to the fields of a complex web form. Our method uses token models rather than only static dictionaries to create

  5. Web 2.0 Technologies with jQuery and Ajax

    Directory of Open Access Journals (Sweden)

    Tamas Lorand

    2009-10-01

    Full Text Available The development of a web 2.0 portal using Ajax and jQuery techniques. This paper describes the development of a web portal using technologies like PHP, jQuery and Ajax. Regular web portals simply use PHP and MySQL, which is not enough to provide the interactivity the user needs from a web portal. jQuery technique is designed to change the way you write JavaScript, because it is very compact and easy to use and understand. jQuery is also very popular being used by Google, IBM, NBC, Amazon, Wordpress and many others. Ajax technique is used to increase responsiveness and interactivity of the web pages achieved by exchanging small amounts of data « behind the scenes » so that the entire web pages do not have to be reloaded each time there is a need to fetch data from the server.

  6. Mandatory Class 1 Federal Areas Web Service

    Data.gov (United States)

    U.S. Environmental Protection Agency — This web service contains the following layers: Mandatory Class 1 Federal Area polygons and Mandatory Class 1 Federal Area labels in the United States. The polygon...

  7. Wild Card Queries for Searching Resources on the Web

    CERN Document Server

    Rafiei, Davood

    2009-01-01

    We propose a domain-independent framework for searching and retrieving facts and relationships within natural language text sources. In this framework, an extraction task over a text collection is expressed as a query that combines text fragments with wild cards, and the query result is a set of facts in the form of unary, binary and general $n$-ary tuples. A significance of our querying mechanism is that, despite being both simple and declarative, it can be applied to a wide range of extraction tasks. A problem in querying natural language text though is that a user-specified query may not retrieve enough exact matches. Unlike term queries which can be relaxed by removing some of the terms (as is done in search engines), removing terms from a wild card query without ruining its meaning is more challenging. Also, any query expansion has the potential to introduce false positives. In this paper, we address the problem of query expansion, and also analyze a few ranking alternatives to score the results and to r...

  8. Query transformations and their role in Web searching by the members of the general public

    Directory of Open Access Journals (Sweden)

    Martin Whittle

    2006-01-01

    Full Text Available Introduction. This paper reports preliminary research in a primarily experimental study of how the general public search for information on the Web. The focus is on the query transformation patterns that characterise searching. Method. In this work, we have used transaction logs from the Excite search engine to develop methods for analysing query transformations that should aid the analysis of our ongoing experimental work. Our methods involve the use of similarity techniques to link queries with the most similar previous query in a train. The resulting query transformations are represented as a list of codes representing a whole search. Analysis. It is shown how query transformation sequences can be represented as graphical networks and some basic statistical results are shown. A correlation analysis is performed to examine the co-occurrence of Boolean and quotation mark changes with the syntactic changes. Results. A frequency analysis of the occurrence of query transformation codes is presented. The connectivity of graphs obtained from the query transformation is investigated and found to follow an exponential scaling law. The correlation analysis reveals a number of patterns that provide some interesting insights into Web searching by the general public. Conclusion. We have developed analytical methods based on query similarity that can be applied to our current experimental work with volunteer subjects. The results of these will form part of a database with the aim of developing an improved understanding of how the public search the Web.

  9. VIQI: A New Approach for Visual Interpretation of Deep Web Query Interfaces

    OpenAIRE

    Boughamoura, Radhouane; Hlaoua, Lobna; Omri, Mohamed nazih

    2012-01-01

    Deep Web databases contain more than 90% of pertinent information of the Web. Despite their importance, users don't profit of this treasury. Many deep web services are offering competitive services in term of prices, quality of service, and facilities. As the number of services is growing rapidly, users have difficulty to ask many web services in the same time. In this paper, we imagine a system where users have the possibility to formulate one query using one query interface and then the sys...

  10. A Deep Web Query Interfaces Classification Method Based on RBF Neural Network

    Institute of Scientific and Technical Information of China (English)

    YUAN Fang; ZHAO Yao; ZHOU Xu

    2007-01-01

    This paper proposes a new approach for classification for query interfaces of Deep Web, which extracts features from the form's text data on the query interfaces, assisted with the synonym library, and uses radial basic function neural network (RBFNN) algorithm to classify the query interfaces. The applied RBFNN is a kind of effective feed-forward artificial neural network, which has a simple networking structure but features with strength of excellent nonlinear approximation, fast convergence and global convergence. A TEL_8 query interfaces' data set from UIUC on-line database is used in our experiments, which consists of 477 query interfaces in 8 typical domains. Experimental results proved that the proposed approach can efficiently classify the query interfaces with an accuracy of 95.67%.

  11. Managing and Querying Web Services Communities: A Survey

    CERN Document Server

    Limam, Hela

    2011-01-01

    With the advance of Web Services technologies and the emergence of Web Services into the information space, tremendous opportunities for empowering users and organizations appear in various application domains including electronic commerce, travel, intelligence information gathering and analysis, health care, digital government, etc. However, the technology to organize, search, integrate these Web Services has not kept pace with the rapid growth of the available information space. The number of Web Services to be integrated may be large and continuously changing. To ease and improve the process of Web services discovery in an open environment like the Internet, it is suggested to gather similar Web services into groups known as communities. Although Web services are intensively investigated, the community management issues have not been addressed yet In this paper we draw an overview of several Web services Communities' management approaches based on some currently existing communities platforms and framework...

  12. Semantics and the medical web: a review of barriers and breakthroughs in effective healthcare query.

    Science.gov (United States)

    Lorence, Daniel P; Spink, Amanda

    2004-06-01

    This paper provides an overview of the research into current medical vocabularies and their impact on searching the Web for health information. The Web provides growing opportunities for laypersons to gain knowledge about specific health conditions, though research to date has been incomplete. Many studies have examined aspects of controlled medical vocabularies. Other studies have examined aspects of medical Web searching vocabularies. In this context, there is a growing need to examine more closely laypersons' Web queries using controlled medical vocabularies that were designed to serve the needs of medical professionals. It may be the case that the average consumer of Web health services is not able to use correct medical terminology, and may not be able to choose analogous or synonymous terms from a search result list. Our review suggests a growing need for studies to examine the current applicability of controlled medical vocabularies as well as alternatives to semantic query by Web search engine users.

  13. Accelerating the Response of Query in Semantic Web

    Directory of Open Access Journals (Sweden)

    Nooshin Azimi

    2014-07-01

    Full Text Available Today, XML has become one of the important formats of saving and exchanging data. XML structure flexibility enhances its use, and the content of XML documents is increasing constantly. As a result, since file management system is not able to manage such content of data, managing XML documents requires a comprehensive management system. With the striking growth of such databases, the necessity of accelerating the implementing operation of queries is felt. In this paper, we are searching for a method that has required ability for a large set of queries; the method that would access fewer nodes and would get the answer through a shorter period of time, compared to similar ways; the method which has the ability of matching with similar ways indicator, and can use them to accelerate the queries. We are seeking a method which is able to jump over the useless nodes and produces intermediate data, as compared to similar ones. A method by which nodes processing are not performed directly and automatically through a pattern matching guide.

  14. Combining Local Scoring and Global Aggregation to Rank Entities for Deep Web Queries

    Institute of Scientific and Technical Information of China (English)

    Yue Kou; De-Rong Shen; Ge Yu; Tie-Zheng Nie

    2009-01-01

    With the rapid growth of Web databases, it is necessary to extract and integrate large-scale data available in Deep Web automatically. But current Web search engines conduct page-level ranking, which are becoming inadequate for entity-oriented vertical search. In this paper, we present an entity-level ranking mechanism called LG-ERM for Deep Web queries based on local scoring and global aggregation. Unlike traditional approaches, LG-ERM considers more rank influencing factors including the uncertainty of entity extraction, the style information of the entities and the importance of the Web sources, as well as the entity relationship. By combining local scoring and global aggregation in ranking, the query result can be more accurate and effective to meet users' needs. The experiments demonstrate the feasibility and effectiveness of the key techniques of LG-ERM.

  15. Empirical Performance Metrics Study of Execution of Database Queries in Implementation of Web Services

    Directory of Open Access Journals (Sweden)

    M. A. Maluk Mohamed

    2012-01-01

    Full Text Available Problem statement: Web services are increasingly being deployed in business applications, due to its unique features such as flexibility, interoperability and other features. Most of the business applications involve extensive use of database operations for data management in back end. Further business applications demand very high level of performance from software solutions and it is a continual and never-ending process. This study focuses on measurement and analysis of performance metrics of database queries in implementation of web services. Approach: In this experimentation, web services are implemented in two popular and standard platforms and database queries are realized through all commercial and standard databases. Performance measurement is done by implementing a common sample application on each realization and using a pair of performance metrics, response time and packet count, that is, number of packets involved in communication between the layers of implementation. Results and Conclusion: This novel study summarizes the various performance aspects of databases in web services with a basic set of database queries in the back end and concludes with firm results on optimum performance offered by database in execution of database queries in realization of web services.

  16. Effective Filtering of Query Results on Updated User Behavioral Profiles in Web Mining.

    Science.gov (United States)

    Sadesh, S; Suganthe, R C

    2015-01-01

    Web with tremendous volume of information retrieves result for user related queries. With the rapid growth of web page recommendation, results retrieved based on data mining techniques did not offer higher performance filtering rate because relationships between user profile and queries were not analyzed in an extensive manner. At the same time, existing user profile based prediction in web data mining is not exhaustive in producing personalized result rate. To improve the query result rate on dynamics of user behavior over time, Hamilton Filtered Regime Switching User Query Probability (HFRS-UQP) framework is proposed. HFRS-UQP framework is split into two processes, where filtering and switching are carried out. The data mining based filtering in our research work uses the Hamilton Filtering framework to filter user result based on personalized information on automatic updated profiles through search engine. Maximized result is fetched, that is, filtered out with respect to user behavior profiles. The switching performs accurate filtering updated profiles using regime switching. The updating in profile change (i.e., switches) regime in HFRS-UQP framework identifies the second- and higher-order association of query result on the updated profiles. Experiment is conducted on factors such as personalized information search retrieval rate, filtering efficiency, and precision ratio.

  17. Index Compression and Efficient Query Processing in Large Web Search Engines

    Science.gov (United States)

    Ding, Shuai

    2013-01-01

    The inverted index is the main data structure used by all the major search engines. Search engines build an inverted index on their collection to speed up query processing. As the size of the web grows, the length of the inverted list structures, which can easily grow to hundreds of MBs or even GBs for common terms (roughly linear in the size of…

  18. Towards Distributed Information Retrieval in the Semantic Web: Query Reformulation Using the oMAP Framework

    NARCIS (Netherlands)

    Straccia, U.; Troncy, R.

    2006-01-01

    This paper introduces a general methodology for performing distributed search in the Semantic Web. We propose to define this task as a three steps process, namely resource selection, query reformulation/ontology alignment and rank aggregation/data fusion. For the second problem, we have implemented

  19. Spatial Search Techniques for Mobile 3D Queries in Sensor Web Environments

    Directory of Open Access Journals (Sweden)

    James D. Carswell

    2013-03-01

    Full Text Available Developing mobile geo-information systems for sensor web applications involves technologies that can access linked geographical and semantically related Internet information. Additionally, in tomorrow’s Web 4.0 world, it is envisioned that trillions of inexpensive micro-sensors placed throughout the environment will also become available for discovery based on their unique geo-referenced IP address. Exploring these enormous volumes of disparate heterogeneous data on today’s location and orientation aware smartphones requires context-aware smart applications and services that can deal with “information overload”. 3DQ (Three Dimensional Query is our novel mobile spatial interaction (MSI prototype that acts as a next-generation base for human interaction within such geospatial sensor web environments/urban landscapes. It filters information using “Hidden Query Removal” functionality that intelligently refines the search space by calculating the geometry of a three dimensional visibility shape (Vista space at a user’s current location. This 3D shape then becomes the query “window” in a spatial database for retrieving information on only those objects visible within a user’s actual 3D field-of-view. 3DQ reduces information overload and serves to heighten situation awareness on constrained commercial off-the-shelf devices by providing visibility space searching as a mobile web service. The effects of variations in mobile spatial search techniques in terms of query speed vs. accuracy are evaluated and presented in this paper.

  20. Utility of Web search query data in testing theoretical assumptions about mephedrone.

    Science.gov (United States)

    Kapitány-Fövény, Máté; Demetrovics, Zsolt

    2017-05-01

    With growing access to the Internet, people who use drugs and traffickers started to obtain information about novel psychoactive substances (NPS) via online platforms. This paper aims to analyze whether a decreasing Web interest in formerly banned substances-cocaine, heroin, and MDMA-and the legislative status of mephedrone predict Web interest about this NPS. Google Trends was used to measure changes of Web interest on cocaine, heroin, MDMA, and mephedrone. Google search results for mephedrone within the same time frame were analyzed and categorized. Web interest about classic drugs found to be more persistent. Regarding geographical distribution, location of Web searches for heroin and cocaine was less centralized. Illicit status of mephedrone was a negative predictor of its Web search query rates. The connection between mephedrone-related Web search rates and legislative status of this substance was significantly mediated by ecstasy-related Web search queries, the number of documentaries, and forum/blog entries about mephedrone. The results might provide support for the hypothesis that mephedrone's popularity was highly correlated with its legal status as well as it functioned as a potential substitute for MDMA. Google Trends was found to be a useful tool for testing theoretical assumptions about NPS. Copyright © 2017 John Wiley & Sons, Ltd.

  1. Querying OCLC Web Services for Name, Subject, and ISBN

    Directory of Open Access Journals (Sweden)

    Ya’aqov Ziso

    2010-03-01

    Full Text Available Using Web services, search terms can be sent to WorldCat's centralized authority and identifier files to retrieve authorized terminology that helps users get a comprehensive set of relevant search results. This article presents methods for searching names, subjects or ISBNs in various WorldCat databases and displaying the results to users. Exploiting WorldCat's databases in this way opens up future possibilities for more seamless integration of authority-controlled vocabulary lists into new discovery interfaces and a reduction in libraries’ dependence on local name and subject authority files.

  2. High-performance web services for querying gene and variant annotation.

    Science.gov (United States)

    Xin, Jiwen; Mark, Adam; Afrasiabi, Cyrus; Tsueng, Ginger; Juchler, Moritz; Gopal, Nikhil; Stupp, Gregory S; Putman, Timothy E; Ainscough, Benjamin J; Griffith, Obi L; Torkamani, Ali; Whetzel, Patricia L; Mungall, Christopher J; Mooney, Sean D; Su, Andrew I; Wu, Chunlei

    2016-05-06

    Efficient tools for data management and integration are essential for many aspects of high-throughput biology. In particular, annotations of genes and human genetic variants are commonly used but highly fragmented across many resources. Here, we describe MyGene.info and MyVariant.info, high-performance web services for querying gene and variant annotation information. These web services are currently accessed more than three million times permonth. They also demonstrate a generalizable cloud-based model for organizing and querying biological annotation information. MyGene.info and MyVariant.info are provided as high-performance web services, accessible at http://mygene.info and http://myvariant.info . Both are offered free of charge to the research community.

  3. PRESY: A Context Based Query Reformulation Tool for Information Retrieval on the Web

    Directory of Open Access Journals (Sweden)

    Abdelkrim Bouramoul

    2010-01-01

    Full Text Available Problem statement: The huge number of information on the web as well as the growth of new inexperienced users creates new challenges for information retrieval. It has become increasingly difficult for these users to find relevant documents that satisfy their individual needs. Certainly the current search engines (such as Google, Bing and Yahoo offer an efficient way to browse the web content. However, the result quality is highly based on uses queries which need to be more precise to find relevant documents. This task still complicated for the majority of inept users who cannot express their needs with significant words in the query. For that reason, we believe that a reformulation of the initial user's query can be a good alternative to improve the information selectivity. This study proposed a novel approach and presents a prototype system called Profile-based Reformulation System (PRESY for information retrieval on the web. Approach: It used an incremental approach to categorize users by constructing a contextual base. The latter was composed of two types of context (static and dynamic obtained using the users' profiles. The architecture proposed was implemented using .Net environment to perform queries reformulating tests. Results: The experiments gave at the end of this article show that the precision of the returned content is effectively improved. The tests were performed with the most popular searching engine (i.e., Google, Bind and Yahoo selected in particular for their high selectivity. Among the given results, we found that query reformulation improve the first three results by 10.7 and 11.7% of the next seven returned elements. So as we could see the reformulation of users' initial queries improves the pertinence of returned content. Conclusion/Recommendations: Therefore, we believed that the exploitation of contextual data based on users' profiles could be a very good way to reformulate user query. This complementary mechanism would

  4. Profile-IQ: Web-based data query system for local health department infrastructure and activities.

    Science.gov (United States)

    Shah, Gulzar H; Leep, Carolyn J; Alexander, Dayna

    2014-01-01

    To demonstrate the use of National Association of County & City Health Officials' Profile-IQ, a Web-based data query system, and how policy makers, researchers, the general public, and public health professionals can use the system to generate descriptive statistics on local health departments. This article is a descriptive account of an important health informatics tool based on information from the project charter for Profile-IQ and the authors' experience and knowledge in design and use of this query system. Profile-IQ is a Web-based data query system that is based on open-source software: MySQL 5.5, Google Web Toolkit 2.2.0, Apache Commons Math library, Google Chart API, and Tomcat 6.0 Web server deployed on an Amazon EC2 server. It supports dynamic queries of National Profile of Local Health Departments data on local health department finances, workforce, and activities. Profile-IQ's customizable queries provide a variety of statistics not available in published reports and support the growing information needs of users who do not wish to work directly with data files for lack of staff skills or time, or to avoid a data use agreement. Profile-IQ also meets the growing demand of public health practitioners and policy makers for data to support quality improvement, community health assessment, and other processes associated with voluntary public health accreditation. It represents a step forward in the recent health informatics movement of data liberation and use of open source information technology solutions to promote public health.

  5. A topological framework for interactive queries on 3D models in the Web.

    Science.gov (United States)

    Figueiredo, Mauro; Rodrigues, José I; Silvestre, Ivo; Veiga-Pires, Cristina

    2014-01-01

    Several technologies exist to create 3D content for the web. With X3D, WebGL, and X3DOM, it is possible to visualize and interact with 3D models in a web browser. Frequently, three-dimensional objects are stored using the X3D file format for the web. However, there is no explicit topological information, which makes it difficult to design fast algorithms for applications that require adjacency and incidence data. This paper presents a new open source toolkit TopTri (Topological model for Triangle meshes) for Web3D servers that builds the topological model for triangular meshes of manifold or nonmanifold models. Web3D client applications using this toolkit make queries to the web server to get adjacent and incidence information of vertices, edges, and faces. This paper shows the application of the topological information to get minimal local points and iso-lines in a 3D mesh in a web browser. As an application, we present also the interactive identification of stalactites in a cave chamber in a 3D web browser. Several tests show that even for large triangular meshes with millions of triangles, the adjacency and incidence information is returned in real time making the presented toolkit appropriate for interactive Web3D applications.

  6. Web Image Re-Ranking UsingQuery-Specific Semantic Signatures.

    Science.gov (United States)

    Wang, Xiaogang; Qiu, Shi; Liu, Ke; Tang, Xiaoou

    2014-04-01

    Image re-ranking, as an effective way to improve the results of web-based image search, has been adopted by current commercial search engines such as Bing and Google. Given a query keyword, a pool of images are first retrieved based on textual information. By asking the user to select a query image from the pool, the remaining images are re-ranked based on their visual similarities with the query image. A major challenge is that the similarities of visual features do not well correlate with images' semantic meanings which interpret users' search intention. Recently people proposed to match images in a semantic space which used attributes or reference classes closely related to the semantic meanings of images as basis. However, learning a universal visual semantic space to characterize highly diverse images from the web is difficult and inefficient. In this paper, we propose a novel image re-ranking framework, which automatically offline learns different semantic spaces for different query keywords. The visual features of images are projected into their related semantic spaces to get semantic signatures. At the online stage, images are re-ranked by comparing their semantic signatures obtained from the semantic space specified by the query keyword. The proposed query-specific semantic signatures significantly improve both the accuracy and efficiency of image re-ranking. The original visual features of thousands of dimensions can be projected to the semantic signatures as short as 25 dimensions. Experimental results show that 25-40 percent relative improvement has been achieved on re-ranking precisions compared with the state-of-the-art methods.

  7. A General Framework for Representing, Reasoning and Querying with Annotated Semantic Web Data

    CERN Document Server

    Zimmermann, Antoine; Polleres, Axel; Straccia, Umberto

    2011-01-01

    We describe a generic framework for representing and reasoning with annotated Semantic Web data, a task becoming more important with the recent increased amount of inconsistent and non-reliable meta-data on the web. We formalise the annotated language, the corresponding deductive system and address the query answering problem. Previous contributions on specific RDF annotation domains are encompassed by our unified reasoning formalism as we show by instantiating it on (i) temporal, (ii) fuzzy, and (iii) provenance annotations. Moreover, we provide a generic method for combining multiple annotation domains allowing to represent, e.g. temporally-annotated fuzzy RDF. Furthermore, we address the development of a query language -- AnQL -- that is inspired by SPARQL, including several features of SPARQL 1.1 (subqueries, aggregates, assignment, solution modifiers) along with the formal definitions of their semantics.

  8. Mining Genotype-Phenotype Associations from Public Knowledge Sources via Semantic Web Querying.

    Science.gov (United States)

    Kiefer, Richard C; Freimuth, Robert R; Chute, Christopher G; Pathak, Jyotishman

    2013-01-01

    Gene Wiki Plus (GeneWiki+) and the Online Mendelian Inheritance in Man (OMIM) are publicly available resources for sharing information about disease-gene and gene-SNP associations in humans. While immensely useful to the scientific community, both resources are manually curated, thereby making the data entry and publication process time-consuming, and to some degree, error-prone. To this end, this study investigates Semantic Web technologies to validate existing and potentially discover new genotype-phenotype associations in GWP and OMIM. In particular, we demonstrate the applicability of SPARQL queries for identifying associations not explicitly stated for commonly occurring chronic diseases in GWP and OMIM, and report our preliminary findings for coverage, completeness, and validity of the associations. Our results highlight the benefits of Semantic Web querying technology to validate existing disease-gene associations as well as identify novel associations although further evaluation and analysis is required before such information can be applied and used effectively.

  9. Head lice surveillance on a deregulated OTC-sales market: a study using web query data.

    Directory of Open Access Journals (Sweden)

    Johan Lindh

    Full Text Available The head louse, Pediculus humanus capitis, is an obligate ectoparasite that causes infestations of humans. Studies have demonstrated a correlation between sales figures for over-the-counter (OTC treatment products and the number of humans with head lice. The deregulation of the Swedish pharmacy market on July 1, 2009, decreased the possibility to obtain complete sale figures and thereby the possibility to obtain yearly trends of head lice infestations. In the presented study we wanted to investigate whether web queries on head lice can be used as substitute for OTC sales figures. Via Google Insights for Search and Vårdguiden medical web site, the number of queries on "huvudlöss" (head lice and "hårlöss" (lice in hair were obtained. The analysis showed that both the Vårdguiden series and the Google series were statistically significant (p<0.001 when added separately, but if the Google series were already included in the model, the Vårdguiden series were not statistically significant (p = 0.5689. In conclusion, web queries can detect if there is an increase or decrease of head lice infested humans in Sweden over a period of years, and be as reliable a proxy as the OTC-sales figures.

  10. Head lice surveillance on a deregulated OTC-sales market: a study using web query data.

    Science.gov (United States)

    Lindh, Johan; Magnusson, Måns; Grünewald, Maria; Hulth, Anette

    2012-01-01

    The head louse, Pediculus humanus capitis, is an obligate ectoparasite that causes infestations of humans. Studies have demonstrated a correlation between sales figures for over-the-counter (OTC) treatment products and the number of humans with head lice. The deregulation of the Swedish pharmacy market on July 1, 2009, decreased the possibility to obtain complete sale figures and thereby the possibility to obtain yearly trends of head lice infestations. In the presented study we wanted to investigate whether web queries on head lice can be used as substitute for OTC sales figures. Via Google Insights for Search and Vårdguiden medical web site, the number of queries on "huvudlöss" (head lice) and "hårlöss" (lice in hair) were obtained. The analysis showed that both the Vårdguiden series and the Google series were statistically significant (pGoogle series were already included in the model, the Vårdguiden series were not statistically significant (p = 0.5689). In conclusion, web queries can detect if there is an increase or decrease of head lice infested humans in Sweden over a period of years, and be as reliable a proxy as the OTC-sales figures.

  11. A Taxonomic Search Engine: Federating taxonomic databases using web services

    Directory of Open Access Journals (Sweden)

    Page Roderic DM

    2005-03-01

    Full Text Available Abstract Background The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. Results The Taxonomic Search Engine (TSE is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata for each name. Conclusion The Taxonomic Search Engine is available at http://darwin.zoology.gla.ac.uk/~rpage/portal/ and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names.

  12. Adding Conflict Resolution Features to a Query Language for Database Federations

    Directory of Open Access Journals (Sweden)

    Kai-Uwe Sattler

    2000-11-01

    Full Text Available A main problem of data integration is the treatment of conflicts caused by different modeling of realworld entities, different data models or simply by different representations of one and the same object. During the integration phase these conflicts have to be identified and resolved as part of the mapping between local and global schemata. Therefore, conflict resolution affects the definition of the integrated view as well as query transformation and evaluation, in this paper we present a SQL extension for defining and querying database federations. This language addresses in particular the resolution of integration conflicts by providing mechanisms for mapping attributes, restructuring relations as well as extended integration operations. Finally, the application of these resolution strategies is briefly explained by presenting a simple conflict resolution method.

  13. QueryArch3D: Querying and Visualising 3D Models of a Maya Archaeological Site in a Web-Based Interface

    Directory of Open Access Journals (Sweden)

    Giorgio Agugiaro

    2011-12-01

    Full Text Available Constant improvements in the field of surveying, computing and distribution of digital-content are reshaping the way Cultural Heritage can be digitised and virtually accessed, even remotely via web. A traditional 2D approach for data access, exploration, retrieval and exploration may generally suffice, however more complex analyses concerning spatial and temporal features require 3D tools, which, in some cases, have not yet been implemented or are not yet generally commercially available. Efficient organisation and integration strategies applicable to the wide array of heterogeneous data in the field of Cultural Heritage represent a hot research topic nowadays. This article presents a visualisation and query tool (QueryArch3D conceived to deal with multi-resolution 3D models. Geometric data are organised in successive levels of detail (LoD, provided with geometric and semantic hierarchies and enriched with attributes coming from external data sources. The visualisation and query front-end enables the 3D navigation of the models in a virtual environment, as well as the interaction with the objects by means of queries based on attributes or on geometries. The tool can be used as a standalone application, or served through the web. The characteristics of the research work, along with some implementation issues and the developed QueryArch3D tool will be discussed and presented.

  14. Domainwise Web Page Optimization Based On Clustered Query Sessions Using Hybrid Of Trust And ACO For Effective Information Retrieval

    Directory of Open Access Journals (Sweden)

    Dr. Suruchi Chawla

    2015-08-01

    Full Text Available Abstract In this paper hybrid of Ant Colony OptimizationACO and trust has been used for domainwise web page optimization in clustered query sessions for effective Information retrieval. The trust of the web page identifies its degree of relevance in satisfying specific information need of the user. The trusted web pages when optimized using pheromone updates in ACO will identify the trusted colonies of web pages which will be relevant to users information need in a given domain. Hence in this paper the hybrid of Trust and ACO has been used on clustered query sessions for identifying more and more relevant number of documents in a given domain in order to better satisfy the information need of the user. Experiment was conducted on the data set of web query sessions to test the effectiveness of the proposed approach in selected three domains Academics Entertainment and Sports and the results confirm the improvement in the precision of search results.

  15. Overview of the TREC 2013 Federated Web Search Track

    NARCIS (Netherlands)

    Demeester, Thomas; Trieschnigg, Dolf; Nguyen, Dong; Hiemstra, Djoerd

    2014-01-01

    The TREC Federated Web Search track is intended to promote research related to federated search in a realistic web setting, and hereto provides a large data collection gathered from a series of online search engines. This overview paper discusses the results of the first edition of the track, FedWeb

  16. Overview of the TREC 2014 Federated Web Search Track

    NARCIS (Netherlands)

    Demeester, Thomas; Trieschnigg, Dolf; Nguyen, Dong-Phuong; Zhou, Ke; Hiemstra, Djoerd

    2014-01-01

    The TREC Federated Web Search track facilitates research in topics related to federated web search, by providing a large realistic data collection sampled from a multitude of online search engines. The FedWeb 2013 challenges of Resource Selection and Results Merging challenges are again included in

  17. Resource Selection for Federated Search on the Web

    NARCIS (Netherlands)

    Nguyen, Dong; Demeester, Thomas; Trieschnigg, Dolf; Hiemstra, Djoerd

    2016-01-01

    A publicly available dataset for federated search reflecting a real web environment has long been bsent, making it difficult for researchers to test the validity of their federated search algorithms for the web setting. We present several experiments and analyses on resource selection on the web usi

  18. Resource Selection for Federated Search on the Web

    NARCIS (Netherlands)

    Nguyen, Dong-Phuong; Demeester, Thomas; Trieschnigg, Rudolf Berend; Hiemstra, Djoerd

    A publicly available dataset for federated search reflecting a real web environment has long been bsent, making it difficult for researchers to test the validity of their federated search algorithms for the web setting. We present several experiments and analyses on resource selection on the web

  19. Overview of the TREC 2013 Federated Web Search Track

    NARCIS (Netherlands)

    Demeester, Thomas; Trieschnigg, Rudolf Berend; Nguyen, Dong-Phuong; Hiemstra, Djoerd

    The TREC Federated Web Search track is intended to promote research related to federated search in a realistic web setting, and hereto provides a large data collection gathered from a series of online search engines. This overview paper discusses the results of the first edition of the track, FedWeb

  20. The Shared Health Research Information Network (SHRINE): a prototype federated query tool for clinical data repositories.

    Science.gov (United States)

    Weber, Griffin M; Murphy, Shawn N; McMurry, Andrew J; Macfadden, Douglas; Nigrin, Daniel J; Churchill, Susanne; Kohane, Isaac S

    2009-01-01

    The authors developed a prototype Shared Health Research Information Network (SHRINE) to identify the technical, regulatory, and political challenges of creating a federated query tool for clinical data repositories. Separate Institutional Review Boards (IRBs) at Harvard's three largest affiliated health centers approved use of their data, and the Harvard Medical School IRB approved building a Query Aggregator Interface that can simultaneously send queries to each hospital and display aggregate counts of the number of matching patients. Our experience creating three local repositories using the open source Informatics for Integrating Biology and the Bedside (i2b2) platform can be used as a road map for other institutions. The authors are actively working with the IRBs and regulatory groups to develop procedures that will ultimately allow investigators to obtain identified patient data and biomaterials through SHRINE. This will guide us in creating a future technical architecture that is scalable to a national level, compliant with ethical guidelines, and protective of the interests of the participating hospitals.

  1. FedWeb Greatest Hits: Presenting the New Test Collection for Federated Web Search

    NARCIS (Netherlands)

    Demeester, Thomas; Trieschnigg, Rudolf Berend; Zhou, Ke; Nguyen, Dong-Phuong; Hiemstra, Djoerd

    2015-01-01

    This paper presents 'FedWeb Greatest Hits', a large new test collection for research in web information retrieval. As a combination and extension of the datasets used in the TREC Federated Web Search Track, this collection opens up new research possibilities on federated web search challenges, as we

  2. FedWeb greatest hits: presenting the new test collection for federated web search

    NARCIS (Netherlands)

    Demeester, Thomas; Trieschnigg, Dolf; Zhou, Ke; Nguyen, Dong-Phuong; Hiemstra, Djoerd

    2015-01-01

    This paper presents 'FedWeb Greatest Hits', a large new test collection for research in web information retrieval. As a combination and extension of the datasets used in the TREC Federated Web Search Track, this collection opens up new research possibilities on federated web search challenges, as we

  3. FedWeb Greatest Hits: Presenting the New Test Collection for Federated Web Search

    NARCIS (Netherlands)

    Demeester, Thomas; Trieschnigg, Rudolf Berend; Zhou, Ke; Nguyen, Dong-Phuong; Hiemstra, Djoerd

    This paper presents 'FedWeb Greatest Hits', a large new test collection for research in web information retrieval. As a combination and extension of the datasets used in the TREC Federated Web Search Track, this collection opens up new research possibilities on federated web search challenges, as

  4. Overview of the TREC 2014 Federated Web Search Track

    OpenAIRE

    Demeester, Thomas; Trieschnigg, Rudolf Berend; Nguyen, Dong-Phuong; Zhou, Ke; Hiemstra, Djoerd

    2014-01-01

    The TREC Federated Web Search track facilitates research in topics related to federated web search, by providing a large realistic data collection sampled from a multitude of online search engines. The FedWeb 2013 challenges of Resource Selection and Results Merging challenges are again included in FedWeb 2014, and we additionally introduced the task of vertical selection. Other new aspects are the required link between the Resource Selection and Results Merging, and the importance of diversi...

  5. Resource Selection for Federated Search on the Web

    OpenAIRE

    Nguyen, Dong Van; Demeester, Thomas; Trieschnigg, Dolf; Hiemstra, Djoerd

    2016-01-01

    A publicly available dataset for federated search reflecting a real web environment has long been absent, making it difficult for researchers to test the validity of their federated search algorithms for the web setting. We present several experiments and analyses on resource selection on the web using a recently released test collection containing the results from more than a hundred real search engines, ranging from large general web search engines such as Google, Bing and Yahoo to small do...

  6. Overview of the TREC 2013 Federated Web Search Track

    OpenAIRE

    Demeester, Thomas; Trieschnigg, Rudolf Berend; Nguyen, Dong-Phuong; Hiemstra, Djoerd

    2014-01-01

    The TREC Federated Web Search track is intended to promote research related to federated search in a realistic web setting, and hereto provides a large data collection gathered from a series of online search engines. This overview paper discusses the results of the first edition of the track, FedWeb 2013. The focus was on basic challenges in federated search: (1) resource selection, and (2) results merging. After an overview of the provided data collection and the relevance judgments for the ...

  7. DescribeX: A Framework for Exploring and Querying XML Web Collections

    CERN Document Server

    Rizzolo, Flavio

    2008-01-01

    This thesis introduces DescribeX, a powerful framework that is capable of describing arbitrarily complex XML summaries of web collections, providing support for more efficient evaluation of XPath workloads. DescribeX permits the declarative description of document structure using all axes and language constructs in XPath, and generalizes many of the XML indexing and summarization approaches in the literature. DescribeX supports the construction of heterogeneous summaries where different document elements sharing a common structure can be declaratively defined and refined by means of path regular expressions on axes, or axis path regular expression (AxPREs). DescribeX can significantly help in the understanding of both the structure of complex, heterogeneous XML collections and the behaviour of XPath queries evaluated on them. Experimental results demonstrate the scalability of DescribeX summary refinements and stabilizations (the key enablers for tailoring summaries) with multi-gigabyte web collections. A com...

  8. Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance

    Science.gov (United States)

    Chan, Emily H.; Sahai, Vikram; Conrad, Corrie; Brownstein, John S.

    2011-01-01

    Background A variety of obstacles including bureaucracy and lack of resources have interfered with timely detection and reporting of dengue cases in many endemic countries. Surveillance efforts have turned to modern data sources, such as Internet search queries, which have been shown to be effective for monitoring influenza-like illnesses. However, few have evaluated the utility of web search query data for other diseases, especially those of high morbidity and mortality or where a vaccine may not exist. In this study, we aimed to assess whether web search queries are a viable data source for the early detection and monitoring of dengue epidemics. Methodology/Principal Findings Bolivia, Brazil, India, Indonesia and Singapore were chosen for analysis based on available data and adequate search volume. For each country, a univariate linear model was then built by fitting a time series of the fraction of Google search query volume for specific dengue-related queries from that country against a time series of official dengue case counts for a time-frame within 2003–2010. The specific combination of queries used was chosen to maximize model fit. Spurious spikes in the data were also removed prior to model fitting. The final models, fit using a training subset of the data, were cross-validated against both the overall dataset and a holdout subset of the data. All models were found to fit the data quite well, with validation correlations ranging from 0.82 to 0.99. Conclusions/Significance Web search query data were found to be capable of tracking dengue activity in Bolivia, Brazil, India, Indonesia and Singapore. Whereas traditional dengue data from official sources are often not available until after some substantial delay, web search query data are available in near real-time. These data represent valuable complement to assist with traditional dengue surveillance. PMID:21647308

  9. Using web search query data to monitor dengue epidemics: a new model for neglected tropical disease surveillance.

    Directory of Open Access Journals (Sweden)

    Emily H Chan

    2011-05-01

    Full Text Available BACKGROUND: A variety of obstacles including bureaucracy and lack of resources have interfered with timely detection and reporting of dengue cases in many endemic countries. Surveillance efforts have turned to modern data sources, such as Internet search queries, which have been shown to be effective for monitoring influenza-like illnesses. However, few have evaluated the utility of web search query data for other diseases, especially those of high morbidity and mortality or where a vaccine may not exist. In this study, we aimed to assess whether web search queries are a viable data source for the early detection and monitoring of dengue epidemics. METHODOLOGY/PRINCIPAL FINDINGS: Bolivia, Brazil, India, Indonesia and Singapore were chosen for analysis based on available data and adequate search volume. For each country, a univariate linear model was then built by fitting a time series of the fraction of Google search query volume for specific dengue-related queries from that country against a time series of official dengue case counts for a time-frame within 2003-2010. The specific combination of queries used was chosen to maximize model fit. Spurious spikes in the data were also removed prior to model fitting. The final models, fit using a training subset of the data, were cross-validated against both the overall dataset and a holdout subset of the data. All models were found to fit the data quite well, with validation correlations ranging from 0.82 to 0.99. CONCLUSIONS/SIGNIFICANCE: Web search query data were found to be capable of tracking dengue activity in Bolivia, Brazil, India, Indonesia and Singapore. Whereas traditional dengue data from official sources are often not available until after some substantial delay, web search query data are available in near real-time. These data represent valuable complement to assist with traditional dengue surveillance.

  10. Age-related differences in the accuracy of web query-based predictions of influenza-like illness.

    Directory of Open Access Journals (Sweden)

    Alexander Domnich

    Full Text Available Web queries are now widely used for modeling, nowcasting and forecasting influenza-like illness (ILI. However, given that ILI attack rates vary significantly across ages, in terms of both magnitude and timing, little is known about whether the association between ILI morbidity and ILI-related queries is comparable across different age-groups. The present study aimed to investigate features of the association between ILI morbidity and ILI-related query volume from the perspective of age.Since Google Flu Trends is unavailable in Italy, Google Trends was used to identify entry terms that correlated highly with official ILI surveillance data. All-age and age-class-specific modeling was performed by means of linear models with generalized least-square estimation. Hold-out validation was used to quantify prediction accuracy. For purposes of comparison, predictions generated by exponential smoothing were computed.Five search terms showed high correlation coefficients of > .6. In comparison with exponential smoothing, the all-age query-based model correctly predicted the peak time and yielded a higher correlation coefficient with observed ILI morbidity (.978 vs. .929. However, query-based prediction of ILI morbidity was associated with a greater error. Age-class-specific query-based models varied significantly in terms of prediction accuracy. In the 0-4 and 25-44-year age-groups, these did well and outperformed exponential smoothing predictions; in the 15-24 and ≥ 65-year age-classes, however, the query-based models were inaccurate and highly overestimated peak height. In all but one age-class, peak timing predicted by the query-based models coincided with observed timing.The accuracy of web query-based models in predicting ILI morbidity rates could differ among ages. Greater age-specific detail may be useful in flu query-based studies in order to account for age-specific features of the epidemiology of ILI.

  11. A Query Language for Handling Big Observation Data Sets in the Sensor Web

    Science.gov (United States)

    Autermann, Christian; Stasch, Christoph; Jirka, Simon; Koppe, Roland

    2017-04-01

    The Sensor Web provides a framework for the standardized Web-based sharing of environmental observations and sensor metadata. While the issue of varying data formats and protocols is addressed by these standards, the fast growing size of observational data is imposing new challenges for the application of these standards. Most solutions for handling big observational datasets currently focus on remote sensing applications, while big in-situ datasets relying on vector features still lack a solid approach. Conventional Sensor Web technologies may not be adequate, as the sheer size of the data transmitted and the amount of metadata accumulated may render traditional OGC Sensor Observation Services (SOS) unusable. Besides novel approaches to store and process observation data in place, e.g. by harnessing big data technologies from mainstream IT, the access layer has to be amended to utilize and integrate these large observational data archives into applications and to enable analysis. For this, an extension to the SOS will be discussed that establishes a query language to dynamically process and filter observations at storage level, similar to the OGC Web Coverage Service (WCS) and it's Web Coverage Processing Service (WCPS) extension. This will enable applications to request e.g. spatial or temporal aggregated data sets in a resolution it is able to display or it requires. The approach will be developed and implemented in cooperation with the The Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research whose catalogue of data compromises marine observations of physical, chemical and biological phenomena from a wide variety of sensors, including mobile (like research vessels, aircrafts or underwater vehicles) and stationary (like buoys or research stations). Observations are made with a high temporal resolution and the resulting time series may span multiple decades.

  12. QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks.

    Directory of Open Access Journals (Sweden)

    Asa Thibodeau

    2016-06-01

    Full Text Available Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for mapping interactions between DNA regions, e.g., ChIA-PET and HiC, can generate genome-wide maps of interactions between regulatory elements. These interaction datasets are important resources to infer distal gene targets of non-coding regulatory elements and to facilitate prioritization of critical loci for important cellular functions. With the increasing diversity and complexity of genomic information and public ontologies, making sense of these datasets demands integrative and easy-to-use software tools. Moreover, network representation of chromatin interaction maps enables effective data visualization, integration, and mining. Currently, there is no software that can take full advantage of network theory approaches for the analysis of chromatin interaction datasets. To fill this gap, we developed a web-based application, QuIN, which enables: 1 building and visualizing chromatin interaction networks, 2 annotating networks with user-provided private and publicly available functional genomics and interaction datasets, 3 querying network components based on gene name or chromosome location, and 4 utilizing network based measures to identify and prioritize critical regulatory targets and their direct and indirect interactions.QuIN's web server is available at http://quin.jax.org QuIN is developed in Java and JavaScript, utilizing an Apache Tomcat web server and MySQL database and the source code is available under the GPLV3 license available on GitHub: https://github.com/UcarLab/QuIN/.

  13. Linking Annual Prescription Volume of Antidepressants to Corresponding Web Search Query Data: A Possible Proxy for Medical Prescription Behavior?

    Science.gov (United States)

    Gahr, Maximilian; Uzelac, Zeljko; Zeiss, René; Connemann, Bernhard J; Lang, Dirk; Schönfeldt-Lecuona, Carlos

    2015-12-01

    Persons using the Internet to retrieve medical information generate large amounts of health-related data, which are increasingly used in modern health sciences. We analyzed the relation between annual prescription volumes (APVs) of several antidepressants with marketing approval in Germany and corresponding web search query data generated in Google to test whether web search query volume may be a proxy for medical prescription practice. We obtained APVs of several antidepressants related to corresponding prescriptions at the expense of the statutory health insurance in Germany from 2004 to 2013. Web search query data generated in Germany and related to defined search terms (active substance or brand name) were obtained with Google Trends. We calculated correlations (Person's r) between the APVs of each substance and the respective annual "search share" values; coefficients of determination (R) were computed to determine the amount of variability shared by the 2 variables. Significant and strong correlations between substance-specific APVs and corresponding annual query volumes were found for each substance during the observational interval: agomelatine (r = 0.968, R = 0.932, P = 0.01), bupropion (r = 0.962, R = 0.925, P = 0.01), citalopram (r = 0.970, R = 0.941, P = 0.01), escitalopram (r = 0.824, R = 0.682, P = 0.01), fluoxetine (r = 0.885, R = 0.783, P = 0.01), paroxetine (r = 0.801, R = 0.641, P = 0.01), and sertraline (r = 0.880, R = 0.689, P = 0.01). Although the used data did not allow to perform an analysis with a higher temporal resolution (quarters, months), our results suggest that web search query volume may be a proxy for corresponding prescription behavior. However, further studies analyzing other pharmacologic agents and prescription data that facilitate an increased temporal resolution are needed to confirm this hypothesis.

  14. Chinese college students’ Web querying behaviors:A case study of Peking University

    Institute of Scientific and Technical Information of China (English)

    QU; Peng; LIU; Chang; LAI; Maosheng

    2010-01-01

    This study examined users’querying behaviors based on a sample of 30 Chinese college students from Peking University.The authors designed 5 search tasks and each participant conducted two randomly selected search tasks during the experiment.The results show that when searching for pre-designed search tasks,users often have relatively clear goals and strategies before searching.When formulating their queries,users often select words from tasks,use concrete concepts directly,or extract"central words"or keywords.When reformulating queries,seven query reformulation types were identified from users’behaviors,i.e.broadening,narrowing,issuing new query,paralleling,changing search tools,reformulating syntax terms,and clicking on suggested queries.The results reveal that the search results and/or the contexts can also influence users’querying behaviors.

  15. Querying phenotype-genotype relationships on patient datasets using semantic web technology: the example of Cerebrotendinous xanthomatosis.

    Science.gov (United States)

    Taboada, María; Martínez, Diego; Pilo, Belén; Jiménez-Escrig, Adriano; Robinson, Peter N; Sobrido, María J

    2012-07-31

    Semantic Web technology can considerably catalyze translational genetics and genomics research in medicine, where the interchange of information between basic research and clinical levels becomes crucial. This exchange involves mapping abstract phenotype descriptions from research resources, such as knowledge databases and catalogs, to unstructured datasets produced through experimental methods and clinical practice. This is especially true for the construction of mutation databases. This paper presents a way of harmonizing abstract phenotype descriptions with patient data from clinical practice, and querying this dataset about relationships between phenotypes and genetic variants, at different levels of abstraction. Due to the current availability of ontological and terminological resources that have already reached some consensus in biomedicine, a reuse-based ontology engineering approach was followed. The proposed approach uses the Ontology Web Language (OWL) to represent the phenotype ontology and the patient model, the Semantic Web Rule Language (SWRL) to bridge the gap between phenotype descriptions and clinical data, and the Semantic Query Web Rule Language (SQWRL) to query relevant phenotype-genotype bidirectional relationships. The work tests the use of semantic web technology in the biomedical research domain named cerebrotendinous xanthomatosis (CTX), using a real dataset and ontologies. A framework to query relevant phenotype-genotype bidirectional relationships is provided. Phenotype descriptions and patient data were harmonized by defining 28 Horn-like rules in terms of the OWL concepts. In total, 24 patterns of SWQRL queries were designed following the initial list of competency questions. As the approach is based on OWL, the semantic of the framework adapts the standard logical model of an open world assumption. This work demonstrates how semantic web technologies can be used to support flexible representation and computational inference mechanisms

  16. Querying phenotype-genotype relationships on patient datasets using semantic web technology: the example of cerebrotendinous xanthomatosis

    Directory of Open Access Journals (Sweden)

    Taboada María

    2012-07-01

    Full Text Available Abstract Background Semantic Web technology can considerably catalyze translational genetics and genomics research in medicine, where the interchange of information between basic research and clinical levels becomes crucial. This exchange involves mapping abstract phenotype descriptions from research resources, such as knowledge databases and catalogs, to unstructured datasets produced through experimental methods and clinical practice. This is especially true for the construction of mutation databases. This paper presents a way of harmonizing abstract phenotype descriptions with patient data from clinical practice, and querying this dataset about relationships between phenotypes and genetic variants, at different levels of abstraction. Methods Due to the current availability of ontological and terminological resources that have already reached some consensus in biomedicine, a reuse-based ontology engineering approach was followed. The proposed approach uses the Ontology Web Language (OWL to represent the phenotype ontology and the patient model, the Semantic Web Rule Language (SWRL to bridge the gap between phenotype descriptions and clinical data, and the Semantic Query Web Rule Language (SQWRL to query relevant phenotype-genotype bidirectional relationships. The work tests the use of semantic web technology in the biomedical research domain named cerebrotendinous xanthomatosis (CTX, using a real dataset and ontologies. Results A framework to query relevant phenotype-genotype bidirectional relationships is provided. Phenotype descriptions and patient data were harmonized by defining 28 Horn-like rules in terms of the OWL concepts. In total, 24 patterns of SWQRL queries were designed following the initial list of competency questions. As the approach is based on OWL, the semantic of the framework adapts the standard logical model of an open world assumption. Conclusions This work demonstrates how semantic web technologies can be used to support

  17. A Web 2.0 Application for Executing Queries and Services on Climatic Data

    Science.gov (United States)

    Abad-Mota, S.; Ruckhaus, E.; Garboza, A.; Tepedino, G.

    2007-12-01

    aggregation, hourly, daily, monthly, so that they can be provided to the user at the desired level. This means that additional caution has to be exercised in query answering, in order to distinguish between primary and derived data. On the other hand, a Web 2.0 application is being designed to provide a front-end to the repository. This design focuses on two important aspects: the use of metadata structures, and the definition of collaborative Web 2.0 features that can be integrated to a project of this nature. Metadata descriptors include for a set of measurements, its quality, granularity and other dimension information. With these descriptors it is possible to establish relationships between different sets of measurements and provide scientists with efficient searching mechanisms that determine the related sets of measurements that contribute to a query answer. Unlike traditional applications for climatic data, our approach not only satisfies requirements of researchers specialized in this domain, but also those of anyone interested in this area; one of the objectives is to build an informal knowledge base that can be improved and consolidated with the usage of the system.

  18. Reducing Excessive Amounts of Data: Multiple Web Queries for Generation of Pun Candidates

    Directory of Open Access Journals (Sweden)

    Pawel Dybala

    2011-01-01

    Full Text Available Humor processing is still a less studied issue, both in NLP and AI. In this paper we contribute to this field. In our previous research we showed that adding a simple pun generator to a chatterbot can significantly improve its performance. The pun generator we used generated only puns based on words (not phrases. In this paper we introduce the next stage of the system's development—an algorithm allowing generation of phrasal pun candidates. We show that by using only the Internet (without any hand-made humor-oriented lexicons, it is possible to generate puns based on complex phrases. As the output list is often excessively long, we also propose a method for reducing the number of candidates by comparing two web-query-based rankings. The evaluation experiment showed that the system achieved an accuracy of 72.5% for finding proper candidates in general, and the reduction method allowed us to significantly shorten the candidates list. The parameters of the reduction algorithm are variable, so that the balance between the number of candidates and the quality of output can be manipulated according to needs.

  19. Research of Query Translation on Deep Web%Deep Web搜索中查询转换的研究

    Institute of Scientific and Technical Information of China (English)

    邵秀丽; 李云龙; 张文龙

    2012-01-01

    The mapping mechanism of query translation has been advanced, which is based on synonymous attributes and group attributes. The problem of how to realize the more accurate comprehensive mapping from the source query string to the target query string has been solved. The query substring to each source site for the realization of retrieval has been provided, the solution of this paper has been applied to 20 representative Deep Web sites which are selected from domestic books area, and the Deep Web search to these sites* book information is implemented.%基于同义属性和成组属性给出了查询转换映射机制,解决了如何从源查询串到目标查询串的较为准确的映射,实现了检索对应各源网址的转换查询子串,相应的方案应用于国内20个代表性的图书领域的DeepWeb站点,较好地实现了对这些站点的Deep Web图书信息的搜索.

  20. Data federation in the Biomedical Informatics Research Network: tools for semantic annotation and query of distributed multiscale brain data.

    Science.gov (United States)

    Bug, William; Astahkov, Vadim; Boline, Jyl; Fennema-Notestine, Christine; Grethe, Jeffrey S; Gupta, Amarnath; Kennedy, David N; Rubin, Daniel L; Sanders, Brian; Turner, Jessica A; Martone, Maryann E

    2008-11-06

    The broadly defined mission of the Biomedical Informatics Research Network (BIRN, www.nbirn.net) is to better understand the causes human disease and the specific ways in which animal models inform that understanding. To construct the community-wide infrastructure for gathering, organizing and managing this knowledge, BIRN is developing a federated architecture for linking multiple databases across sites contributing data and knowledge. Navigating across these distributed data sources requires a shared semantic scheme and supporting software framework to actively link the disparate repositories. At the core of this knowledge organization is BIRNLex, a formally-represented ontology facilitating data exchange. Source curators enable database interoperability by mapping their schema and data to BIRNLex semantic classes thereby providing a means to cast BIRNLex-based queries against specific data sources in the federation. We will illustrate use of the source registration, term mapping, and query tools.

  1. KDS-CM: A Cache Mechanism Based on Top-K Data Source for Deep Web Query

    Institute of Scientific and Technical Information of China (English)

    KOU Yue; SHEN Derong; YU Ge; LI Dong; NIE Tiezheng

    2007-01-01

    Caching is an important technique to enhance the efficiency of query processing. Unfortunately, traditional caching mechanisms are not efficient for deep Web because of storage space and dynamic maintenance limitations. In this paper, we present on providing a cache mechanism based on Top-K data source (KDS-CM) instead of result records for deep Web query.By integrating techniques from IR and Top-K, a data reorganization strategy is presented to model KDS-CM. Also some measures about cache management and optimization are proposed to improve the performances of cache effectively. Experimental results show the benefits of KDS-CM in execution cost and dynamic maintenance when compared with various alternate strategies.

  2. The iMars WebGIS - Spatio-Temporal Data Queries and Single Image Map Web Services

    Science.gov (United States)

    Walter, Sebastian; Steikert, Ralf; Schreiner, Bjoern; Muller, Jan-Peter; van Gasselt, Stephan; Sidiropoulos, Panagiotis; Lanz-Kroechert, Julia

    2017-04-01

    Introduction: Web-based planetary image dissemination platforms usually show outline coverages of the data and offer querying for metadata as well as preview and download, e.g. the HRSC Mapserver (Walter & van Gasselt, 2014). Here we introduce a new approach for a system dedicated to change detection by simultanous visualisation of single-image time series in a multi-temporal context. While the usual form of presenting multi-orbit datasets is the merge of the data into a larger mosaic, we want to stay with the single image as an important snapshot of the planetary surface at a specific time. In the context of the EU FP-7 iMars project we process and ingest vast amounts of automatically co-registered (ACRO) images. The base of the co-registration are the high precision HRSC multi-orbit quadrangle image mosaics, which are based on bundle-block-adjusted multi-orbit HRSC DTMs. Additionally we make use of the existing bundle-adjusted HRSC single images available at the PDS archives. A prototype demonstrating the presented features is available at http://imars.planet.fu-berlin.de. Multi-temporal database: In order to locate multiple coverage of images and select images based on spatio-temporal queries, we converge available coverage catalogs for various NASA imaging missions into a relational database management system with geometry support. We harvest available metadata entries during our processing pipeline using the Integrated Software for Imagers and Spectrometers (ISIS) software. Currently, this database contains image outlines from the MGS/MOC, MRO/CTX and the MO/THEMIS instruments with imaging dates ranging from 1996 to the present. For the MEx/HRSC data, we already maintain a database which we automatically update with custom software based on the VICAR environment. Web Map Service with time support: The MapServer software is connected to the database and provides Web Map Services (WMS) with time support based on the START_TIME image attribute. It allows temporal

  3. Federated Space-Time Query for Earth Science Data Using OpenSearch Conventions

    Science.gov (United States)

    Lynnes, Chris; Beaumont, Bruce; Duerr, Ruth; Hua, Hook

    2009-01-01

    This slide presentation reviews a Space-time query system that has been developed to assist the user in finding Earth science data that fulfills the researchers needs. It reviews the reasons why finding Earth science data can be so difficult, and explains the workings of the Space-Time Query with OpenSearch and how this system can assist researchers in finding the required data, It also reviews the developments with client server systems.

  4. Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases.

    Science.gov (United States)

    Wollbrett, Julien; Larmande, Pierre; de Lamotte, Frédéric; Ruiz, Manuel

    2013-04-15

    In recent years, a large amount of "-omics" data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic.

  5. Robust Query Processing for Personalized Information Access on the Semantic Web

    DEFF Research Database (Denmark)

    Dolog, Peter; Stuckenschmidt, Heiner; Wache, Holger

    Research in Cooperative Query answering is triggered by the observation that users are often not able to correctly formulate queries to databases that return the intended result. Due to a lack of knowledge of the contents and the structure of a database, users will often only be able to provide v...

  6. An analysis of free-text queries for a multi-field web form

    NARCIS (Netherlands)

    Tjin-Kam-Jet, K.; Trieschnigg, D.; Hiemstra, D.

    2012-01-01

    We report how users interact with an experimental system that transforms single-field textual input into a multi-field query for an existing travel planner system. The experimental system was made publicly available and we collected over 30,000 queries from almost 12,000 users. From the free-text qu

  7. jQuery在万欣机房管理系统WEB查询模块中的典型应用%Typical Applications of jQuery in a Web Query Module for Wanxin Computer Room Management System

    Institute of Scientific and Technical Information of China (English)

    陈专红; 李焕

    2013-01-01

    By the application of jQuery in a Web query module for Wanxin computer room management system, the query module has got a superior user experience, better interactive operation efficiency and interactive performance, which has a certain reference value for the Web UI application developers.%通过jQuery在万欣机房管理系统Web查询模块中的具体应用,使得实现的查询模块具备了优越的用户体验,较高的系统的运行效率和交互性能,对Web UI应用开发者有一定的参考价值。

  8. Web-based Visualization and Query of semantically segmented multiresolution 3D Models in the Field of Cultural Heritage

    Science.gov (United States)

    Auer, M.; Agugiaro, G.; Billen, N.; Loos, L.; Zipf, A.

    2014-05-01

    Many important Cultural Heritage sites have been studied over long periods of time by different means of technical equipment, methods and intentions by different researchers. This has led to huge amounts of heterogeneous "traditional" datasets and formats. The rising popularity of 3D models in the field of Cultural Heritage in recent years has brought additional data formats and makes it even more necessary to find solutions to manage, publish and study these data in an integrated way. The MayaArch3D project aims to realize such an integrative approach by establishing a web-based research platform bringing spatial and non-spatial databases together and providing visualization and analysis tools. Especially the 3D components of the platform use hierarchical segmentation concepts to structure the data and to perform queries on semantic entities. This paper presents a database schema to organize not only segmented models but also different Levels-of-Details and other representations of the same entity. It is further implemented in a spatial database which allows the storing of georeferenced 3D data. This enables organization and queries by semantic, geometric and spatial properties. As service for the delivery of the segmented models a standardization candidate of the OpenGeospatialConsortium (OGC), the Web3DService (W3DS) has been extended to cope with the new database schema and deliver a web friendly format for WebGL rendering. Finally a generic user interface is presented which uses the segments as navigation metaphor to browse and query the semantic segmentation levels and retrieve information from an external database of the German Archaeological Institute (DAI).

  9. Snippet-based relevance predictions for federated web search

    NARCIS (Netherlands)

    Demeester, Thomas; Nguyen, Dong; Trieschnigg, Dolf; Develder, Chris; Hiemstra, Djoerd

    2013-01-01

    How well can the relevance of a page be predicted, purely based on snippets? This would be highly useful in a Federated Web Search setting where caching large amounts of result snippets is more feasible than caching entire pages. The experiments reported in this paper make use of result snippets and

  10. Snippet-based relevance predictions for federated web search

    OpenAIRE

    Demeester, Thomas; Nguyen, Dong-Phuong; Trieschnigg, Rudolf Berend; Develder, Chris; Hiemstra, Djoerd

    2013-01-01

    How well can the relevance of a page be predicted, purely based on snippets? This would be highly useful in a Federated Web Search setting where caching large amounts of result snippets is more feasible than caching entire pages. The experiments reported in this paper make use of result snippets and pages from a diverse set of actual Web search engines. A linear classifier is trained to predict the snippet-based user estimate of page relevance, but also, to predict the actual page relevance, ...

  11. Robust Query Processing for Personalized Information Access on the Semantic Web

    DEFF Research Database (Denmark)

    Dolog, Peter; Stuckenschmidt, Heiner; Wache, Holger

    and user preferences. We describe a framework for information access that combines query refinement and relaxation in order to provide robust, personalized access to heterogeneous RDF data as well as an implementation in terms of rewriting rules and explain its application in the context of e...

  12. Learning semantic query suggestions

    NARCIS (Netherlands)

    E. Meij; M. Bron; L. Hollink; B. Huurnink; M. de Rijke

    2009-01-01

    An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide faci

  13. Collective spatial keyword querying

    DEFF Research Database (Denmark)

    Cao, Xin; Cong, Gao; Jensen, Christian S.;

    2011-01-01

    With the proliferation of geo-positioning and geo-tagging, spatial web objects that possess both a geographical location and a textual description are gaining in prevalence, and spatial keyword queries that exploit both location and textual description are gaining in prominence. However......, the queries studied so far generally focus on finding individual objects that each satisfy a query rather than finding groups of objects where the objects in a group collectively satisfy a query. We define the problem of retrieving a group of spatial web objects such that the group's keywords cover the query...

  14. Categorical and Specificity Differences between User-Supplied Tags and Search Query Terms for Images. An Analysis of "Flickr" Tags and Web Image Search Queries

    Science.gov (United States)

    Chung, EunKyung; Yoon, JungWon

    2009-01-01

    Introduction: The purpose of this study is to compare characteristics and features of user supplied tags and search query terms for images on the "Flickr" Website in terms of categories of pictorial meanings and level of term specificity. Method: This study focuses on comparisons between tags and search queries using Shatford's categorization…

  15. 基于HTML5+jQuery Mobile的移动Web应用开发研究%Study of mobile web application development based on HTML5 and jQuery Mobile

    Institute of Scientific and Technical Information of China (English)

    覃凤萍

    2015-01-01

    With the rapidly growing popularity of smart devices such as iphone and Android,mobile web technology has gradually become a new hot spot of concern,traditional site will be transferred to the mobile terminal due to market demand . Using jQuery Mobile and HTML5 to do mobile web application development, with the development of simple, short release cycle, cross-platform, cross-platform advantages . In this paper, jQuery Mobile and HTML5 mobile web application development made a presentation and analysis.%随着iphone、Android等智能设备的迅速普及,移动Web技术逐渐成为关注的新热点,传统信息类和电子商务网站因市场需求向移动终端转移。使用jQuery Mobile和HTML5做移动Web应用开发,具有开发简单,发布周期短、跨平台跨设备的优点。文章对jQuery Mobile和HTML5的移动Web应用开发做了介绍和分析。

  16. CWI and TU Delft at TREC 2013: Contextual Suggestion, Federated Web Search, KBA, and Web Tracks

    NARCIS (Netherlands)

    Bellogín Kouki, A.; Gebremeskel, G.G.; He, J.; Lin, J.J.P.; Said, A.; Samar, T.; Vries, A.P. de; Vuurens, J.B.P.

    2014-01-01

    This paper provides an overview of the work done at the Centrum Wiskunde & Informatica (CWI) and Delft University of Technology (TU Delft) for different tracks of TREC 2013. We participated in the Contextual Suggestion Track, the Federated Web Search Track, the Knowledge Base Acceleration (KBA) Trac

  17. NoSQL: collection document and cloud by using a dynamic web query form

    Science.gov (United States)

    Abdalla, Hemn B.; Lin, Jinzhao; Li, Guoquan

    2015-07-01

    Mongo-DB (from "humongous") is an open-source document database and the leading NoSQL database. A NoSQL (Not Only SQL, next generation databases, being non-relational, deal, open-source and horizontally scalable) presenting a mechanism for storage and retrieval of documents. Previously, we stored and retrieved the data using the SQL queries. Here, we use the MonogoDB that means we are not utilizing the MySQL and SQL queries. Directly importing the documents into our Drives, retrieving the documents on that drive by not applying the SQL queries, using the IO BufferReader and Writer, BufferReader for importing our type of document files to my folder (Drive). For retrieving the document files, the usage is BufferWriter from the particular folder (or) Drive. In this sense, providing the security for those storing files for what purpose means if we store the documents in our local folder means all or views that file and modified that file. So preventing that file, we are furnishing the security. The original document files will be changed to another format like in this paper; Binary format is used. Our documents will be converting to the binary format after that direct storing in one of our folder, that time the storage space will provide the private key for accessing that file. Wherever any user tries to discover the Document files means that file data are in the binary format, the document's file owner simply views that original format using that personal key from receive the secret key from the cloud.

  18. An informatics supported web-based data annotation and query tool to expedite translational research for head and neck malignancies

    Directory of Open Access Journals (Sweden)

    Ridge-Hetrick Jennifer

    2009-11-01

    Full Text Available Abstract Background The Specialized Program of Research Excellence (SPORE in Head and Neck Cancer neoplasm virtual biorepository is a bioinformatics-supported system to incorporate data from various clinical, pathological, and molecular systems into a single architecture based on a set of common data elements (CDEs that provides semantic and syntactic interoperability of data sets. Results The various components of this annotation tool include the Development of Common Data Elements (CDEs that are derived from College of American Pathologists (CAP Checklist and North American Association of Central Cancer Registries (NAACR standards. The Data Entry Tool is a portable and flexible Oracle-based data entry device, which is an easily mastered web-based tool. The Data Query Tool helps investigators and researchers to search de-identified information within the warehouse/resource through a "point and click" interface, thus enabling only the selected data elements to be essentially copied into a data mart using a multi dimensional model from the warehouse's relational structure. The SPORE Head and Neck Neoplasm Database contains multimodal datasets that are accessible to investigators via an easy to use query tool. The database currently holds 6553 cases and 10607 tumor accessions. Among these, there are 965 metastatic, 4227 primary, 1369 recurrent, and 483 new primary cases. The data disclosure is strictly regulated by user's authorization. Conclusion The SPORE Head and Neck Neoplasm Virtual Biorepository is a robust translational biomedical informatics tool that can facilitate basic science, clinical, and translational research. The Data Query Tool acts as a central source providing a mechanism for researchers to efficiently find clinically annotated datasets and biospecimens that are relevant to their research areas. The tool protects patient privacy by revealing only de-identified data in accordance with regulations and approvals of the IRB and

  19. LINCS Canvas Browser: interactive web app to query, browse and interrogate LINCS L1000 gene expression signatures.

    Science.gov (United States)

    Duan, Qiaonan; Flynn, Corey; Niepel, Mario; Hafner, Marc; Muhlich, Jeremy L; Fernandez, Nicolas F; Rouillard, Andrew D; Tan, Christopher M; Chen, Edward Y; Golub, Todd R; Sorger, Peter K; Subramanian, Aravind; Ma'ayan, Avi

    2014-07-01

    For the Library of Integrated Network-based Cellular Signatures (LINCS) project many gene expression signatures using the L1000 technology have been produced. The L1000 technology is a cost-effective method to profile gene expression in large scale. LINCS Canvas Browser (LCB) is an interactive HTML5 web-based software application that facilitates querying, browsing and interrogating many of the currently available LINCS L1000 data. LCB implements two compacted layered canvases, one to visualize clustered L1000 expression data, and the other to display enrichment analysis results using 30 different gene set libraries. Clicking on an experimental condition highlights gene-sets enriched for the differentially expressed genes from the selected experiment. A search interface allows users to input gene lists and query them against over 100 000 conditions to find the top matching experiments. The tool integrates many resources for an unprecedented potential for new discoveries in systems biology and systems pharmacology. The LCB application is available at http://www.maayanlab.net/LINCS/LCB. Customized versions will be made part of the http://lincscloud.org and http://lincs.hms.harvard.edu websites.

  20. Semantic Web Query on e-Governance Data and Designing Ontology for Agriculture Domain

    Directory of Open Access Journals (Sweden)

    Swaran Lata

    2013-07-01

    Full Text Available Indian agriculture has made rapid progress on the agricultural front during the past three decades and isin a queue of the major producer in the world. But still it has long way to go and meet challenges aheadsuch as communication, resources, and availability at right time at right place. The web has had an amazingexistence and it has been the driving force for a cause to grow information across boundaries, enablingeffective communication and 24x7 service availability all leading to a digital information based economythat we have today. Despite that, its direct influence has reached to a small percentage of human population.Since localization populated with India and the applications are translated and adapted for Indian users.With the possible localization of spread raw formatted Indian government data, at different locationsare thought to have integrated with each other using the internet web technology as – Semantic Web Network.

  1. Implementing a Federated Data Archive with Asynchronous Data Query, Gathering and Analysis Capabilities

    Science.gov (United States)

    Rankin, R.; Gordon, M.; Potter, R.

    2009-12-01

    Software architectures for data archives must be easily extensible to meet the requirements of new space science missions of increasing complexity and scope. Simultaneously, they must ensure that data is discoverable and easily usable by the research community. They must also respect the independence and rights of data providers, who may have digital rights management issues to enforce. The Canadian Space Science Data Portal (CSSDP) is facing these challenges by adopting a federated, portal-based, virtual observatory model, founded on Space Physics Archive Search and Extract (SPASE) metadata. The approach allows for the separation of the search, discovery and retrieval of pertinent data from its actual storage location, and the insulation of the researcher from the technical details of such data search, discovery, retrieval and usage. CSSDP implements a federation approach that allows data storage locations and methods to evolve separately from data access, and for data analysis capabilities to be offered through a common interface. This allows further independence of (and changes to) any archive in the federation, which is achieved without any change to CSSDP itself and without any impact on individual researchers. As CSSDP has successfully addressed core data storage and access challenges, it is now moving toward supporting improved data manipulation and analysis. CSSDP currently supports researcher collaboration and researcher "tagging" of relevant data, along with basic analytics; in the future, CSSDP will also allow researchers to insert scientific workflows and data analysis tools into their workspace, This paper will discuss how relevant features were implemented for CSSDP, the knowledge gained, the value delivered to the space science research community, and how to position data archives to achieve the same result in related disciplines.

  2. Federated queries of clinical data repositories: Scaling to a national network.

    Science.gov (United States)

    Weber, Griffin M

    2015-06-01

    Federated networks of clinical research data repositories are rapidly growing in size from a handful of sites to true national networks with more than 100 hospitals. This study creates a conceptual framework for predicting how various properties of these systems will scale as they continue to expand. Starting with actual data from Harvard's four-site Shared Health Research Information Network (SHRINE), the framework is used to imagine a future 4000 site network, representing the majority of hospitals in the United States. From this it becomes clear that several common assumptions of small networks fail to scale to a national level, such as all sites being online at all times or containing data from the same date range. On the other hand, a large network enables researchers to select subsets of sites that are most appropriate for particular research questions. Developers of federated clinical data networks should be aware of how the properties of these networks change at different scales and design their software accordingly.

  3. WebEQ: a web-GIS System to collect, display and query data for the management of the earthquake emergency in Central Italy

    Science.gov (United States)

    Carbone, Gianluca; Cosentino, Giuseppe; Pennica, Francesco; Moscatelli, Massimiliano; Stigliano, Francesco

    2017-04-01

    After the strong earthquakes that hit central Italy in recent months, the Center for Seismic Microzonation and its applications (CentroMS) was commissioned by the Italian Department of Civil Protection to conduct the study of seismic microzonation of the territories affected by the earthquake of August 24, 2016. As part of the activities of microzonation, IGAG CNR has created WebEQ, a management tool of the data that have been acquired by all participants (i.e., more than twenty research institutes and university departments). The data collection was organized and divided into sub-areas, assigned to working groups with multidisciplinary expertise in geology, geophysics and engineering. WebEQ is a web-GIS System that helps all the subjects involved in the data collection activities, through tools aimed at data uploading and validation, and with a simple GIS interface to display, query and download geographic data. WebEQ is contributing to the creation of a large database containing geographical data, both vector and raster, from various sources and types: - Regional Technical Map em Geological and geomorphological maps em Data location maps em Maps of microzones homogeneous in seismic perspective and seismic microzonation maps em National strong motion network location. Data loading is done through simple input masks that ensure consistency with the database structure, avoiding possible errors and helping users to interact with the map through user-friendly tools. All the data are thematized through standardized symbologies and colors (Gruppo di lavoro MS 2008), in order to allow the easy interpretation by all users. The data download tools allow data exchange between working groups and the scientific community to benefit from the activities. The seismic microzonation activities are still ongoing. WebEQ is enabling easy management of large amounts of data and will form a basis for the development of tools for the management of the upcoming seismic emergencies.

  4. Semantic Web Query on E-Governance Data and Designing Ontology for Agriculture Domain

    Directory of Open Access Journals (Sweden)

    Swaran Lata

    2013-07-01

    Full Text Available Indian agriculture has made rapid progress on the a gricultural front during the past three decades and is in a queue of the major producer in the world. But still it has long way to go and meet challenges ahead such as communication, resour ces, and availability at right time at right place. The web has had an amazing existence and it has been the driving force for a cause to grow information across boundaries, enabling effect ive communication and 24x7 service availability all leading to a digital information b ased economy that we have today. Despite that, its direct influence has reached to a small percent age of human population. Since localization populated with India and the applications are trans lated and adapted for Indian users. With the possible localization of spread raw formatted India n government data, at different locations are thought to have integrated with each other using th e internet web technology as – Semantic Web Network

  5. Query for Semantic Web Service Based on SPARQL-DL%基于SPARQL-DL的语义Web服务查询

    Institute of Scientific and Technical Information of China (English)

    王海; 高岭; 范琳; 李增智

    2011-01-01

    The Semantic Web service discovery is a hot spot in the current web service research field. Its core research content is service descriptions and the corresponding discovery methods. Service description can be divided into the service request description and advertisement description. Service advertisement descriptions are usually complete, descriptive, information-rich descriptions; while service request description is concemed only with some of the characteristics of services,usually does not constitute a complete description of the service. Current methods of service discovery using the same mechanism to describe both request and advertisement. Compare and identify the identity or similarity of the corresponding part between the two as matching basis. Build a complete fictive service description as service request limit the practicability of the method. In this paper, we propose to use semantic web query language SPARQL-DL as the service request description language to retrieve OWL-S form published services. Treat the service discovery issue as query against the knowledge base. Through the experiment, confinned that the method is practical, simple, reliable and easy-to-use.%语义Web服务发现是当前Web服务研究领域的热点,其核心研究内容是服务描述及相应的发现方法.服务描述分为请求描述与发布描述,发布描述具有描述完整,信息丰富的特征,而服务请求描述仅关注服务的部分特征,通常不构成一个完整的服务描述.现有方法使用相同机制进行请求描述与发布描述,以比对请求与发布中对应部分的同一性或相似性作为匹配依据.构建假想的完整服务描述作为服务请求既不合理也难以实施,从而限制了方法的实用性.本文提出以语义Web查询语言SPARQL-DL作为服务请求描述语言,以OWL-S作为服务发布描述语言,将服务发现问题转化为知识库的查询,从而进行有效的服务发现.通过实验,证实了该方

  6. The Research on Automatic Construction of Domain Model Based on Deep Web Query Interfaces

    Science.gov (United States)

    JianPing, Gu

    The integration of services is transparent, meaning that users no longer face the millions of Web services, do not care about the required data stored, but do not need to learn how to obtain these data. In this paper, we analyze the uncertainty of schema matching, and then propose a series of similarity measures. To reduce the cost of execution, we propose the type-based optimization method and schema matching pruning method of numeric data. Based on above analysis, we propose the uncertain schema matching method. The experiments prove the effectiveness and efficiency of our method.

  7. Federated Search and the Library Web Site: A Study of Association of Research Libraries Member Web Sites

    Science.gov (United States)

    Williams, Sarah C.

    2010-01-01

    The purpose of this study was to investigate how federated search engines are incorporated into the Web sites of libraries in the Association of Research Libraries. In 2009, information was gathered for each library in the Association of Research Libraries with a federated search engine. This included the name of the federated search service and…

  8. Federated Search and the Library Web Site: A Study of Association of Research Libraries Member Web Sites

    Science.gov (United States)

    Williams, Sarah C.

    2010-01-01

    The purpose of this study was to investigate how federated search engines are incorporated into the Web sites of libraries in the Association of Research Libraries. In 2009, information was gathered for each library in the Association of Research Libraries with a federated search engine. This included the name of the federated search service and…

  9. Spatial Keyword Querying

    DEFF Research Database (Denmark)

    Cao, Xin; Chen, Lisi; Cong, Gao;

    2012-01-01

    The web is increasingly being used by mobile users. In addition, it is increasingly becoming possible to accurately geo-position mobile users and web content. This development gives prominence to spatial web data management. Specifically, a spatial keyword query takes a user location and user-sup...... different kinds of functionality as well as the ideas underlying their definition....

  10. Genes2GO: A web application for querying gene sets for specific GO terms.

    Science.gov (United States)

    Chawla, Konika; Kuiper, Martin

    2016-01-01

    Gene ontology annotations have become an essential resource for biological interpretations of experimental findings. The process of gathering basic annotation information in tables that link gene sets with specific gene ontology terms can be cumbersome, in particular if it requires above average computer skills or bioinformatics expertise. We have therefore developed Genes2GO, an intuitive R-based web application. Genes2GO uses the biomaRt package of Bioconductor in order to retrieve custom sets of gene ontology annotations for any list of genes from organisms covered by the Ensembl database. Genes2GO produces a binary matrix file, indicating for each gene the presence or absence of specific annotations for a gene. It should be noted that other GO tools do not offer this user-friendly access to annotations. Genes2GO is freely available and listed under http://www.semantic-systems-biology.org/tools/externaltools/.

  11. Query recommendation for children

    NARCIS (Netherlands)

    Duarte Torres, Sergio; Hiemstra, Djoerd; Weber, Ingmar; Serdyukov, Pavel

    2012-01-01

    One of the biggest problems that children experience while searching the web occurs during the query formulation process. Children have been found to struggle formulating queries based on keywords given their limited vocabulary and their difficulty to choose the right keywords. In this work we propo

  12. WATERS Expert Query Tool

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Expert Query Tool is a web-based reporting tool using the EPA’s WATERS database.There are just three steps to using Expert Query:1. View Selection – Choose what...

  13. 基于关键词的深度万维网数据库查询%Keyword-based Deep Web Database Query

    Institute of Scientific and Technical Information of China (English)

    丁传羽; 陈军华; 夏海峰

    2013-01-01

    深度万维网蕴藏着海量的信息,现有的搜索引擎很难搜索到其中的内容.如何充分地获取深度万维网中的有价值的信息成为一个难题.论文提出了基于关键词的深度万维网的数据库的查询方法,该方法采用朴素贝叶斯算法对关键词进行分类,并采用日志挖掘对采样的数据库进行统计,最终生成查询的SQL,语句.该方法不仅解决了深度万维网多领域的数据库查询,而且能够与现有的搜索引擎进行整合,帮助用户快速有效的查询.%Deep Web contain vast amounts of information, existing search engines are difficult to search the contents. How to make the access to valuable information in Deep Web becomes a problem. This paper proposes a keyword-based Deep Web database query method, the method using the Naive Bayes algorithm to classify the keywords and log mining statistics of the sample database, eventually generate the SQL statement for the query. This method not only solves Deep Web in various fields of the database query, but also can be integrated with the existing search engine to help users to query quickly and efficiently.

  14. Deep Web信息资源的查询接口集成研究%Query Interfaces Integrating on Deep Web

    Institute of Scientific and Technical Information of China (English)

    林爱群; 习万球

    2011-01-01

    The Hidden Web databases contain much more searchable information than the Surface Web databases. If the query interfaces on the Deep Web are integrated, the recall and precision of web information retrieval will be highly improved. This paper discusses the clustering analysis for query schema integration problem. The query' interface schema integration method costs less, compared with the Deep Web data source integration.%Deep Web信息是隐藏在Web服务器中可搜索的数据库信息资源,其信息量远比表面web信息量大。将Deep Web信息查询的接口模式集成为统一的查询接口,将极大地提高web信息检索的查全率和查准率。讨论了查询模式集成问题的聚类分析方法,相对于直接对Deep Web数据源的进行集成,对查询模式加以集成的思路成本更低。

  15. A UDDI Search Engine for SVG Federated Medical Imaging Web Services

    Directory of Open Access Journals (Sweden)

    Sabah Mohammed

    2006-01-01

    Full Text Available With more and more medical web services appearing on the web, web service’s discovery mechanism becomes essential. UDDI is an online registry standard to facilitate the discovery of business partners and services. However, most medical imaging applications exist within their own protected domain and were never designed to participate and operate with other applications across the web. However, private UDDI registries in federated organizations should be able to share the service descriptions as well as to access them if they are authorized. The new initiatives on Federated Web Services Identity Management can resolve a range of both technical and political barriers to enable wide-scale participation and interoperation of separate domains into a singular, robust user experience. However, there is no widely acceptable standard for federated web services and most of the available venders frameworks concentrate only on the security issue of the federation leaving the issue of searching and discovering web services largely primitive. Federated web services security and web services searching are uniquely intertwined, mutually reliant on each other and are poised to finally solve a long-running problem in both IT and systems security. Traditional keyword search is insufficient for web services search as the very small text fragments in web services are unsuitable for keyword search and the underlying structure and semantics of the web service are not exploited. Engineering solutions that address the security and accessibility concerns of web services, however, is a challenging task. This article introduces an extension to the traditional UDDI that enables sophisticated types of searching based on a lightweight web services federated security infrastructure.

  16. Project Lefty: More Bang for the Search Query

    Science.gov (United States)

    Varnum, Ken

    2010-01-01

    This article describes the Project Lefty, a search system that, at a minimum, adds a layer on top of traditional federated search tools that will make the wait for results more worthwhile for researchers. At best, Project Lefty improves search queries and relevance rankings for web-scale discovery tools to make the results themselves more relevant…

  17. Project Lefty: More Bang for the Search Query

    Science.gov (United States)

    Varnum, Ken

    2010-01-01

    This article describes the Project Lefty, a search system that, at a minimum, adds a layer on top of traditional federated search tools that will make the wait for results more worthwhile for researchers. At best, Project Lefty improves search queries and relevance rankings for web-scale discovery tools to make the results themselves more relevant…

  18. Drexel at TREC 2014 Federated Web Search Track

    Science.gov (United States)

    2014-11-01

    TREC 2014) held in Gaithersburg, Maryland, November 19-21, 2014. The conference was co-sponsored by the National Institute of Standards and... genre of content, e.g. news, blogs, encyclopedia, etc. The user’s query may have a strong indication of vertical intent, e.g. ”arrow icon”, which is

  19. Semantic Similarity-based Approach for Answering Imprecise Queries over Web Databases%基于语义相似度的Web数据库不精确查询方法

    Institute of Scientific and Technical Information of China (English)

    孟祥福; 张霄雁; 马宗民; 张志艳

    2012-01-01

    To deal with the problem of answering the Web database imprecise queries, this paper proposed a semantic similarity-based Web database imprecise query approach. For a given query, one or several similar queries in the query history will be found firstly,and the similarity of each similar query to the original query is greater than the given relaxation threshold. Then, the tuples matched to these queries are treated as the imprecise query results to the current query. Finally, the result tuples are ranked according to their satisfaction to the original query. Results of experiments demonstrate that the query similarity measuring method proposed is stable and reasonable,and the imprecise query method proposed has higher recall and the ranking accuracy as well.%为了解决普通用户对于Web数据库的不精确查询问题,提出了一种基于语义相似度的Web数据库不精确查询方法.对于一个给定查询,该方法首先在查询历史中找出一个(或若干)与其相似度高于给定放松阈值的查询,然后从数据库中找出与这些查询相匹配的元组作为当前查询的不精确查询的结果,最后将这些查询结果按其对初始查询的满足程度进行排序.实验结果表明,提出的不同查询之间的语义相似度评估方法性能稳定、评估结果合理,不精确查询方法具有较高的查全率和排序准确性.

  20. 基于jQuery前端框架提升Web用户体验的研究%Research on Enhancing Web User Experience with jQuery

    Institute of Scientific and Technical Information of China (English)

    侯海平

    2013-01-01

      越来越多互联网公司注重用户体验,jQuery正是顺应这一趋势而诞生的JavaScript轻量级框架,在提升Web用户体验方面有着很大的优势,并且这一效果正在逐渐扩大。本文主要介绍jQuery这一框架的技术特点,并分析如何提升用户体验。%More and more Internet companies focus on user expe-rience, jQuery is the one generated for this tendency. It is a lightweight framework of JavaScript. It has great advantage for enhancing Web user experience, and the effective now is enlarg-ing. This essay mainly introduces the features of jQuery and ana-lyze how to realize it.

  1. Overview of the TREC 2014 Federated Web Search Track

    Science.gov (United States)

    2014-11-01

    description>You are looking for information about Squamous Cell Carcinoma (skin cancer ). </description> <narrative>You have been diagnosed with squamous cell... dogs . For the rightmost topic 7222, no Key results were returned, although a num- ber of HRel results were. The query route 666 appeared to be rather...ranking (rank correlation ρ = 0.95). Also, despite the clear absolute benefit of remov- ing duplicates (with regard to the official metric nDCG@20), the

  2. Querying JSON Streams

    OpenAIRE

    Bo, Yang

    2010-01-01

    A data stream management system (DSMS) is similar to a database management system (DBMS) but can search data directly in on-line streams. Using its mediator-wrapper approach, the extensible database system, Amos II, allows different kinds of distributed data resource to be queried. It has been extended with a stream datatype to query possibly infinite streams, which provides DSMS functionality. Nowadays, more and more web applications start to offer their services in JSON format which is a te...

  3. AthMethPre: a web server for the prediction and query of mRNA m(6)A sites in Arabidopsis thaliana.

    Science.gov (United States)

    Xiang, Shunian; Yan, Zhangming; Liu, Ke; Zhang, Yaou; Sun, Zhirong

    2016-10-18

    N(6)-Methyladenosine (m(6)A) is the most prevalent and abundant modification in mRNA that has been linked to many key biological processes. High-throughput experiments have generated m(6)A-peaks across the transcriptome of A. thaliana, but the specific methylated sites were not assigned, which impedes the understanding of m(6)A functions in plants. Therefore, computational prediction of mRNA m(6)A sites becomes emergently important. Here, we present a method to predict the m(6)A sites for A. thaliana mRNA sequence(s). To predict the m(6)A sites of an mRNA sequence, we employed the support vector machine to build a classifier using the features of the positional flanking nucleotide sequence and position-independent k-mer nucleotide spectrum. Our method achieved good performance and was applied to a web server to provide service for the prediction of A. thaliana m(6)A sites. The server also provides a comprehensive database of predicted transcriptome-wide m(6)A sites and curated m(6)A-seq peaks from the literature for query and visualization. The AthMethPre web server is the first web server that provides a user-friendly tool for the prediction and query of A. thaliana mRNA m(6)A sites, which is freely accessible for public use at .

  4. EquiX-A Search and Query Language for XML.

    Science.gov (United States)

    Cohen, Sara; Kanza, Yaron; Kogan, Yakov; Sagiv, Yehoshua; Nutt, Werner; Serebrenik, Alexander

    2002-01-01

    Describes EquiX, a search language for XML that combines querying with searching to query the data and the meta-data content of Web pages. Topics include search engines; a data model for XML documents; search query syntax; search query semantics; an algorithm for evaluating a query on a document; and indexing EquiX queries. (LRW)

  5. Moving Spatial Keyword Queries

    DEFF Research Database (Denmark)

    Wu, Dingming; Yiu, Man Lung; Jensen, Christian S.

    2013-01-01

    Web users and content are increasingly being geo-positioned. This development gives prominence to spatial keyword queries, which involve both the locations and textual descriptions of content. We study the efficient processing of continuously moving top-k spatial keyword (MkSK) queries over spatial...... text data. State-of-the-art solutions for moving queries employ safe zones that guarantee the validity of reported results as long as the user remains within the safe zone associated with a result. However, existing safe-zone methods focus solely on spatial locations and ignore text relevancy. We...

  6. jQuery cookbook

    CERN Document Server

    2010-01-01

    jQuery simplifies building rich, interactive web frontends. Getting started with this JavaScript library is easy, but it can take years to fully realize its breadth and depth; this cookbook shortens the learning curve considerably. With these recipes, you'll learn patterns and practices from 19 leading developers who use jQuery for everything from integrating simple components into websites and applications to developing complex, high-performance user interfaces. Ideal for newcomers and JavaScript veterans alike, jQuery Cookbook starts with the basics and then moves to practical use cases w

  7. Learning jQuery

    CERN Document Server

    Chaffer, Jonathan

    2013-01-01

    Step through each of the core concepts of the jQuery library, building an overall picture of its capabilities. Once you have thoroughly covered the basics, the book returns to each concept to cover more advanced examples and techniques.This book is for web designers who want to create interactive elements for their designs, and for developers who want to create the best user interface for their web applications. Basic JavaScript programming and knowledge of HTML and CSS is required. No knowledge of jQuery is assumed, nor is experience with any other JavaScript libraries.

  8. 75 FR 20400 - Submission for Review: Federal Cyber Service: Scholarship for Service (SFS) Registration Web Site

    Science.gov (United States)

    2010-04-19

    ... MANAGEMENT Submission for Review: Federal Cyber Service: Scholarship for Service (SFS) Registration Web Site... supporting documentation, may be obtained by contacting the San Antonio Services Branch, Office of Personnel... Science Foundation in accordance with the Federal Cyber Service Training and Education Initiative...

  9. On tourism information query system in Panjin based on WebGIS%基于WebGIS的盘锦市旅游信息查询系统

    Institute of Scientific and Technical Information of China (English)

    杨帆; 任东风

    2016-01-01

    文中基于盘锦旅游资源和旅游信息查询现状,以Visual Studio 2010为开发环境,采用Silver-light开发方案,调用天地图为地图底图数据,C#为编程语言进行开发。除了基本的地图浏览功能,还可通过分类查询、模糊查询等来实现景点、餐饮等信息的查询,并借助多媒体增加景点的表现力,面向游客提供交通、住宿、娱乐等旅游相关信息。该系统的建立将进一步树立盘锦市旅游业发展的新形象。%Based on present situation of tourism resources and tourism information query in Panjin, the paper establi-shes an tourism information query system in Panjin with technologies such as Visual Studio 2010, Silverlight, C # language, and data of Tianditu. Besides the basis function of map browsing, it can also provide scenery spots query and catering query through classification query and fuzzy query. And with the help of multimedia, further query of transportation, accommodation, entertainment and other related travel information are also available. This system will further build up the new image of the tourism development in Panjin.

  10. Attributes extraction of Deep Web query interface based on DOM%基于DOM的Deep Web查询接口属性抽取方法

    Institute of Scientific and Technical Information of China (English)

    石龙; 强保华; 何倩; 吴春明; 谌超

    2012-01-01

    Query interface schema extraction is the precondition of Deep Web data integration. Generally query interface schema consists of a set of domain-related attributes, and one attribute is formed by a single element or a combination of multi-elements. The current researches on attribute extraction are mostly based on the single element fashion, and those multi-elements based are few. Aiming at the case of multi-elements attribute extraction, a DOM-based method for query interface schema extraction is proposed. This method parses query interface to become a DOM and extracts the form elements base on the corresponding DOM nodes. The method employs two-phase clustering algorithms to cluster the form elements, mines the combination relationship of them and combines elements to realize attributes extraction. This method has a favorable performance at both single-element and multi-elements attribute extraction. The experimental result shows that this method is effective.%属性抽取主要基于单元素属性抽取,而多元素属性抽取的研究较少.针对多元素组成属性情况进行研究,提出一种基于查询接口DOM结构的属性抽取方法,该方法将查询接口解析成DOM,基于DOM节点提取查询接口上对应的表单元素,对从查询接口上提取获得的元素集合进行二次聚类,挖掘元素之间的组合关系,最终将元素组合形成属性.这种方法能很好地抽取接口上的单元素属性和多元素属性,实验结果表明了方法的有效性.

  11. Web Service技术在出入境信息查询中的应用%Application of Web Service in Querying Exit-Entry Information

    Institute of Scientific and Technical Information of China (English)

    陈桂芳; 邱旭华

    2006-01-01

    Web Service是Internet上新兴的应用通信和集成技术,本文先简要地介绍了Web Service技术架构以及Web Service技术的特点和优势,然后着重讨论Web Service在出入境信息查询系统中的应用.

  12. Orthogonal Query Expansion

    CERN Document Server

    Ackerman, Margareta; Lopez-Ortiz, Alejandro

    2011-01-01

    Over the last fifteen years, web searching has seen tremendous improvements. Starting from a nearly random collection of matching pages in 1995, today, search engines tend to satisfy the user's informational need on well-formulated queries. One of the main remaining challenges is to satisfy the users' needs when they provide a poorly formulated query. When the pages matching the user's original keywords are judged to be unsatisfactory, query expansion techniques are used to alter the result set. These techniques find keywords that are similar to the keywords given by the user, which are then appended to the original query leading to a perturbation of the result set. However, when the original query is sufficiently ill-posed, the user's informational need is best met using entirely different keywords, and a small perturbation of the original result set is bound to fail. We propose a novel approach that is not based on the keywords of the original query. We intentionally seek out orthogonal queries, which are r...

  13. Deep Web数据采集查询构造方法研究%Research on Query Construction Method for Deep Web Data Crawling

    Institute of Scientific and Technical Information of China (English)

    林海伦; 杨晓刚; 熊锦华; 王元卓; 贾岩涛; 程学旗

    2015-01-01

    Network big data bring a great challenge to the knowledge acquisition because of large-scale, heterogeneity, dynamic and high noise. Specially, many websites data are hidden in Web databases behind the HTML forms, called Deep Web data, which can only be dynamically accessed by performing form submissions. These data can not be covered by Web crawlers as a result of using hyperlinks to collect resources, which affects the coverage of knowl-edge resources. Therefore, how to efficiently crawl these data and make use of them is challenging. This paper firstly presents a detailed analysis of the existing Deep Web data acquisition query construction methods, and introduces the Deep Web data acquisition query construction methods according to the different types of forms. Secondly, this paper concludes the advantages and limitations of the existing methods. Finally, this paper proposes the future work to promote the development of the Deep Web crawling techniques.%网络大数据的大规模、多源异构、动态更新、高噪声给知识的获取带来了很大的挑战。特别地,很多网站隐藏在HTML表单后端的Web数据库中的Deep Web数据,只能通过提交表单查询的方式进行动态访问,网络爬虫难以通过页面之间的链接关系采集到这些数据,影响了获取到的知识资源的覆盖率,如何高效地采集这些数据并加以利用非常具有挑战性。为此对现有的Deep Web数据采集的查询构造方法进行了详细分析,分别介绍了针对不同类型的表单对应的Deep Web数据采集查询构造方法;总结了现有表层化方式的Deep Web数据采集查询构造方法的优缺点,并对Deep Web数据采集查询构造方法的未来工作进行了展望,以推动Deep Web数据采集技术的进一步发展。

  14. jQuery Mobile

    CERN Document Server

    Reid, Jon

    2011-01-01

    Native apps have distinct advantages, but the future belongs to mobile web apps that function on a broad range of smartphones and tablets. Get started with jQuery Mobile, the touch-optimized framework for creating apps that look and behave consistently across many devices. This concise book provides HTML5, CSS3, and JavaScript code examples, screen shots, and step-by-step guidance to help you build a complete working app with jQuery Mobile. If you're already familiar with the jQuery JavaScript library, you can use your existing skills to build cross-platform mobile web apps right now. This b

  15. Superfund Query

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Superfund Query allows users to retrieve data from the Comprehensive Environmental Response, Compensation, and Liability Information System (CERCLIS) database.

  16. Semantic web services for web databases

    CERN Document Server

    Ouzzani, Mourad

    2011-01-01

    Semantic Web Services for Web Databases introduces an end-to-end framework for querying Web databases using novel Web service querying techniques. This includes a detailed framework for the query infrastructure for Web databases and services. Case studies are covered in the last section of this book. Semantic Web Services For Web Databases is designed for practitioners and researchers focused on service-oriented computing and Web databases.

  17. Deep web站点查询界面的潜在语义分析%Latent semantic analysis for query interfaces of deep web sites

    Institute of Scientific and Technical Information of China (English)

    茅琴娇; 冯博琴; 潘善亮

    2008-01-01

    为了进一步提高搜索引擎的效率,实现对deep web中所蕴含的大量有用信息的检索、索引和定位,引入潜在语义分析理论是一种简单而有效的方法.通过对作为deep web站点入口的查询界面里的表单属性进行潜在语义分析,从表单属性中挖掘出潜在语义结构,并实现一定程度上的降维.利用这种潜在语义结构,推断对应站点的数据内容并改善不同站点的相似度计算.实验结果显示,潜在语义分析修正和改善了deep web站点的表单属性的语义理解,弥补了单纯的关键字匹配带来的一些不足.该方法可以被用来实现为某一站点查找网络上相似度高的站点及通过键入表单属性给出拥有相似表单的站点列表.%To further enhance the efficiencies of search engines, achieving capabilities of searching, indexing and locating the information in the deep web, latent semantic analysis is a simple and effective way. Through the latent semantic analysis of the attributes in the query interfaces and the unique entrances of the deep web sites, the hidden semantic structure information can be retrieved and dimension reduction can be achieved to a certain extent. Using this semantic structure information, the contents in the site can be inferred and the similarity measures among sites in deep web can be revised. Experimental results show that latent semantic analysis revises and improves the semantic understanding of the query form in the deep web, which overcomes the shortcomings of the keyword-based methods. This approach can be used to effectively search the most similar site for any given site and to obtain a site list which conforms to the restrictions one specifies.

  18. Query responses

    Directory of Open Access Journals (Sweden)

    Paweł Łupkowski

    2017-05-01

    Full Text Available In this article we consider the phenomenon of answering a query with a query. Although such answers are common, no large scale, corpus-based characterization exists, with the exception of clarification requests. After briefly reviewing different theoretical approaches on this subject, we present a corpus study of query responses in the British National Corpus and develop a taxonomy for query responses. We point at a variety of response categories that have not been formalized in previous dialogue work, particularly those relevant to adversarial interaction. We show that different response categories have significantly different rates of subsequent answer provision. We provide a formal analysis of the response categories in the framework of KoS.

  19. SCRY: Enabling quantitative reasoning in SPARQL queries

    NARCIS (Netherlands)

    Meroño-Peñuela, A.; Stringer, Bas; Loizou, Antonis; Abeln, Sanne; Heringa, Jaap

    2015-01-01

    The inability to include quantitative reasoning in SPARQL queries slows down the application of Semantic Web technology in the life sciences. SCRY, our SPARQL compatible service layer, improves this by executing services at query time and making their outputs query-accessible, generating RDF data on

  20. From scarcity to bounty: how Galateas can turn your scarce short queries into gold

    NARCIS (Netherlands)

    Segond, F.; Barbu, E.; Barsanti, I.; Kovachev, B.; Lagos, N.; Trevisan, M.; Vald, E.

    2012-01-01

    With the growth of digital libraries and the digital library federation in addition to partially unstructured collections of documents such as web sites, a large set of vendors are offering engines for retrieving content and metadata via search requests by the end user (queries). In most cases these

  1. From scarcity to bounty: how Galateas can turn your scarce short queries into gold

    NARCIS (Netherlands)

    Segond, F.; Barbu, E.; Barsanti, I.; Kovachev, B.; Lagos, N.; Trevisan, M.; Vald, E.

    2012-01-01

    With the growth of digital libraries and the digital library federation in addition to partially unstructured collections of documents such as web sites, a large set of vendors are offering engines for retrieving content and metadata via search requests by the end user (queries). In most cases these

  2. HTML5 PivotViewer: high-throughput visualization and querying of image data on the web.

    Science.gov (United States)

    Taylor, Stephen; Noble, Roger

    2014-09-15

    Visualization and analysis of large numbers of biological images has generated a bottle neck in research. We present HTML5 PivotViewer, a novel, open source, platform-independent viewer making use of the latest web technologies that allows seamless access to images and associated metadata for each image. This provides a powerful method to allow end users to mine their data. Documentation, examples and links to the software are available from http://www.cbrg.ox.ac.uk/data/pivotviewer/. The software is licensed under GPLv2. © The Author 2014. Published by Oxford University Press.

  3. Deep web接口查询能力估计%The estimate of the query capability of deep web interface

    Institute of Scientific and Technical Information of China (English)

    元书俊; 朱守中; 金灵芝

    2009-01-01

    deep web 数据源中的信息可以通过查询提交进行访问,因此分析一个查询接口的查询能力是非常关键的,本文基于原子查询的理念,提出了一种通过识别查询接口上所有原子查询的方法来估计deep web接口查询能力.

  4. The Design and Implementation of Mobile Query Web App Based on Hospital Information Platforms%基于医院信息化平台的移动查询Web App的设计与实现

    Institute of Scientific and Technical Information of China (English)

    谢颖夫; 张俊; 刘灵

    2016-01-01

    目前多数医院信息查询都依赖于传统的信息处理系统完成,给随时随地获取信息带来了限制.为此,运用HTML5、jQuery Mobile、PhoneGap等技术开发了一种基于医院信息化平台的移动查询Web App系统.有效解决了查询过程中受制于IOS和Android技术Native App移植限制问题,实现了医务人员在移动终端就能对医院信息的实时获取.

  5. 基于SSH+jQuery框架的餐饮Web App的设计与实现%Design and implementation of a catering industry Web App based on SSH and jQuery

    Institute of Scientific and Technical Information of China (English)

    张佳佳; 王杨; 韩力英

    2016-01-01

    针对传统方式开发的餐饮Web App平台难以维护和扩展、用户体验不够好等问题,本文提出一种以Windows为开发环境,Eclipse为开发工具,Oracle为数据库,将SSH和jQuery这两种框架整合应用于系统开发的方案。该方案包括视图层、业务逻辑层和数据持久层,分别由SSH+jQuery框架组合实现相应功能。结果表明,该方案将SSH和jQuery这2个框架整合应用于系统开发中,实现了注册登录、订餐、外卖等主要功能,实现了上述三个层面的完全分离,提高了用户体验度。%For solving the Web App platform difficult to maintain and extend and promoting the user experience, a system developing scheme based on integrating framework SSH and jQuery has been proposed in this paper.This scheme adopts Windows as development environment, Eclipse as development tool and Oracle as the database. It contains view layer, business logic layer and date persistence layer, implemented by the JSP+jQuery framework, Struts+Spring framework, and Hibermate+Spring frameworkseparately. The results shows that this scheme has achieved the main function of registration, ordering, take-out, and has realized the complete separation of theview layer, business logic layer and date persistence layer, which also will improved the user experience.

  6. jQuery Pocket Reference

    CERN Document Server

    Flanagan, David

    2010-01-01

    "As someone who uses jQuery on a regular basis, it was surprising to discover how much of the library I'm not using. This book is indispensable for anyone who is serious about using jQuery for non-trivial applications."-- Raffaele Cecco, longtime developer of video games, including Cybernoid, Exolon, and Stormlord jQuery is the "write less, do more" JavaScript library. Its powerful features and ease of use have made it the most popular client-side JavaScript framework for the Web. This book is jQuery's trusty companion: the definitive "read less, learn more" guide to the library. jQuery P

  7. Instant jQuery selectors

    CERN Document Server

    De Rosa, Aurelio

    2013-01-01

    Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. Instant jQuery Selectors follows a simple how-to format with recipes aimed at making you well versed with the wide range of selectors that jQuery has to offer through a myriad of examples.Instant jQuery Selectors is for web developers who want to delve into jQuery from its very starting point: selectors. Even if you're already familiar with the framework and its selectors, you could find several tips and tricks that you aren't aware of, especially about performance and how jQuery ac

  8. Design and Implementation of Salary Query System Based on Web%基于院内局域网的Web工资查询系统的设计与实现

    Institute of Scientific and Technical Information of China (English)

    李菁菁; 房芳; 王英

    2012-01-01

    目的:为了推进医院信息化建设和“无纸化办公”的进程,设计并实现了Web分布式院内职工工资查询系统.方法:采用SQL Server 2005企业版作为Web数据库,利用Microsoft.NET框架中网络应用程序的通用模板,以ASP.NET脚本语言创建网页的网络应用程序和网络服务并利用Windows的IIS组件将网页发布出去.结果:医院职工通过院内局域网,在浏览器中输入工资查询系统所在服务器的IP地址,即可实现个人工资的明细查询.结论:系统为医院职工查询工资提供了便利,节省了发放工资条所消耗的人力、物力.%Objective To design the salary query distributed system based on Web and implement it in hospital in order to promote the informationization construction and the process of "paperless office". Methods SQL Server 2005 Enterprise Edition was chosen as Web database. Using the universal template of Web application program of Microsoft .NET framework and the script language of ASP.NET, Web application and Web service was created. And then Web pages were released by Window's IIS component. Results When the staffs entered Web server's Ip into the browser, the salary was queried and showed automatically. Conclusion This system provides convenience for salary query. And it saves the manpower andmaterial resources of distributing pay slips.

  9. Distributed Top-k Queries in E-commerce Environment

    Institute of Scientific and Technical Information of China (English)

    JiangZhan; YiqingSong; HaixiaZhang

    2004-01-01

    This paper focus on how to make distributed top-k query in e-commerce environment through web service. We first give the query process in such environment, then we present an algorithms for processing such queries, which based on the query model we defined. Experimental results show that the algorithms is efficient.

  10. Processing keyword queries under access limitations

    OpenAIRE

    Calì, Andrea; Martinenghi, D.; Torlone, R.

    2015-01-01

    The Deep Web is constituted by data accessible through Web pages, but not readily indexable by search engines, as they are returned in dynamic pages. In this paper we propose a framework for accessing Deep Web sources, represented as relational tables with so-called access limitations, with keyword-based queries. We formalize the notion of optimal answer and propose methods for query processing. To the best of our knowledge, ours is the first systematic approach to keyword search in such cont...

  11. Hacia una web social libre y federada: el caso de Lorea Towards a free and federated social web: the case of Lorea

    Directory of Open Access Journals (Sweden)

    Alex Haché

    2012-06-01

    Full Text Available Este artículo aborda las limitaciones de la web 2.0, las claves para superarlas y las recientes iniciativas de web social libre y federada. De entre ellas nos centraremos en el caso de Lorea desde nuestra experiencia directa con ella. This paper tackles the limitations of web 2.0, the keys to overcome them and the recent efforts towards a free and federated social web. Between these we will focus on the case of Lorea from our direct experience with it.

  12. A web-based federated neuroinformatics model for surgical planning and clinical research applications in epilepsy.

    Science.gov (United States)

    Cao, Xinhua; Wong, Stephen T C; Hoo, Kent Soo; Tjandra, Donny; Fu, J C; Lowenstein, Daniel H

    2004-01-01

    There is an increasing need to efficiently share diverse clinical and image data among different clinics, labs, and departments of a medical center enterprise to facilitate better quality care and more effective clinical research. In this paper, we describe a web-based, federated information model as a viable technical solution with applications in medical refractory epilepsy and other neurological disorders. We describe four such online applications developed in a federated system prototype: surgical planning, image analysis, statistical data analysis, and dynamic extraction, transforming, and loading (ETL) of data from a heterogeneous collection of data sources into an epilepsy multimedia data warehouse (EMDW). The federated information system adopts a three-tiered architecture, consisting of a user-interface layer, an application logic layer, and a data service layer. We implemented two complementary federated information technologies, i.e., XML (eXtensible Markup Language) and CORBA (Common Object Request Broker Architecture), in the prototype to enable multimedia data exchange and brain images transmission. The preliminary results show that the federated prototype system provides a uniform interface, heterogeneous information integration and efficient data sharing for users in our institution who are concerned with the care of patients with epilepsy and who pursue research in this area.

  13. jQuery For Dummies

    CERN Document Server

    Beighley, Lynn

    2010-01-01

    Learn how jQuery can make your Web page or blog stand out from the crowd!. jQuery is free, open source software that allows you to extend and customize Joomla!, Drupal, AJAX, and WordPress via plug-ins. Assuming no previous programming experience, Lynn Beighley takes you through the basics of jQuery from the very start. You'll discover how the jQuery library separates itself from other JavaScript libraries through its ease of use, compactness, and friendliness if you're a beginner programmer. Written in the easy-to-understand style of the For Dummies brand, this book demonstrates how you can a

  14. An Effective Information Retrieval for Ambiguous Query

    CERN Document Server

    Roul, R K

    2012-01-01

    Search engine returns thousands of web pages for a single user query, in which most of them are not relevant. In this context, effective information retrieval from the expanding web is a challenging task, in particular, if the query is ambiguous. The major question arises here is that how to get the relevant pages for an ambiguous query. We propose an approach for the effective result of an ambiguous query by forming community vector based on association concept of data minning using vector space model and the freedictionary. We develop clusters by computing the similarity between community vectors and document vectors formed from the extracted web pages by the search engine. We use Gensim package to implement the algorithm because of its simplicity and robust nature. Analysis shows that our approach is an effective way to form clusters for an ambiguous query.

  15. DEEP WEB DATA SOURCES CLASSIFICATION BASED ON TEXT VSM OF QUERY INTERFACE%基于查询接口文本VSM的Deep Web数据源分类

    Institute of Scientific and Technical Information of China (English)

    石龙; 强保华; 谌超; 吴春明

    2013-01-01

    With the rapid development of Internet technology,a large number of Web databases have mushroomed and the number remains in a fast-growing trend.In order to effectively organise and utilise the information which hides deeply in Web databases,it is necessary to classify and integrate them according to domains.Since the query interface of Webpage is the unique channel to access the Web database,the classification of Deep Web data source can be realised by classifying the query interfaces.In this paper,a classification method based on text VSM of query interface is proposed.The basic idea is to build a vector space model (VSM) by using query interface text information firstly.Then the typical data mining classification algorithm is employed to train one or more classifiers,thus to classify the domains the query interfaces belonging to is implemented.Experimental result shows that the approach proposed in the paper has excellent classification performance.%随着Intemet技术的快速发展,Web数据库数目庞大而且仍在快速增长.为有效组织利用深藏于Web数据库上的信息,需对其按领域进行分类和集成.Web页面上的查询接口是网络用户访问Web数据库的唯一途径,对Deep Web数据源分类可通过对查询接口分类实现.为此,提出一种基于查询接口文本VSM(Vector Space Model)的分类方法.首先,使用查询接口文本信息构建向量空间模型,然后通过典型的数据挖掘分类算法训练分类器,从而实现对查询接口所属领域进行分类.实验结果表明给出的方法具有良好的分类性能.

  16. jQuery插件对Web应用的影响研究与讨论%Research and Discussion on the Impact of jQuery Plug-in on Web Application

    Institute of Scientific and Technical Information of China (English)

    邓佳源

    2016-01-01

    在网络迅速发展的今天,关于Web应用的研究不断深化。从客观的角度来分析,Web应用迅速扩大的今天,插件所产生的影响是绝对性的,且在很多方面都产生了较大的积极作用。另一方面,随着插件的泛滥,很多的Web应用都受到了较大的限制,不仅失去了广泛的用户群体,同时在很多方面都遭到了用户的抵制,造成了Web应用的矛盾局面。为此,针对Web应用的特点和需求,有效应用了jQuery插件,实现了Web应用的新进步,净化了插件的应用程度,为用户带来了更多的体验。今后,应针对jQuery插件的应用开展深入的研究,健全Web应用。%In the rapid development of the network today, the research on the application of Web has been continuously deep-ened. From an objective point of view, the rapid expansion of Web applications today, the impact of the plug-in is absolute, and in many ways have a greater positive effect. On the other hand, with the proliferation of plug-ins, a lot of Web applications are subject to greater restrictions, not only lost a wide range of user groups, but also in many ways by the user's resistance, resulting in the contradiction of Web applications. To this end, in view of the characteristics and requirements of Web applications, the ef-fective application of the jQuery plug-in, to achieve a new progress in Web applications, the extent of the application of the plug-in, to bring more experience to the user. In the future, we should carry out further research on the application of jQuery plug-in, and improve the application of Web.

  17. Implementation of Web Processing Services (WPS) over IPSL Earth System Grid Federation (ESGF) node

    Science.gov (United States)

    Kadygrov, Nikolay; Denvil, Sebastien; Carenton, Nicolas; Levavasseur, Guillaume; Hempelmann, Nils; Ehbrecht, Carsten

    2016-04-01

    The Earth System Grid Federation (ESGF) is aimed to provide access to climate data for the international climate community. ESGF is a system of distributed and federated nodes that dynamically interact with each other. ESGF user may search and download climatic data, geographically distributed over the world, from one common web interface and through standardized API. With the continuous development of the climate models and the beginning of the sixth phase of the Coupled Model Intercomparison Project (CMIP6), the amount of data available from ESGF will continuously increase during the next 5 years. IPSL holds a replication of the different global and regional climate models output, observations and reanalysis data (CMIP5, CORDEX, obs4MIPs, etc) that are available on the IPSL ESGF node. In order to let scientists perform analysis of the models without downloading vast amount of data the Web Processing Services (WPS) were installed at IPSL compute node. The work is part of the CONVERGENCE project founded by French National Research Agency (ANR). PyWPS implementation of the Web processing Service standard from Open Geospatial Consortium (OGC) in the framework of birdhouse software is used. The processes could be run by user remotely through web-based WPS client or by using command-line tool. All the calculations are performed on the server side close to the data. If the models/observations are not available at IPSL it will be downloaded and cached by WPS process from ESGF network using synda tool. The outputs of the WPS processes are available for download as plots, tar-archives or as NetCDF files. We present the architecture of WPS at IPSL along with the processes for evaluation of the model performance, on-site diagnostics and post-analysis processing of the models output, e.g.: - regriding/interpolation/aggregation - ocgis (OpenClimateGIS) based polygon subsetting of the data - average seasonal cycle, multimodel mean, multimodel mean bias - calculation of the

  18. 基于决策树和链接相似的DeepWeb查询接口判定%Deep Web query interface identification based on decision tree and link-similar

    Institute of Scientific and Technical Information of China (English)

    李雪玲; 施化吉; 兰均; 李星毅

    2011-01-01

    针对现有Deep Web查询接口判定方法误判较多、无法有效区分搜索引擎类接口的不足,提出了基于决策树和链接相似的Deep Web查询接口判定方法.该方法利用信息增益率选取重要属性,并构建决策树对接口表单进行预判定,识别特征较为明显的接口;然后利用基于链接相似的判定方法对未识别出的接口进行二次判定,准确识别真正查询接口,排除搜索引擎类接口.结果表明,该方法能有效区分搜索引擎类接口,提高了分类的准确率和查全率.%In order to solve the problems existed in the traditional method that Deep Web query interfaces are more false positives and search engine class interface can not be effectively distinguished, this paper proposed a Deep Web query interface identification method based on decision tree and link-similar. This method used attribute information gain ratio as selection level, built a decision tree to pre-determine the form of the interfaces to identify the most interfaces which had some distinct features, and then used a new method based on link-similar to identify these unidentified again, distinguishing between Deep Web query interface and the interface of search engines. The result of experiment shows that it can enhance the accuracy and proves that it is better than the traditional methods.

  19. A novel adaptive Cuckoo search for optimal query plan generation.

    Science.gov (United States)

    Gomathi, Ramalingam; Sharmila, Dhandapani

    2014-01-01

    The emergence of multiple web pages day by day leads to the development of the semantic web technology. A World Wide Web Consortium (W3C) standard for storing semantic web data is the resource description framework (RDF). To enhance the efficiency in the execution time for querying large RDF graphs, the evolving metaheuristic algorithms become an alternate to the traditional query optimization methods. This paper focuses on the problem of query optimization of semantic web data. An efficient algorithm called adaptive Cuckoo search (ACS) for querying and generating optimal query plan for large RDF graphs is designed in this research. Experiments were conducted on different datasets with varying number of predicates. The experimental results have exposed that the proposed approach has provided significant results in terms of query execution time. The extent to which the algorithm is efficient is tested and the results are documented.

  20. A Novel Adaptive Cuckoo Search for Optimal Query Plan Generation

    Directory of Open Access Journals (Sweden)

    Ramalingam Gomathi

    2014-01-01

    Full Text Available The emergence of multiple web pages day by day leads to the development of the semantic web technology. A World Wide Web Consortium (W3C standard for storing semantic web data is the resource description framework (RDF. To enhance the efficiency in the execution time for querying large RDF graphs, the evolving metaheuristic algorithms become an alternate to the traditional query optimization methods. This paper focuses on the problem of query optimization of semantic web data. An efficient algorithm called adaptive Cuckoo search (ACS for querying and generating optimal query plan for large RDF graphs is designed in this research. Experiments were conducted on different datasets with varying number of predicates. The experimental results have exposed that the proposed approach has provided significant results in terms of query execution time. The extent to which the algorithm is efficient is tested and the results are documented.

  1. Head First jQuery

    CERN Document Server

    Benedetti, Ryan

    2011-01-01

    Want to add more interactivity and polish to your websites? Discover how jQuery can help you build complex scripting functionality in just a few lines of code. With Head First jQuery, you'll quickly get up to speed on this amazing JavaScript library by learning how to navigate HTML documents while handling events, effects, callbacks, and animations. By the time you've completed the book, you'll be incorporating Ajax apps, working seamlessly with HTML and CSS, and handling data with PHP, MySQL and JSON. If you want to learn-and understand-how to create interactive web pages, unobtrusive scrip

  2. Enhancing Recall in Semantic Querying

    DEFF Research Database (Denmark)

    Rouces, Jacobo

    2013-01-01

    RDF and SPARQL are currently state-of-the-art W3C standards to respectively represent and query structured information, especially when information from different sources must be federated. However, there are various reasons for which the same knowledge can be modeled in RDF graphs that are both ...

  3. jQuery for ASPNET Developers

    CERN Document Server

    Brinkman, Joe

    2009-01-01

    This Wrox Blox teaches you how to use jQuery with your ASP.NET-based websites.  jQuery greatly simplifies JavaScript development and allows you to create highly interactive and responsive websites using the latest JavaScript and AJAX techniques. The author walks you through the jQuery API using a simple ASP.NET MVC application to highlight major topics, and shows how you can apply jQuery to your own applications. After learning the basics of using jQuery, you'll discover how easy it is to use within your own ASP.NET projects.  Whether you are using WebForms or the MVC framework, jQuery will gr

  4. GQL: Extending XQuery to Query GML Documents

    Institute of Scientific and Technical Information of China (English)

    GUAN Jihong; ZHU Fubao; ZHOU Jiaogen; NIU Liping

    2006-01-01

    GML is becoming the de facto standard for electronic data exchange among the applications of Web and distributed geographic information systems. However, the conventional query languages (e.g. SQL and its extended versions) are not suitable for direct querying and updating of GML documents. Even the effective approaches working well with XML could not guarantee good results when applied to GML documents. Although XQuery is a powerful standard query language for XML, it is not proposed for querying spatial features, which constitute the most important components in GML documents. We propose GQL, a query language specification to support spatial queries over GML documents by extending XQuery. The data model, algebra, and formal semantics as well as various spatial functions and operations of GQL are presented in detail.

  5. Query-By-Keywords (QBK): Query Formulation Using Semantics and Feedback

    Science.gov (United States)

    Telang, Aditya; Chakravarthy, Sharma; Li, Chengkai

    The staples of information retrieval have been querying and search, respectively, for structured and unstructured repositories. Processing queries over known, structured repositories (e.g., Databases) has been well-understood, and search has become ubiquitous when it comes to unstructured repositories (e.g., Web). Furthermore, searching structured repositories has been explored to a limited extent. However, there is not much work in querying unstructured sources. We argue that querying unstructured sources is the next step in performing focused retrievals. This paper proposed a new approach to generate queries from search-like inputs for unstructured repositories. Instead of burdening the user with schema details, we believe that pre-discovered semantic information in the form of taxonomies, relationship of keywords based on context, and attribute & operator compatibility can be used to generate query skeletons. Furthermore, progressive feedback from users can be used to improve the accuracy of query skeletons generated.

  6. Performance Oriented Query Processing In GEO Based Location Search Engines

    CERN Document Server

    Umamaheswari, M

    2010-01-01

    Geographic location search engines allow users to constrain and order search results in an intuitive manner by focusing a query on a particular geographic region. Geographic search technology, also called location search, has recently received significant interest from major search engine companies. Academic research in this area has focused primarily on techniques for extracting geographic knowledge from the web. In this paper, we study the problem of efficient query processing in scalable geographic search engines. Query processing is a major bottleneck in standard web search engines, and the main reason for the thousands of machines used by the major engines. Geographic search engine query processing is different in that it requires a combination of text and spatial data processing techniques. We propose several algorithms for efficient query processing in geographic search engines, integrate them into an existing web search query processor, and evaluate them on large sets of real data and query traces.

  7. From Questions to Queries

    Directory of Open Access Journals (Sweden)

    M. Drlík

    2007-12-01

    Full Text Available The extension of (Internet databases forceseveryone to become more familiar with techniques of datastorage and retrieval because users’ success often dependson their ability to pose right questions and to be able tointerpret their answers. University programs pay moreattention to developing database programming skills than todata exploitation skills. To educate our students to become“database users”, the authors intensively exploit supportivetools simplifying the production of database elements astables, queries, forms, reports, web pages, and macros.Videosequences demonstrating “standard operations” forcompleting them have been prepared to enhance out-ofclassroomlearning. The use of SQL and other professionaltools is reduced to the cases when the wizards are unable togenerate the intended construct.

  8. Approximate dictionary queries

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Gasieniec, Leszek

    1996-01-01

    Given a set of n binary strings of length m each. We consider the problem of answering d-queries. Given a binary query string of length m, a d-query is to report if there exists a string in the set within Hamming distance d of . We present a data structure of size O(nm) supporting 1-queries in ti...

  9. Hybrid Filtering in Semantic Query Processing

    Science.gov (United States)

    Jeong, Hanjo

    2011-01-01

    This dissertation presents a hybrid filtering method and a case-based reasoning framework for enhancing the effectiveness of Web search. Web search may not reflect user needs, intent, context, and preferences, because today's keyword-based search is lacking semantic information to capture the user's context and intent in posing the search query.…

  10. Query Classification and Study of University Students' Search Trends

    Science.gov (United States)

    Maabreh, Majdi A.; Al-Kabi, Mohammed N.; Alsmadi, Izzat M.

    2012-01-01

    Purpose: This study is an attempt to develop an automatic identification method for Arabic web queries and divide them into several query types using data mining. In addition, it seeks to evaluate the impact of the academic environment on using the internet. Design/methodology/approach: The web log files were collected from one of the higher…

  11. Query Intent Disambiguation of Keyword-Based Semantic Entity Search in Dataspaces

    Institute of Scientific and Technical Information of China (English)

    Dan Yang; De-Rong Shen; Ge Yu; Yue Kou; Tie-Zheng Nie

    2013-01-01

    Keyword query has attracted much research attention due to its simplicity and wide applications.The inherent ambiguity of keyword query is prone to unsatisfied query results.Moreover some existing techniques on Web query,keyword query in relational databases and XML databases cannot be completely applied to keyword query in dataspaces.So we propose KeymanticES,a novel keyword-based semantic entity search mechanism in dataspaces which combines both keyword query and semantic query features.And we focus on query intent disambiguation problem and propose a novel three-step approach to resolve it.Extensive experimental results show the effectiveness and correctness of our proposed approach.

  12. Algebra-Based Optimization of XML-Extended OLAP Queries

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    is desirable. This report presents a complete foundation for such OLAP-XML federations. This includes a prototypical query engine, a simplified query semantics based on previous work, and a complete physical algebra which enables precise modeling of the execution tasks of an OLAP-XML query. Effective algebra......-based and cost-based query optimization and implementation are also proposed, as well as the execution techniques. Finally, experiments with the prototypical query engine w.r.t. federation performance, optimization effectiveness, and feasibility suggest that our approach, unlike the physical integration...

  13. Optimizing Temporal Queries

    DEFF Research Database (Denmark)

    Toman, David; Bowman, Ivan Thomas

    2003-01-01

    Recent research in the area of temporal databases has proposed a number of query languages that vary in their expressive power and the semantics they provide to users. These query languages represent a spectrum of solutions to the tension between clean semantics and efficient evaluation. Often......, these query languages are implemented by translating temporal queries into standard relational queries. However, the compiled queries are often quite cumbersome and expensive to execute even using state-of-the-art relational products. This paper presents an optimization technique that produces more efficient...... translated SQL queries by taking into account the properties of the encoding used for temporal attributes. For concreteness, this translation technique is presented in the context of SQL/TP; however, these techniques are also applicable to other temporal query languages....

  14. Optimizing Temporal Queries

    DEFF Research Database (Denmark)

    Toman, David; Bowman, Ivan Thomas

    2003-01-01

    translated SQL queries by taking into account the properties of the encoding used for temporal attributes. For concreteness, this translation technique is presented in the context of SQL/TP; however, these techniques are also applicable to other temporal query languages......., these query languages are implemented by translating temporal queries into standard relational queries. However, the compiled queries are often quite cumbersome and expensive to execute even using state-of-the-art relational products. This paper presents an optimization technique that produces more efficient......Recent research in the area of temporal databases has proposed a number of query languages that vary in their expressive power and the semantics they provide to users. These query languages represent a spectrum of solutions to the tension between clean semantics and efficient evaluation. Often...

  15. P2P Web搜索中一种有效的查询路由策略%An Effective Query Routing Strategy over P2P Web Search

    Institute of Scientific and Technical Information of China (English)

    王振华; 李妹芳; 申德荣; 于戈

    2011-01-01

    Effective multi-keyword query routing is the key problem over P2P Web search. A novel query processing strategy based on benefit cost ratio was proposed A P2P overlay based on DHT has been built, and take into account of the correlation of keywords as well as the coverage and overlap among peers. Min-wise independent permutation was applied for overlap detection, so the redundant routing to the same result is avoided. The experimental results show that the method improves the search performance greatly.%有效的多关键字查询路由是P2P Web搜索中的一个关键问题.文章提出一种基于收益代价比的查询处理方法.该方法基于DHT的P2P覆盖网,挖掘关键字的关联性和节点间覆盖度和重叠度.利用最小独立置换进行重叠检测,因此避免了对相同记录的冗余路由.实验证明了该方法显著减少了查询时间,同时提高了查全率和查准率.

  16. Language engineering for the Semantic Web: a digital library for endangered languages. Endangered languages, Ontology, Digital library, Multimedia, EMELD, Intelligent querying and retrieval, ImageSpace

    Directory of Open Access Journals (Sweden)

    Lu Shiyong

    2004-01-01

    Full Text Available In this paper, we describe the effort undertaken at Wayne State University to preserve endangered languages using the state-of-the-art information technologies. In particular, we discuss the issues involved in such an effort, and present the architecture of a distributed digital library for endangered languages which will contain various data of endangered languages in the forms of text, image, video, audio and include advanced tools for intelligent cataloguing, indexing, searching and browsing information on languages and language analysis. We use various Semantic Web technologies such as XML, OLAC, ontologies so that our digital library becomes a useful linguistic resource on the Semantic Web.

  17. 一种基于证据理论和任务分配的Deep Web查询接口匹配方法%A Deep Web Query Interface Matching Approach Based on Evidence Theory and Task Assignment

    Institute of Scientific and Technical Information of China (English)

    董永权; 李庆忠; 丁艳辉; 张永新

    2011-01-01

    针对已有查询接口匹配方法匹配器权重设置困难、匹配决策缺乏有效处理的局限性,提出一种基于证据理论和任务分配的Deep Web查询接口匹配方法.该方法通过引人改进的D-S证据理论自动融合多个匹配器结果,避免手工设定匹配器权重,有效减少人工干预.通过对任务分配问题进行扩展,将查询接口的一对一匹配决策问题转化为扩展的任务分配问题,为源查询接口中的每一个属性选择合适的匹配,并在此基础上,采用树结构启发式规则进行一对多匹配决策.实验结果表明ETTA-IM方法具有较高的查准率和查全率.%To solve the limitations of existing query interface matching which have the difficulties of weight setting of the matcher and the absence of the efficient processing of matching decision, a deep web query interface matching approach based on evidence theory and task assignment is proposed called evidence theory and task assignment based query interface matching approach (ETTA-IM).Firstly, an improved D-S evidence theory is used to automatically combine multiple matchers.Thus, the weight of each matcher is not required to be set by hand and human involvement is reduced.Then, a method is used to select a proper attribute correspondence of each source attribute from target query interface, which converts one-to-one matching decision to the extended task assignment problem.Finally, based on one-to-one matching results, some heuristic rules of tree structure are used to perform one-to-many matching decision.Experimental results show that ETTA-IM approach has high precision and recall measure.

  18. Keeping Dublin Core Simple: Cross-Domain Discovery or Resource Description?; First Steps in an Information Commerce Economy: Digital Rights Management in the Emerging E-Book Environment; Interoperability: Digital Rights Management and the Emerging EBook Environment; Searching the Deep Web: Direct Query Engine Applications at the Department of Energy.

    Science.gov (United States)

    Lagoze, Carl; Neylon, Eamonn; Mooney, Stephen; Warnick, Walter L.; Scott, R. L.; Spence, Karen J.; Johnson, Lorrie A.; Allen, Valerie S.; Lederman, Abe

    2001-01-01

    Includes four articles that discuss Dublin Core metadata, digital rights management and electronic books, including interoperability; and directed query engines, a type of search engine designed to access resources on the deep Web that is being used at the Department of Energy. (LRW)

  19. Keeping Dublin Core Simple: Cross-Domain Discovery or Resource Description?; First Steps in an Information Commerce Economy: Digital Rights Management in the Emerging E-Book Environment; Interoperability: Digital Rights Management and the Emerging EBook Environment; Searching the Deep Web: Direct Query Engine Applications at the Department of Energy.

    Science.gov (United States)

    Lagoze, Carl; Neylon, Eamonn; Mooney, Stephen; Warnick, Walter L.; Scott, R. L.; Spence, Karen J.; Johnson, Lorrie A.; Allen, Valerie S.; Lederman, Abe

    2001-01-01

    Includes four articles that discuss Dublin Core metadata, digital rights management and electronic books, including interoperability; and directed query engines, a type of search engine designed to access resources on the deep Web that is being used at the Department of Energy. (LRW)

  20. jQuery 2.0 animation techniques beginner's guide

    CERN Document Server

    Culpepper, Adam

    2013-01-01

    This book is a guide to help you create attractive web page animations using jQuery. Written in a friendly and engaging approach this book is designed to be placed alongside your computer as a mentor.If you are a web designer or a frontend developer or if you want to learn how to animate the user interface of your web applications with jQuery, this book is for you. Experience with jQuery or Javascript would be helpful but solid knowledge base of HTML and CSS is assumed.

  1. Investigating metrics of geospatial web services: The case of a CEOS federated catalog service for earth observation data

    Science.gov (United States)

    Han, Weiguo; Di, Liping; Yu, Genong; Shao, Yuanzheng; Kang, Lingjun

    2016-07-01

    Geospatial Web Services (GWS) make geospatial information and computing resources discoverable and accessible over the Web. Among them, Open Geospatial Consortium (OGC) standards-compliant data, catalog and processing services are most popular, and have been widely adopted and leveraged in geospatial research and applications. The GWS metrics, such as visit count, average processing time, and user distribution, are important to evaluate their overall performance and impacts. However, these metrics, especially of federated catalog service, have not been systematically evaluated and reported to relevant stakeholders from the point of view of service providers. Taking an integrated catalog service for earth observation data as an example, this paper describes metrics information retrieval, organization, and representation of a catalog service federation. An extensible and efficient log file analyzer is implemented to retrieve a variety of service metrics from the log file and store analysis results in an easily programmable format. An Ajax powered Web portal is built to provide stakeholders, sponsors, developers, partners, and other types of users with specific and relevant insights into metrics information in an interactive and informative form. The deployed system has provided useful information for periodical reports, service delivery, and decision support. The proposed measurement strategy and analytics framework can be a guidance to help GWS providers evaluate their services.

  2. Web Page Recommendation Using Web Mining

    Directory of Open Access Journals (Sweden)

    Modraj Bhavsar

    2014-07-01

    Full Text Available On World Wide Web various kind of content are generated in huge amount, so to give relevant result to user web recommendation become important part of web application. On web different kind of web recommendation are made available to user every day that includes Image, Video, Audio, query suggestion and web page. In this paper we are aiming at providing framework for web page recommendation. 1 First we describe the basics of web mining, types of web mining. 2 Details of each web mining technique.3We propose the architecture for the personalized web page recommendation.

  3. Evaluating XML-Extended OLAP Queries Based on a Physical Algebra

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    2006-01-01

    . In this paper, we extend previous work on the logical federation of OLAP and XML data sources by presenting a simplified query semantics, a physical query algebra and a robust OLAP-XML query engine as well as the query evaluation techniques. Performance experiments with a prototypical implementation suggest...

  4. Elements of a Spatial Web

    DEFF Research Database (Denmark)

    Jensen, Christian S.

    2010-01-01

    and are relevant to a text argument. An important element in enabling such queries is to be able to rank spatial web objects. Another is to be able to determine the relevance of an object to a query. Yet another is to enable the efficient processing of such queries. The talk covers recent results on spatial web...

  5. Mining tree-query associations in graphs

    CERN Document Server

    Hoekx, Eveline

    2010-01-01

    New applications of data mining, such as in biology, bioinformatics, or sociology, are faced with large datasetsstructured as graphs. We introduce a novel class of tree-shapedpatterns called tree queries, and present algorithms for miningtree queries and tree-query associations in a large data graph. Novel about our class of patterns is that they can containconstants, and can contain existential nodes which are not counted when determining the number of occurrences of the patternin the data graph. Our algorithms have a number of provableoptimality properties, which are based on the theory of conjunctive database queries. We propose a practical, database-oriented implementation in SQL, and show that the approach works in practice through experiments on data about food webs, protein interactions, and citation analysis.

  6. A Query Language for Formal Mathematical Libraries

    CERN Document Server

    Rabe, Florian

    2012-01-01

    One of the most promising applications of mathematical knowledge management is search: Even if we restrict attention to the tiny fragment of mathematics that has been formalized, the amount exceeds the comprehension of an individual human. Based on the generic representation language MMT, we introduce the mathematical query language QMT: It combines simplicity, expressivity, and scalability while avoiding a commitment to a particular logical formalism. QMT can integrate various search paradigms such as unification, semantic web, or XQuery style queries, and QMT queries can span different mathematical libraries. We have implemented QMT as a part of the MMT API. This combination provides a scalable indexing and query engine that can be readily applied to any library of mathematical knowledge. While our focus here is on libraries that are available in a content markup language, QMT naturally extends to presentation and narration markup languages.

  7. Web Query Expansion and Refinement using Query - Level Clustering

    Directory of Open Access Journals (Sweden)

    Senthil Kumar N

    2013-04-01

    Full Text Available The objectives raised in this paper are to pave the new dimension to Internet searching and bring the semantic core strategies to the forefront to add values to the search process. In precise, “the search must be what user wish, not what user types”. To know the process of search intricacy, we observed the vocabulary contradictionand mismatch problem existence during retrieval can estimate the irrelevant document matching. Generally, a term or vocabulary mismatch can happens to the search iteration only if the terms not present in the fetched documents. Many techniques have been proposed such as library science, pseudo relevance feedback and later semantic indexing etc, where all the algorithms tend to find the objectives sustained but did not deal with alternate process. Hence we have proposed a technique which gives the sheer implications of all the pitfalls and device a new mechanism to support the mismatch problem. By bringing the semantics aspects of the sentences and word order of the sentence to the core part, we have emulated the proper solution to get rid of sentence or term mismatch problem.

  8. Spatial information semantic query based on SPARQL

    Science.gov (United States)

    Xiao, Zhifeng; Huang, Lei; Zhai, Xiaofang

    2009-10-01

    How can the efficiency of spatial information inquiries be enhanced in today's fast-growing information age? We are rich in geospatial data but poor in up-to-date geospatial information and knowledge that are ready to be accessed by public users. This paper adopts an approach for querying spatial semantic by building an Web Ontology language(OWL) format ontology and introducing SPARQL Protocol and RDF Query Language(SPARQL) to search spatial semantic relations. It is important to establish spatial semantics that support for effective spatial reasoning for performing semantic query. Compared to earlier keyword-based and information retrieval techniques that rely on syntax, we use semantic approaches in our spatial queries system. Semantic approaches need to be developed by ontology, so we use OWL to describe spatial information extracted by the large-scale map of Wuhan. Spatial information expressed by ontology with formal semantics is available to machines for processing and to people for understanding. The approach is illustrated by introducing a case study for using SPARQL to query geo-spatial ontology instances of Wuhan. The paper shows that making use of SPARQL to search OWL ontology instances can ensure the result's accuracy and applicability. The result also indicates constructing a geo-spatial semantic query system has positive efforts on forming spatial query and retrieval.

  9. Evaluation of Query Generators for Entity Search Engines

    CERN Document Server

    Endrullis, Stefan; Rahm, Erhard

    2010-01-01

    Dynamic web applications such as mashups need efficient access to web data that is only accessible via entity search engines (e.g. product or publication search engines). However, most current mashup systems and applications only support simple keyword searches for retrieving data from search engines. We propose the use of more powerful search strategies building on so-called query generators. For a given set of entities query generators are able to automatically determine a set of search queries to retrieve these entities from an entity search engine. We demonstrate the usefulness of query generators for on-demand web data integration and evaluate the effectiveness and efficiency of query generators for a challenging real-world integration scenario.

  10. How Do Search Engines Handle Chinese Queries?

    Directory of Open Access Journals (Sweden)

    Hong Cui

    2005-10-01

    Full Text Available The use of languages other than English has been growing exponentially on the Web. However, the major search engines have been lagging behind in providing indexes and search features to handle these languages. This article explores the characteristics of the Chinese language and how queries in this language are handled by different search engines. Queries were entered in two major search engines (Google and AlltheWeb and two search engines developed for Chinese (Sohu and Baidu. Criteria such as handling word segmentation, number of retrieved documents, and correct display and identification of Chinese characters were used to examine how the search engines handled the queries. The results showed that the performance of the two major search engines was not on a par with that of the search engines developed for Chinese.

  11. Query Language for Complex Similarity Queries

    CERN Document Server

    Budikova, Petra; Zezula, Pavel

    2012-01-01

    For complex data types such as multimedia, traditional data management methods are not suitable. Instead of attribute matching approaches, access methods based on object similarity are becoming popular. Recently, this resulted in an intensive research of indexing and searching methods for the similarity-based retrieval. Nowadays, many efficient methods are already available, but using them to build an actual search system still requires specialists that tune the methods and build the system manually. Several attempts have already been made to provide a more convenient high-level interface in a form of query languages for such systems, but these are limited to support only basic similarity queries. In this paper, we propose a new language that allows to formulate content-based queries in a flexible way, taking into account the functionality offered by a particular search engine in use. To ensure this, the language is based on a general data model with an abstract set of operations. Consequently, the language s...

  12. Digital Investigations of AN Archaeological Smart Point Cloud: a Real Time Web-Based Platform to Manage the Visualisation of Semantical Queries

    Science.gov (United States)

    Poux, F.; Neuville, R.; Hallot, P.; Van Wersch, L.; Luczfalvy Jancsó, A.; Billen, R.

    2017-05-01

    While virtual copies of the real world tend to be created faster than ever through point clouds and derivatives, their working proficiency by all professionals' demands adapted tools to facilitate knowledge dissemination. Digital investigations are changing the way cultural heritage researchers, archaeologists, and curators work and collaborate to progressively aggregate expertise through one common platform. In this paper, we present a web application in a WebGL framework accessible on any HTML5-compatible browser. It allows real time point cloud exploration of the mosaics in the Oratory of Germigny-des-Prés, and emphasises the ease of use as well as performances. Our reasoning engine is constructed over a semantically rich point cloud data structure, where metadata has been injected a priori. We developed a tool that directly allows semantic extraction and visualisation of pertinent information for the end users. It leads to efficient communication between actors by proposing optimal 3D viewpoints as a basis on which interactions can grow.

  13. DIGITAL INVESTIGATIONS OF AN ARCHAEOLOGICAL SMART POINT CLOUD: A REAL TIME WEB-BASED PLATFORM TO MANAGE THE VISUALISATION OF SEMANTICAL QUERIES

    Directory of Open Access Journals (Sweden)

    F. Poux

    2017-05-01

    Full Text Available While virtual copies of the real world tend to be created faster than ever through point clouds and derivatives, their working proficiency by all professionals’ demands adapted tools to facilitate knowledge dissemination. Digital investigations are changing the way cultural heritage researchers, archaeologists, and curators work and collaborate to progressively aggregate expertise through one common platform. In this paper, we present a web application in a WebGL framework accessible on any HTML5-compatible browser. It allows real time point cloud exploration of the mosaics in the Oratory of Germigny-des-Prés, and emphasises the ease of use as well as performances. Our reasoning engine is constructed over a semantically rich point cloud data structure, where metadata has been injected a priori. We developed a tool that directly allows semantic extraction and visualisation of pertinent information for the end users. It leads to efficient communication between actors by proposing optimal 3D viewpoints as a basis on which interactions can grow.

  14. Web Usage Mining Analysis of Federated Search Tools for Egyptian Scholars

    Science.gov (United States)

    Mohamed, Khaled A.; Hassan, Ahmed

    2008-01-01

    Purpose: This paper aims to examine the behaviour of the Egyptian scholars while accessing electronic resources through two federated search tools. The main purpose of this article is to provide guidance for federated search tool technicians and support teams about user issues, including the need for training. Design/methodology/approach: Log…

  15. Web Usage Mining Analysis of Federated Search Tools for Egyptian Scholars

    Science.gov (United States)

    Mohamed, Khaled A.; Hassan, Ahmed

    2008-01-01

    Purpose: This paper aims to examine the behaviour of the Egyptian scholars while accessing electronic resources through two federated search tools. The main purpose of this article is to provide guidance for federated search tool technicians and support teams about user issues, including the need for training. Design/methodology/approach: Log…

  16. Ontology Based Queries - Investigating a Natural Language Interface

    NARCIS (Netherlands)

    van der Sluis, Ielka; Hielkema, F.; Mellish, C.; Doherty, G.

    2010-01-01

    In this paper we look at what may be learned from a comparative study examining non-technical users with a background in social science browsing and querying metadata. Four query tasks were carried out with a natural language interface and with an interface that uses a web paradigm with hyperlinks.

  17. 基于语义主题相似度的Web教育资源查询方法%A Novel Query Method on Web Educational Resources Based on Semantic Theme Similarity

    Institute of Scientific and Technical Information of China (English)

    王杨; 尤科本; 吴梦婷; 虞威; 陈付龙; 赵传信

    2014-01-01

    In order to make users acquire high availability of educational resources to solve the defects of the keyword search,a Web edu-cational resources query methods oriented semantic theme is proposed. This method builds the ontology concept semantic networks,on the basis of this,the concept retrieval method based on the semantic theme similarity matching is designed,that is the semantic theme was in-troduced into ontology concept semantic network,then a Web education resource vertical search engine was set up which blended with the semantic characteristic,and map the corresponding semantic distance into similarity by constructing a similarity function to meet the con-ditions,improving the query efficiency effectively. Experiment results show the method can improve information retrieval precision and recall.%针对传统Web教育主体难以获得高可用教育资源的问题,提出了一种面向语义主题相似度的Web教育资源查询方法。该方法建立了本体概念语义网络(Ontology Concept Semantic Network,OCSN),在此基础上,设计了基于语义主题相似度匹配的概念检索方法:在检索前主动将教育资源根据其语义和主题组织到本体概念语义网络中,然后建立一个基于语义特性的Web教育资源发现的垂直搜索引擎,并通过构造满足条件的相似度函数,将对应的语义距离映射为相似度,有效地提高了查询效率。实验结果表明此方法能够提高Web教育资源的查准率和查全率。

  18. 基于概念松弛的高效Web服务查询方法%Efficient Web Service Query Approach Based on Concept Relaxation

    Institute of Scientific and Technical Information of China (English)

    欧伟杰; 曾承; 项小明; 彭智勇; 李德毅

    2011-01-01

    随着云计算技术的发展,面向服务的应用在互联网上呈现快速增长趋势,开放平台中基于云服务的组合服务也如雨后春笋般大量涌现,这给用户快速、精确定位所需服务带来了巨大挑战.尽管传统服务查询方法在查全率和查准率方面已取得较大进步,但仍无法适用于动态的互联网环境下大规模服务发现的要求.文章根据概念之间的语义关系,提出了基于概念松弛的相似性服务查询方法,它通过计算无关概念与服务对查询结果的影响,不仅改善了服务查询的效果,而且满足海量服务查询的高效性要求.经实验证明,文中提出的方法不仅在性能上优于传统方法,且满足服务查询的可扩展性.此外,该方法已经应用于上线的按需服务平台中.%Cloud computing and service-oriented applications on the Internet are growing rapidly. On the other hand, open platform speed up the emergence of composited services, which is based on Cloud service. How to discover the desired services for users efficiently has become a significant challenge. Although traditional approaches have made progress in recall rate and precision. But they are not suitable for large-scale service discovery under dynamic environment. In this article, a novel approach for service query based on concept relaxation is proposed, which employs the semantic relation of concepts from hierarchical ontology. The unrelated concepts and services are figured out to improve the efficiency of algorithm. The method has been implemented in a prototype of on-demand service platform. The results of experiments illustrate that the proposed approach not only outperform the traditional ones, but also satisfy the scalability of service discovery.

  19. Indexing for summary queries

    DEFF Research Database (Denmark)

    Yi, Ke; Wang, Lu; Wei, Zhewei

    2014-01-01

    ), of a particular attribute of these records. Aggregation queries are especially useful in business intelligence and data analysis applications where users are interested not in the actual records, but some statistics of them. They can also be executed much more efficiently than reporting queries, by embedding...

  20. Mastering jQuery

    CERN Document Server

    Libby, Alex

    2015-01-01

    If you are a developer who is already familiar with using jQuery and wants to push your skill set further, then this book is for you. The book assumes an intermediate knowledge level of jQuery, JavaScript, HTML5, and CSS.

  1. Indexing for summary queries

    DEFF Research Database (Denmark)

    Yi, Ke; Wang, Lu; Wei, Zhewei

    2014-01-01

    ), of a particular attribute of these records. Aggregation queries are especially useful in business intelligence and data analysis applications where users are interested not in the actual records, but some statistics of them. They can also be executed much more efficiently than reporting queries, by embedding...

  2. A Novel Ranking Algorithm of Query Words Stored in QIIIEP Server

    Directory of Open Access Journals (Sweden)

    Dilip Kumar Sharma

    2010-11-01

    Full Text Available This paper proposes a novel algorithm for the ranking of query words stored in QIIIEP server which are used for posting the query to extract the contents from deep web (Sharma and Sharma, 2009. These words can be collected from either by auto query words extraction module or submitted by web master of third party sites. This paperanalyze different existing algorithms for ranking of query words and suggest an improved algorithm for the same by including newer parameters for ranking of query words. An elaborate analysis is carried on the concept of query words ranking so as to come up with an improved algorithm with enhanced efficiency and one in conformance with the global standards. Proposed algorithm analyzes the context of web page with respect to the supplied keywords and frequency of simultaneous occurrence for same keyword on surface web to assign a numerical weighting to each query word with the purpose of "measuring" its relative importance within the set.

  3. SPARQL Assist Language-Neutral Query Composer

    CERN Document Server

    McCarthy, Luke; Wilkinson, Mark

    2010-01-01

    SPARQL query composition is difficult for the lay-person or even the experienced bioinformatician in cases where the data model is unfamiliar. Established best-practices and internationalization concerns dictate that semantic web ontologies should use terms with opaque identifiers, further complicating the task. We present SPARQL Assist: a web application that addresses these issues by providing context-sensitive type-ahead completion to existing web forms. Ontological terms are suggested using their labels and descriptions, leveraging existing XML support for internationalization and language-neutrality.

  4. Declarative Visualization Queries

    Science.gov (United States)

    Pinheiro da Silva, P.; Del Rio, N.; Leptoukh, G. G.

    2011-12-01

    In an ideal interaction with machines, scientists may prefer to write declarative queries saying "what" they want from a machine than to write code stating "how" the machine is going to address the user request. For example, in relational database, users have long relied on specifying queries using Structured Query Language (SQL), a declarative language to request data results from a database management system. In the context of visualizations, we see that users are still writing code based on complex visualization toolkit APIs. With the goal of improving the scientists' experience of using visualization technology, we have applied this query-answering pattern to a visualization setting, where scientists specify what visualizations they want generated using a declarative SQL-like notation. A knowledge enhanced management system ingests the query and knows the following: (1) know how to translate the query into visualization pipelines; and (2) how to execute the visualization pipelines to generate the requested visualization. We define visualization queries as declarative requests for visualizations specified in an SQL like language. Visualization queries specify what category of visualization to generate (e.g., volumes, contours, surfaces) as well as associated display attributes (e.g., color and opacity), without any regards for implementation, thus allowing scientists to remain partially unaware of a wide range of visualization toolkit (e.g., Generic Mapping Tools and Visualization Toolkit) specific implementation details. Implementation details are only a concern for our knowledge-based visualization management system, which uses both the information specified in the query and knowledge about visualization toolkit functions to construct visualization pipelines. Knowledge about the use of visualization toolkits includes what data formats the toolkit operates on, what formats they output, and what views they can generate. Visualization knowledge, which is not

  5. TAP Service Federation Factory

    Science.gov (United States)

    Hume, A. C.; Krause, A.; Holliman, M.; Mann, R. G.; Noddle, K.; Voutsinas, S.

    2012-09-01

    This paper describes a prototype federation service for multiple TAP endpoints. Users can create a new TAP resource that allows them to query the federation as if all tables were in a single database.

  6. Range-Clustering Queries

    OpenAIRE

    Abrahamsen, Mikkel; de Berg, Mark; Buchin, Kevin; Mehr, Mehran; Mehrabi, Ali D.

    2017-01-01

    In a geometric $k$-clustering problem the goal is to partition a set of points in $\\mathbb{R}^d$ into $k$ subsets such that a certain cost function of the clustering is minimized. We present data structures for orthogonal range-clustering queries on a point set $S$: given a query box $Q$ and an integer $k>2$, compute an optimal $k$-clustering for $S\\setminus Q$. We obtain the following results. We present a general method to compute a $(1+\\epsilon)$-approximation to a range-clustering query, ...

  7. Benchmarking Query Execution Robustness

    Science.gov (United States)

    Wiener, Janet L.; Kuno, Harumi; Graefe, Goetz

    Benchmarks that focus on running queries on a well-tuned database system ignore a long-standing problem: adverse runtime conditions can cause database system performance to vary widely and unexpectedly. When the query execution engine does not exhibit resilience to these adverse conditions, addressing the resultant performance problems can contribute significantly to the total cost of ownership for a database system in over-provisioning, lost efficiency, and increased human administrative costs. For example, focused human effort may be needed to manually invoke workload management actions or fine-tune the optimization of specific queries.

  8. Joint Top-K Spatial Keyword Query Processing

    DEFF Research Database (Denmark)

    Wu, Dingming; Yiu, Man Lung; Cong, Gao

    2012-01-01

    Web users and content are increasingly being geopositioned, and increased focus is being given to serving local content in response to web queries. This development calls for spatial keyword queries that take into account both the locations and textual descriptions of content. We study...... keyword queries. Empirical studies show that the proposed solution is efficient on real data sets. We also offer analytical studies on synthetic data sets to demonstrate the efficiency of the proposed solution. Index Terms IEEE Terms Electronic mail , Google , Indexes , Joints , Mobile communication...

  9. Query Recommendation by Coupling Personalization with Clustering for Search Engine

    Directory of Open Access Journals (Sweden)

    Dhiliphanrajkumar.Thambidurai

    2016-11-01

    Full Text Available In the present world internet and web search engines have become an important part in one’s day-today life. For a user query, more than few thousand web pages are retrieved but most of them are irrelevant. A major problem in search engine is that the user queries are usually short and ambiguous, and they are not sufficient to satisfy the precise user needs. Also listing more number of results according to user make them worry about searching the desired results and it takes large amount of time to search from the huge list of results. To overcome all the problems, an effective approach is developed by capturing the users’ click through and bookmarking data to provide personalized query recommendation. For retrieving the results, Google API is used. Experimental results show that the proposed method is providing better query recommendation results than the existing query suggestion methods.

  10. 结合匹配度和语义相似度的Deep Web查询接口模式匹配%Deep Web query interface schema matching based on matching degree and semantic similarity

    Institute of Scientific and Technical Information of China (English)

    冯永; 张洋

    2012-01-01

    Query interface schema matching is a key step in Deep Web data integration. Dual Correlated Mining (DCM) is able to make full use of association mining method to solve the problems of complex interface schema matching. There are some problems about DCM, such as inefficiency and inaccuracy in matching. Therefore, a new method based on matching degree and semantic similarity was presented in this paper to solve the problems. Firstly, the method used correlation matrix to save the association relationship among attributes; and then, matching degree was applied to calculate the degree of correlation between attributes; at last, semantic similarity was used to ensure the accuracy of final results. The experimental results on BAMM data sets of University of Illinois show that the proposed method has higher precision and efficiency than DCM and improved DCM, and indicate that the method can deal with the query interface schema matching problems very well.%查询接口模式匹配是Deep Web信息集成中的关键部分,双重相关性挖掘方法(DCM)能有效利用关联挖掘方法解决复杂接口模式匹配问题.针对DCM方法在匹配效率、匹配准确性方面的不足,提出了一种基于匹配度和语义相似度的新模式匹配方法.该方法首先使用矩阵存储属性间的关联关系,然后采用匹配度计算属性间的相关度,最后利用语义相似度计算候选匹配的相似性.通过在美国伊利诺斯大学的BAMM数据集上进行实验,所提方法与DCM及其改进方法比较有更高的匹配效率和准确性,表明该方法能更好地处理接口之间模式匹配问题.

  11. Localized Geometric Query Problems

    CERN Document Server

    Augustine, John; Maheshwari, Anil; Nandy, Subhas C; Roy, Sasanka; Sarvattomananda, Swami

    2011-01-01

    A new class of geometric query problems are studied in this paper. We are required to preprocess a set of geometric objects $P$ in the plane, so that for any arbitrary query point $q$, the largest circle that contains $q$ but does not contain any member of $P$, can be reported efficiently. The geometric sets that we consider are point sets and boundaries of simple polygons.

  12. 基于Web2.0的公交信息查询系统的设计与实现%Design of the city public transportation query system based on Web 2.0

    Institute of Scientific and Technical Information of China (English)

    何成万; 张慧; 陈艳兰; 严小环

    2009-01-01

    介绍了基于Web 2.0的公交信息查询系统的设计和实现.该系统实现了线路、站点数据信息管理,以及换乘查询功能,并且在浏览器中的Google Maps上能够显示公交线路和车站及其相关信息.

  13. Inverse Queries For Multidimensional Spaces

    CERN Document Server

    Bernecker, Thomas; Kriegel, Hans-Peter; Mamoulis, Nikos; Renz, Matthias; Zhang, Shiming; Züfle, Andreas

    2011-01-01

    Traditional spatial queries return, for a given query object $q$, all database objects that satisfy a given predicate, such as epsilon range and $k$-nearest neighbors. This paper defines and studies {\\em inverse} spatial queries, which, given a subset of database objects $Q$ and a query predicate, return all objects which, if used as query objects with the predicate, contain $Q$ in their result. We first show a straightforward solution for answering inverse spatial queries for any query predicate. Then, we propose a filter-and-refinement framework that can be used to improve efficiency. We show how to apply this framework on a variety of inverse queries, using appropriate space pruning strategies. In particular, we propose solutions for inverse epsilon range queries, inverse $k$-nearest neighbor queries, and inverse skyline queries. Our experiments show that our framework is significantly more efficient than naive approaches.

  14. Usare WebDewey

    OpenAIRE

    Baldi, Paolo

    2016-01-01

    This presentation shows how to use the WebDewey tool. Features of WebDewey. Italian WebDewey compared with American WebDewey. Querying Italian WebDewey. Italian WebDewey and MARC21. Italian WebDewey and UNIMARC. Numbers, captions, "equivalente verbale": Dewey decimal classification in Italian catalogues. Italian WebDewey and Nuovo soggettario. Italian WebDewey and LCSH. Italian WebDewey compared with printed version of Italian Dewey Classification (22. edition): advantages and disadvantages o...

  15. jQuery for designers beginner's guide

    CERN Document Server

    MacLees, Natalie

    2014-01-01

    A step-by-step guide that spices up your web pages and designs them in the way you want using the most widely used JavaScript library, jQuery. The beginner-friendly and easy-to-understand approach of the book will help get to grips with jQuery in no time. If you know the fundamentals of HTML and CSS, and want to extend your knowledge by learning to use JavaScript, then this is just the book for you. jQuery makes JavaScript straightforward and approachable - you'll be surprised at how easy it can be to add animations and special effects to your beautifully designed pages.

  16. jQuery Mobile Up and Running

    CERN Document Server

    Firtman, Maximiliano

    2012-01-01

    Would you like to build one mobile web application that works on iPad and Kindle Fire as well as iPhone and Android smartphones? This introductory guide to jQuery Mobile shows you how. Through a series of hands-on exercises, you'll learn the best ways to use this framework's many interface components to build customizable, multiplatform apps. You don't need any programming skills or previous experience with jQuery to get started. By the time you finish this book, you'll know how to create responsive, Ajax-based interfaces that work on a variety of smartphones and tablets, using jQuery Mobile

  17. Query strategy for sequential ontology debugging

    CERN Document Server

    Shchekotykhina, Kostyantyn; Fleiss, Philipp; Rodler, Patrick

    2011-01-01

    Debugging of ontologies is an important prerequisite for their wide-spread application, especially in areas that rely upon everyday users to create and maintain knowledge bases, as in the case of the Semantic Web. Recent approaches use diagnosis methods to identify causes of inconsistent or incoherent ontologies. However, in most debugging scenarios these methods return many alternative diagnoses, thus placing the burden of fault localization on the user. This paper demonstrates how the target diagnosis can be identified by performing a sequence of observations, that is, by querying an oracle about entailments of the target ontology. We exploit a-priori probabilities of typical user errors to formulate information-theoretic concepts for query selection. Our evaluation showed that the proposed method significantly reduces the number of required queries compared to myopic strategies. We experimented with different probability distributions of user errors and different qualities of the a-priori probabilities. Ou...

  18. A Study of Library Databases by Translating Those SQL Queries into Relational Algebra and Generating Query Trees

    Directory of Open Access Journals (Sweden)

    Santhi Lasya

    2011-09-01

    Full Text Available Even in this World Wide Web era where there is unrestricted access to a lot of articles and books at a mouses click, the role of an organized library is immense. It is vital to have effective software to manage various functions in a library and the fundamental for effective software is the underlying database access and the queries used. And hence library databases become our use-case for this study. This paper starts off with considering a basic ER model of a typical library relational database. We would also list all the basic use-cases in a library management system. The next part of the paper deals with the sql queries used for performing certain functions in a library database management system. Along with the queries, we would generate reports for some of the use cases. The final section of the paper forms the crux of this library database study, wherein we would dwell on the concepts of query processing and query optimization in the relational database domain. We would analyze the above mentioned queries, by translating the query into a relational algebra expression and generating a query tree for the same. By converting algebra, we look at optimizing the query, and by generating a query tree, we would come up a cheapest cost plan.

  19. A Dynamic Extension of ATLAS Run Query Service

    CERN Document Server

    Buliga, Alexandru

    2015-01-01

    The ATLAS RunQuery is a primarily web-based service for the ATLAS community to access meta information about the data taking in a concise format. In order to provide a better user experience, the service was moved to use a new technology, involving concepts such as: Web Sockets, on demand data, client-side scripting, memory caching and parallelizing execution.

  20. Virtual Solar Observatory Distributed Query Construction

    Science.gov (United States)

    Gurman, J. B.; Dimitoglou, G.; Bogart, R.; Davey, A.; Hill, F.; Martens, P.

    2003-01-01

    Through a prototype implementation (Tian et al., this meeting) the VSO has already demonstrated the capability of unifying geographically distributed data sources following the Web Services paradigm and utilizing mechanisms such as the Simple Object Access Protocol (SOAP). So far, four participating sites (Stanford, Montana State University, National Solar Observatory and the Solar Data Analysis Center) permit Web-accessible, time-based searches that allow browse access to a number of diverse data sets. Our latest work includes the extension of the simple, time-based queries to include numerous other searchable observation parameters. For VSO users, this extended functionality enables more refined searches. For the VSO, it is a proof of concept that more complex, distributed queries can be effectively constructed and that results from heterogeneous, remote sources can be synthesized and presented to users as a single, virtual data product.

  1. Web Services in Earth Science Data Systems: Realities of Brokering, Chaining and Federating Services

    Science.gov (United States)

    Burnett, M.; Seablom, M.; Pfister, R.

    2004-12-01

    will have access to the complete provenance of the newly generated product. The user receives only the data that she needs in a form that is optimized for her study in rapid turnaround. Today, technology is emerging to begin to meet this vision. Web Services technology is a key part of this evolving solution. There are still challenges to face. This presentation will discuss Web Services technology and its application to achieve this vision as well as the realities of the challenges and issues we still face in making the vision a reality.

  2. XPath Whole Query Optimization

    CERN Document Server

    Maneth, Sebastian

    2010-01-01

    Previous work reports about SXSI, a fast XPath engine which executes tree automata over compressed XML indexes. Here, reasons are investigated why SXSI is so fast. It is shown that tree automata can be used as a general framework for fine grained XML query optimization. We define the "relevant nodes" of a query as those nodes that a minimal automaton must touch in order to answer the query. This notion allows to skip many subtrees during execution, and, with the help of particular tree indexes, even allows to skip internal nodes of the tree. We efficiently approximate runs over relevant nodes by means of on-the-fly removal of alternation and non-determinism of (alternating) tree automata. We also introduce many implementation techniques which allows us to efficiently evaluate tree automata, even in the absence of special indexes. Through extensive experiments, we demonstrate the impact of the different optimization techniques.

  3. Code query by example

    Science.gov (United States)

    Vaucouleur, Sebastien

    2011-02-01

    We introduce code query by example for customisation of evolvable software products in general and of enterprise resource planning systems (ERPs) in particular. The concept is based on an initial empirical study on practices around ERP systems. We motivate our design choices based on those empirical results, and we show how the proposed solution helps with respect to the infamous upgrade problem: the conflict between the need for customisation and the need for upgrade of ERP systems. We further show how code query by example can be used as a form of lightweight static analysis, to detect automatically potential defects in large software products. Code query by example as a form of lightweight static analysis is particularly interesting in the context of ERP systems: it is often the case that programmers working in this field are not computer science specialists but more of domain experts. Hence, they require a simple language to express custom rules.

  4. QUESEM: Towards building a Meta Search Service utilizing Query Semantics

    Directory of Open Access Journals (Sweden)

    Neelam Duhan

    2011-01-01

    Full Text Available Current Web Search Engines are built to serve needs of all users, independent of the special needs of any individual. The documents are returned by matching their queries with available documents, with no emphasis on the semantics of query. As a result, the generated information is often very large and inaccurate that results in increased user perceived latency. In this paper, a Semantic Search Service is being developed to help users gather relevant documents more efficiently unlike traditional Web search engines. The approach relies on the online web resource such as dictionary based sites to retrieve possible semantics of the query keywords, which are stored in a definition repository. The service works as a meta-layer above the keyword-based search engine to generate sub-queries based on different meanings of user query, which in turn are sent to the keyword-based search engine to perform Web search. This approach relieves the user in finding the desired information content and improves the search quality for certain types of complex queries. Experiments depict its efficiency as it results in reduced search space.

  5. Querying the Web with Local Intent

    DEFF Research Database (Denmark)

    Jensen, Christian S.

    movement. For example, the paper's proposal supports spatial aggregation and utilizes the topology of indoor spaces to achieve better performance. The paper reports on empirical studies with real and synthetic data that offer insights into the functional and computational aspects of its proposal....

  6. Extending OLAP Querying to External Object

    DEFF Research Database (Denmark)

    Pedersen, Torben Bach; Shoshani, Arie; Gu, Junmin

    inherent in data in nonstandard applications are not accommodated well by OLAP systems. In contrast, object database systems are built to handle such complexity, but do not support OLAP-type querying well. This paper presents the concepts and techniques underlying a flexible, multi-model federated system...... that enables OLAP users to exploit simultaneously the features of OLAP and object systems. The system allows data to be handled using the most appropriate data model and technology: OLAP systems for dimensional data and object database systems for more complex, general data. Additionally, physical data...... integration can be avoided. As a vehicle for demonstrating the capabilities of the system, a prototypical OLAP language is defined and extended to naturally support queries that involve data in object databases. The language permits selection criteria that reference object data, queries that return...

  7. KoralQuery -- A General Corpus Query Protocol

    DEFF Research Database (Denmark)

    Bingel, Joachim; Diewald, Nils

    2015-01-01

    The task-oriented and format-driven development of corpus query systems has led to the creation of numerous corpus query languages (QLs) that vary strongly in expressiveness and syntax. This is a severe impediment for the interoperability of corpus analysis systems, which lack a common protocol....... In this paper, we present KoralQuery, a JSON-LD based general corpus query protocol, aiming to be independent of particular QLs, tasks and corpus formats. In addition to describing the system of types and operations that KoralQuery is built on, we exemplify the representation of corpus queries in the serialized...

  8. TREC 2013 Web Track Overview

    Science.gov (United States)

    2014-01-30

    topics designed to represent more specific, less frequent, possibly more difficult queries. To retain the Web flavor of queries in this track, we retain...like queries, but a number had potentially multiple answers (e.g. [dark chocolate health benefits]). This led to many pages being partially relevant

  9. Object-Extended OLAP Querying

    DEFF Research Database (Denmark)

    Pedersen, Torben Bach; Gu, Junmin; Shoshani, Arie

    2009-01-01

    On-line analytical processing (OLAP) systems based on a dimensional view of data have found widespread use in business applications and are being used increasingly in non-standard applications. These systems provide good performance and ease-of-use. However, the complex structures and relationshi...... with performance measurements that show that the approach is a viable alternative to a physically integrated data warehouse.......On-line analytical processing (OLAP) systems based on a dimensional view of data have found widespread use in business applications and are being used increasingly in non-standard applications. These systems provide good performance and ease-of-use. However, the complex structures and relationships...... inherent in data in non-standard applications are not accommodated well by OLAP systems. In contrast, object database systems are built to handle such complexity, but do not support OLAP-type querying well. This paper presents the concepts and techniques underlying a flexible, "multi-model" federated...

  10. Object-Extended OLAP Querying

    DEFF Research Database (Denmark)

    Pedersen, Torben Bach; Gu, Junmin; Shoshani, Arie

    2009-01-01

    inherent in data in non-standard applications are not accommodated well by OLAP systems. In contrast, object database systems are built to handle such complexity, but do not support OLAP-type querying well. This paper presents the concepts and techniques underlying a flexible, "multi-model" federated...... system that enables OLAP users to exploit simultaneously the features of OLAP and object systems. The system allows data to be handled using the most appropriate data model and technology: OLAP systems for dimensional data and object database systems for more complex, general data. This allows data...... analysis on the OLAP data to be significantly enriched by the use of additional object data. Additionally, physical integration of the OLAP and the object data can be avoided. As a vehicle for demonstrating the capabilities of the system, a prototypical OLAP language is defined and extended to naturally...

  11. Evaluating XML-Extended OLAP Queries Based on a Physical Algebra

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    2004-01-01

    is desirable. In this paper, we extend previous work on the logical federation of OLAP and XML data sources by presenting a simplified query semantics,a physical query algebra and a robust OLAP-XML query engine.Performance experiments with a prototypical implementation suggest that the performance for OLAP...

  12. Evaluating XML-Extended OLAP Queries Based on a Physical Algebra

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    2004-01-01

    is desirable. In this paper, we extend previous work on the logical federation of OLAP and XML data sources by presenting a simplified query semantics,a physical query algebra and a robust OLAP-XML query engine.Performance experiments with a prototypical implementation suggest that the performance for OLAP...

  13. User perspectives on query difficulty

    DEFF Research Database (Denmark)

    Lioma, Christina; Larsen, Birger; Schütze, Hinrich

    2011-01-01

    , or to statistical and linguistic features of the queries that may render them difficult. This work addresses query difficulty from a different angle, namely the users’ own perspectives on query difficulty. Two research questions are asked: (1) Are users aware that the query they submit to an IR system may......The difficulty of a user query can affect the performance of Information Retrieval (IR) systems. What makes a query difficult and how one may predict this is an active research area, focusing mainly on factors relating to the retrieval algorithm, to the properties of the retrieval data...... for synthesising the user-assessed causes of query difficulty through opinion fusion into an overall assessment of query difficulty. The resulting assessments of query difficulty are found to agree notably more to the TREC categories than the direct user assessments....

  14. Spatial Keyword Query Processing

    DEFF Research Database (Denmark)

    Chen, Lisi; Jensen, Christian S.; Wu, Dingming

    2013-01-01

    an all-around survey of 12 state- of-the-art geo-textual indices. We propose a benchmark that en- ables the comparison of the spatial keyword query performance. We also report on the findings obtained when applying the bench- mark to the indices, thus uncovering new insights that may guide index...

  15. Conceptual querying through ontologies

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik

    2009-01-01

    We present here ail approach to conceptual querying where the aim is, given a collection of textual database objects or documents, to target an abstraction of the entire database content in terms of the concepts appearing in documents, rather than the documents in the collection. The approach is ...

  16. XIRAF: Ultimate Forensic Querying

    NARCIS (Netherlands)

    Alink, W.; Bhoedjang, R.; Vries, A.P. de; Boncz, P.A.

    2006-01-01

    This paper describes a novel, XML-based approach towards managing and querying forensic traces extracted from digital evidence. This approach has been implemented in XIRAF, a prototype system for forensic analysis. XIRAF systematically applies forensic analysis tools to evidence files (e.g., hard di

  17. Query Driven Visualization

    CERN Document Server

    Buddelmeijer, Hugo

    2011-01-01

    The request driven way of deriving data in Astro-WISE is extended to a query driven way of visualization. This allows scientists to focus on the science they want to perform, because all administration of their data is automated. This can be done over an abstraction layer that enhances control and flexibility for the scientist.

  18. Flexible Query Answering Systems

    DEFF Research Database (Denmark)

    -computer interaction. The special track covers some some specific and, typically, newer fields, namely: environmental scanning for strategic early warning; generating linguistic descriptions of data; advances in fuzzy querying and fuzzy databases: theory and applications; fusion and ensemble techniques for on......-line learning on data streams; and intelligent information extraction from texts....

  19. Flexible Query Answering Systems

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 12th International Conference on Flexible Query Answering Systems, FQAS 2017, held in London, UK, in June 2017. The 21 full papers presented in this book together with 4 short papers were carefully reviewed and selected from 43 submissions...

  20. Querying Schemas With Access Restrictions

    CERN Document Server

    Benedikt, Michael; Ley, Clemens

    2012-01-01

    We study verification of systems whose transitions consist of accesses to a Web-based data-source. An access is a lookup on a relation within a relational database, fixing values for a set of positions in the relation. For example, a transition can represent access to a Web form, where the user is restricted to filling in values for a particular set of fields. We look at verifying properties of a schema describing the possible accesses of such a system. We present a language where one can describe the properties of an access path, and also specify additional restrictions on accesses that are enforced by the schema. Our main property language, AccLTL, is based on a first-order extension of linear-time temporal logic, interpreting access paths as sequences of relational structures. We also present a lower-level automaton model, Aautomata, which AccLTL specifications can compile into. We show that AccLTL and A-automata can express static analysis problems related to "querying with limited access patterns" that h...

  1. Learning via Query Synthesis

    KAUST Repository

    Alabdulmohsin, Ibrahim Mansour

    2017-05-07

    Active learning is a subfield of machine learning that has been successfully used in many applications. One of the main branches of active learning is query synthe- sis, where the learning agent constructs artificial queries from scratch in order to reveal sensitive information about the underlying decision boundary. It has found applications in areas, such as adversarial reverse engineering, automated science, and computational chemistry. Nevertheless, the existing literature on membership query synthesis has, generally, focused on finite concept classes or toy problems, with a limited extension to real-world applications. In this thesis, I develop two spectral algorithms for learning halfspaces via query synthesis. The first algorithm is a maximum-determinant convex optimization method while the second algorithm is a Markovian method that relies on Khachiyan’s classical update formulas for solving linear programs. The general theme of these methods is to construct an ellipsoidal approximation of the version space and to synthesize queries, afterward, via spectral decomposition. Moreover, I also describe how these algorithms can be extended to other settings as well, such as pool-based active learning. Having demonstrated that halfspaces can be learned quite efficiently via query synthesis, the second part of this thesis proposes strategies for mitigating the risk of reverse engineering in adversarial environments. One approach that can be used to render query synthesis algorithms ineffective is to implement a randomized response. In this thesis, I propose a semidefinite program (SDP) for learning a distribution of classifiers, subject to the constraint that any individual classifier picked at random from this distributions provides reliable predictions with a high probability. This algorithm is, then, justified both theoretically and empirically. A second approach is to use a non-parametric classification method, such as similarity-based classification. In this

  2. Federated Giovanni: A Distributed Web Service for Analysis and Visualization of Remote Sensing Data

    Science.gov (United States)

    Lynnes, Chris

    2014-01-01

    The Geospatial Interactive Online Visualization and Analysis Interface (Giovanni) is a popular tool for users of the Goddard Earth Sciences Data and Information Services Center (GES DISC) and has been in use for over a decade. It provides a wide variety of algorithms and visualizations to explore large remote sensing datasets without having to download the data and without having to write readers and visualizers for it. Giovanni is now being extended to enable its capabilities at other data centers within the Earth Observing System Data and Information System (EOSDIS). This Federated Giovanni will allow four other data centers to add and maintain their data within Giovanni on behalf of their user community. Those data centers are the Physical Oceanography Distributed Active Archive Center (PO.DAAC), MODIS Adaptive Processing System (MODAPS), Ocean Biology Processing Group (OBPG), and Land Processes Distributed Active Archive Center (LP DAAC). Three tiers are supported: Tier 1 (GES DISC-hosted) gives the remote data center a data management interface to add and maintain data, which are provided through the Giovanni instance at the GES DISC. Tier 2 packages Giovanni up as a virtual machine for distribution to and deployment by the other data centers. Data variables are shared among data centers by sharing documents from the Solr database that underpins Giovanni's data management capabilities. However, each data center maintains their own instance of Giovanni, exposing the variables of most interest to their user community. Tier 3 is a Shared Source model, in which the data centers cooperate to extend the infrastructure by contributing source code.

  3. Google BigQuery analytics

    CERN Document Server

    Tigani, Jordan

    2014-01-01

    How to effectively use BigQuery, avoid common mistakes, and execute sophisticated queries against large datasets Google BigQuery Analytics is the perfect guide for business and data analysts who want the latest tips on running complex queries and writing code to communicate with the BigQuery API. The book uses real-world examples to demonstrate current best practices and techniques, and also explains and demonstrates streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results. In addit

  4. Processing Incomplete Query Specifications in a Context-Dependent Reasoning Framework

    Directory of Open Access Journals (Sweden)

    Neli P. Zlatareva

    2013-04-01

    Full Text Available Search is the most prominent web service, which is about to change dramatically with the transition to the Semantic Web. Semantic Web applications are expected to deal with complex conjunctive queries, and not always such queries can be completely and precisely defined. Current Semantic Web reasoners built upon Description Logics have limited processing power in such environments. We discuss some of their limitations, and show how an alternative logical framework utilizing context-dependent rules can be extended to handle incomplete or imprecise query specifications.

  5. Query log analysis of an electronic health record search engine.

    Science.gov (United States)

    Yang, Lei; Mei, Qiaozhu; Zheng, Kai; Hanauer, David A

    2011-01-01

    We analyzed a longitudinal collection of query logs of a full-text search engine designed to facilitate information retrieval in electronic health records (EHR). The collection, 202,905 queries and 35,928 user sessions recorded over a course of 4 years, represents the information-seeking behavior of 533 medical professionals, including frontline practitioners, coding personnel, patient safety officers, and biomedical researchers for patient data stored in EHR systems. In this paper, we present descriptive statistics of the queries, a categorization of information needs manifested through the queries, as well as temporal patterns of the users' information-seeking behavior. The results suggest that information needs in medical domain are substantially more sophisticated than those that general-purpose web search engines need to accommodate. Therefore, we envision there exists a significant challenge, along with significant opportunities, to provide intelligent query recommendations to facilitate information retrieval in EHR.

  6. HIDDEN WEB EXTRACTOR DYNAMIC WAY TO UNCOVER THE DEEP WEB

    OpenAIRE

    DR. ANURADHA; BABITA AHUJA

    2012-01-01

    In this era of digital tsunami of information on the web, everyone is completely dependent on the WWW for information retrieval. This has posed a challenging problem in extracting relevant data. Traditional web crawlers focus only on the surface web while the deep web keeps expanding behind the scene. The web databases are hidden behind the query interfaces. In this paper, we propose a Hidden Web Extractor (HWE) that can automatically discover and download data from the Hidden Web databases. ...

  7. User perspectives on query difficulty

    DEFF Research Database (Denmark)

    Lioma, Christina; Larsen, Birger; Schütze, Hinrich

    2011-01-01

    The difficulty of a user query can affect the performance of Information Retrieval (IR) systems. What makes a query difficult and how one may predict this is an active research area, focusing mainly on factors relating to the retrieval algorithm, to the properties of the retrieval data, or to sta......The difficulty of a user query can affect the performance of Information Retrieval (IR) systems. What makes a query difficult and how one may predict this is an active research area, focusing mainly on factors relating to the retrieval algorithm, to the properties of the retrieval data......, or to statistical and linguistic features of the queries that may render them difficult. This work addresses query difficulty from a different angle, namely the users’ own perspectives on query difficulty. Two research questions are asked: (1) Are users aware that the query they submit to an IR system may...

  8. Regular paths in SparQL: querying the NCI Thesaurus.

    Science.gov (United States)

    Detwiler, Landon T; Suciu, Dan; Brinkley, James F

    2008-11-06

    OWL, the Web Ontology Language, provides syntax and semantics for representing knowledge for the semantic web. Many of the constructs of OWL have a basis in the field of description logics. While the formal underpinnings of description logics have lead to a highly computable language, it has come at a cognitive cost. OWL ontologies are often unintuitive to readers lacking a strong logic background. In this work we describe GLEEN, a regular path expression library, which extends the RDF query language SparQL to support complex path expressions over OWL and other RDF-based ontologies. We illustrate the utility of GLEEN by showing how it can be used in a query-based approach to defining simpler, more intuitive views of OWL ontologies. In particular we show how relatively simple GLEEN-enhanced SparQL queries can create views of the OWL version of the NCI Thesaurus that match the views generated by the web-based NCI browser.

  9. Mashups over the Deep Web

    Science.gov (United States)

    Hornung, Thomas; Simon, Kai; Lausen, Georg

    Combining information from different Web sources often results in a tedious and repetitive process, e.g. even simple information requests might require to iterate over a result list of one Web query and use each single result as input for a subsequent query. One approach for this chained queries are data-centric mashups, which allow to visually model the data flow as a graph, where the nodes represent the data source and the edges the data flow.

  10. Enabling Incremental Query Re-Optimization

    Science.gov (United States)

    Liu, Mengmeng; Ives, Zachary G.; Loo, Boon Thau

    2017-01-01

    As declarative query processing techniques expand to the Web, data streams, network routers, and cloud platforms, there is an increasing need to re-plan execution in the presence of unanticipated performance changes. New runtime information may affect which query plan we prefer to run. Adaptive techniques require innovation both in terms of the algorithms used to estimate costs, and in terms of the search algorithm that finds the best plan. We investigate how to build a cost-based optimizer that recomputes the optimal plan incrementally given new cost information, much as a stream engine constantly updates its outputs given new data. Our implementation especially shows benefits for stream processing workloads. It lays the foundations upon which a variety of novel adaptive optimization algorithms can be built. We start by leveraging the recently proposed approach of formulating query plan enumeration as a set of recursive datalog queries; we develop a variety of novel optimization approaches to ensure effective pruning in both static and incremental cases. We further show that the lessons learned in the declarative implementation can be equally applied to more traditional optimizer implementations. PMID:28659658

  11. Enabling Incremental Query Re-Optimization.

    Science.gov (United States)

    Liu, Mengmeng; Ives, Zachary G; Loo, Boon Thau

    2016-01-01

    As declarative query processing techniques expand to the Web, data streams, network routers, and cloud platforms, there is an increasing need to re-plan execution in the presence of unanticipated performance changes. New runtime information may affect which query plan we prefer to run. Adaptive techniques require innovation both in terms of the algorithms used to estimate costs, and in terms of the search algorithm that finds the best plan. We investigate how to build a cost-based optimizer that recomputes the optimal plan incrementally given new cost information, much as a stream engine constantly updates its outputs given new data. Our implementation especially shows benefits for stream processing workloads. It lays the foundations upon which a variety of novel adaptive optimization algorithms can be built. We start by leveraging the recently proposed approach of formulating query plan enumeration as a set of recursive datalog queries; we develop a variety of novel optimization approaches to ensure effective pruning in both static and incremental cases. We further show that the lessons learned in the declarative implementation can be equally applied to more traditional optimizer implementations.

  12. COMPLEX QUERY AND METADATA

    OpenAIRE

    Nakatoh, Tetsuya; Omori, Keisuke; Yamada, Yasuhiro; Hirokawa, Sachio

    2003-01-01

    We are developing a search system DAISEn which integrates multiple search engines and generates a metasearch engine automatically. The target search engines of DAISEn are not general search engines, but are search engines specialized in some area. Integration of such engines yields efficiency and quality. There are search engines of new type which accept complex query and return structured data. Integration of such search engines is much harder than that of simple search engines which accept ...

  13. Querying genomic databases

    Energy Technology Data Exchange (ETDEWEB)

    Baehr, A.; Hagstrom, R.; Joerg, D.; Overbeek, R.

    1991-09-01

    A natural-language interface has been developed that retrieves genomic information by using a simple subset of English. The interface spares the biologist from the task of learning database-specific query languages and computer programming. Currently, the interface deals with the E. coli genome. It can, however, be readily extended and shows promise as a means of easy access to other sequenced genomic databases as well.

  14. A Semantic Graph Query Language

    Energy Technology Data Exchange (ETDEWEB)

    Kaplan, I L

    2006-10-16

    Semantic graphs can be used to organize large amounts of information from a number of sources into one unified structure. A semantic query language provides a foundation for extracting information from the semantic graph. The graph query language described here provides a simple, powerful method for querying semantic graphs.

  15. Query optimization over crowdsourced data

    KAUST Repository

    Park, Hyunjung

    2013-08-26

    Deco is a comprehensive system for answering declarative queries posed over stored relational data together with data obtained on-demand from the crowd. In this paper we describe Deco\\'s cost-based query optimizer, building on Deco\\'s data model, query language, and query execution engine presented earlier. Deco\\'s objective in query optimization is to find the best query plan to answer a query, in terms of estimated monetary cost. Deco\\'s query semantics and plan execution strategies require several fundamental changes to traditional query optimization. Novel techniques incorporated into Deco\\'s query optimizer include a cost model distinguishing between "free" existing data versus paid new data, a cardinality estimation algorithm coping with changes to the database state during query execution, and a plan enumeration algorithm maximizing reuse of common subplans in a setting that makes reuse challenging. We experimentally evaluate Deco\\'s query optimizer, focusing on the accuracy of cost estimation and the efficiency of plan enumeration.

  16. Mastering jQuery mobile

    CERN Document Server

    Lambert, Chip

    2015-01-01

    You've started down the path of jQuery Mobile, now begin mastering some of jQuery Mobile's higher level topics. Go beyond jQuery Mobile's documentation and master one of the hottest mobile technologies out there. Previous JavaScript and PHP experience can help you get the most out of this book.

  17. A query index for continuous queries on RFID streaming data

    Institute of Scientific and Technical Information of China (English)

    Jaekwan PARK; Bonghee HONG; Chaehoon BAN

    2008-01-01

    RFID middleware collects and filters RFID streaming data to process applications' requests called continuous queries, because they are executed continuously during tag movement. Several approaches to building an index on queries rather than data records, called a query index, have been proposed to evaluate continuous queries over streaming data. EPCgiobal proposed an Event Cycle Specification (ECSpec) model, which is a de facto standard query interface for RFID applications. Continuous queries based on ECSpec consist of a large number of segments that represent the query conditions. The problem when using any of the existing query indexes on these continuous queries is that it takes a long time to build the index, because it is necessary to insert a large number of segments into the index. To solve this problem, we propose a transform method that converts a group of segments into compressed data. We also propose an efficient query index scheme for the transformed space. Comparing with existing query indexes, the performance of proposed index outperforms the others on various datasets.

  18. Dynamic Query Optimization Approach for Semantic Database Grid

    Institute of Scientific and Technical Information of China (English)

    Xiao-Qing Zheng; Hua-Jun Chen; Zhao-Hui Wu; Yu-Xin Mao

    2006-01-01

    Fundamentally, semantic grid database is about bringing globally distributed databases together in order to coordinate resource sharing and problem solving in which information is given well-defined meaning, and DartGrid Ⅱ is the implemented database gird system whose goal is to provide a semantic solution for integrating database resources on the Web.Although many algorithms have been proposed for optimizing query-processing in order to minimize costs and/or response time, associated with obtaining the answer to query in a distributed database system, database grid query optimization problem is fundamentally different from traditional distributed query optimization. These differences are shown to be the consequences of autonomy and heterogeneity of database nodes in database grid. Therefore, more challenges have arisen for query optimization in database grid than traditional distributed database. Following this observation, the design of a query optimizer in DartGrid Ⅱ is presented, and a heuristic, dynamic and parallel query optimization approach to processing query in database grid is proposed. A set of semantic tools supporting relational database integration and semantic-based information browsing has also been implemented to realize the above vision.

  19. Airport Status Web Service

    Data.gov (United States)

    Department of Transportation — A web service that allows end-users the ability to query the current known delays in the National Airspace System as well as the current weather from NOAA by airport...

  20. HIDDEN WEB EXTRACTOR DYNAMIC WAY TO UNCOVER THE DEEP WEB

    Directory of Open Access Journals (Sweden)

    DR. ANURADHA

    2012-06-01

    Full Text Available In this era of digital tsunami of information on the web, everyone is completely dependent on the WWW for information retrieval. This has posed a challenging problem in extracting relevant data. Traditional web crawlers focus only on the surface web while the deep web keeps expanding behind the scene. The web databases are hidden behind the query interfaces. In this paper, we propose a Hidden Web Extractor (HWE that can automatically discover and download data from the Hidden Web databases. Since the only “entry point” to a Hidden Web site is a query interface, the main challenge that a Hidden WebExtractor has to face is how to automatically generate meaningful queries for the unlimited number of website pages.

  1. Bayesian Query-Focused Summarization

    CERN Document Server

    Daumé, Hal

    2009-01-01

    We present BayeSum (for ``Bayesian summarization''), a model for sentence extraction in query-focused summarization. BayeSum leverages the common case in which multiple documents are relevant to a single query. Using these documents as reinforcement for query terms, BayeSum is not afflicted by the paucity of information in short queries. We show that approximate inference in BayeSum is possible on large data sets and results in a state-of-the-art summarization system. Furthermore, we show how BayeSum can be understood as a justified query expansion technique in the language modeling for IR framework.

  2. Tvorba internetových aplikací s využitím framework jQuery

    OpenAIRE

    OKTÁBEC, Michal

    2010-01-01

    This work deals with using framework jQuery for creating web pages. At first it will be explained what is framework jQuery and how is it different from classic JavaScript. Next parts of work will be specialize in syntax and ways of using jQuery. The goal of this work is create first Czech user guide.

  3. The IRIS Federator: Accessing Seismological Data Across Data Centers

    Science.gov (United States)

    Trabant, C. M.; Van Fossen, M.; Ahern, T. K.; Weekly, R. T.

    2015-12-01

    In 2013 the International Federation of Digital Seismograph Networks (FDSN) approved a specification for web service interfaces for accessing seismological station metadata, time series and event parameters. Since then, a number of seismological data centers have implemented FDSN service interfaces, with more implementations in development. We have developed a new system called the IRIS Federator which leverages this standardization and provides the scientific community with a service for easy discovery and access of seismological data across FDSN data centers. These centers are located throughout the world and this work represents one model of a system for data collection across geographic and political boundaries.The main components of the IRIS Federator are a catalog of time series metadata holdings at each data center and a web service interface for searching the catalog. The service interface is designed to support client­-side federated data access, a model in which the client (software run by the user) queries the catalog and then collects the data from each identified center. By default the results are returned in a format suitable for direct submission to those web services, but could also be formatted in a simple text format for general data discovery purposes. The interface will remove any duplication of time series channels between data centers according to a set of business rules by default, however a user may request results with all duplicate time series entries included. We will demonstrate how client­-side federation is being incorporated into some of the DMC's data access tools. We anticipate further enhancement of the IRIS Federator to improve data discovery in various scenarios and to improve usefulness to communities beyond seismology.Data centers with FDSN web services: http://www.fdsn.org/webservices/The IRIS Federator query interface: http://service.iris.edu/irisws/fedcatalog/1/

  4. Instant Cassandra query language

    CERN Document Server

    Singh, Amresh

    2013-01-01

    Get to grips with a new technology, understand what it is and what it can do for you, and then get to work with the most important features and tasks. It's an Instant Starter guide.Instant Cassandra Query Language is great for those who are working with Cassandra databases and who want to either learn CQL to check data from the console or build serious applications using CQL. If you're looking for something that helps you get started with CQL in record time and you hate the idea of learning a new language syntax, then this book is for you.

  5. Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning in Web 3.0

    OpenAIRE

    Padma, S; Seshasaayee, Ananthi

    2012-01-01

    Web 3.0 is an evolving extension of the current web environme bnt. Information in web 3.0 can be collaborated and communicated when queried. Web 3.0 architecture provides an excellent learning experience to the students. Web 3.0 is 3D, media centric and semantic. Web based learning has been on high in recent days. Web 3.0 has intelligent agents as tutors to collect and disseminate the answers to the queries by the students. Completely Interactive learner's query determine the customization of...

  6. A Querying Method over RDF-ized Health Level Seven v2.5 Messages Using Life Science Knowledge Resources

    Science.gov (United States)

    2016-01-01

    Background Health level seven version 2.5 (HL7 v2.5) is a widespread messaging standard for information exchange between clinical information systems. By applying Semantic Web technologies for handling HL7 v2.5 messages, it is possible to integrate large-scale clinical data with life science knowledge resources. Objective Showing feasibility of a querying method over large-scale resource description framework (RDF)-ized HL7 v2.5 messages using publicly available drug databases. Methods We developed a method to convert HL7 v2.5 messages into the RDF. We also converted five kinds of drug databases into RDF and provided explicit links between the corresponding items among them. With those linked drug data, we then developed a method for query expansion to search the clinical data using semantic information on drug classes along with four types of temporal patterns. For evaluation purpose, medication orders and laboratory test results for a 3-year period at the University of Tokyo Hospital were used, and the query execution times were measured. Results Approximately 650 million RDF triples for medication orders and 790 million RDF triples for laboratory test results were converted. Taking three types of query in use cases for detecting adverse events of drugs as an example, we confirmed these queries were represented in SPARQL Protocol and RDF Query Language (SPARQL) using our methods and comparison with conventional query expressions were performed. The measurement results confirm that the query time is feasible and increases logarithmically or linearly with the amount of data and without diverging. Conclusions The proposed methods enabled query expressions that separate knowledge resources and clinical data, thereby suggesting the feasibility for improving the usability of clinical data by enhancing the knowledge resources. We also demonstrate that when HL7 v2.5 messages are automatically converted into RDF, searches are still possible through SPARQL without

  7. Geospatial semantic web

    CERN Document Server

    Zhang, Chuanrong; Li, Weidong

    2015-01-01

    This book covers key issues related to Geospatial Semantic Web, including geospatial web services for spatial data interoperability; geospatial ontology for semantic interoperability; ontology creation, sharing, and integration; querying knowledge and information from heterogeneous data source; interfaces for Geospatial Semantic Web, VGI (Volunteered Geographic Information) and Geospatial Semantic Web; challenges of Geospatial Semantic Web; and development of Geospatial Semantic Web applications. This book also describes state-of-the-art technologies that attempt to solve these problems such as WFS, WMS, RDF, OWL, and GeoSPARQL, and demonstrates how to use the Geospatial Semantic Web technologies to solve practical real-world problems such as spatial data interoperability.

  8. An Efficient Algorithm for Query Transformation in Semantic Query Optimization

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Semantic query optimization (SQO) is comparatively a recent approach for the transformation of given query into equivalent alternative query using matching rules in order to select an optimal query based on the costs of executing alternative queries. The key aspect of the algorithm proposed here is that previous proposed SQO techniques can be considered equally in the uniform cost model, with which optimization opportunities will not be missed. At the same time, the authors used the implication closure to guarantee that any matched rule will not be lost. The authors implemented their algorithm for the optimization of decomposed sub-query in local database in Multi-Database Integrator (MDBI), which is a multidatabase project. The experimental results verify that this algorithm is effective in the process of SQO.

  9. Optimizing Phylogenetic Queries for Performance.

    Science.gov (United States)

    Jamil, Hasan M

    2017-08-24

    The vast majority of phylogenetic databases do not support declarative querying using which their contents can be flexibly and conveniently accessed and the template based query interfaces they support do not allow arbitrary speculative queries. They therefore also do not support query optimization leveraging unique phylogeny properties. While a small number of graph query languages such as XQuery, Cypher and GraphQL exist for computer savvy users, most are too general and complex to be useful for biologists, and too inefficient for large phylogeny querying. In this paper, we discuss a recently introduced visual query language, called PhyQL, that leverages phylogeny specific properties to support essential and powerful constructs for a large class of phylogentic queries. We develop a range of pruning aids, and propose a substantial set of query optimization strategies using these aids suitable for large phylogeny querying. A hybrid optimization technique that exploits a set of indices and ``graphlet" partitioning is discussed. A ``fail soonest" strategy is used to avoid hopeless processing and is shown to produce dividends. Possible novel optimization techniques yet to be explored are also discussed.

  10. An Optimal Labeling Scheme for Ancestry Queries

    OpenAIRE

    2009-01-01

    An ancestry labeling scheme assigns labels (bit strings) to the nodes of rooted trees such that ancestry queries between any two nodes in a tree can be answered merely by looking at their corresponding labels. The quality of an ancestry labeling scheme is measured by its label size, that is the maximal number of bits in a label of a tree node. In addition to its theoretical appeal, the design of efficient ancestry labeling schemes is motivated by applications in web search engines. For this p...

  11. Cooperative Answering of Fuzzy Queries

    Institute of Scientific and Technical Information of China (English)

    Narjes Hachani; Mohamed Ali Ben Hassine; Hanène Chettaoui; Habib Ounelli

    2009-01-01

    The majority of existing information systems deals with crisp data through crisp database systems. Traditional Database Management Systems (DBMS) have not taken into account imprecision so one can say there is some sort of lack of flexibility. The reason is that queries retrieve only elements which precisely match to the given Boolean query. That is, an element belongs to the result if the query is true for this element; otherwise, no answers are returned to the user. The aim of this paper is to present a cooperative approach to handling empty answers of fuzzy conjunctive queries by referring to the Formal Concept Analysis (FCA) theory and fuzzy logic. We present an architecture which combines FCA and databases. The processing of fuzzy queries allows detecting the minimal reasons of empty answers. We also use concept lattice in order to provide the user with the nearest answers in the case of a query failure.

  12. Harvesting all matching information to a given query from a deep website

    NARCIS (Netherlands)

    Khelghati, Mohammadreza; Hiemstra, Djoerd; Keulen, van Maurice; Armano, Giuliano; Bozzon, Alessandro; Giuliani, Alessandro

    2015-01-01

    In this paper, the goal is harvesting all documents matching a given (entity) query from a deep web source. The objective is to retrieve all information about for instance "Denzel Washington", "Iran Nuclear Deal", or "FC Barcelona" from data hidden behind web forms. Policies of web search engines us

  13. Harvesting All Matching Information To A Given Query From a Deep Website

    NARCIS (Netherlands)

    Khelghati, Mohammadreza; Hiemstra, Djoerd; van Keulen, Maurice; Armano, Giuliano; Bozzon, Alessandro; Giuliani, Alessandro

    In this paper, the goal is harvesting all documents matching a given (entity) query from a deep web source. The objective is to retrieve all information about for instance "Denzel Washington", "Iran Nuclear Deal", or "FC Barcelona" from data hidden behind web forms. Policies of web search engines

  14. Web Personalization Using Web Mining

    Directory of Open Access Journals (Sweden)

    Ms.Kavita D.Satokar,

    2010-03-01

    Full Text Available The information on the web is growing dramatically. The users has to spend lots of time on the web finding the information they are interested in. Today, he traditional search engines do not give users enough personalized help but provide the user with lots of irrelevant information. In this paper, we present a personalize Web searchsystem, which can helps users to get the relevant web pages based on their selection from the domain list. Thus, users can obtain a set of interested domains and the web pages from the system. The system is based on features extracted from hyperlinks, such as anchor terms or URL tokens. Our methodology uses an innovative weighted URL Rank algorithm based on user interested domains and user query.

  15. Ranking Queries on Uncertain Data

    CERN Document Server

    Hua, Ming

    2011-01-01

    Uncertain data is inherent in many important applications, such as environmental surveillance, market analysis, and quantitative economics research. Due to the importance of those applications and rapidly increasing amounts of uncertain data collected and accumulated, analyzing large collections of uncertain data has become an important task. Ranking queries (also known as top-k queries) are often natural and useful in analyzing uncertain data. Ranking Queries on Uncertain Data discusses the motivations/applications, challenging problems, the fundamental principles, and the evaluation algorith

  16. Research Issues in Mobile Querying

    DEFF Research Database (Denmark)

    Breunig, M.; Jensen, Christian Søndergaard; Klein, M.

    2004-01-01

    This document reports on key aspects of the discussions conducted within the working group. In particular, the document aims to offer a structured and somewhat digested summary of the group's discussions. The document first offers concepts that enable characterization of "mobile queries" as well...... as the types of systems that enable such queries. It explores the notion of context in mobile queries. The document ends with a few observations, mainly regarding challenges....

  17. Optimizing queries in distributed systems

    Directory of Open Access Journals (Sweden)

    Ion LUNGU

    2006-01-01

    Full Text Available This research presents the main elements of query optimizations in distributed systems. First, data architecture according with system level architecture in a distributed environment is presented. Then the architecture of a distributed database management system (DDBMS is described on conceptual level followed by the presentation of the distributed query execution steps on these information systems. The research ends with presentation of some aspects of distributed database query optimization and strategies used for that.

  18. Smart Query Answering for Marine Sensor Data

    Directory of Open Access Journals (Sweden)

    Paulo de Souza

    2011-03-01

    Full Text Available We review existing query answering systems for sensor data. We then propose an extended query answering approach termed smart query, specifically for marine sensor data. The smart query answering system integrates pattern queries and continuous queries. The proposed smart query system considers both streaming data and historical data from marine sensor networks. The smart query also uses query relaxation technique and semantics from domain knowledge as a recommender system. The proposed smart query benefits in building data and information systems for marine sensor networks.

  19. Data Caching for XML Query

    Institute of Scientific and Technical Information of China (English)

    SU Fei; CI Lin-lin; ZHU Li-ping; ZHAO Xin-xin

    2006-01-01

    In order to apply the technique of data cache to extensible markup language (XML) database system, the XML-cache system to support data cache for XQuery is presented. According to the character of XML, the queries with nesting are normalized to facilitate the following operation. Based on the idea of incomplete tree, using the document type definition (DTD) schema tree and conditions from normalized XQuery, the results of previous queries are maintained to answer new queries, at the same time, the remainder queries are sent to XML database at the back. The results of experiment show all applications supported by XML database can use this technique to cache data for future use.

  20. Query recommendation in the information domain of children

    NARCIS (Netherlands)

    Duarte Torres, Sergio Raúl; Hiemstra, Djoerd; Weber, Ingmar; Serdyukov, Pavel

    2014-01-01

    Children represent an increasing part of web users. One of the key problems that hamper their search experience is their limited vocabulary, their difficulty to use the right keywords, and the inappropriateness of general- purpose query suggestions. In this work, we propose a method that uses tags f

  1. Learning from the History of Distributed Query Processing

    DEFF Research Database (Denmark)

    Betz, Heiko; Gropengießer, Francis; Hose, Katja

    2012-01-01

    The vision of the Semantic Web has triggered the development of various new applications and opened up new directions in research. Recently, much effort has been put into the development of techniques for query processing over Linked Data. Being based upon techniques originally developed for dist...

  2. Learning from the History of Distributed Query Processing

    DEFF Research Database (Denmark)

    Betz, Heiko; Gropengießer, Francis; Hose, Katja;

    2012-01-01

    The vision of the Semantic Web has triggered the development of various new applications and opened up new directions in research. Recently, much effort has been put into the development of techniques for query processing over Linked Data. Being based upon techniques originally developed for dist...

  3. KoralQuery -- A General Corpus Query Protocol

    DEFF Research Database (Denmark)

    Bingel, Joachim; Diewald, Nils

    2015-01-01

    The task-oriented and format-driven development of corpus query systems has led to the creation of numerous corpus query languages (QLs) that vary strongly in expressiveness and syntax. This is a severe impediment for the interoperability of corpus analysis systems, which lack a common protocol...... format and illustrate use cases in the KorAP project....

  4. Using Advanced Search Operators on Web Search Engines.

    Science.gov (United States)

    Jansen, Bernard J.

    Studies show that the majority of Web searchers enter extremely simple queries, so a reasonable system design approach would be to build search engines to compensate for this user characteristic. One hundred representative queries were selected from the transaction log of a major Web search service. These 100 queries were then modified using the…

  5. Traitor: associating concepts using the world wide web

    NARCIS (Netherlands)

    Drijfhout, Wanno; Oliver, Jundt; Wevers, Lesley; Hiemstra, Djoerd

    2013-01-01

    We use Common Crawl's 25TB data set of web pages to construct a database of associated concepts using Hadoop. The database can be queried through a web application with two query interfaces. A textual interface allows searching for similarities and differences between multiple concepts using a query

  6. Using Advanced Search Operators on Web Search Engines.

    Science.gov (United States)

    Jansen, Bernard J.

    Studies show that the majority of Web searchers enter extremely simple queries, so a reasonable system design approach would be to build search engines to compensate for this user characteristic. One hundred representative queries were selected from the transaction log of a major Web search service. These 100 queries were then modified using the…

  7. Traitor: associating concepts using the world wide web

    NARCIS (Netherlands)

    Drijfhout, Wanno; Oliver, J.; Oliver, Jundt; Wevers, L.; Hiemstra, Djoerd

    2013-01-01

    We use Common Crawl's 25TB data set of web pages to construct a database of associated concepts using Hadoop. The database can be queried through a web application with two query interfaces. A textual interface allows searching for similarities and differences between multiple concepts using a query

  8. Query sensitive comparative summarization of search results using concept based segmentation

    CERN Document Server

    Chitra, P; Sarukesi, K

    2012-01-01

    Query sensitive summarization aims at providing the users with the summary of the contents of single or multiple web pages based on the search query. This paper proposes a novel idea of generating a comparative summary from a set of URLs from the search result. User selects a set of web page links from the search result produced by search engine. Comparative summary of these selected web sites is generated. This method makes use of HTML DOM tree structure of these web pages. HTML documents are segmented into set of concept blocks. Sentence score of each concept block is computed with respect to the query and feature keywords. The important sentences from the concept blocks of different web pages are extracted to compose the comparative summary on the fly. This system reduces the time and effort required for the user to browse various web sites to compare the information. The comparative summary of the contents would help the users in quick decision making.

  9. Foundations of Traversal Based Query Execution over Linked Data (Extended Version)

    CERN Document Server

    Hartig, Olaf

    2011-01-01

    The World Wide Web currently evolves into a \\emph{Web of Linked Data} where content providers publish and link data as they have done with hypertext for the last 20 years. We understand this emerging dataspace as an infinitely large, globally distributed database which is -at best- partially known to query execution systems. To tap the full potential of the Web, such a system must be able to answer a query using data from initially unknown data sources. A discovery of such data can be achieved by traversing data links during the execution of the query. This paper provides a formal foundation for executing queries based on link traversal. We model a Web of Linked Data as an infinite structure of documents that contain triple-based data and that are interlinked via this data. We present a query model that introduces a well-defined semantics for conjunctive queries over data in such a Web. Furthermore, we define an abstract execution model for integrating the traversal of data links into the query execution proc...

  10. The EarthServer Federation: State, Role, and Contribution to GEOSS

    Science.gov (United States)

    Merticariu, Vlad; Baumann, Peter

    2016-04-01

    The intercontinental EarthServer initiative has established a European datacube platform with proven scalability: known databases exceed 100 TB, and single queries have been split across more than 1,000 cloud nodes. Its service interface being rigorously based on the OGC "Big Geo Data" standards, Web Coverage Service (WCS) and Web Coverage Processing Service (WCPS), a series of clients can dock into the services, ranging from open-source OpenLayers and QGIS over open-source NASA WorldWind to proprietary ESRI ArcGIS. Datacube fusion in a "mix and match" style is supported by the platform technolgy, the rasdaman Array Database System, which transparently federates queries so that users simply approach any node of the federation to access any data item, internally optimized for minimal data transfer. Notably, rasdaman is part of GEOSS GCI. NASA is contributing its Web WorldWind virtual globe for user-friendly data extraction, navigation, and analysis. Integrated datacube / metadata queries are contributed by CITE. Current federation members include ESA (managed by MEEO sr.l.), Plymouth Marine Laboratory (PML), the European Centre for Medium-Range Weather Forecast (ECMWF), Australia's National Computational Infrastructure, and Jacobs University (adding in Planetary Science). Further data centers have expressed interest in joining. We present the EarthServer approach, discuss its underlying technology, and illustrate the contribution this datacube platform can make to GEOSS.

  11. Usability of XML Query Languages

    NARCIS (Netherlands)

    Graaumans, J.P.M.

    2005-01-01

    The eXtensible Markup Language (XML) is a markup language which enables re-use of information. Specific query languages for XML are developed to facilitate this. There are large differences between history, design goal, and syntax of the XML query languages. However, in practice these languages are

  12. The Semantics of Query Modification

    NARCIS (Netherlands)

    Hollink, V.; Tsikrika, T.; Vries, A.P. de

    2010-01-01

    We present a method that exploits `linked data' to determine semantic relations between consecutive user queries. Our method maps queries onto concepts in linked data and searches the linked data graph for direct or indirect relations between the concepts. By comparing relations between large number

  13. Querying Sentiment Development over Time

    DEFF Research Database (Denmark)

    Andreasen, Troels; Christiansen, Henning; Have, Christian Theil

    2013-01-01

    that measures how well a hypothesis characterizes a given time interval; the semantics is parameterized so it can be adjusted to different views of the data. EmoEpisodes is extended to a query language with variables standing for unknown topics and emotions, and the query-answering mechanism will return...

  14. Priming the Query Specification Process.

    Science.gov (United States)

    Toms, Elaine G.; Freund, Luanne

    2003-01-01

    Tests the use of questions as a technique in the query specification process. Using a within-subjects design, 48 people interacted with a modified Google interface to solve four information problems in four domains. Half the tasks were entered as typical keyword queries, and half as questions or statements. Results suggest the typical search box…

  15. jQuery UI cookbook

    CERN Document Server

    Boduch, Adam

    2013-01-01

    Filled with a practical collection of recipes, jQuery UI Cookbook is full of clear, step-by-step instructions that will help you harness the powerful UI framework in jQuery. Depending on your needs, you can dip in and out of the Cookbook and its recipes, or follow the book from start to finish.If you are a jQuery UI developer looking to improve your existing applications, extract ideas for your new application, or to better understand the overall widget architecture, then jQuery UI Cookbook is a must-have for you. The reader should at least have a rudimentary understanding of what jQuery UI is

  16. Automatic Question Answering from Web Documents

    Institute of Scientific and Technical Information of China (English)

    LI Xin; HU Dawei; LI Huan; HAO Tianyong; CHEN Enhong; LIU Wenyin

    2007-01-01

    A passage retrieval strategy for web-based question answering (QA) systems is proposed in our QA system. It firstly analyzes the question based on semantic patterns to obtain its syntactic and semantic information and then form initial queries. The queries are used to retrieve documents from the World Wide Web (WWW) using the Google search engine. The queries are then rewritten to form queries for passage retrieval in order to improve the precision. The relations between keywords in the question are employed in our query rewrite method. The experimental result on the question set of the TREC-2003 passage task shows that our system performs well for factoid questions.

  17. Query auto completion in information retrieval

    NARCIS (Netherlands)

    Cai, Fei

    2016-01-01

    Query auto completion is an important feature embedded into today's search engines. It can help users formulate queries which other people have searched for when he/she finishes typing the query prefix. Today's most sophisticated query auto completion approaches are based on the collected query logs

  18. In-context query reformulation for failing SPARQL queries

    Science.gov (United States)

    Viswanathan, Amar; Michaelis, James R.; Cassidy, Taylor; de Mel, Geeth; Hendler, James

    2017-05-01

    Knowledge bases for decision support systems are growing increasingly complex, through continued advances in data ingest and management approaches. However, humans do not possess the cognitive capabilities to retain a bird's-eyeview of such knowledge bases, and may end up issuing unsatisfiable queries to such systems. This work focuses on the implementation of a query reformulation approach for graph-based knowledge bases, specifically designed to support the Resource Description Framework (RDF). The reformulation approach presented is instance-and schema-aware. Thus, in contrast to relaxation techniques found in the state-of-the-art, the presented approach produces in-context query reformulation.

  19. The Query-commit Problem

    CERN Document Server

    Molinaro, Marco

    2011-01-01

    In the query-commit problem we are given a graph where edges have distinct probabilities of existing. It is possible to query the edges of the graph, and if the queried edge exists then its endpoints are irrevocably matched. The goal is to find a querying strategy which maximizes the expected size of the matching obtained. This stochastic matching setup is motivated by applications in kidney exchanges and online dating. In this paper we address the query-commit problem from both theoretical and experimental perspectives. First, we show that a simple class of edges can be queried without compromising the optimality of the strategy. This property is then used to obtain in polynomial time an optimal querying strategy when the input graph is sparse. Next we turn our attentions to the kidney exchange application, focusing on instances modeled over real data from existing exchange programs. We prove that, as the number of nodes grows, almost every instance admits a strategy which matches almost all nodes. This resu...

  20. Instant jQuery Flot visual data analysis

    CERN Document Server

    Peiris, Brian

    2013-01-01

    Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. A quick, instruction-based guide full of examples that details on the various aspects of Flot and how users can apply it to data groups for interactive data representation techniques.If you are a data visualization developer, mapping and presentation software developer, or anyone with an interest in jQuery visualization, this book is ideal for you. If you have a working knowledge of jQuery and JavaScript, you can use this book to add sophisticated visualizations to your web applicat

  1. Optimizing RDF Data Cubes for Efficient Processing of Analytical Queries

    DEFF Research Database (Denmark)

    Jakobsen, Kim Ahlstrøm; Andersen, Alex B.; Hose, Katja

    2015-01-01

    data warehouses and data cubes. Today, external data sources are essential for analytics and, as the Semantic Web gains popularity, more and more external sources are available in native RDF. With the recent SPARQL 1.1 standard, performing analytical queries over RDF data sources has finally become......In today’s data-driven world, analytical querying, typically based on the data cube concept, is the cornerstone of answering important business questions and making data-driven decisions. Traditionally, the underlying analytical data was mostly internal to the organization and stored in relational...

  2. Efficient Path Query and Reasoning Method Based on Rare Axis

    Institute of Scientific and Technical Information of China (English)

    姜洋; 冯志勇; 王鑫马晓宁

    2015-01-01

    A new concept of rare axis based on statistical facts is proposed, and an evaluation algorithm is designed thereafter. For the nested regular expressions containing rare axes, the proposed algorithm can reduce its evaluation complexity from polynomial time to nearly linear time. The distributed technique is also employed to construct the navigation axis indexes for resource description framework (RDF) graph data. Experiment results in DrugBank and BioGRID show that this method can improve the query efficiency significantly while ensuring the accuracy and meet the query requirements on Web-scale RDF graph data.

  3. Multi-Dimensional Path Queries

    DEFF Research Database (Denmark)

    Bækgaard, Lars

    1998-01-01

    We present the path-relationship model that supports multi-dimensional data modeling and querying. A path-relationship database is composed of sets of paths and sets of relationships. A path is a sequence of related elements (atoms, paths, and sets of paths). A relationship is a binary path...... to create nested path structures. We present an SQL-like query language that is based on path expressions and we show how to use it to express multi-dimensional path queries that are suited for advanced data analysis in decision support environments like data warehousing environments...

  4. Recommendation Sets and Choice Queries

    DEFF Research Database (Denmark)

    Viappiani, Paolo Renato; Boutilier, Craig

    2011-01-01

    Utility elicitation is an important component of many applications, such as decision support systems and recommender systems. Such systems query users about their preferences and offer recommendations based on the system's belief about the user's utility function. We analyze the connection between...... the problem of generating optimal recommendation sets and the problem of generating optimal choice queries, considering both Bayesian and regret-based elicitation. Our results show that, somewhat surprisingly, under very general circumstances, the optimal recommendation set coincides with the optimal query....

  5. The role of economics in the QUERI program: QUERI Series

    Directory of Open Access Journals (Sweden)

    Smith Mark W

    2008-04-01

    Full Text Available Abstract Background The United States (U.S. Department of Veterans Affairs (VA Quality Enhancement Research Initiative (QUERI has implemented economic analyses in single-site and multi-site clinical trials. To date, no one has reviewed whether the QUERI Centers are taking an optimal approach to doing so. Consistent with the continuous learning culture of the QUERI Program, this paper provides such a reflection. Methods We present a case study of QUERI as an example of how economic considerations can and should be integrated into implementation research within both single and multi-site studies. We review theoretical and applied cost research in implementation studies outside and within VA. We also present a critique of the use of economic research within the QUERI program. Results Economic evaluation is a key element of implementation research. QUERI has contributed many developments in the field of implementation but has only recently begun multi-site implementation trials across multiple regions within the national VA healthcare system. These trials are unusual in their emphasis on developing detailed costs of implementation, as well as in the use of business case analyses (budget impact analyses. Conclusion Economics appears to play an important role in QUERI implementation studies, only after implementation has reached the stage of multi-site trials. Economic analysis could better inform the choice of which clinical best practices to implement and the choice of implementation interventions to employ. QUERI economics also would benefit from research on costing methods and development of widely accepted international standards for implementation economics.

  6. Algebra-Based Optimization of XML-Extended OLAP Queries

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    2006-01-01

    In today’s OLAP systems, integrating fast changing data physically into a cube is complex and time-consuming. Our solution, the “OLAP-XML Federation System,” makes it possible to reference the fast changing data in XML format in OLAP queries without physical integration. In this paper, we introdu...

  7. Sexual information seeking on web search engines.

    Science.gov (United States)

    Spink, Amanda; Koricich, Andrew; Jansen, B J; Cole, Charles

    2004-02-01

    Sexual information seeking is an important element within human information behavior. Seeking sexually related information on the Internet takes many forms and channels, including chat rooms discussions, accessing Websites or searching Web search engines for sexual materials. The study of sexual Web queries provides insight into sexually-related information-seeking behavior, of value to Web users and providers alike. We qualitatively analyzed queries from logs of 1,025,910 Alta Vista and AlltheWeb.com Web user queries from 2001. We compared the differences in sexually-related Web searching between Alta Vista and AlltheWeb.com users. Differences were found in session duration, query outcomes, and search term choices. Implications of the findings for sexual information seeking are discussed.

  8. Semantic – Based Querying Using Ontology in Relational Database of Library Management System

    Directory of Open Access Journals (Sweden)

    Ayesha Banu

    2011-11-01

    Full Text Available The traditional Web stores huge amount of data in the form of Relational Databases (RDB as it is good atstoring objects and relationships between them. Relational Databases are dynamic in nature which allowsbringing tables together helping user to search for related material across multiple tables. RDB arescalable to expand as the data grows. The RDB uses a Structured Query Language called SQL to accessthe databases for several data retrieval purposes. As the world is moving today from the Syntactic form toSemantic form and the Web is also taking its new form of Semantic Web. The Structured Query of the RDBon web can be a Semantic Query on Semantic Web. The SPARQL is the Query Language recommended byW3C for the RDF(Resource Description Framework. RDF is a directed, labeled graph data format forrepresenting information in the Web and is a very important layer of the Semantic Web Architecture. In thispaper we consider the Library Management System (LMS database, taking some tuples of the LMSRelational Schema. We discuss how the RDF code is scripted and validated using RDF Validator and howRDF Triples are generated. Later we give the graphical representation of the RDF triples and see theprocess of extracting ontology from the RDF Schema and application of the Semantic Query.

  9. XML Multidimensional Modelling and Querying

    CERN Document Server

    Boucher, Serge; Zimányi, Esteban

    2009-01-01

    As XML becomes ubiquitous and XML storage and processing becomes more efficient, the range of use cases for these technologies widens daily. One promising area is the integration of XML and data warehouses, where an XML-native database stores multidimensional data and processes OLAP queries written in the XQuery interrogation language. This paper explores issues arising in the implementation of such a data warehouse. We first compare approaches for multidimensional data modelling in XML, then describe how typical OLAP queries on these models can be expressed in XQuery. We then show how, regardless of the model, the grouping features of XQuery 1.1 improve performance and readability of these queries. Finally, we evaluate the performance of query evaluation in each modelling choice using the eXist database, which we extended with a grouping clause implementation.

  10. Schedule Sales Query Raw Data

    Data.gov (United States)

    General Services Administration — Schedule Sales Query presents sales volume figures as reported to GSA by contractors. The reports are generated as quarterly reports for the current year and the...

  11. Web Service: MedlinePlus

    Science.gov (United States)

    ... an alternate method of accessing MedlinePlus data. Base URL https ://wsearch.nlm.nih.gov/ws/query Please ... the Web service. All special characters must be URL encoded. Spaces may be replaced by '+' signs, which ...

  12. TLGM-QL:基于图模型的Web数据分析性查询语言%TLGM-QL:Analytical Query Language of Web Data Based on Graph Model

    Institute of Scientific and Technical Information of China (English)

    马强; 陶导; 钱卫宁; 周傲英

    2009-01-01

    随着万维网规模和应用的飞速发展,如何有效存储和利用Web数据已成为计算机科学诸多研究领域的巨大挑战.针对这些迫切的需要,介绍一种新的Web分析工具TLGM-QL(tagged and labeled graph model query language),用户只需要编写描述性的类SQL分析性查询语句,即可获得对于以图形式组织的Web数据分析结果.用户不需要关心底层的实现,系统可将TLGM-QL查询语句生成物理执行计划分配给集群高度并行执行,最终返回查询结果.

  13. Ontological Queries: Rewriting and Optimization (Extended Version)

    CERN Document Server

    Gottlob, Georg; Pieris, Andreas

    2011-01-01

    Ontological queries are evaluated against an ontology rather than directly on a database. The evaluation and optimization of such queries is an intriguing new problem for database research. In this paper we discuss two important aspects of this problem: query rewriting and query optimization. Query rewriting consists of the compilation of an ontological query into an equivalent query against the underlying relational database. The focus here is on soundness and completeness. We review previous results and present a new rewriting algorithm for rather general types of ontological constraints. In particular, we show how a conjunctive query against an ontology can be compiled into a union of conjunctive queries against the underlying database. Ontological query optimization, in this context, attempts to improve this process so to produce possibly small and cost-effective UCQ rewritings for an input query. We review existing optimization methods, and propose an effective new method that works for linear Datalog+/-...

  14. Federal Fleet Report

    Data.gov (United States)

    General Services Administration — Annual report of Federal agencies' motor vehicle fleet data collected in the Federal Automotive Statistical Tool (FAST), a web-based reporting tool cosponsored by...

  15. Optimizing RDF Data Cubes for Efficient Processing of Analytical Queries

    DEFF Research Database (Denmark)

    Jakobsen, Kim Ahlstrøm; Andersen, Alex B.; Hose, Katja

    2015-01-01

    data warehouses and data cubes. Today, external data sources are essential for analytics and, as the Semantic Web gains popularity, more and more external sources are available in native RDF. With the recent SPARQL 1.1 standard, performing analytical queries over RDF data sources has finally become......In today’s data-driven world, analytical querying, typically based on the data cube concept, is the cornerstone of answering important business questions and making data-driven decisions. Traditionally, the underlying analytical data was mostly internal to the organization and stored in relational...... feasible. However, unlike their relational counterparts, RDF data cubes stores lack optimizations that enable fast querying. In this paper, we present an approach to optimizing RDF data cubes that is based on three novel cube patterns that optimize RDF data cubes, as well as associated algorithms...

  16. A Quantum Query Expansion Approach for Session Search

    Directory of Open Access Journals (Sweden)

    Peng Zhang

    2016-04-01

    Full Text Available Recently, Quantum Theory (QT has been employed to advance the theory of Information Retrieval (IR. Various analogies between QT and IR have been established. Among them, a typical one is applying the idea of photon polarization in IR tasks, e.g., for document ranking and query expansion. In this paper, we aim to further extend this work by constructing a new superposed state of each document in the information need space, based on which we can incorporate the quantum interference idea in query expansion. We then apply the new quantum query expansion model to session search, which is a typical Web search task. Empirical evaluation on the large-scale Clueweb12 dataset has shown that the proposed model is effective in the session search tasks, demonstrating the potential of developing novel and effective IR models based on intuitions and formalisms of QT.

  17. The NIF LinkOut broker: a web resource to facilitate federated data integration using NCBI identifiers.

    Science.gov (United States)

    Marenco, Luis; Ascoli, Giorgio A; Martone, Maryann E; Shepherd, Gordon M; Miller, Perry L

    2008-09-01

    This paper describes the NIF LinkOut Broker (NLB) that has been built as part of the Neuroscience Information Framework (NIF) project. The NLB is designed to coordinate the assembly of links to neuroscience information items (e.g., experimental data, knowledge bases, and software tools) that are (1) accessible via the Web, and (2) related to entries in the National Center for Biotechnology Information's (NCBI's) Entrez system. The NLB collects these links from each resource and passes them to the NCBI which incorporates them into its Entrez LinkOut service. In this way, an Entrez user looking at a specific Entrez entry can LinkOut directly to related neuroscience information. The information stored in the NLB can also be utilized in other ways. A second approach, which is operational on a pilot basis, is for the NLB Web server to create dynamically its own Web page of LinkOut links for each NCBI identifier in the NLB database. This approach can allow other resources (in addition to the NCBI Entrez) to LinkOut to related neuroscience information. The paper describes the current NLB system and discusses certain design issues that arose during its implementation.

  18. The NIF LinkOut Broker: A Web Resource to Facilitate Federated Data Integration using NCBI Identifiers

    Science.gov (United States)

    Ascoli, Giorgio A.; Martone, Maryann E.; Shepherd, Gordon M.; Miller, Perry L.

    2009-01-01

    This paper describes the NIF LinkOut Broker (NLB) that has been built as part of the Neuroscience Information Framework (NIF) project. The NLB is designed to coordinate the assembly of links to neuroscience information items (e.g., experimental data, knowledge bases, and software tools) that are (1) accessible via the Web, and (2) related to entries in the National Center for Biotechnology Information’s (NCBI’s) Entrez system. The NLB collects these links from each resource and passes them to the NCBI which incorporates them into its Entrez LinkOut service. In this way, an Entrez user looking at a specific Entrez entry can LinkOut directly to related neuroscience information. The information stored in the NLB can also be utilized in other ways. A second approach, which is operational on a pilot basis, is for the NLB Web server to create dynamically its own Web page of LinkOut links for each NCBI identifier in the NLB database. This approach can allow other resources (in addition to the NCBI Entrez) to LinkOut to related neuroscience information. The paper describes the current NLB system and discusses certain design issues that arose during its implementation. PMID:18975149

  19. Integrity Constraint Checking in Federated Databases

    NARCIS (Netherlands)

    Grefen, Paul; Widom, Jennifer

    1996-01-01

    A federated database is comprised of multiple interconnected databases that cooperate in an autonomous fashion. Global integrity constraints are very useful in federated databases, but the lack of global queries, global transaction mechanisms, and global concurrency control renders traditional const

  20. The Navigational Power of Web Browsers

    NARCIS (Netherlands)

    Bielecki, M.; Hidders, J.; Paredaens, J.; Spielmann, M.; Tyszkiewicz, J.; Van den Bussche, J.

    2010-01-01

    We investigate the computational capabilities of Web browsers, when equipped with a standard finite automaton. We observe that Web browsers are Turingcomplete. We introduce the notion of a navigational problem, and investigate the complexity of solving Web queries and navigational problems by Web br

  1. Beginning ASPNET Web Pages with WebMatrix

    CERN Document Server

    Brind, Mike

    2011-01-01

    Learn to build dynamic web sites with Microsoft WebMatrix Microsoft WebMatrix is designed to make developing dynamic ASP.NET web sites much easier. This complete Wrox guide shows you what it is, how it works, and how to get the best from it right away. It covers all the basic foundations and also introduces HTML, CSS, and Ajax using jQuery, giving beginning programmers a firm foundation for building dynamic web sites.Examines how WebMatrix is expected to become the new recommended entry-level tool for developing web sites using ASP.NETArms beginning programmers, students, and educators with al

  2. Effective Density Queries of Continuously Moving Objects

    DEFF Research Database (Denmark)

    Jensen, Christian Søndergaard; Lin, D.; Ooi, B.C.

    2006-01-01

    In this paper, we study a newly emerging type of queries on moving objects - the density query. Basically, this query locates regions in the data space where the density of the objects is high. This type of queries is especially useful in Location Based Services (LBS). For example, in a traffic...

  3. Ad-Hoc Queries over Document Collections - A Case Study

    Science.gov (United States)

    Löser, Alexander; Lutter, Steffen; Düssel, Patrick; Markl, Volker

    We discuss the novel problem of supporting analytical business intelligence queries over web-based textual content, e.g., BI-style reports based on 100.000's of documents from an ad-hoc web search result. Neither conventional search engines nor conventional Business Intelligence and ETL tools address this problem, which lies at the intersection of their capabilities. "Google Squared" or our system GOOLAP.info, are examples of these kinds of systems. They execute information extraction methods over one or several document collections at query time and integrate extracted records into a common view or tabular structure. Frequent extraction and object resolution failures cause incomplete records which could not be joined into a record answering the query. Our focus is the identification of join-reordering heuristics maximizing the size of complete records answering a structured query. With respect to given costs for document extraction we propose two novel join-operations: The multi-way CJ-operator joins records from multiple relationships extracted from a single document. The two-way join-operator DJ ensures data density by removing incomplete records from results. In a preliminary case study we observe that our join-reordering heuristics positively impact result size, record density and lower execution costs.

  4. Privacy Preserving Moving KNN Queries

    CERN Document Server

    Hashem, Tanzima; Zhang, Rui

    2011-01-01

    We present a novel approach that protects trajectory privacy of users who access location-based services through a moving k nearest neighbor (MkNN) query. An MkNN query continuously returns the k nearest data objects for a moving user (query point). Simply updating a user's imprecise location such as a region instead of the exact position to a location-based service provider (LSP) cannot ensure privacy of the user for an MkNN query: continuous disclosure of regions enables the LSP to follow a user's trajectory. We identify the problem of trajectory privacy that arises from the overlap of consecutive regions while requesting an MkNN query and provide the first solution to this problem. Our approach allows a user to specify the confidence level that represents a bound of how much more the user may need to travel than the actual kth nearest data object. By hiding a user's required confidence level and the required number of nearest data objects from an LSP, we develop a technique to prevent the LSP from tracking...

  5. Dynamic Planar Range Maxima Queries

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Tsakalidis, Konstantinos

    2011-01-01

    We consider the dynamic two-dimensional maxima query problem. Let P be a set of n points in the plane. A point is maximal if it is not dominated by any other point in P. We describe two data structures that support the reporting of the t maximal points that dominate a given query point, and allow...... update time, using O(nlogn) space, where t is the size of the output. This improves the worst case deletion time of the dynamic rectangular visibility query problem from O(log^3 n) to O(log^2 n). We adapt the data structure to the RAM model with word size w, where the coordinates of the points...... in the worst case. The data structure also supports the more general query of reporting the maximal points among the points that lie in a given 3-sided orthogonal range unbounded from above in the same complexity. We can support 4-sided queries in O(log^2 n + t) worst case time, and O(log^2 n) worst case...

  6. Ontology-based geospatial data query and integration

    Science.gov (United States)

    Zhao, T.; Zhang, C.; Wei, M.; Peng, Z.-R.

    2008-01-01

    Geospatial data sharing is an increasingly important subject as large amount of data is produced by a variety of sources, stored in incompatible formats, and accessible through different GIS applications. Past efforts to enable sharing have produced standardized data format such as GML and data access protocols such as Web Feature Service (WFS). While these standards help enabling client applications to gain access to heterogeneous data stored in different formats from diverse sources, the usability of the access is limited due to the lack of data semantics encoded in the WFS feature types. Past research has used ontology languages to describe the semantics of geospatial data but ontology-based queries cannot be applied directly to legacy data stored in databases or shapefiles, or to feature data in WFS services. This paper presents a method to enable ontology query on spatial data available from WFS services and on data stored in databases. We do not create ontology instances explicitly and thus avoid the problems of data replication. Instead, user queries are rewritten to WFS getFeature requests and SQL queries to database. The method also has the benefits of being able to utilize existing tools of databases, WFS, and GML while enabling query based on ontology semantics. ?? 2008 Springer-Verlag Berlin Heidelberg.

  7. Query optimization for graph analytics on linked data using SPARQL

    Energy Technology Data Exchange (ETDEWEB)

    Hong, Seokyong [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lee, Sangkeun [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lim, Seung -Hwan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sukumar, Sreenivas R. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Vatsavai, Ranga Raju [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2015-07-01

    Triplestores that support query languages such as SPARQL are emerging as the preferred and scalable solution to represent data and meta-data as massive heterogeneous graphs using Semantic Web standards. With increasing adoption, the desire to conduct graph-theoretic mining and exploratory analysis has also increased. Addressing that desire, this paper presents a solution that is the marriage of Graph Theory and the Semantic Web. We present software that can analyze Linked Data using graph operations such as counting triangles, finding eccentricity, testing connectedness, and computing PageRank directly on triple stores via the SPARQL interface. We describe the process of optimizing performance of the SPARQL-based implementation of such popular graph algorithms by reducing the space-overhead, simplifying iterative complexity and removing redundant computations by understanding query plans. Our optimized approach shows significant performance gains on triplestores hosted on stand-alone workstations as well as hardware-optimized scalable supercomputers such as the Cray XMT.

  8. Data management on the spatial web

    DEFF Research Database (Denmark)

    Jensen, Christian S.

    2012-01-01

    billion web queries are issued that have local intent and target spatial web objects. These are points of interest with a web presence, and they thus have locations as well as textual descriptions. This development has given prominence to spatial web data management, an area ripe with new and exciting...... opportunities and challenges. The research community has embarked on inventing and supporting new query functionality for the spatial web. Different kinds of spatial web queries return objects that are near a location argument and are relevant to a text argument. To support such queries, it is important...... functionality enabled by the setting. Further, the talk offers insight into the data management techniques capable of supporting such functionality....

  9. Bottom-up mining of XML query patterns to improve XML querying

    Institute of Scientific and Technical Information of China (English)

    Yi-jun BEI; Gang CHEN; Jin-xiang DONG; Ke CHEN

    2008-01-01

    Querying XML data is a computationally expensive process due to the complex nature of both the XML data and the XML queries. In this paper we propose an approach to expedite XML query processing by caching the results of frequent queries. We discover frequent query patterns from user-issued queries using an efficient bottom-up mining approach called VBUXMiner. VBUXMiner consists of two main steps. First, all queries are merged into a summary structure named "compressed global tree guide" (CGTG). Second, a bottom-up traversal scheme based on the CGTG is employed to generate frequent query patterns. We use the frequent query patterns in a cache mechanism to improve the XML query performance. Experimental results show that our proposed mining approach outperforms the previous mining algorithms for XML queries, such as XQPMinerTID and FastXMiner, and that by caching the results of frequent query patterns, XML query performance can be dramatically improved.

  10. jQuery 2.0 development cookbook

    CERN Document Server

    Revill, Leon

    2014-01-01

    Taking a recipe-based approach, this book presents numerous practical examples that you can use directly in your applications. The book covers the essential issues you will face while developing your web applications and gives you solutions to them. The recipes in this book are written in a manner that rapidly takes you from beginner to expert level.This book is for web developers of all skill levels. Although some knowledge of JavaScript, HTML, and CSS is required, this Cookbook will teach jQuery newcomers all the basics required to move on to the more complex examples of this book, which wil

  11. The JAVA-based DICOM query interface DicoSE.

    Science.gov (United States)

    Prinz, Michael; Fischer, Georg; Schuster, Ernst

    2005-03-01

    DICOM 3 is a very elaborate standard for the communication between medical image devices. It is published in several parts by the National Electrical Manufacturers Association (NEMA). To adequately visualize the data structure defined in parts 3, 5 and 6 of the DICOM standard, we implemented the web based Dicom Search Engine (DicoSE). It allows for querying the DICOM standard data dictionary for defined data fields and visualizes the topology of the data which is inherently present in DICOM datasets. For the administration of the underlying data a web based administration interface is provided. The service is entirely based on freely available software.

  12. Query Results Clustering by Extending SPARQL with CLUSTER BY

    Science.gov (United States)

    Ławrynowicz, Agnieszka

    The task of dynamic clustering of the search results proved to be useful in the Web context, where the user often does not know the granularity of the search results in advance. The goal of this paper is to provide a declarative way for invoking dynamic clustering of the results of queries submitted over Semantic Web data. To achieve this goal the paper proposes an approach that extends SPARQL by clustering abilities. The approach introduces a new statement, CLUSTER BY, into the SPARQL grammar and proposes semantics for such extension.

  13. JavaScript & jQuery The Missing Manual

    CERN Document Server

    McFarland, David

    2011-01-01

    JavaScript lets you supercharge your HTML with animation, interactivity, and visual effects-but many web designers find the language hard to learn. This jargon-free guide covers JavaScript basics and shows you how to save time and effort with the jQuery library of prewritten JavaScript code. You'll soon be building web pages that feel and act like desktop programs, without having to do much programming. The important stuff you need to know: Make your pages interactive. Create JavaScript events that react to visitor actions.Use animations and effects. Build drop-down navigation menus, pop-ups

  14. Condorcet query engine: A query engine for coordinated index terms

    NARCIS (Netherlands)

    van der Vet, P.E.; Mars, Nicolaas

    1999-01-01

    On-line information retrieval systems often offer their users some means to tune the query to match the level of granularity of the information request. Users can be offered a far greater range of possibilities, however, if documents are indexed with coordinated index concepts. Coordinated index

  15. Querying and Manipulating Temporal Databases

    Directory of Open Access Journals (Sweden)

    Mohamed Mkaouar

    2011-03-01

    Full Text Available Many works have focused, for over twenty five years, on the integration of the time dimension indatabases (DB. However, the standard SQL3 does not yet allow easy definition, manipulation andquerying of temporal DBs. In this paper, we study how we can simplify querying and manipulatingtemporal facts in SQL3, using a model that integrates time in a native manner. To do this, we proposenew keywords and syntax to define different temporal versions for many relational operators andfunctions used in SQL. It then becomes possible to perform various queries and updates appropriate totemporal facts. We illustrate the use of these proposals on many examples from a real application.

  16. Webspace query formulation: an overview

    NARCIS (Netherlands)

    Zwol, van Roelof; Apers, Peter M.G.

    2001-01-01

    To nd information on theWorld-WideWeb (WWW), two approaches are generally followed. Browsing the web from a specific starting point, or web-site map, is called search by divergence. The second approach, search by convergence, is followed when using a search engine. Most search engines use a informat

  17. Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning in Web 3.0

    CERN Document Server

    Padma, S

    2012-01-01

    Web 3.0 is an evolving extension of the current web environme bnt. Information in web 3.0 can be collaborated and communicated when queried. Web 3.0 architecture provides an excellent learning experience to the students. Web 3.0 is 3D, media centric and semantic. Web based learning has been on high in recent days. Web 3.0 has intelligent agents as tutors to collect and disseminate the answers to the queries by the students. Completely Interactive learner's query determine the customization of the intelligent tutor. This paper analyses the Web 3.0 learning environment attributes. A Maximum spanning tree model for the personalized web based collaborative learning is designed.

  18. Digital Earth Watch (DEW): How Mobile Apps Are Paving The Way Towards A Federated Web-Services Architecture For Citizen Science

    Science.gov (United States)

    Carrera, F.; Schloss, A. L.; Guerin, S.; Beaudry, J.; Pickle, J.

    2011-12-01

    interacts with the UNH server via APIs (Application Programming Interfaces) that were created to allow bi-directional machine-to-machine interaction between the mobile device and the web site. Thus, the principal functions that a user can perform on the web site, such as finding post sites on a map and viewing and adding picture sets, are available on the smartphone. The development of the APIs makes it now possible not only to communicate with our own mobile app, but, more importantly, it opens the door for other computer systems to directly interact with our server. Our ongoing discussions with the National Phenology Network and Project Budburst, have highlighted the potential (and perhaps the need) for the creation of a distributed web-service architecture whereby each national program exposes its key functionalities not only to their own mobile phone apps, but also to other organizations, in a federated system of servers, all supporting citizen-based digital earth watch programs.

  19. Cross Language Information Retrieval Model for Discovering WSDL Documents Using Arabic Language Query

    Directory of Open Access Journals (Sweden)

    Prof. Dr. Torkey I.Sultan

    2013-09-01

    Full Text Available Web service discovery is the process of finding a suitable Web service for a given user’s query through analyzing the web service‘s WSDL content and finding the best match for the user’s query. The service query should be written in the same language of the WSDL, for example English. Cross Language Information Retrieval techniques does not exist in the web service discovery process. The absence of CLIR methods limits the search language to the English language keywords only, which raises the following question “How do people that do not know the English Language find a web service, This paper proposes the application of CLIR techniques and IR methods to support Bilingual Web service discovery process the second language that proposed here is Arabic. Text mining techniques were applied on WSDL content and user’s query to be ready for CLIR methods. The proposed model was tested on a curated catalogue of Life Science Web Services http://www.biocatalogue.org/ and used for solving the research problem with 99.87 % accuracy and 95.06 precision

  20. CLASCN: Candidate Network Selection for Efficient Top-k Keyword Queries over Databases

    Institute of Scientific and Technical Information of China (English)

    Jun Zhang; Zhao-Hui Peng; Shan Wang; Hui-Jing Nie

    2007-01-01

    Keyword Search Over Relational Databases (KSORD) enables casual or Web users easily access databases through free-form keyword queries. Improving the performance of KSORD systems is a critical issue in this area. In this paper, a new approach CLASCN (Classification, Learning And Selection of Candidate Network) is developed to efficiently perform top-k keyword queries in schema-graph-based online KSORD systems. In this approach, the Candidate Networks(CNs) from trained keyword queries or executed user queries are classified and stored in the databases, and top-k results from the CNs are learned for constructing CN Language Models (CNLMs). The CNLMs are used to compute the similarity scores between a new user query and the CNs from the query. The CNs with relatively large similarity score, which are the most promising ones to produce top-k results, will be selected and performed. Currently, CLASCN is only applicable for past queries and New All-keyword-Used (NAU) queries which are frequently submitted queries. Extensive experiments also show the efficiency and effectiveness of our CLASCN approach.

  1. Preference Elicitation in Prioritized Skyline Queries

    CERN Document Server

    Mindolin, Denis

    2010-01-01

    Preference queries incorporate the notion of binary preference relation into relational database querying. Instead of returning all the answers, such queries return only the best answers, according to a given preference relation. Preference queries are a fast growing area of database research. Skyline queries constitute one of the most thoroughly studied classes of preference queries. A well known limitation of skyline queries is that skyline preference relations assign the same importance to all attributes. In this work, we study p-skyline queries that generalize skyline queries by allowing varying attribute importance in preference relations. We perform an in-depth study of the properties of p-skyline preference relations. In particular,we study the problems of containment and minimal extension. We apply the obtained results to the central problem of the paper: eliciting relative importance of attributes. Relative importance is implicit in the constructed p-skyline preference relation. The elicitation is ba...

  2. Scalable Social Coordination using Enmeshed Queries

    CERN Document Server

    Chen, Jianjun; Varghese, George

    2012-01-01

    Social coordination allows users to move beyond awareness of their friends to efficiently coordinating physical activities with others. While specific forms of social coordination can be seen in tools such as Evite, Meetup and Groupon, we introduce a more general model using what we call {\\em enmeshed queries}. An enmeshed query allows users to declaratively specify an intent to coordinate by specifying social attributes such as the desired group size and who/what/when, and the database returns matching queries. Enmeshed queries are continuous, but new queries (and not data) answer older queries; the variable group size also makes enmeshed queries different from entangled queries, publish-subscribe systems, and dating services. We show that even offline group coordination using enmeshed queries is NP-hard. We then introduce efficient heuristics that use selective indices such as location and time to reduce the space of possible matches; we also add refinements such as delayed evaluation and using the relative...

  3. Improving Web Page Retrieval using Search Context from Clicked Domain Names

    NARCIS (Netherlands)

    Li, Rongmei

    2009-01-01

    Search context is a crucial factor that helps to understand a user’s information need in ad-hoc Web page retrieval. A query log of a search engine contains rich information on issued queries and their corresponding clicked Web pages. The clicked data implies its relevance to the query and can be use

  4. Improving Web Page Retrieval using Search Context from Clicked Domain Names

    NARCIS (Netherlands)

    Li, R.

    Search context is a crucial factor that helps to understand a user’s information need in ad-hoc Web page retrieval. A query log of a search engine contains rich information on issued queries and their corresponding clicked Web pages. The clicked data implies its relevance to the query and can be

  5. Query Expansion Using Heterogeneous Thesauri.

    Science.gov (United States)

    Mandala, Rila; Tokunaga, Takenobu; Tanaka, Hozumi

    2000-01-01

    Proposes a method to improve the performance of information retrieval systems by expanding queries using heterogeneous thesauri. Experiments show that using heterogeneous thesauri with an appropriate weighting method results in better retrieval performance than using only one type of thesaurus. (Author/LRW)

  6. Accomplishing Deterministic XML Query Optimization

    Institute of Scientific and Technical Information of China (English)

    Dun-Ren Che

    2005-01-01

    As the popularity of XML (eXtensible Markup Language) keeps growing rapidly, the management of XML compliant structured-document databases has become a very interesting and compelling research area. Query optimization for XML structured-documents stands out as one of the most challenging research issues in this area because of the much enlarged optimization (search) space, which is a consequence of the intrinsic complexity of the underlying data model of XML data. We therefore propose to apply deterministic transformations on query expressions to most aggressively prune the search space and fast achieve a sufficiently improved alternative (if not the optimal) for each incoming query expression. This idea is not just exciting but practically attainable. This paper first provides an overview of our optimization strategy, and then focuses on the key implementation issues of our rule-based transformation system for XML query optimization in a database environment. The performance results we obtained from experimentation show that our approach is a valid and effective one.

  7. Query Expansion Using Heterogeneous Thesauri.

    Science.gov (United States)

    Mandala, Rila; Tokunaga, Takenobu; Tanaka, Hozumi

    2000-01-01

    Proposes a method to improve the performance of information retrieval systems by expanding queries using heterogeneous thesauri. Experiments show that using heterogeneous thesauri with an appropriate weighting method results in better retrieval performance than using only one type of thesaurus. (Author/LRW)

  8. Querying Large Biological Network Datasets

    Science.gov (United States)

    Gulsoy, Gunhan

    2013-01-01

    New experimental methods has resulted in increasing amount of genetic interaction data to be generated every day. Biological networks are used to store genetic interaction data gathered. Increasing amount of data available requires fast large scale analysis methods. Therefore, we address the problem of querying large biological network datasets.…

  9. Explanations for Skyline Query Results

    DEFF Research Database (Denmark)

    Chester, Sean; Assent, Ira

    2015-01-01

    Skyline queries are a well-studied problem for multidimensional data, wherein points are returned to the user iff no other point is preferable across all attributes. This leaves only the points most likely to appeal to an arbitrary user. However, some dominated points may still be interesting, an...

  10. Logical Querying of Relational Databases

    Directory of Open Access Journals (Sweden)

    Luminita Pistol

    2016-12-01

    Full Text Available This paper aims to demonstrate the usefulness of formal logic and lambda calculus in database programming. After a short introduction in propositional and first order logic, we implement dynamically a small database and translate some SQL queries in filtered java 8 streams, enhanced with Tuples facilities from jOOλ library.

  11. Large Catalogue Query Performance in Relational Databases

    Science.gov (United States)

    Power, Robert A.

    2007-05-01

    The performance of the mysql and oracle database systems have been compared for a selection of astronomy queries using large catalogues of up to a billion objects. The queries tested are those expected from the astronomy community: general database queries, cone searches, neighbour finding and cross matching. The catalogue preparation, sql query formulation and database performance is presented. Most of the general queries perform adequately when appropriate indexes are present in the database. Each system performs well for cone search queries when the Hierarchical Triangular Mesh spatial index is used. Neighbour finding and cross matching are not well supported in a database environment when compared to software specifically developed to solve these problems.

  12. Digging Deeper: The Deep Web.

    Science.gov (United States)

    Turner, Laura

    2001-01-01

    Focuses on the Deep Web, defined as Web content in searchable databases of the type that can be found only by direct query. Discusses the problems of indexing; inability to find information not indexed in the search engine's database; and metasearch engines. Describes 10 sites created to access online databases or directly search them. Lists ways…

  13. Digging Deeper: The Deep Web.

    Science.gov (United States)

    Turner, Laura

    2001-01-01

    Focuses on the Deep Web, defined as Web content in searchable databases of the type that can be found only by direct query. Discusses the problems of indexing; inability to find information not indexed in the search engine's database; and metasearch engines. Describes 10 sites created to access online databases or directly search them. Lists ways…

  14. Optimizing Temporal Queries: Efficient Handling of Duplicates

    DEFF Research Database (Denmark)

    Toman, David; Bowman, Ivan Thomas

    2001-01-01

    Recent research in the area of temporal databases has proposed a number of query languages that vary in their expressive power and the semantics they provide to users. These query languages represent a spectrum of solutions to the tension between clean semantics and efficient evaluation. Often......, these query languages are implemented by translating temporal queries into standard relational queries. However, the compiled queries are often quite cumbersome and expensive to execute even using state-of-the- art relational products. This paper presents an optimization technique that produces more efficient...... translated SQL queries by taking into account the properties of the encoding used for temporal attributes. For concreteness, this translation technique is presented in the context of SQL/TP; however, these techniques are also applicable to other temporal query languages....

  15. Optimizing Temporal Queries: Efficient Handling of Duplicates

    DEFF Research Database (Denmark)

    Toman, David; Bowman, Ivan Thomas

    2001-01-01

    translated SQL queries by taking into account the properties of the encoding used for temporal attributes. For concreteness, this translation technique is presented in the context of SQL/TP; however, these techniques are also applicable to other temporal query languages......., these query languages are implemented by translating temporal queries into standard relational queries. However, the compiled queries are often quite cumbersome and expensive to execute even using state-of-the- art relational products. This paper presents an optimization technique that produces more efficient......Recent research in the area of temporal databases has proposed a number of query languages that vary in their expressive power and the semantics they provide to users. These query languages represent a spectrum of solutions to the tension between clean semantics and efficient evaluation. Often...

  16. Format SPARQL Query Results into HTML Report

    Directory of Open Access Journals (Sweden)

    Dr Sunitha Abburu

    2013-07-01

    Full Text Available SPARQL is one of the powerful query language for querying semantic data. It is recognized by the W3C as a query language for RDF. As an efficient query language for RDF, it has defined several query result formats such as CSV, TSV and XML etc. These formats are not attractive, understandable and readable. The results need to be converted in an appropriate format so that user can easily understand. The above formats require additional transformations or tool support to represent the query result in user readable format. The main aim of this paper is to propose a method to build HTML report dynamically for SPARQL query results. This enables SPARQL query result display, in HTML report format easily, in an attractive understandable format without the support of any additional or external tools or transformation.

  17. Distributed Deep Web Search

    NARCIS (Netherlands)

    Tjin-Kam-Jet, Kien

    2013-01-01

    The World Wide Web contains billions of documents (and counting); hence, it is likely that some document will contain the answer or content you are searching for. While major search engines like Bing and Google often manage to return relevant results to your query, there are plenty of situations in

  18. Distributed deep web search

    NARCIS (Netherlands)

    Tjin-Kam-Jet, Kien-Tsoi Theodorus Egbert

    2013-01-01

    The World Wide Web contains billions of documents (and counting); hence, it is likely that some document will contain the answer or content you are searching for. While major search engines like Bing and Google often manage to return relevant results to your query, there are plenty of situations in

  19. Deep web content monitoring

    NARCIS (Netherlands)

    Khelghati, Mohammadreza

    2016-01-01

    In this thesis, we investigate the path towards a focused web harvesting approach which can automatically and efficiently query websites, navigate through results, download data, store it and track data changes over time. Such an approach can also facilitate users to access a complete collection of

  20. Retrieving top-k prestige-based relevant spatial web objects

    DEFF Research Database (Denmark)

    Cao, Xin; Cong, Gao; Jensen, Christian S.

    2010-01-01

    The location-aware keyword query returns ranked objects that are near a query location and that have textual descriptions that match query keywords. This query occurs inherently in many types of mobile and traditional web services and applications, e.g., Yellow Pages and Maps services. Previous...... of prestige-based relevance to capture both the textual relevance of an object to a query and the effects of nearby objects. Based on this, a new type of query, the Location-aware top-k Prestige-based Text retrieval (LkPT) query, is proposed that retrieves the top-k spatial web objects ranked according...... to both prestige-based relevance and location proximity. We propose two algorithms that compute LkPT queries. Empirical studies with real-world spatial data demonstrate that LkPT queries are more effective in retrieving web objects than a previous approach that does not consider the effects of nearby...

  1. Broadcast-Based Spatial Queries

    Institute of Scientific and Technical Information of China (English)

    Kwang-Jin Park; Moon-Bae Song; Chong-Sun Hwang

    2005-01-01

    Indexing techniques have been developed for wireless data broadcast environments, in order to conserve the scarce power resources of the mobile clients. However, the use of interleaved index segments in a broadcast cycle increases the average access latency for the clients. In this paper, the broadcast-based spatial query processing methods (BBS)are presented for the location-based services. In the BBS, broadcasted data objects are sorted sequentially based on their locations, and the server broadcasts the location dependent data along with an index segment. Then, a sequential prefetching and caching scheme is designed to reduce the query response time. The performance of this scheme is investigated in relation to various environmental variables, such as the distributions of the data objects, the average speed of the clients and the size of the service area.

  2. Querying Sentiment Development over Time

    DEFF Research Database (Denmark)

    Andreasen, Troels; Christiansen, Henning; Have, Christian Theil

    2013-01-01

    A new language is introduced for describing hypotheses about fluctuations of measurable properties in streams of timestamped data, and as prime example, we consider trends of emotions in the constantly flowing stream of Twitter messages. The language, called EmoEpisodes, has a precise semantics...... that measures how well a hypothesis characterizes a given time interval; the semantics is parameterized so it can be adjusted to different views of the data. EmoEpisodes is extended to a query language with variables standing for unknown topics and emotions, and the query-answering mechanism will return...... instantiations for topics and emotions as well as time intervals that provide the largest deflections in this measurement. Experiments are performed on a selection of Twitter data to demonstrates the usefulness of the approach....

  3. Benchmarking Performance of Web Service Operations

    OpenAIRE

    Zhang, Shuai

    2011-01-01

    Web services are often used for retrieving data from servers providing information of different kinds. A data providing web service operation returns collections of objects for a given set of arguments without any side effects. In this project a web service benchmark (WSBENCH) is developed to simulate the performance of web service calls. Web service operations are specified as SQL statements. The function generator of WSBENCH converts user specified SQL queries into functions and automatical...

  4. Comprendre le Web caché

    OpenAIRE

    Senellart, Pierre

    2007-01-01

    The hidden Web (also known as deep or invisible Web), that is, the part of the Web not directly accessible through hyperlinks, but through HTML forms or Web services, is of great value, but difficult to exploit. We discuss a process for the fully automatic discovery, syntactic and semantic analysis, and querying of hidden-Web services. We propose first a general architecture that relies on a semi-structured warehouse of imprecise (probabilistic) content. We provide a detailed complexity analy...

  5. Lightweight query authentication on streams

    OpenAIRE

    2014-01-01

    We consider a stream outsourcing setting, where a data owner delegates the management of a set of disjoint data streams to an untrusted server. The owner authenticates his streams via signatures. The server processes continuous queries on the union of the streams for clients trusted by the owner. Along with the results, the server sends proofs of result correctness derived from the owner's signatures, which are easily verifiable by the clients. We design novel constructions for a collection o...

  6. Building interactive queries with LINQPad

    CERN Document Server

    Finot, Sébastien

    2013-01-01

    A step-by-step practical guide that will introduce you to LINQPad's key features, thereby helping you to query databases interactively.This book is aimed at C#/.Net developers who wish to learn LINQ programming and leverage the easy way of using LINQPad. No prior knowledge of LINQ or LINQPad is expected. A basic knowledge of SQL and XML is required for some chapters.

  7. Flexible Query Answering Systems 2006

    DEFF Research Database (Denmark)

    submissions, relating to the topic of users posing queries and systems producing answers. The papers cover the fields: Database Management, Information Retrieval, Domain Modeling, Knowledge Representation and Ontologies, Knowledge Discovery and Data Mining, Artificial Intelligence, Classical and Non......-classical Logics, Computational Linguistics and Natural Language Processing, Multimedia Information Systems, and Human--Computer Interaction, including reports of interesting applications. We wish to thank the contributors for their excellent papers and the referees, publisher, and sponsors for their effort...

  8. Predecessor queries in dynamic integer sets

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting

    1997-01-01

    We consider the problem of maintaining a set of n integers in the range 0.2w–1 under the operations of insertion, deletion, predecessor queries, minimum queries and maximum queries on a unit cost RAM with word size w bits. Let f (n) be an arbitrary nondecreasing smooth function satisfying n...

  9. Heuristics-based query optimisation for SPARQL

    NARCIS (Netherlands)

    P. Tsialiamanis (Petros); E. Sidirourgos (Eleftherios); I. Fundulaki; V. Christophides; P.A. Boncz (Peter)

    2012-01-01

    textabstractQuery optimization in RDF Stores is a challenging problem as SPARQL queries typically contain many more joins than equivalent relational plans, and hence lead to a large join order search space. In such cases, cost-based query optimization often is not possible. One practical reason for

  10. 基于频繁结构的Deep Web查询接口集成%Research of the Deep Web Query Interface Integration Based on the Frequent Structure

    Institute of Scientific and Technical Information of China (English)

    赵晓蓉; 周锦程; 王丹

    2014-01-01

    随着网络规模的日益扩大,海量的信息被“深藏”于各类在线数据库中,用户只能通过查询接口才能获取其中的数据,这部分内容称之为Deep Web;因此对同一领域的Deep Web数据进行集成是非常必要的.查询接口的集成是其中一个非常关键的子问题.查询接口的集成分为模式匹配和模式集成两个步骤;重点研究集成查询接口中属性布局的确定.DeepWeb中查询接口数量巨大,以及动态性与异构性的特点给该问题带来了巨大的挑战.将查询接口的结构建模成一棵树,然后通过挖掘频繁的模式子树来构建集成的查询接口树,使其最大化地满足属性间的结构约束和顺序约束.该算法具有较低的时间复杂度,并具有很好的扩展性,对八个领域的查询接口进行集成的实验结果证明了算法的有效性.

  11. Query Adaptive Image Retrieval System

    Directory of Open Access Journals (Sweden)

    Amruta Dubewar

    2014-03-01

    Full Text Available Images play a crucial role in various fields such as art gallery, medical, journalism and entertainment. Increasing use of image acquisition and data storage technologies have enabled the creation of large database. So, it is necessary to develop appropriate information management system to efficiently manage these collections and needed a system to retrieve required images from these collections. This paper proposed query adaptive image retrieval system (QAIRS to retrieve images similar to the query image specified by user from database. The goal of this system is to support image retrieval based on content properties such as colour and texture, usually encoded into feature vectors. In this system, colour feature extracted by various techniques such as colour moment, colour histogram and autocorrelogram and texture feature extracted by using gabor wavelet. Hashing technique is used to embed high dimensional image features into hamming space, where search can be performed by hamming distance of compact hash codes. Depending upon minimum hamming distance it returns the similar image to query image.

  12. Design and evaluation of a NoSQL database for storing and querying RDF data

    Directory of Open Access Journals (Sweden)

    Kanda Runapongsa Saikaew

    2014-12-01

    Full Text Available Currently the amount of web data has increased excessively. Its metadata is widely used in order to fully exploit web information resources. This causes the need for Semantic Web technology to quickly analyze such big data. Resource Description Framework (RDF is a standard for describing web resources. In this paper, we propose a method to exploit a NoSQL database, specifically MongoDB, to store and query RDF data. We choose MongoDB to represent a NoSQL database because it is one of the most popular high-performance NoSQL databases. We evaluate the proposed design and implementation by using the Berlin SPARQL Benchmark, which is one of the most widely accepted benchmarks for comparing the performance of RDF storage systems. We compare three database systems, which are Apache Jena TDB (native RDF store, MySQL (relational database, and our proposed system with MongoDB (NoSQL database. Based on the experimental results analysis, our proposed system outperforms other database systems for most queries when the data set size is small. However, for a larger data set, MongoDB performs well for queries with simple operators while MySQL offers an efficient solution for complex queries. The result of this work can provide some guideline for choosing an appropriate RDF database system and applying a NoSQL database in storing and querying RDF data.

  13. Indexing and Retrieval for the Web.

    Science.gov (United States)

    Rasmussen, Edie M.

    2003-01-01

    Explores current research on indexing and ranking as retrieval functions of search engines on the Web. Highlights include measuring search engine stability; evaluation of Web indexing and retrieval; Web crawlers; hyperlinks for indexing and ranking; ranking for metasearch; document structure; citation indexing; relevance; query evaluation;…

  14. Preventing SQL Injection through Automatic Query Sanitization with ASSIST

    CERN Document Server

    Mui, Raymond; 10.4204/EPTCS.35.3

    2010-01-01

    Web applications are becoming an essential part of our everyday lives. Many of our activities are dependent on the functionality and security of these applications. As the scale of these applications grows, injection vulnerabilities such as SQL injection are major security challenges for developers today. This paper presents the technique of automatic query sanitization to automatically remove SQL injection vulnerabilities in code. In our technique, a combination of static analysis and program transformation are used to automatically instrument web applications with sanitization code. We have implemented this technique in a tool named ASSIST (Automatic and Static SQL Injection Sanitization Tool) for protecting Java-based web applications. Our experimental evaluation showed that our technique is effective against SQL injection vulnerabilities and has a low overhead.

  15. Boolean queries for news monitoring: Suggesting new query terms to expert users

    NARCIS (Netherlands)

    Verberne, S.; Wabeke, T.; Kaptein, R.

    2016-01-01

    In this paper, we evaluate query suggestion for Boolean queries in a news monitoring system. Users of this system receive news articles that match their running query on a daily basis. Because the news for a topic continuously changes, the queries need regular updating. We first investigated the

  16. Applying semantic web technologies for phenome-wide scan using an electronic health record linked Biobank

    Directory of Open Access Journals (Sweden)

    Pathak Jyotishman

    2012-12-01

    Full Text Available Abstract Background The ability to conduct genome-wide association studies (GWAS has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form “biobanks” where biospecimens linked to personal health information, typically in electronic health records (EHRs, are collected and stored on a large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypotheses generation. Results In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped for Type 2 Diabetes and Hypothyroidism to discover gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries. Conclusions This study demonstrates how Semantic Web technologies can be applied in conjunction with clinical data stored in EHRs to accurately identify subjects with specific diseases and phenotypes, and identify genotype-phenotype associations.

  17. Truth Space Method for Caching Database Queries

    Directory of Open Access Journals (Sweden)

    S. V. Mosin

    2015-01-01

    Full Text Available We propose a new method of client-side data caching for relational databases with a central server and distant clients. Data are loaded into the client cache based on queries executed on the server. Every query has the corresponding DB table – the result of the query execution. These queries have a special form called "universal relational query" based on three fundamental Relational Algebra operations: selection, projection and natural join. We have to mention that such a form is the closest one to the natural language and the majority of database search queries can be expressed in this way. Besides, this form allows us to analyze query correctness by checking lossless join property. A subsequent query may be executed in a client’s local cache if we can determine that the query result is entirely contained in the cache. For this we compare truth spaces of the logical restrictions in a new user’s query and the results of the queries execution in the cache. Such a comparison can be performed analytically , without need in additional Database queries. This method may be used to define lacking data in the cache and execute the query on the server only for these data. To do this the analytical approach is also used, what distinguishes our paper from the existing technologies. We propose four theorems for testing the required conditions. The first and the third theorems conditions allow us to define the existence of required data in cache. The second and the fourth theorems state conditions to execute queries with cache only. The problem of cache data actualizations is not discussed in this paper. However, it can be solved by cataloging queries on the server and their serving by triggers in background mode. The article is published in the author’s wording.

  18. A solution of spatial query processing and query optimization for spatial databases

    Institute of Scientific and Technical Information of China (English)

    YUAN Jie; XIE Kun-qing; MA Xiu-jun; ZHANG Min; SUN Le-bin

    2004-01-01

    Recently, attention has been focused on spatial query language which is used to query spatial databases. A design of spatial query language has been presented in this paper by extending the standard relational database query language SQL. It recognizes the significantly different requirements of spatial data handling and overcomes the inherent problems of the application of conventional database query languages. This design is based on an extended spatial data model, including the spatial data types and the spatial operators on them. The processing and optimization of spatial queries have also been discussed in this design. In the end, an implementation of this design is given in a spatial query subsystem.

  19. Cache-Based Aggregate Query Shipping: An Efficient Scheme of Distributed OLAP Query Processing

    Institute of Scientific and Technical Information of China (English)

    Hua-Ming Liao; Guo-Shun Pei

    2008-01-01

    Our study introduces a novel distributed query plan refinement phase in an enhanced architecture of distributed query processing engine (DQPE). Query plan refinement generates potentially efficient distributed query plan by reusable aggregate query shipping (RAQS) approach. The approach improves response time at the cost of pre-processing time. If theoverheads could not be compensated by query results reusage, RAQS is no more favorable. Therefore a global cost estimation model is employed to get proper operators: RR_Agg, R_Agg, or R_Scan. For the purpose of reusing results of queries with aggregate function in distributed query processing, a multi-level hybrid view caching (HVC) scheme is introduced. The scheme retains the advantages of partial match and aggregate query results caching. By our solution, evaluations with distributed TPC-H queries show significant improvement on average response time.

  20. Towards Automatic Improvement of Patient Queries in Health Retrieval Systems

    Directory of Open Access Journals (Sweden)

    Nesrine KSENTINI

    2016-07-01

    Full Text Available With the adoption of health information technology for clinical health, e-health is becoming usual practice today. Users of this technology find it difficult to seek information relevant to their needs due to the increasing amount of the clinical and medical data on the web, and the lack of knowledge of medical jargon. In this regards, a method is described to improve user's needs by automatically adding new related terms to their queries which appear in the same context of the original query in order to improve final search results. This method is based on the assessment of semantic relationships defined by a proposed statistical method between a set of terms or keywords. Experiments were performed on CLEF-eHealth-2015 database and the obtained results show the effectiveness of our proposed method.

  1. Structured Query Language for Virtual Observatory

    CERN Document Server

    Shirasaki, Y; Mizumoto, Y; Tanaka, M; Honda, S; Oe, M; Yasuda, N; Masunaga, Y; Shirasaki, Yuji; Ohishi, Masatoshi; Mizumoto, Yoshihiko; Tanaka, Masahiro; Honda, Satoshi; Oe, Masafumi; Yasuda, Naoki; Masunaga, Yoshifumi

    2004-01-01

    Currently two query languages are defined as standards for the Virtual Observatory (VO). Astronomical Data Query Language (ADQL) is used for catalog data query and Simple Image Access Protocol (SIAP) is for image data query. As a result, when we query each data service, we need to know in advance which language is supported and then construct a query language accordingly. The construct of SIAP is simple, but they have a limited capability. For example, there is no way to specify multiple regions in one query, and it is difficult to specify complex query conditions. In this paper, we propose a unified query language for any kind of astronomical database on the basis of SQL99. SQL is a query language optimized for a table data, so to apply the SQL to the image and spectrum data set, the data structure need to be mapped to a table like structure. We present specification of this query language and an example of the architecture for the database system.

  2. Advanced Query and Data Mining Capabilities for MaROS

    Science.gov (United States)

    Wang, Paul; Wallick, Michael N.; Allard, Daniel A.; Gladden, Roy E.; Hy, Franklin H.

    2013-01-01

    The Mars Relay Operational Service (MaROS) comprises a number of tools to coordinate, plan, and visualize various aspects of the Mars Relay network. These levels include a Web-based user interface, a back-end "ReSTlet" built in Java, and databases that store the data as it is received from the network. As part of MaROS, the innovators have developed and implemented a feature set that operates on several levels of the software architecture. This new feature is an advanced querying capability through either the Web-based user interface, or through a back-end REST interface to access all of the data gathered from the network. This software is not meant to replace the REST interface, but to augment and expand the range of available data. The current REST interface provides specific data that is used by the MaROS Web application to display and visualize the information; however, the returned information from the REST interface has typically been pre-processed to return only a subset of the entire information within the repository, particularly only the information that is of interest to the GUI (graphical user interface). The new, advanced query and data mining capabilities allow users to retrieve the raw data and/or to perform their own data processing. The query language used to access the repository is a restricted subset of the structured query language (SQL) that can be built safely from the Web user interface, or entered as freeform SQL by a user. The results are returned in a CSV (Comma Separated Values) format for easy exporting to third party tools and applications that can be used for data mining or user-defined visualization and interpretation. This is the first time that a service is capable of providing access to all cross-project relay data from a single Web resource. Because MaROS contains the data for a variety of missions from the Mars network, which span both NASA and ESA, the software also establishes an access control list (ACL) on each data record

  3. Aggregating Queries Against Large Inventories of Remotely Accessible Data

    Science.gov (United States)

    Gallagher, J. H. R.; Fulker, D. W.

    2016-12-01

    Those seeking to discover data for a specific purpose often encounter search results that are so large as to be useless without computing assistance. This situation arises, with increasing frequency, in part because repositories contain ever greater numbers of granules, and their granularities may well be poorly aligned or even orthogonal to the data-selection needs of the user. This presentation describes a recently developed service for simultaneously querying large lists of OPeNDAP-accessible granules to extract specified data. The specifications include a richly expressive set of data-selection criteria—applicable to content as well as metadata—and the service has been tested successfully against lists naming hundreds of thousands of granules. Querying such numbers of local files (i.e., granules) on a desktop or laptop computer is practical (by using a scripting language, e.g.), but this practicality is diminished when the data are remote and thus best accessed through a Web-services interface. In these cases, which are increasingly common, scripted queries can take many hours because of inherent network latencies. Furthermore, communication dropouts can add fragility to such scripts, yielding gaps in the acquired results. In contrast, OPeNDAP's new aggregated-query services enable data discovery in the context of very large inventory sizes. These capabilities have been developed for use with OPeNDAP's Hyrax server, which is an open-source realization of DAP (for "Data Access Protocol," a specification widely used in NASA, NOAA and other data-intensive contexts). These aggregated-query services exhibit good response times (on the order of seconds, not hours) even for inventories that list hundreds of thousands of source granules.

  4. An Optimal Labeling Scheme for Ancestry Queries

    CERN Document Server

    Fraigniaud, Pierre

    2009-01-01

    An ancestry labeling scheme assigns labels (bit strings) to the nodes of rooted trees such that ancestry queries between any two nodes in a tree can be answered merely by looking at their corresponding labels. The quality of an ancestry labeling scheme is measured by its label size, that is the maximal number of bits in a label of a tree node. In addition to its theoretical appeal, the design of efficient ancestry labeling schemes is motivated by applications in web search engines. For this purpose, even small improvements in the label size are important. In fact, the literature about this topic is interested in the exact label size rather than just its order of magnitude. As a result, following the proposal of a simple interval-based ancestry scheme with label size $2\\log_2 n$ bits (Kannan et al., STOC '88), a considerable amount of work was devoted to improve the bound on the size of a label. The current state of the art upper bound is $\\log_2 n + O(\\sqrt{\\log n})$ bits (Abiteboul et al., SODA '02) which is...

  5. Query Migration from Object Oriented World to Semantic World

    Directory of Open Access Journals (Sweden)

    Nassima Soussi

    2016-06-01

    Full Text Available In the last decades, object-oriented approach was able to take a large share of databases market aiming to design and implement structured and reusable software through the composition of independent elements in order to have programs with a high performance. On the other hand, the mass of information stored in the web is increasing day after day with a vertiginous speed, exposing the currently web faced with the problem of creating a bridge so as to facilitate access to data between different applications and systems as well as to look for relevant and exact information wished by users. In addition, all existing approach of rewriting object oriented languages to SPARQL language rely on models transformation process to guarantee this mapping. All the previous raisons has prompted us to write this paper in order to bridge an important gap between these two heterogeneous worlds (object oriented and semantic web world by proposing the first provably semantics preserving OQLto-SPARQL translation algorithm for each element of OQL Query (SELECT clause, FROM clause, FILTER constraint, implicit/ explicit join and union/intersection SELECT queries.

  6. Record Matching Over Query Results Using Fuzzy Ontological Document Clustering

    Directory of Open Access Journals (Sweden)

    V.Vijayaraja

    2011-02-01

    Full Text Available Record matching is an essential step in duplicate detection as it identifies records representing same real-world entity. Supervised record matching methods require users to provide training data andtherefore cannot be applied for web databases where query results are generated on-the-fly. To overcome the problem, a new record matching method named Unsupervised Duplicate Elimination (UDE is proposed for identifying and eliminating duplicates among records in dynamic query results. The idea of this paper is to adjust the weights of record fields in calculating similarities among records. Two classifiers namely weight component similarity summing classifier, support vector machine classifier are iteratively employed with UDE where the first classifier utilizes the weights set to match records from different data sources. With the matched records as positive dataset and non duplicate records as negative set, the second classifier identifies new duplicates. Then, a new methodology to automatically interpret and cluster knowledge documents using an ontology schema is presented. Moreover, a fuzzy logic control approach is used to match suitable document cluster(s for given patents based on their derived ontological semantic webs. Thus, this paper takes advantage of similarity among records from web databases and solves the online duplicate detection problem.

  7. Supercharged JavaScript Graphics with HTML5 canvas, jQuery, and More

    CERN Document Server

    Cecco, Raffaele

    2011-01-01

    With HTML5 and improved web browser support, JavaScript has become the tool of choice for creating high-performance web graphics. This faced-paced book shows you how to use JavaScript, jQuery, DHTML, and HTML5's Canvas element to create rich web applications for computers and mobile devices. By following real-world examples, experienced web developers learn fun and useful approaches to arcade games, DHTML effects, business dashboards, and other applications. This book serves complex subjects in easily digestible pieces, and each topic acts as a foundation for the next. Tackle JavaScript opti

  8. Monitoring nearest neighbor queries with cache strategies

    Institute of Scientific and Technical Information of China (English)

    PAN Peng; LU Yan-sheng

    2007-01-01

    The problem of continuously monitoring multiple K-nearest neighbor (K-NN) queries with dynamic object and query dataset is valuable for many location-based applications. A practical method is to partition the data space into grid cells, with both object and query table being indexed by this grid structure, while solving the problem by periodically joining cells of objects with queries having their influence regions intersecting the cells. In the worst case, all cells of objects will be accessed once. Object and query cache strategies are proposed to further reduce the I/O cost. With object cache strategy, queries remaining static in current processing cycle seldom need I/O cost, they can be returned quickly. The main I/O cost comes from moving queries, the query cache strategy is used to restrict their search-regions, which uses current results of queries in the main memory buffer. The queries can share not only the accessing of object pages, but also their influence regions. Theoretical analysis of the expected I/O cost is presented, with the I/O cost being about 40% that of the SEA-CNN method in the experiment results.

  9. A Survey on Web Search Results Personalization

    Directory of Open Access Journals (Sweden)

    Blessy Thomas

    2015-10-01

    Full Text Available  Web is a huge information repository covering almost every topic, in which a human user could be interested. As the size and richness of information on the web increases, diversity and complexity of the tasks users tries to perform also increases. With the overwhelming volume of information on the web, the task of finding relevant information related to a specific query or topic is becoming increasingly difficult. Now a day’s commonly used task on internet is web search. User gets variety of related information for their queries. To provide more relevant and effective results to user, Personalization technique is used. Personalized web search refer to search information that is tailored specifically to a person’s interests by incorporating information about query provided. Two general types of approaches to personalizing search results are modifying user’s query and re-ranking search results. Several personalized web search techniques based on web contents, web link structure, browsing history, user profiles and user queries. This paper is to represent survey on various techniques of personalization

  10. Os empresários, a política e a web: mapeando as atividades políticas nos portais das federações de indústrias brasileiras Entrepeneurs, politics and the Web: mapping political activities in the web portals of the industry federation

    Directory of Open Access Journals (Sweden)

    Sérgio Braga

    2009-08-01

    Full Text Available O objetivo do trabalho é apresentar os resultados de nossa pesquisa sobre o "grau de informatização" dos portais das federações das indústrias de 27 unidades subnacionais brasileiras, especialmente o mapeamento que efetuamos das atividades políticas desenvolvidas pelos empresários a partir desses portais. Esse objetivo geral desdobra-se em dois objetivos específicos: a apresentar uma proposta de mensuração do grau de informatização e de utilização da Internet pelas federações de indústrias, visando a avaliar o quanto estas entidades avançaram no uso da Web para divulgar suas atividades e interagir com o cidadão comum; b elaborar e aplicar instrumentos teórico-metodológicos para a análise de tais dados, especialmente para a avaliação da eficácia da Internet como instrumento de organização da ação política do empresariado.This paper's main objective is to present the results of a research about the "informatization level" of web portals of the industry federation of 27 Brazilian subnational units, especially a map of political activities developed by the members who make use of such portals. The main objective brings about two specific objectives: a to present a proposal for measurement of the informatization level and Internet usage by the local chapters of the industry federation and other similar initiatives by the business class, aimed at evaluating how these entities use the Web to promote their activities and interact with the general public; b elaborate and apply theoretical and methodological tools for the analysis of such information, especially to evaluate the effectiveness of the Internet as a tool for both organization of political action and public engagement by the business class.

  11. An intelligent method for geographic Web search

    Science.gov (United States)

    Mei, Kun; Yuan, Ying

    2008-10-01

    While the electronically available information in the World-Wide Web is explosively growing and thus increasing, the difficulty to find relevant information is also increasing for search engine user. In this paper we discuss how to constrain web queries geographically. A number of search queries are associated with geographical locations, either explicitly or implicitly. Accurately and effectively detecting the locations where search queries are truly about has huge potential impact on increasing search relevance, bringing better targeted search results, and improving search user satisfaction. Our approach focus on both in the way geographic information is extracted from the web and, as far as we can tell, in the way it is integrated into query processing. This paper gives an overview of a spatially aware search engine for semantic querying of web document. It also illustrates algorithms for extracting location from web documents and query requests using the location ontologies to encode and reason about formal semantics of geographic web search. Based on a real-world scenario of tourism guide search, the application of our approach shows that the geographic information retrieval can be efficiently supported.

  12. CALA:A Web Analysis Algorithm Combined with Content Correlation Analysis Method

    Institute of Scientific and Technical Information of China (English)

    ZHANG Ling(张岭); MA FanYuan(马范援); YE YunMing(叶允明); CHEN JianGuo(陈建国)

    2003-01-01

    Web hyperlink structure analysis algorithm plays a significant role in improving the precision of Web information retrieval. Current link algorithms employ iteration function to compute the Web resource weight. The major drawback of this approach is that every Web document has a fixed rank which is independent of Web queries. This paper proposes an improved algorithm that ranks the quality and the relevance of a page according to users' query dynamically.The experiments show that the current link analysis algorithm is improved.

  13. A study of Web search trends

    Directory of Open Access Journals (Sweden)

    Bernard J. Jansen

    2004-12-01

    Full Text Available This article provides an overview of recent research conducted from 1997 to 2003 that explored how people search the Web. The article reports selected findings from many research studies conducted by the co-authors of the paper from 1997 to 2003 using large-scale Web query transaction logs provided by commercial Web companies, including Excite, Alta Vista, Ask Jeeves, and AlltheWeb.com. The many studies are also synthesized in the recent book "Web Search: Public Searching of the Web" by Amanda Spink and Bernard J. Jansen (Kluwer Academic Publishers. The researchers examined the topics of Web searches; how users search the Web using terms in queries during search sessions; and the diverse types of searches, including medical, sex, e-commerce, multimedia, etc. information. Key findings include changes in search topics since 1997, including a shift from entertainment to e-commerce queries. Further findings show little change in many aspects of Web searching from 1997-2003, including query and search session length. The studies also show more complex Web search behaviors by a minority of users who conduct multitasking and successive searches.

  14. Expanding user’s query with tag-neighbors for effective medical information retrieval

    DEFF Research Database (Denmark)

    Durao, Frederico; Bayyapu, Karunakar Reddy; Xu, Guandong;

    2014-01-01

    Medical information is a natural human demand. Existing search engines on the Web often are unable to handle medical search well because they do not consider its special requirements. Often a medical information searcher is uncertain about his exact questions and unfamiliar with medical terminology....... Under-specified queries often lead to undesirable search results that do not contain the information needed. To overcome the limitations of under-specified queries, we utilize tags to enhance information retrieval capabilities by expanding users’ original queries with context-relevant information. We...... compute a set of significant tag neighbor candidates based on the neighbor frequency and weight, and utilize the qualified tag neighbors to expand an entry query. The proposed approach is evaluated by using MedWorm medical article collection and results show considerable precision improvements over state...

  15. Pushing the Boundaries of Crowd-enabled Databases with Query-driven Schema Expansion

    CERN Document Server

    Selke, Joachim; Balke, Wolf-Tilo

    2012-01-01

    By incorporating human workers into the query execution process crowd-enabled databases facilitate intelligent, social capabilities like completing missing data at query time or performing cognitive operators. But despite all their flexibility, crowd-enabled databases still maintain rigid schemas. In this paper, we extend crowd-enabled databases by flexible query-driven schema expansion, allowing the addition of new attributes to the database at query time. However, the number of crowd-sourced mini-tasks to fill in missing values may often be prohibitively large and the resulting data quality is doubtful. Instead of simple crowd-sourcing to obtain all values individually, we leverage the user-generated data found in the Social Web: By exploiting user ratings we build perceptual spaces, i.e., highly-compressed representations of opinions, impressions, and perceptions of large numbers of users. Using few training samples obtained by expert crowd sourcing, we then can extract all missing data automatically from ...

  16. Medical Information Retrieval Enhanced with User's Query Expanded with Tag-Neighbors

    DEFF Research Database (Denmark)

    Durao, Frederico; Bayyapu, Karunakar Reddy; Xu, Guandong

    2013-01-01

    ’ original queries with context-relevant information. We compute a set of significant tag neighbor candidates based on the neighbor frequency and weight, and utilize the qualified tag neighbors to expand an entry query. The proposed approach is evaluated by using MedWorm medical article collection......Under-specified queries often lead to undesirable search results that do not contain the information needed. This problem gets worse when it comes to medical information, a natural human demand everywhere. Existing search engines on the Web often are unable to handle medical search well because...... they do not consider its special requirements. Often a medical information searcher is uncertain about his exact questions and unfamiliar with medical terminology. To overcome the limitations of under-specified queries, we utilize tags to enhance information retrieval capabilities by expanding users...

  17. Adding query privacy to robust DHTs

    DEFF Research Database (Denmark)

    Backes, Michael; Goldberg, Ian; Kate, Aniket

    2012-01-01

    Interest in anonymous communication over distributed hash tables (DHTs) has increased in recent years. However, almost all known solutions solely aim at achieving sender or requestor anonymity in DHT queries. In many application scenarios, it is crucial that the queried key remains secret from...... intermediate peers that (help to) route the queries towards their destinations. In this paper, we satisfy this requirement by presenting an approach for providing privacy for the keys in DHT queries. We use the concept of oblivious transfer (OT) in communication over DHTs to preserve query privacy without...... compromising spam resistance. Although our OT-based approach can work over any DHT, we concentrate on robust DHTs that can tolerate Byzantine faults and resist spam. We choose the best-known robust DHT construction, and employ an efficient OT protocol well-suited for achieving our goal of obtaining query...

  18. Entity-based Stochastic Analysis of Search Results for Query Expansion and Results Re-Ranking

    Science.gov (United States)

    2015-11-20

    Mining and Semantics (WIMS’14). ACM, June 2014. [3] P. Fafalios, M. Baritakis, and Y. Tzitzikas. Exploiting linked data for open and configurable named...Salampasis, and Y. Tzitzikas. Web searching with entity mining at query time. In 5th Information Retrieval Facility Conference, 2012. [5] P. Fafalios...International Conference on Semantic Computing (ICSC 2014), 2014. [8] T. Heath and C. Bizer. Linked Data: Evolving the web into a global data space. Synthesis

  19. Hybrid Schema Matching for Deep Web

    Science.gov (United States)

    Chen, Kerui; Zuo, Wanli; He, Fengling; Chen, Yongheng

    Schema matching is the process of identifying semantic mappings, or correspondences, between two or more schemas. Schema matching is a first step and critical part of data integration. For schema matching of deep web, most researches only interested in query interface, while rarely pay attention to abundant schema information contained in query result pages. This paper proposed a mixed schema matching technique, which combines attributes that appeared in query structures and query results of different data sources, and mines the matched schemas inside. Experimental results prove the effectiveness of this method for improving the accuracy of schema matching.

  20. jQuery Tools UI Library

    CERN Document Server

    Libby, Alex

    2012-01-01

    A practical tutorial with powerful yet simple projects that are quick to implement. This book is aimed at developers who have prior jQuery knowledge, but may not have any prior experience with jQuery Tools. It is possible that they may have started with the basics of jQuery Tools, but want to learn more about how it can be used, as well as get ideas for future projects.

  1. Evaluating Multidimensional Queries by Diamond Dicing

    CERN Document Server

    Webb, Hazel; Lemire, Daniel

    2010-01-01

    Queries that constrain multiple dimensions simultaneously are difficult to express and compute efficiently in both Structured Query Language (SQL) and multidimensional languages. We introduce the diamond cube operator to facilitate the expression of one such class of multidimensional query. We have developed, implemented and tested algorithms to compute diamonds on both real and synthetic large data sets. We show that our custom implementation is more than twenty-five times faster, on a large data set, than popular database engines.

  2. PiCO QL: A software library for runtime interactive queries on program data

    Directory of Open Access Journals (Sweden)

    Marios Fragkoulis

    2016-01-01

    Full Text Available Pico ql is an open source c/c++ software whose scientific scope is real-time interactive analysis of in-memory data through sql queries. It exposes a relational view of a system’s or application’s data structures, which is queryable through sql. While the application or system is executing, users can input queries through a web-based interface or issue web service requests. Queries execute on the live data structures through the respective relational views. pico ql makes a good candidate for ad-hoc data analysis in applications and for diagnostics in systems settings. Applications of pico ql include the Linux kernel, the Valgrind instrumentation framework, a gis application, a virtual real-time observatory of stellar objects, and a source code analyser.

  3. PiCO QL: A software library for runtime interactive queries on program data

    Science.gov (United States)

    Fragkoulis, Marios; Spinellis, Diomidis; Louridas, Panos

    PiCO QL is an open source C/C++ software whose scientific scope is real-time interactive analysis of in-memory data through SQL queries. It exposes a relational view of a system's or application's data structures, which is queryable through SQL. While the application or system is executing, users can input queries through a web-based interface or issue web service requests. Queries execute on the live data structures through the respective relational views. PiCO QL makes a good candidate for ad-hoc data analysis in applications and for diagnostics in systems settings. Applications of PiCO QL include the Linux kernel, the Valgrind instrumentation framework, a GIS application, a virtual real-time observatory of stellar objects, and a source code analyser.

  4. Mining and Querying Multimedia Data

    Science.gov (United States)

    2011-09-29

    SAT1.5GB data set as the ground truth. By randomly choos - ing a small number of these labels as input and leaving out remaining ones for evaluation, we...Record, 30:37–46, May 2001. ISSN 0163-5808. [3] Eugene Agichtein, Eric Brill, and Susan Dumais. Improving web search ranking by in- corporating user...06, pages 19–26, 2006. [4] Eugene Agichtein, Eric Brill, Susan Dumais, and Robert Ragno. Learning user interac- tion models for predicting web search

  5. Modeling and Analyze the Deep Web: Surfacing Hidden Value

    OpenAIRE

    Suneet Kumar; Anuj Kumar Yadav; Rakesh Bharati; Rani Choudhary

    2011-01-01

    Focused web crawlers have recently emerged as an alternative to the well-established web search engines. While the well-known focused crawlers retrieve relevant web-pages, there are various applications which target whole websites instead of single web-pages. For example, companies are represented by websites, not by individual web-pages. To answer queries targeted at Websites, web directories are an established solution. In this paper, we introduce a novel focused website crawler to employ t...

  6. Accurate And Efficient Crawling The Deep Web: Surfacing Hidden Value

    OpenAIRE

    Suneet Kumar; Anuj Kumar Yadav; Rakesh Bharti; Rani Choudhary

    2011-01-01

    Searching Focused web crawlers have recently emerged as an alternative to the well-established web search engines. While the well-known focused crawlers retrieve relevant web-pages, there are various applications which target whole websites instead of single web-pages. For example, companies are represented by websites, not by individual web-pages. To answer queries targeted at websites, web directories are an established solution. In this paper, we introduce a novel focused website crawler t...

  7. Queries with Guarded Negation (full version)

    CERN Document Server

    Barany, Vince; Otto, Martin

    2012-01-01

    A well-established and fundamental insight in database theory is that negation (also known as complementation) tends to make queries difficult to process and difficult to reason about. Many basic problems are decidable and admit practical algorithms in the case of unions of conjunctive queries, but become difficult or even undecidable when queries are allowed to contain negation. Inspired by recent results in finite model theory, we consider a restricted form of negation, guarded negation. We introduce a fragment of SQL, called GN-SQL, as well as a fragment of Datalog with stratified negation, called GN-Datalog, that allow only guarded negation, and we show that these query languages are computationally well behaved, in terms of testing query containment, query evaluation, open-world query answering, and boundedness. GN-SQL and GN-Datalog subsume a number of well known query languages and constraint languages, such as unions of conjunctive queries, monadic Datalog, and frontier-guarded tgds. In addition, an a...

  8. Oceanographic ontology-based spatial knowledge query

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    The construction of oceanographic ontologies is fundamental to the "digital ocean". Therefore, on the basis of introduction of new concept of oceanographic ontology, an oceanographic ontology-based spatial knowledge query (OOBSKQ) method was proposed and developed. Because the method uses a natural language to describe query conditions and the query result is highly integrated knowledge,it can provide users with direct answers while hiding the complicated computation and reasoning processes, and achieves intelligent,automatic oceanographic spatial information query on the level of knowledge and semantics. A case study of resource and environmental application in bay has shown the implementation process of the method and its feasibility and usefulness.

  9. Querying moving objects detected by sensor networks

    CERN Document Server

    Bestehorn, Markus

    2012-01-01

    Declarative query interfaces to Sensor Networks (SN) have become a commodity. These interfaces allow access to SN deployed for collecting data using relational queries. However, SN are not confined to data collection, but may track object movement, e.g., wildlife observation or traffic monitoring. While rational approaches are well suited for data collection, research on ""Moving Object Databases"" (MOD) has shown that relational operators are unsuitable to express information needs on object movement, i.e., spatio-temporal queries. ""Querying Moving Objects Detected by Sensor Networks"" studi

  10. Topic Level Disambiguation for Weak Queries

    Directory of Open Access Journals (Sweden)

    Zhang, Hui

    2013-09-01

    Full Text Available Despite limited success, today's information retrieval (IR systems are not intelligent or reliable. IR systems return poor search results when users formulate their information needs into incomplete or ambiguous queries (i.e., weak queries. Therefore, one of the main challenges in modern IR research is to provide consistent results across all queries by improving the performance on weak queries. However, existing IR approaches such as query expansion are not overly effective because they make little effort to analyze and exploit the meanings of the queries. Furthermore, word sense disambiguation approaches, which rely on textual context, are ineffective against weak queries that are typically short. Motivated by the demand for a robust IR system that can consistently provide highly accurate results, the proposed study implemented a novel topic detection that leveraged both the language model and structural knowledge of Wikipedia and systematically evaluated the effect of query disambiguation and topic-based retrieval approaches on TREC collections. The results not only confirm the effectiveness of the proposed topic detection and topic-based retrieval approaches but also demonstrate that query disambiguation does not improve IR as expected.

  11. Effective Density Queries of Continuously Moving Objects

    DEFF Research Database (Denmark)

    Jensen, Christian Søndergaard; Lin, D.; Ooi, B.C.

    2006-01-01

    control system, we need to identify the places that are or would be affected by a traffic jam, and report this information to drivers so that they can choose a less congested route. As a naive way to solve the problem is prohibitively expensive, we first introduce a framework which makes the problem......In this paper, we study a newly emerging type of queries on moving objects - the density query. Basically, this query locates regions in the data space where the density of the objects is high. This type of queries is especially useful in Location Based Services (LBS). For example, in a traffic...

  12. Adding Query Privacy to Robust DHTs

    DEFF Research Database (Denmark)

    Backes, Michael; Goldberg, Ian; Kate, Aniket

    2011-01-01

    of obtaining query privacy over robust DHTs. Finally, we compare the performance of our privacy-preserving protocols with their more privacy-invasive counterparts. We observe that there is no increase in the message complexity and only a small overhead in the computational complexity....... intermediate peers that (help to) route the queries towards their destinations. In this paper, we satisfy this requirement by presenting an approach for providing privacy for the keys in DHT queries. We use the concept of oblivious transfer (OT) in communication over DHTs to preserve query privacy without...

  13. Database queries and constraints via lifting problems

    CERN Document Server

    Spivak, David I

    2012-01-01

    Previous work has shown a tight relationship between databases and categories. In the present paper we extend that connection to show that certain queries and constraints correspond to the algebro-topological notion of lifting problems. In our formulation, each so-called SPARQL graph pattern query corresponds to a lifting problem, and each solution to the query corresponds to a lift. We interpret constraints within the same formalism and then investigate some formal properties of queries and constraints, e.g. their behavior under data migration functors.

  14. Comparative analysis of online health queries originating from personal computers and smart devices on a consumer health information portal.

    Science.gov (United States)

    Jadhav, Ashutosh; Andrews, Donna; Fiksdal, Alexander; Kumbamu, Ashok; McCormick, Jennifer B; Misitano, Andrew; Nelsen, Laurie; Ryu, Euijung; Sheth, Amit; Wu, Stephen; Pathak, Jyotishman

    2014-07-04

    The number of people using the Internet and mobile/smart devices for health information seeking is increasing rapidly. Although the user experience for online health information seeking varies with the device used, for example, smart devices (SDs) like smartphones/tablets versus personal computers (PCs) like desktops/laptops, very few studies have investigated how online health information seeking behavior (OHISB) may differ by device. The objective of this study is to examine differences in OHISB between PCs and SDs through a comparative analysis of large-scale health search queries submitted through Web search engines from both types of devices. Using the Web analytics tool, IBM NetInsight OnDemand, and based on the type of devices used (PCs or SDs), we obtained the most frequent health search queries between June 2011 and May 2013 that were submitted on Web search engines and directed users to the Mayo Clinic's consumer health information website. We performed analyses on "Queries with considering repetition counts (QwR)" and "Queries without considering repetition counts (QwoR)". The dataset contains (1) 2.74 million and 3.94 million QwoR, respectively for PCs and SDs, and (2) more than 100 million QwR for both PCs and SDs. We analyzed structural properties of the queries (length of the search queries, usage of query operators and special characters in health queries), types of search queries (keyword-based, wh-questions, yes/no questions), categorization of the queries based on health categories and information mentioned in the queries (gender, age-groups, temporal references), misspellings in the health queries, and the linguistic structure of the health queries. Query strings used for health information searching via PCs and SDs differ by almost 50%. The most searched health categories are "Symptoms" (1 in 3 search queries), "Causes", and "Treatments & Drugs". The distribution of search queries for different health categories differs with the device used for

  15. Extending Climate Analytics-As to the Earth System Grid Federation

    Science.gov (United States)

    Tamkin, G.; Schnase, J. L.; Duffy, D.; McInerney, M.; Nadeau, D.; Li, J.; Strong, S.; Thompson, J. H.

    2015-12-01

    We are building three extensions to prior-funded work on climate analytics-as-a-service that will benefit the Earth System Grid Federation (ESGF) as it addresses the Big Data challenges of future climate research: (1) We are creating a cloud-based, high-performance Virtual Real-Time Analytics Testbed supporting a select set of climate variables from six major reanalysis data sets. This near real-time capability will enable advanced technologies like the Cloudera Impala-based Structured Query Language (SQL) query capabilities and Hadoop-based MapReduce analytics over native NetCDF files while providing a platform for community experimentation with emerging analytic technologies. (2) We are building a full-featured Reanalysis Ensemble Service comprising monthly means data from six reanalysis data sets. The service will provide a basic set of commonly used operations over the reanalysis collections. The operations will be made accessible through NASA's climate data analytics Web services and our client-side Climate Data Services (CDS) API. (3) We are establishing an Open Geospatial Consortium (OGC) WPS-compliant Web service interface to our climate data analytics service that will enable greater interoperability with next-generation ESGF capabilities. The CDS API will be extended to accommodate the new WPS Web service endpoints as well as ESGF's Web service endpoints. These activities address some of the most important technical challenges for server-side analytics and support the research community's requirements for improved interoperability and improved access to reanalysis data.

  16. Query Optimization for Deductive Databases

    Institute of Scientific and Technical Information of China (English)

    周傲英; 施伯乐

    1995-01-01

    A systematic,efficient compilation method for query evaluation of Deductive Databases (DeDB) is proposed in this paper.In order to eliminate redundancy and to minimize the potentially relevant facts,which are two key issues to the efficiency of a DeDB,the compilation process is decomposed into two phases.The first is the pre-compilation phase,which is responsible for the minimization of the potentially relevant facts.The second,which we refer to as the general compilation phase,is responsible for the elimination of redundancy.The rule/goal graph devised by J.D.Ullman is appropriately extended and used as a uniform formalism.Two general algorithms corresponding to the two phases respectively are described intuitively and formally.

  17. Using search query surveillance to monitor tax avoidance and smoking cessation following the United States' 2009 "SCHIP" cigarette tax increase.

    Science.gov (United States)

    Ayers, John W; Ribisl, Kurt; Brownstein, John S

    2011-03-16

    Smokers can use the web to continue or quit their habit. Online vendors sell reduced or tax-free cigarettes lowering smoking costs, while health advocates use the web to promote cessation. We examined how smokers' tax avoidance and smoking cessation Internet search queries were motivated by the United States' (US) 2009 State Children's Health Insurance Program (SCHIP) federal cigarette excise tax increase and two other state specific tax increases. Google keyword searches among residents in a taxed geography (US or US state) were compared to an untaxed geography (Canada) for two years around each tax increase. Search data were normalized to a relative search volume (RSV) scale, where the highest search proportion was labeled 100 with lesser proportions scaled by how they relatively compared to the highest proportion. Changes in RSV were estimated by comparing means during and after the tax increase to means before the tax increase, across taxed and untaxed geographies. The SCHIP tax was associated with an 11.8% (95% confidence interval [95%CI], 5.7 to 17.9; ptax levels in Canada during the months after the tax. Tax avoidance searches increased 27.9% (95%CI, 15.9 to 39.9; ptax compared to Canada, respectively, suggesting avoidance is the more pronounced and durable response. Trends were similar for state-specific tax increases but suggest strong interactive processes across taxes. When the SCHIP tax followed Florida's tax, versus not, it promoted more cessation and avoidance searches. Efforts to combat tax avoidance and increase cessation may be enhanced by using interventions targeted and tailored to smokers' searches. Search query surveillance is a valuable real-time, free and public method, that may be generalized to other behavioral, biological, informational or psychological outcomes manifested online.

  18. Searchable Data Vault: Encrypted Queries in Secure Distributed Cloud Storage

    Directory of Open Access Journals (Sweden)

    Geong Sen Poh

    2017-05-01

    Full Text Available Cloud storage services allow users to efficiently outsource their documents anytime and anywhere. Such convenience, however, leads to privacy concerns. While storage providers may not read users’ documents, attackers may possibly gain access by exploiting vulnerabilities in the storage system. Documents may also be leaked by curious administrators. A simple solution is for the user to encrypt all documents before submitting them. This method, however, makes it impossible to efficiently search for documents as they are all encrypted. To resolve this problem, we propose a multi-server searchable symmetric encryption (SSE scheme and construct a system called the searchable data vault (SDV. A unique feature of the scheme is that it allows an encrypted document to be divided into blocks and distributed to different storage servers so that no single storage provider has a complete document. By incorporating the scheme, the SDV protects the privacy of documents while allowing for efficient private queries. It utilizes a web interface and a controller that manages user credentials, query indexes and submission of encrypted documents to cloud storage services. It is also the first system that enables a user to simultaneously outsource and privately query documents from a few cloud storage services. Our preliminary performance evaluation shows that this feature introduces acceptable computation overheads when compared to submitting documents directly to a cloud storage service.

  19. Querying Large Physics Data Sets Over an Information Grid

    Institute of Scientific and Technical Information of China (English)

    NigelBaker; ZsoltKovacs; 等

    2001-01-01

    Optimising use of the Web(WWW) for LHC data analysis is a complex problem and illustrates the challenges arising from the integration of and computation across massive ampunts of information distributed worldwide.Finding the right piece of information can,at times,be extremely time-consuming,if not impossible,SO-called Grids have been proposed to facilitate LHC computing and many groups have embarked on studies of data replication,data migration and netwroking plhilosophies.Other aspects such as the role of moddleware' for Grids are emerging as requiring research.This paper positions the need for appropriate middleware that enables users to resolve physics queries across massive data sets.It identifies the role of meta-data for query resolution and the importance of Information Grids for high-energy physics analysis rather than just computational or Data Grids,This paper identifies software that is being implemented at CERN to enable the querying of very large collaborating HEP data-sets,initially being employed for the construction of CMS detectors.

  20. Stratification-Based Outlier Detection over the Deep Web

    OpenAIRE

    Xuefeng Xian; Pengpeng Zhao; Victor S. Sheng; Ligang Fang; Caidong Gu; Yuanfeng Yang; Zhiming Cui

    2016-01-01

    For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribu...

  1. Stratification-Based Outlier Detection over the Deep Web

    OpenAIRE

    Xuefeng Xian; Pengpeng Zhao; Sheng, Victor S.; Ligang Fang; Caidong Gu; Yuanfeng Yang; Zhiming Cui

    2016-01-01

    For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribu...

  2. Mining the human phenome using semantic web technologies: a case study for Type 2 Diabetes.

    Science.gov (United States)

    Pathak, Jyotishman; Kiefer, Richard C; Bielinski, Suzette J; Chute, Christopher G

    2012-01-01

    The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form "biobanks" where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypothesis generation. In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped with Type 2 Diabetes for discovering gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries.

  3. Querying Business Process Models with VMQL

    DEFF Research Database (Denmark)

    Störrle, Harald; Acretoaie, Vlad

    2013-01-01

    The Visual Model Query Language (VMQL) has been invented with the objectives (1) to make it easier for modelers to query models effectively, and (2) to be universally applicable to all modeling languages. In previous work, we have applied VMQL to UML, and validated the first of these two claims. ...

  4. Path Minima Queries in Dynamic Weighted Trees

    DEFF Research Database (Denmark)

    Davoodi, Pooya; Brodal, Gerth Stølting; Satti, Srinivasa Rao

    2011-01-01

    In the path minima problem on a tree, each edge is assigned a weight and a query asks for the edge with minimum weight on a path between two nodes. For the dynamic version of the problem, where the edge weights can be updated, we give data structures that achieve optimal query time\\todo{what about...

  5. Meet Charles, big data query advisor

    NARCIS (Netherlands)

    Sellam, T.; Kersten, M.

    2013-01-01

    In scientific data management and business analytics, the most informative queries are a holy grail. Data collection becomes increasingly simpler, yet data exploration gets significantly harder. Exploratory querying is likely to return an empty or an overwhelming result set. On the other hand, data

  6. Meet Charles, big data query advisor

    NARCIS (Netherlands)

    Sellam, T.; Kersten, M.

    2013-01-01

    In scientific data management and business analytics, the most informative queries are a holy grail. Data collection becomes increasingly simpler, yet data exploration gets significantly harder. Exploratory querying is likely to return an empty or an overwhelming result set. On the other hand, data

  7. Quantum associative memory with improved distributed queries

    CERN Document Server

    Njafa, J -P Tchapet; Woafo, Paul

    2012-01-01

    The paper proposes an improved quantum associative algorithm with distributed query based on model proposed by Ezhov et al. We introduce two modifications of the query that optimized data retrieval of correct multi-patterns simultaneously for any rate of the number of the recognition pattern on the total patterns. Simulation results are given.

  8. Efficient caching for constrained skyline queries

    DEFF Research Database (Denmark)

    Mortensen, Michael Lind; Chester, Sean; Assent, Ira;

    2015-01-01

    Constrained skyline queries retrieve all points that optimize some user’s preferences subject to orthogonal range constraints, but at significant computational cost. This paper is the first to propose caching to improve constrained skyline query response time. Because arbitrary range constraints ...

  9. Web-based resources for critical care education.

    Science.gov (United States)

    Kleinpell, Ruth; Ely, E Wesley; Williams, Ged; Liolios, Antonios; Ward, Nicholas; Tisherman, Samuel A

    2011-03-01

    To identify, catalog, and critically evaluate Web-based resources for critical care education. A multilevel search strategy was utilized. Literature searches were conducted (from 1996 to September 30, 2010) using OVID-MEDLINE, PubMed, and the Cumulative Index to Nursing and Allied Health Literature with the terms "Web-based learning," "computer-assisted instruction," "e-learning," "critical care," "tutorials," "continuing education," "virtual learning," and "Web-based education." The Web sites of relevant critical care organizations (American College of Chest Physicians, American Society of Anesthesiologists, American Thoracic Society, European Society of Intensive Care Medicine, Society of Critical Care Medicine, World Federation of Societies of Intensive and Critical Care Medicine, American Association of Critical Care Nurses, and World Federation of Critical Care Nurses) were reviewed for the availability of e-learning resources. Finally, Internet searches and e-mail queries to critical care medicine fellowship program directors and members of national and international acute/critical care listserves were conducted to 1) identify the use of and 2) review and critique Web-based resources for critical care education. To ensure credibility of Web site information, Web sites were reviewed by three independent reviewers on the basis of the criteria of authority, objectivity, authenticity, accuracy, timeliness, relevance, and efficiency in conjunction with suggested formats for evaluating Web sites in the medical literature. Literature searches using OVID-MEDLINE, PubMed, and the Cumulative Index to Nursing and Allied Health Literature resulted in >250 citations. Those pertinent to critical care provide examples of the integration of e-learning techniques, the development of specific resources, reports of the use of types of e-learning, including interactive tutorials, case studies, and simulation, and reports of student or learner satisfaction, among other general

  10. Exploring features for automatic identification of news queries through query logs

    Institute of Scientific and Technical Information of China (English)

    Xiaojuan; ZHANG; Jian; LI

    2014-01-01

    Purpose:Existing researches of predicting queries with news intents have tried to extract the classification features from external knowledge bases,this paper tries to present how to apply features extracted from query logs for automatic identification of news queries without using any external resources.Design/methodology/approach:First,we manually labeled 1,220 news queries from Sogou.com.Based on the analysis of these queries,we then identified three features of news queries in terms of query content,time of query occurrence and user click behavior.Afterwards,we used 12 effective features proposed in literature as baseline and conducted experiments based on the support vector machine(SVM)classifier.Finally,we compared the impacts of the features used in this paper on the identification of news queries.Findings:Compared with baseline features,the F-score has been improved from 0.6414 to0.8368 after the use of three newly-identified features,among which the burst point(bst)was the most effective while predicting news queries.In addition,query expression(qes)was more useful than query terms,and among the click behavior-based features,news URL was the most effective one.Research limitations:Analyses based on features extracted from query logs might lead to produce limited results.Instead of short queries,the segmentation tool used in this study has been more widely applied for long texts.Practical implications:The research will be helpful for general-purpose search engines to address search intents for news events.Originality/value:Our approach provides a new and different perspective in recognizing queries with news intent without such large news corpora as blogs or Twitter.

  11. Adding Query Privacy to Robust DHTs

    DEFF Research Database (Denmark)

    Backes, Michael; Goldberg, Ian; Kate, Aniket

    2011-01-01

    Interest in anonymous communication over distributed hash tables (DHTs) has increased in recent years. However, almost all known solutions solely aim at achieving sender or requestor anonymity in DHT queries. In many application scenarios, it is crucial that the queried key remains secret from...... intermediate peers that (help to) route the queries towards their destinations. In this paper, we satisfy this requirement by presenting an approach for providing privacy for the keys in DHT queries. We use the concept of oblivious transfer (OT) in communication over DHTs to preserve query privacy without...... compromising spam resistance. Although our OT-based approach can work over any DHT, we concentrate on communication over robust DHTs that can tolerate Byzantine faults and resist spam. We choose the best-known robust DHT construction, and employ an efficient OT protocol well-suited for achieving our goal...

  12. Ensuring Query Compatibility with Evolving XML Schemas

    CERN Document Server

    Genevès, Pierre; Quint, Vincent

    2008-01-01

    During the life cycle of an XML application, both schemas and queries may change from one version to another. Schema evolutions may affect query results and potentially the validity of produced data. Nowadays, a challenge is to assess and accommodate the impact of theses changes in rapidly evolving XML applications. This article proposes a logical framework and tool for verifying forward/backward compatibility issues involving schemas and queries. First, it allows analyzing relations between schemas. Second, it allows XML designers to identify queries that must be reformulated in order to produce the expected results across successive schema versions. Third, it allows examining more precisely the impact of schema changes over queries, therefore facilitating their reformulation.

  13. Dreamweaver CS6 HTML5, CSS3, responsive design, and jQuery

    CERN Document Server

    Karlins, David

    2013-01-01

    This book combines accessible, clear, engaging, and candid reference material, advice, and shortcuts with substantial stepbystep instructions for creating a wide range of HTML5 and CSS3 designs and page content in Dreamweaver.This book is geared towards experienced Dreamweaver web designers migrating to HTML5 and jQuery. It also targets web designers new to Dreamweaver who want to jump with two feet into the most current web design tools and features. While focused primarily on Dreamweaver CS5.5, the book includes content of value to readers using older versions of Dreamweaver with directions

  14. COEUS: “semantic web in a box” for biomedical applications

    Directory of Open Access Journals (Sweden)

    Lopes Pedro

    2012-12-01

    Full Text Available Abstract Background As the “omics” revolution unfolds, the growth in data quantity and diversity is bringing about the need for pioneering bioinformatics software, capable of significantly improving the research workflow. To cope with these computer science demands, biomedical software engineers are adopting emerging semantic web technologies that better suit the life sciences domain. The latter’s complex relationships are easily mapped into semantic web graphs, enabling a superior understanding of collected knowledge. Despite increased awareness of semantic web technologies in bioinformatics, their use is still limited. Results COEUS is a new semantic web framework, aiming at a streamlined application development cycle and following a “semantic web in a box” approach. The framework provides a single package including advanced data integration and triplification tools, base ontologies, a web-oriented engine and a flexible exploration API. Resources can be integrated from heterogeneous sources, including CSV and XML files or SQL and SPARQL query results, and mapped directly to one or more ontologies. Advanced interoperability features include REST services, a SPARQL endpoint and LinkedData publication. These enable the creation of multiple applications for web, desktop or mobile environments, and empower a new knowledge federation layer. Conclusions The platform, targeted at biomedical application developers, provides a complete skeleton ready for rapid application deployment, enhancing the creation of new semantic information systems. COEUS is available as open source at http://bioinformatics.ua.pt/coeus/.

  15. Algorithms for effective querying of compound graph-based pathway databases

    Directory of Open Access Journals (Sweden)

    Demir Emek

    2009-11-01

    Full Text Available Abstract Background Graph-based pathway ontologies and databases are widely used to represent data about cellular processes. This representation makes it possible to programmatically integrate cellular networks and to investigate them using the well-understood concepts of graph theory in order to predict their structural and dynamic properties. An extension of this graph representation, namely hierarchically structured or compound graphs, in which a member of a biological network may recursively contain a sub-network of a somehow logically similar group of biological objects, provides many additional benefits for analysis of biological pathways, including reduction of complexity by decomposition into distinct components or modules. In this regard, it is essential to effectively query such integrated large compound networks to extract the sub-networks of interest with the help of efficient algorithms and software tools. Results Towards this goal, we developed a querying framework, along with a number of graph-theoretic algorithms from simple neighborhood queries to shortest paths to feedback loops, that is applicable to all sorts of graph-based pathway databases, from PPIs (protein-protein interactions to metabolic and signaling pathways. The framework is unique in that it can account for compound or nested structures and ubiquitous entities present in the pathway data. In addition, the queries may be related to each other through "AND" and "OR" operators, and can be recursively organized into a tree, in which the result of one query might be a source and/or target for another, to form more complex queries. The algorithms were implemented within the querying component of a new version of the software tool PATIKAweb (Pathway Analysis Tool for Integration and Knowledge Acquisition and have proven useful for answering a number of biologically significant questions for large graph-based pathway databases. Conclusion The PATIKA Project Web site is

  16. Evaluating web serch engines

    CERN Document Server

    Lewandowski, Dirk

    2011-01-01

    Every month, more than 130 billion queries worldwide are entered into the search boxes of general-purpose web search engines (ComScore, 2010). This enormous number shows that web searching is not only a large business, but also that many people rely on the search engines' results when researching information. A goal of all search engine evaluation efforts is to generate better systems. This goal is of major importance to the search engine vendors who can directly apply evaluation results to develop better ranking algorithms.

  17. WEB BASED TRANSLATION OF CHINESE ORGANIZATION NAME

    Institute of Scientific and Technical Information of China (English)

    Yang Muyun; Liu Daxin; Zhao Tiejun; Qi Haoliang; Lin Kaiming

    2009-01-01

    A web-based translation method for Chinese organization name is proposed. After analyzing the structure of Chinese organization name, the methods of bilingual query formulation and maximum entropy based translation re-ranking are suggested to retrieve the English translation from the web via public search engine. The experiments on Chinese university names demonstrate the validness of this approach.

  18. Integrating Data Warehouses with Web Data

    DEFF Research Database (Denmark)

    Perez, Juan Manuel; Berlanga, Rafael; Aramburu, Maria Jose

    This paper surveys the most relevant research on combining Data Warehouse (DW) and Web data. It studies the XML technologies that are currently being used to integrate, store, query and retrieve web data, and their application to data warehouses. The paper addresses the problem of integrating...

  19. Retrieving top-k prestige-based relevant spatial web objects

    DEFF Research Database (Denmark)

    Cao, Xin; Cong, Gao; Jensen, Christian S.

    2010-01-01

    of prestige-based relevance to capture both the textual relevance of an object to a query and the effects of nearby objects. Based on this, a new type of query, the Location-aware top-k Prestige-based Text retrieval (LkPT) query, is proposed that retrieves the top-k spatial web objects ranked according...... to both prestige-based relevance and location proximity. We propose two algorithms that compute LkPT queries. Empirical studies with real-world spatial data demonstrate that LkPT queries are more effective in retrieving web objects than a previous approach that does not consider the effects of nearby...

  20. Query Optimizations over Decentralized RDF Graphs

    KAUST Repository

    Abdelaziz, Ibrahim

    2017-05-18

    Applications in life sciences, decentralized social networks, Internet of Things, and statistical linked dataspaces integrate data from multiple decentralized RDF graphs via SPARQL queries. Several approaches have been proposed to optimize query processing over a small number of heterogeneous data sources by utilizing schema information. In the case of schema similarity and interlinks among sources, these approaches cause unnecessary data retrieval and communication, leading to poor scalability and response time. This paper addresses these limitations and presents Lusail, a system for scalable and efficient SPARQL query processing over decentralized graphs. Lusail achieves scalability and low query response time through various optimizations at compile and run times. At compile time, we use a novel locality-aware query decomposition technique that maximizes the number of query triple patterns sent together to a source based on the actual location of the instances satisfying these triple patterns. At run time, we use selectivity-awareness and parallel query execution to reduce network latency and to increase parallelism by delaying the execution of subqueries expected to return large results. We evaluate Lusail using real and synthetic benchmarks, with data sizes up to billions of triples on an in-house cluster and a public cloud. We show that Lusail outperforms state-of-the-art systems by orders of magnitude in terms of scalability and response time.

  1. The RCSB Protein Data Bank: redesigned web site and web services.

    Science.gov (United States)

    Rose, Peter W; Beran, Bojan; Bi, Chunxiao; Bluhm, Wolfgang F; Dimitropoulos, Dimitris; Goodsell, David S; Prlic, Andreas; Quesada, Martha; Quinn, Gregory B; Westbrook, John D; Young, Jasmine; Yukich, Benjamin; Zardecki, Christine; Berman, Helen M; Bourne, Philip E

    2011-01-01

    The RCSB Protein Data Bank (RCSB PDB) web site (http://www.pdb.org) has been redesigned to increase usability and to cater to a larger and more diverse user base. This article describes key enhancements and new features that fall into the following categories: (i) query and analysis tools for chemical structure searching, query refinement, tabulation and export of query results; (ii) web site customization and new structure alerts; (iii) pair-wise and representative protein structure alignments; (iv) visualization of large assemblies; (v) integration of structural data with the open access literature and binding affinity data; and (vi) web services and web widgets to facilitate integration of PDB data and tools with other resources. These improvements enable a range of new possibilities to analyze and understand structure data. The next generation of the RCSB PDB web site, as described here, provides a rich resource for research and education.

  2. Keyword search in the Deep Web

    OpenAIRE

    Calì, Andrea; Martinenghi, D.; Torlone, R.

    2015-01-01

    The Deep Web is constituted by data accessible through Web\\ud pages, but not readily indexable by search engines, as they are returned\\ud in dynamic pages. In this paper we propose a framework for accessing\\ud Deep Web sources, represented as relational tables with so-called ac-\\ud cess limitations, with keyword-based queries. We formalize the notion\\ud of optimal answer and investigate methods for query processing. To our\\ud knowledge, this problem has never been studied in a systematic way.

  3. Hidden Page WebCrawler Model for Secure Web Pages

    Directory of Open Access Journals (Sweden)

    K. F. Bharati

    2013-03-01

    Full Text Available The traditional search engines available over the internet are dynamic in searching the relevant content over the web. The search engine has got some constraints like getting the data asked from a varied source, where the data relevancy is exceptional. The web crawlers are designed only to more towards a specific path of the web and are restricted in moving towards a different path as they are secured or at times restricted due to the apprehension of threats. It is possible to design a web crawler that will have the capability of penetrating through the paths of the web, not reachable by the traditional web crawlers, in order to get a better solution in terms of data, time and relevancy for the given search query. The paper makes use of a newer parser and indexer for coming out with a novel idea of web crawler and a framework to support it. The proposed web crawler is designed to attend Hyper Text Transfer Protocol Secure (HTTPS based websites and web pages that needs authentication to view and index. User has to fill a search form and his/her creditionals will be used by the web crawler to attend secure web server for authentication. Once it is indexed the secure web server will be inside the web crawler’s accessible zone

  4. Complexity of temporal query abduction in DL-Lite

    CSIR Research Space (South Africa)

    Klarman, S

    2014-07-01

    Full Text Available and Temporal Query Language, based on the combination of LTL with conjunctive queries. In this defined setting, we study the complexity of temporal query abduction, assuming different restrictions on the problem and minimality criteria for abductive solutions...

  5. Query Through Heterogeneous Ontologies Using Association Matrix

    Institute of Scientific and Technical Information of China (English)

    KANG Da-zhou; XU Bao-wen; LU Jian-jiang; WANG Peng; LI Yan-hui

    2004-01-01

    This paper introduces the definition and calculation of the association matrix between ontologies.It uses the association matrix to describe the relations between concepts in different ontologies and uses concept vectors to represent queries; then computes the vectors with the association matrix in order to rewrite queries.This paper proposes a simple method of querying through heterogeneous Ontology using association matrix.This method is based on the correctness of approximate information filtering theory; and it is simple to be implemented and expected to run quite fast.

  6. Instant MDX queries for SQL Server 2012

    CERN Document Server

    Emond, Nicholas

    2013-01-01

    Get to grips with a new technology, understand what it is and what it can do for you, and then get to work with the most important features and tasks. This short, focused guide is a great way to get stated with writing MDX queries. New developers can use this book as a reference for how to use functions and the syntax of a query as well as how to use Calculated Members and Named Sets.This book is great for new developers who want to learn the MDX query language from scratch and install SQL Server 2012 with Analysis Services

  7. Relative aggregation operator in database fuzzy querying

    Directory of Open Access Journals (Sweden)

    Luminita DUMITRIU

    2005-12-01

    Full Text Available Fuzzy selection criteria querying relational databases include vague terms; they usually refer linguistic values form the attribute linguistic domains, defined as fuzzy sets. Generally, when a vague query is processed, the definitions of vague terms must already exist in a knowledge base. But there are also cases when vague terms must be dynamically defined, when a particular operation is used to aggregate simple criteria in a complex selection. The paper presents a new aggregation operator and the corresponding algorithm to evaluate the fuzzy query.

  8. Provenance Storage, Querying, and Visualization in PBase

    Energy Technology Data Exchange (ETDEWEB)

    Kianmajd, Parisa [University of California, Davis; Ludascher, Bertram [University of California, Davis; Missier, Paolo [Newcastle University, UK; Chirigati, Fernando [New York University; Wei, Yaxing [ORNL; Koop, David [New York University; Dey, Saumen [University of California, Davis

    2015-01-01

    We present PBase, a repository for scientific workflows and their corresponding provenance information that facilitates the sharing of experiments among the scientific community. PBase is interoperable since it uses ProvONE, a standard provenance model for scientific workflows. Workflows and traces are stored in RDF, and with the support of SPARQL and the tree cover encoding, the repository provides a scalable infrastructure for querying the provenance data. Furthermore, through its user interface, it is possible to: visualize workflows and execution traces; visualize reachability relations within these traces; issue SPARQL queries; and visualize query results.

  9. Query Load Balancing For Visible Object Extraction

    DEFF Research Database (Denmark)

    Bukauskas, Linas; Bøhlen, Michael Hanspeter

    2004-01-01

    Interactive visual data explorations impose rigid real-time requirements on the extraction of visible objects. Often these requirements are met by deploying powerful hardware that maintains the entire data set in huge main memory structures. In this paper we propose an approach that retrieves...... objects along the path. The visible objects are retrieved incrementally, and it is possible to precisely control the query load and the number of retrieved objects. The minimal distance path method issues frequent queries and retrieves the lowest possible number of objects at each query point. The end...

  10. Evaluating Trajectory Queries over Imprecise Location Data

    DEFF Research Database (Denmark)

    Xie, Scott, Xike; Cheng, Reynold; Yiu, Man Lung

    2012-01-01

    Trajectory queries, which retrieve nearby objects for every point of a given route, can be used to identify alerts of potential threats along a vessel route, or monitor the adjacent rescuers to a travel path. However, the locations of these objects (e.g., threats, succours) may not be precisely......, the query is quite time-consuming, since all the points on the trajectory are considered. In this paper, we study how to efficiently evaluate trajectory queries over imprecise location data, by proposing a new concept called the u-bisector. In general, the u-bisector is an extension of bisector to handle...

  11. KnockoutJS web development

    CERN Document Server

    Farrar, John

    2015-01-01

    This book is for web developers and designers who work with HTML and JavaScript to help them manage data and interactivity with data using KnockoutJS. Knowledge about jQuery will be useful but is not necessary.

  12. A Framework for Transparently Accessing Deep Web Sources

    Science.gov (United States)

    Dragut, Eduard Constantin

    2010-01-01

    An increasing number of Web sites expose their content via query interfaces, many of them offering the same type of products/services (e.g., flight tickets, car rental/purchasing). They constitute the so-called "Deep Web". Accessing the content on the Deep Web has been a long-standing challenge for the database community. For a user interested in…

  13. A Framework for Transparently Accessing Deep Web Sources

    Science.gov (United States)

    Dragut, Eduard Constantin

    2010-01-01

    An increasing number of Web sites expose their content via query interfaces, many of them offering the same type of products/services (e.g., flight tickets, car rental/purchasing). They constitute the so-called "Deep Web". Accessing the content on the Deep Web has been a long-standing challenge for the database community. For a user interested in…

  14. Schedule Sales Query Report Generation System

    Data.gov (United States)

    General Services Administration — Schedule Sales Query presents sales volume figures as reported to GSA by contractors. The reports are generated as quarterly reports for the current year and the...

  15. Clean Air Markets - Compliance Query Wizard

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Compliance Query Wizard is part of a suite of Clean Air Markets-related tools that are accessible at http://ampd.epa.gov/ampd/. The Compliance module provides...

  16. Business information query expansion through semantic network

    Science.gov (United States)

    Gong, Zhiguo; Muyeba, Maybin; Guo, Jingzhi

    2010-02-01

    In this article, we propose a method for business information query expansions. In our approach, hypernym/hyponymy and synonym relations in WordNet are used as the basic expansion rules. Then we use WordNet Lexical Chains and WordNet semantic similarity to assign terms in the same query into different groups with respect to their semantic similarities. For each group, we expand the highest terms in the WordNet hierarchies with hypernym and synonym, the lowest terms with hyponym and synonym and all other terms with only synonym. In this way, the contradictory caused by full expansion can be well controlled. Furthermore, we use collection-related term semantic network to further improve the expansion performance. And our experiment reveals that our solution for query expansion can improve the query performance dramatically.

  17. Medical Expenditure Panel Survey (MEPS) Query Tool

    Data.gov (United States)

    U.S. Department of Health & Human Services — MEPSnet HC Query Tool MEPSnet/Household Component provides easy access to nationally representative statistics of health care use, expenditures, sources of payment,...

  18. Range Query Processing in Multidisk Systems

    Institute of Scientific and Technical Information of China (English)

    李建中

    1992-01-01

    In order to reduce the disk access time,a database can be stored on several simultaneously accessible disks.In this paper,we are concerned with the dynamic d-attribute database allocation problem for range queries,An allocation method,called coordinate moule allocation method,is proposed to allocate data in a d-attribute database among disks so that the maximum disk accessing concurrency can be achieved for range queries.Our analysis and experiments show that the method achieves the optimum or near-optimum parallelism for range queries.The paper offers the conditions under which the method is optimal .The worst case bounds of the performance of the method are also given.In addition,the parallel algorithm of processing range queries in described at the end of the paper.The method has been used in the statistic and scientific database management system whic is being designed by us.

  19. Efficient Probabilistic Inference with Partial Ranking Queries

    CERN Document Server

    Huang, Jonathan; Guestrin, Carlos E

    2012-01-01

    Distributions over rankings are used to model data in various settings such as preference analysis and political elections. The factorial size of the space of rankings, however, typically forces one to make structural assumptions, such as smoothness, sparsity, or probabilistic independence about these underlying distributions. We approach the modeling problem from the computational principle that one should make structural assumptions which allow for efficient calculation of typical probabilistic queries. For ranking models, "typical" queries predominantly take the form of partial ranking queries (e.g., given a user's top-k favorite movies, what are his preferences over remaining movies?). In this paper, we argue that riffled independence factorizations proposed in recent literature [7, 8] are a natural structural assumption for ranking distributions, allowing for particularly efficient processing of partial ranking queries.

  20. Mobile Information Access with Spoken Query Answering

    DEFF Research Database (Denmark)

    Brøndsted, Tom; Larsen, Henrik Legind; Larsen, Lars Bo

    2006-01-01

    This paper addresses the problem of information and service accessibility in mobile devices with limited resources. A solution is developed and tested through a prototype that applies state-of-the-art Distributed Speech Recognition (DSR) and knowledge-based Information Retrieval (IR) processing...... for spoken query answering. For the DSR part, a configurable DSR system is implemented on the basis of the ETSI-DSR advanced front-end and the SPHINX IV recognizer. For the knowledge-based IR part, a distributed system solution is developed for fast retrieval of the most relevant documents, with a text...... window focused over the part which most likely contains an answer to the query. The two systems are integrated into a full spoken query answering system. The prototype can answer queries and questions within the chosen football (soccer) test domain, but the system has the flexibility for being ported...

  1. Querying temporal databases via OWL 2 QL

    CSIR Research Space (South Africa)

    Klarman, S

    2014-06-01

    Full Text Available SQL:2011, the most recently adopted version of the SQL query language, has unprecedentedly standardized the representation of temporal data in relational databases. Following the successful paradigm of ontology-based data access, we develop a...

  2. Search Result Diversification Based on Query Facets

    Institute of Scientific and Technical Information of China (English)

    胡莎; 窦志成; 王晓捷; 继荣

    2015-01-01

    In search engines, different users may search for different information by issuing the same query. To satisfy more users with limited search results, search result diversification re-ranks the results to cover as many user intents as possible. Most existing intent-aware diversification algorithms recognize user intents as subtopics, each of which is usually a word, a phrase, or a piece of description. In this paper, we leverage query facets to understand user intents in diversification, where each facet contains a group of words or phrases that explain an underlying intent of a query. We generate subtopics based on query facets and propose faceted diversification approaches. Experimental results on the public TREC 2009 dataset show that our faceted approaches outperform state-of-the-art diversification models.

  3. Clean Air Markets - Allowances Query Wizard

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Allowances Query Wizard is part of a suite of Clean Air Markets-related tools that are accessible at http://camddataandmaps.epa.gov/gdm/index.cfm. The Allowances...

  4. An Agent-Based Focused Crawling Framework for Topic- and Genre-Related Web Document Discovery

    OpenAIRE

    Pappas, Nikolaos; Katsimpras, Georgios; Stamatatos, Efstathios

    2012-01-01

    The discovery of web documents about certain topics is an important task for web-based applications including web document retrieval, opinion mining and knowledge extraction. In this paper, we propose an agent-based focused crawling framework able to retrieve topic- and genre-related web documents. Starting from a simple topic query, a set of focused crawler agents explore in parallel topic-specific web paths using dynamic seed URLs that belong to certain web genres and are collected from web...

  5. The challenge of automated tutoring in Web-based learning environments for information retrieval instruction

    OpenAIRE

    Sormunen Eero; Pennanen; Sami

    2004-01-01

    The need to enhance information literacy education increases demand for effective Web-based learning environments for information retrieval instruction. The paper introduces the Query Performance Analyser, a unique instructional tool for information retrieval learning environments. On top of an information retrieval system and within a given search assignment, the Query Performance Analyser supports learning by instantly visualizing achieved query performance. Although the Query Performance A...

  6. Evaluating SPARQL queries on massive RDF datasets

    KAUST Repository

    Al-Harbi, Razen

    2015-08-01

    Distributed RDF systems partition data across multiple computer nodes. Partitioning is typically based on heuristics that minimize inter-node communication and it is performed in an initial, data pre-processing phase. Therefore, the resulting partitions are static and do not adapt to changes in the query workload; as a result, existing systems are unable to consistently avoid communication for queries that are not favored by the initial data partitioning. Furthermore, for very large RDF knowledge bases, the partitioning phase becomes prohibitively expensive, leading to high startup costs. In this paper, we propose AdHash, a distributed RDF system which addresses the shortcomings of previous work. First, AdHash initially applies lightweight hash partitioning, which drastically minimizes the startup cost, while favoring the parallel processing of join patterns on subjects, without any data communication. Using a locality-aware planner, queries that cannot be processed in parallel are evaluated with minimal communication. Second, AdHash monitors the data access patterns and adapts dynamically to the query load by incrementally redistributing and replicating frequently accessed data. As a result, the communication cost for future queries is drastically reduced or even eliminated. Our experiments with synthetic and real data verify that AdHash (i) starts faster than all existing systems, (ii) processes thousands of queries before other systems become online, and (iii) gracefully adapts to the query load, being able to evaluate queries on billion-scale RDF data in sub-seconds. In this demonstration, audience can use a graphical interface of AdHash to verify its performance superiority compared to state-of-the-art distributed RDF systems.

  7. Nearest Neighbor Queries in Road Networks

    DEFF Research Database (Denmark)

    Jensen, Christian Søndergaard; Kolar, Jan; Pedersen, Torben Bach

    2003-01-01

    With wireless communications and geo-positioning being widely available, it becomes possible to offer new e-services that provide mobile users with information about other mobile objects. This paper concerns active, ordered k-nearest neighbor queries for query and data objects that are moving...... for the nearest neighbor search in the prototype is presented in detail. In addition, the paper reports on results from experiments with the prototype system....

  8. SWORS: a system for the efficient retrieval of relevant spatial web objects

    DEFF Research Database (Denmark)

    Cao, Xin; Cong, Gao; Jensen, Christian S.

    2012-01-01

    Spatial web objects that possess both a geographical location and a textual description are gaining in prevalence. This gives prominence to spatial keyword queries that exploit both location and textual arguments. Such queries are used in many web services such as yellow pages and maps services....

  9. SWORS: a system for the efficient retrieval of relevant spatial web objects

    DEFF Research Database (Denmark)

    Cao, Xin; Cong, Gao; Jensen, Christian S.

    2012-01-01

    Spatial web objects that possess both a geographical location and a textual description are gaining in prevalence. This gives prominence to spatial keyword queries that exploit both location and textual arguments. Such queries are used in many web services such as yellow pages and maps services....

  10. Managing and querying whole slide images

    Science.gov (United States)

    Wang, Fusheng; Oh, Tae W.; Vergara-Niedermayr, Cristobal; Kurc, Tahsin; Saltz, Joel

    2012-02-01

    High-resolution pathology images provide rich information about the morphological and functional characteristics of biological systems, and are transforming the field of pathology into a new era. To facilitate the use of digital pathology imaging for biomedical research and clinical diagnosis, it is essential to manage and query both whole slide images (WSI) and analytical results generated from images, such as annotations made by humans and computed features and classifications made by computer algorithms. There are unique requirements on modeling, managing and querying whole slide images, including compatibility with standards, scalability, support of image queries at multiple granularities, and support of integrated queries between images and derived results from the images. In this paper, we present our work on developing the Pathology Image Database System (PIDB), which is a standard oriented image database to support retrieval of images, tiles, regions and analytical results, image visualization and experiment management through a unified interface and architecture. The system is deployed for managing and querying whole slide images for In Silico brain tumor studies at Emory University. PIDB is generic and open source, and can be easily used to support other biomedical research projects. It has the potential to be integrated into a Picture Archiving and Communications System (PACS) with powerful query capabilities to support pathology imaging.

  11. Implementing Graph Pattern Queries on a Relational Database

    Energy Technology Data Exchange (ETDEWEB)

    Kaplan, I L; Abdulla, G M; Brugger, S T; Kohn, S R

    2007-12-26

    When a graph database is implemented on top of a relational database, queries in the graph query language are translated into relational SQL queries. Graph pattern queries are an important feature of a graph query language. Translating graph pattern queries into single SQL statements results in very poor query performance. By taking into account the pattern query structure and generating multiple SQL statements, pattern query performance can be dramatically improved. The performance problems encountered with the single SQL statements generated for pattern queries reflects a problem in the SQL query planner and optimizer. Addressing this problem would allow relational databases to better support semantic graph databases. Relational database systems that provide good support for graph databases may also be more flexible platforms for data warehouses.

  12. k-Nearest Neighbor Query Processing Algorithms for a Query Region in Road Networks

    Institute of Scientific and Technical Information of China (English)

    Hyeong-Il Kim; Jae-Woo Chang

    2013-01-01

    Recent development of wireless communication technologies and the popularity of smart phones are making location-based services (LBS) popular.However,requesting queries to LBS servers with users' exact locations may threat the privacy of users.Therefore,there have been many researches on generating a cloaked query region for user privacy protection.Consequently,an efficient query processing algorithm for a query region is required.So,in this paper,we propose k-nearest neighbor query (k-NN) processing algorithms for a query region in road networks.To efficiently retrieve k-NN points of interest (POIs),we make use of the Island index.We also propose a method that generates an adaptive Island index to improve the query processing performance and storage usage.Finally,we show by our performance analysis that our k-NN query processing algorithms outperform the existing k-Range Nearest Neighbor (kRNN) algorithm in terms of network expansion cost and query processing time.

  13. QuerySpaces on Hadoop for the ATLAS EventIndex

    CERN Document Server

    Hrivnac, Julius; The ATLAS collaboration; Cranshaw, Jack; Favareto, Andrea; Prokoshin, Fedor; Glasman, Claudia; Toebbicke, Rainer

    2015-01-01

    A Hadoop-based implementation of the adaptive query engine serving as the back-end for the ATLAS EventIndex. The QuerySpaces implementation handles both original data and search results providing fast and efficient mechanisms for new user queries using already accumulated knowledge for optimization. Detailed descriptions and statistics about user requests are collected in HBase tables and HDFS files. Requests are associated to their results and a graph of relations between them is created to be used to find the most efficient way of providing answers to new requests The environment is completely transparent to users and is accessible over several command-line interfaces, a Web Service and a programming API.

  14. QuerySpaces on Hadoop for the ATLAS EventIndex

    CERN Document Server

    Hrivnac, Julius; The ATLAS collaboration; Cranshaw, Jack; Glasman, Claudia; Favareto, Andrea; Prokoshin, Fedor

    2015-01-01

    Hadoop-based implementation of the adaptive query engine serving as the back-end for the ATLAS EventIndex. The QuerySpaces implementation handles both original data and search results providing fast and efficient mechanisms for new user queries using already accumulated knowledge for optimisation. Detailed description and statistics about user requests are collected in HBase tables and HDFS files. Requests are associated to their results and a graph of relations between them is created to be used to find the most efficient way of providing answers to new requests The environment is completely transparent to users and is accessible over several command-line interfaces, a Web Service and a programming API.

  15. Infodemiology of status epilepticus: A systematic validation of the Google Trends-based search queries.

    Science.gov (United States)

    Bragazzi, Nicola Luigi; Bacigaluppi, Susanna; Robba, Chiara; Nardone, Raffaele; Trinka, Eugen; Brigo, Francesco

    2016-02-01

    People increasingly use Google looking for health-related information. We previously demonstrated that in English-speaking countries most people use this search engine to obtain information on status epilepticus (SE) definition, types/subtypes, and treatment. Now, we aimed at providing a quantitative analysis of SE-related web queries. This analysis represents an advancement, with respect to what was already previously discussed, in that the Google Trends (GT) algorithm has been further refined and correlational analyses have been carried out to validate the GT-based query volumes. Google Trends-based SE-related query volumes were well correlated with information concerning causes and pharmacological and nonpharmacological treatments. Google Trends can provide both researchers and clinicians with data on realities and contexts that are generally overlooked and underexplored by classic epidemiology. In this way, GT can foster new epidemiological studies in the field and can complement traditional epidemiological tools.

  16. Graph Mining Meets the Semantic Web

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Sangkeun (Matt) [ORNL; Sukumar, Sreenivas R [ORNL; Lim, Seung-Hwan [ORNL

    2015-01-01

    The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today, data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. We address that need through implementation of three popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, and PageRank). We implement these algorithms as SPARQL queries, wrapped within Python scripts. We evaluate the performance of our implementation on 6 real world data sets and show graph mining algorithms (that have a linear-algebra formulation) can indeed be unleashed on data represented as RDF graphs using the SPARQL query interface.

  17. Stratification-Based Outlier Detection over the Deep Web

    Directory of Open Access Journals (Sweden)

    Xuefeng Xian

    2016-01-01

    Full Text Available For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribution of this paper is to develop a new data mining method for outlier detection over deep web. In our approach, the query space of a deep web data source is stratified based on a pilot sample. Neighborhood sampling and uncertainty sampling are developed in this paper with the goal of improving recall and precision based on stratification. Finally, a careful performance evaluation of our algorithm confirms that our approach can effectively detect outliers in deep web.

  18. Stratification-Based Outlier Detection over the Deep Web.

    Science.gov (United States)

    Xian, Xuefeng; Zhao, Pengpeng; Sheng, Victor S; Fang, Ligang; Gu, Caidong; Yang, Yuanfeng; Cui, Zhiming

    2016-01-01

    For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribution of this paper is to develop a new data mining method for outlier detection over deep web. In our approach, the query space of a deep web data source is stratified based on a pilot sample. Neighborhood sampling and uncertainty sampling are developed in this paper with the goal of improving recall and precision based on stratification. Finally, a careful performance evaluation of our algorithm confirms that our approach can effectively detect outliers in deep web.

  19. The WebStand Project

    CERN Document Server

    Nguyen, Benjamin; Colazzo, Dario; Vion, Antoine; Manolescu, Ioana; Senellart, Pierre

    2010-01-01

    In this paper we present the state of advancement of the French ANR WebStand project. The objective of this project is to construct a customizable XML based warehouse platform to acquire, transform, analyze, store, query and export data from the web, in particular mailing lists, with the final intension of using this data to perform sociological studies focused on social groups of World Wide Web, with a specific emphasis on the temporal aspects of this data. We are currently using this system to analyze the standardization process of the W3C, through its social network of standard setters.

  20. ECADS: An Efficient Approach for Accessing Data and Query Workload

    Directory of Open Access Journals (Sweden)

    Rakesh Malvi

    2016-12-01

    Full Text Available In current scenario a huge amount of data is introduced over the web, because data introduced by the various sources, that data contains heterogeneity in nature. Data extraction is one of the major tasks in data mining. In various techniques for data extraction have been proposed from the past, which provides functionality to extract data like Collaborative Adaptive Data Sharing (CADS, pay-as-you-go etc. The drawbacks associated with these techniques is that, it is not able to provide global solution for the user. Through these techniques to get accurate search result user need to know all the details whatever he want to search. In this paper we have proposed a new searching technique “Enhanced Collaborative Adaptive Data Sharing Platform (ECADS” in which predefined queries are provided to the user to search data. In this technique some key words are provided to user related with the domain, for efficient data extraction task. These keywords are useful to user to write proper queries to search data in efficient way. In this way it provides an accurate, time efficient and a global search technique to search data. A comparison analysis for the existing and proposed technique is presented in result and analysis section. That shows, proposed technique provide better than the existing technique.