WorldWideScience

Sample records for web query federation

  1. Federated query processing for the semantic web

    CERN Document Server

    Buil-Aranda, C

    2014-01-01

    During the last years, the amount of RDF data has increased exponentially over the Web, exposed via SPARQL endpoints. These SPARQL endpoints allow users to direct SPARQL queries to the RDF data. Federated SPARQL query processing allows to query several of these RDF databases as if they were a single one, integrating the results from all of them. This is a key concept in the Web of Data and it is also a hot topic in the community. Besides of that, the W3C SPARQL-WG has standardized it in the new Recommendation SPARQL 1.1.This book provides a formalisation of the W3C proposed recommendation. Thi

  2. A journey to Semantic Web query federation in the life sciences.

    Science.gov (United States)

    Cheung, Kei-Hoi; Frost, H Robert; Marshall, M Scott; Prud'hommeaux, Eric; Samwald, Matthias; Zhao, Jun; Paschke, Adrian

    2009-10-01

    As interest in adopting the Semantic Web in the biomedical domain continues to grow, Semantic Web technology has been evolving and maturing. A variety of technological approaches including triplestore technologies, SPARQL endpoints, Linked Data, and Vocabulary of Interlinked Datasets have emerged in recent years. In addition to the data warehouse construction, these technological approaches can be used to support dynamic query federation. As a community effort, the BioRDF task force, within the Semantic Web for Health Care and Life Sciences Interest Group, is exploring how these emerging approaches can be utilized to execute distributed queries across different neuroscience data sources. We have created two health care and life science knowledge bases. We have explored a variety of Semantic Web approaches to describe, map, and dynamically query multiple datasets. We have demonstrated several federation approaches that integrate diverse types of information about neurons and receptors that play an important role in basic, clinical, and translational neuroscience research. Particularly, we have created a prototype receptor explorer which uses OWL mappings to provide an integrated list of receptors and executes individual queries against different SPARQL endpoints. We have also employed the AIDA Toolkit, which is directed at groups of knowledge workers who cooperatively search, annotate, interpret, and enrich large collections of heterogeneous documents from diverse locations. We have explored a tool called "FeDeRate", which enables a global SPARQL query to be decomposed into subqueries against the remote databases offering either SPARQL or SQL query interfaces. Finally, we have explored how to use the vocabulary of interlinked Datasets (voiD) to create metadata for describing datasets exposed as Linked Data URIs or SPARQL endpoints. We have demonstrated the use of a set of novel and state-of-the-art Semantic Web technologies in support of a neuroscience query

  3. Querying on Federated Sensor Networks

    Directory of Open Access Journals (Sweden)

    Zuhal Can

    2016-09-01

    Full Text Available A Federated Sensor Network (FSN is a network of geographically distributed Wireless Sensor Networks (WSNs called islands. For querying on an FSN, we introduce the Layered Federated Sensor Network (L-FSN Protocol. For layered management, L-FSN provides communication among islands by its inter-island querying protocol by which a query packet routing path is determined according to some path selection policies. L-FSN allows autonomous management of each island by island-specific intra-island querying protocols that can be selected according to island properties. We evaluate the applicability of L-FSN and compare the L-FSN protocol with various querying protocols running on the flat federation model. Flat federation is a method to federate islands by running a single querying protocol on an entire FSN without distinguishing communication among and within islands. For flat federation, we select a querying protocol from geometrical, hierarchical cluster-based, hash-based, and tree-based WSN querying protocol categories. We found that a layered federation of islands by L-FSN increases the querying performance with respect to energy-efficiency, query resolving distance, and query resolving latency. Moreover, L-FSN’s flexibility of choosing intra-island querying protocols regarding the island size brings advantages on energy-efficiency and query resolving latency.

  4. Semantic Web Technologies and Big Data Infrastructures: SPARQL Federated Querying of Heterogeneous Big Data Stores

    OpenAIRE

    Konstantopoulos, Stasinos; Charalambidis, Angelos; Mouchakis, Giannis; Troumpoukis, Antonis; Jakobitsch, Jürgen; Karkaletsis, Vangelis

    2016-01-01

    The ability to cross-link large scale data with each other and with structured Semantic Web data, and the ability to uniformly process Semantic Web and other data adds value to both the Semantic Web and to the Big Data community. This paper presents work in progress towards integrating Big Data infrastructures with Semantic Web technologies, allowing for the cross-linking and uniform retrieval of data stored in both Big Data infrastructures and Semantic Web data. The technical challenges invo...

  5. Responsive web design with jQuery

    CERN Document Server

    Carlos, Gilberto

    2013-01-01

    Responsive Web Design with jQuery follows a standard tutorial-based approach, covering various aspects of responsive web design by building a comprehensive website.""Responsive Web Design with jQuery"" is aimed at web designers who are interested in building device-agnostic websites. You should have a grasp of standard HTML, CSS, and JavaScript development, and have a familiarity with graphic design. Some exposure to jQuery and HTML5 will be beneficial but isn't essential.

  6. Web development with jQuery

    CERN Document Server

    York, Richard

    2015-01-01

    Newly revised and updated resource on jQuery's many features and advantages Web Development with jQuery offers a major update to the popular Beginning JavaScript and CSS Development with jQuery from 2009. More than half of the content is new or updated, and reflects recent innovations with regard to mobile applications, jQuery mobile, and the spectrum of associated plugins. Readers can expect thorough revisions with expanded coverage of events, CSS, AJAX, animation, and drag and drop. New chapters bring developers up to date on popular features like jQuery UI, navigation, tables, interacti

  7. Web-Based Distributed XML Query Processing

    NARCIS (Netherlands)

    Smiljanic, M.; Feng, L.; Jonker, Willem; Blanken, Henk; Grabs, T.; Schek, H-J.; Schenkel, R.; Weikum, G.

    2003-01-01

    Web-based distributed XML query processing has gained in importance in recent years due to the widespread popularity of XML on the Web. Unlike centralized and tightly coupled distributed systems, Web-based distributed database systems are highly unpredictable and uncontrollable, with a rather

  8. Executing SPARQL Queries over the Web of Linked Data

    Science.gov (United States)

    Hartig, Olaf; Bizer, Christian; Freytag, Johann-Christoph

    The Web of Linked Data forms a single, globally distributed dataspace. Due to the openness of this dataspace, it is not possible to know in advance all data sources that might be relevant for query answering. This openness poses a new challenge that is not addressed by traditional research on federated query processing. In this paper we present an approach to execute SPARQL queries over the Web of Linked Data. The main idea of our approach is to discover data that might be relevant for answering a query during the query execution itself. This discovery is driven by following RDF links between data sources based on URIs in the query and in partial results. The URIs are resolved over the HTTP protocol into RDF data which is continuously added to the queried dataset. This paper describes concepts and algorithms to implement our approach using an iterator-based pipeline. We introduce a formalization of the pipelining approach and show that classical iterators may cause blocking due to the latency of HTTP requests. To avoid blocking, we propose an extension of the iterator paradigm. The evaluation of our approach shows its strengths as well as the still existing challenges.

  9. Deep web query interface understanding and integration

    CERN Document Server

    Dragut, Eduard C; Yu, Clement T

    2012-01-01

    There are millions of searchable data sources on the Web and to a large extent their contents can only be reached through their own query interfaces. There is an enormous interest in making the data in these sources easily accessible. There are primarily two general approaches to achieve this objective. The first is to surface the contents of these sources from the deep Web and add the contents to the index of regular search engines. The second is to integrate the searching capabilities of these sources and support integrated access to them. In this book, we introduce the state-of-the-art tech

  10. Linking Health Records for Federated Query Processing

    Directory of Open Access Journals (Sweden)

    Dewri Rinku

    2016-07-01

    Full Text Available A federated query portal in an electronic health record infrastructure enables large epidemiology studies by combining data from geographically dispersed medical institutions. However, an individual’s health record has been found to be distributed across multiple carrier databases in local settings. Privacy regulations may prohibit a data source from revealing clear text identifiers, thereby making it non-trivial for a query aggregator to determine which records correspond to the same underlying individual. In this paper, we explore this problem of privately detecting and tracking the health records of an individual in a distributed infrastructure. We begin with a secure set intersection protocol based on commutative encryption, and show how to make it practical on comparison spaces as large as 1010 pairs. Using bigram matching, precomputed tables, and data parallelism, we successfully reduced the execution time to a matter of minutes, while retaining a high degree of accuracy even in records with data entry errors. We also propose techniques to prevent the inference of identifier information when knowledge of underlying data distributions is known to an adversary. Finally, we discuss how records can be tracked utilizing the detection results during query processing.

  11. Path Index Based Keywords to SPARQL Query Transformation for Semantic Data Federations

    Directory of Open Access Journals (Sweden)

    Thilini Cooray

    2016-06-01

    Full Text Available Semantic web is a highly emerging research domain. Enhancing the ability of keyword query processing on Semantic Web data provides a huge support for familiarizing the usefulness of Semantic Web to the general public. Most of the existing approaches focus on just user keyword matching to RDF graphs and output the connecting elements as results. Semantic Web consists of SPARQL query language which can process queries more accurately and efficiently than general keyword matching. There are only about a couple of approaches available for transforming keyword queries to SPARQL. They basically rely on real time graph traversals? for identifying subgraphs which can connect user keywords. Those approaches are either limited to query processing on a single data store or a set of interlinked data sets. They have not focused on query processing on a federation of independent data sets which belongs to the same domain. This research proposes a Path Index based approach eliminating real time graph traversal for transforming keyword queries to SPARQL. We have introduced an ontology alignment based approach for keyword query transforming on a federation of RDF data stored using multiple heterogeneous vocabularies. Evaluation shows that the proposed approach have the ability to generate SPARQL queries which can provide highly relevant results for user keyword queries. The Path Index based query transformation approach has also achieved high efficiency compared to the existing approach.

  12. Federated querying architecture with clinical & translational health IT application.

    Science.gov (United States)

    Livne, Oren E; Schultz, N Dustin; Narus, Scott P

    2011-10-01

    We present a software architecture that federates data from multiple heterogeneous health informatics data sources owned by multiple organizations. The architecture builds upon state-of-the-art open-source Java and XML frameworks in innovative ways. It consists of (a) federated query engine, which manages federated queries and result set aggregation via a patient identification service; and (b) data source facades, which translate the physical data models into a common model on-the-fly and handle large result set streaming. System modules are connected via reusable Apache Camel integration routes and deployed to an OSGi enterprise service bus. We present an application of our architecture that allows users to construct queries via the i2b2 web front-end, and federates patient data from the University of Utah Enterprise Data Warehouse and the Utah Population database. Our system can be easily adopted, extended and integrated with existing SOA Healthcare and HL7 frameworks such as i2b2 and caGrid.

  13. Parallelizing Federated SPARQL Queries in Presence of Replicated Data

    DEFF Research Database (Denmark)

    Minier, Thomas; Montoya, Gabriela; Skaf-Molli, Hala

    2017-01-01

    Federated query engines have been enhanced to exploit new data localities created by replicated data, e.g., Fedra. However, existing replication aware federated query engines mainly focus on pruning sources during the source selection and query decomposition in order to reduce intermediate result...

  14. Federated query services provided by the Seamless SAR Archive project

    Science.gov (United States)

    Baker, S.; Bryson, G.; Buechler, B.; Meertens, C. M.; Crosby, C. J.; Fielding, E. J.; Nicoll, J.; Youn, C.; Baru, C.

    2013-12-01

    The NASA Advancing Collaborative Connections for Earth System Science (ACCESS) seamless synthetic aperture radar (SAR) archive (SSARA) project is a 2-year collaboration between UNAVCO, the Alaska Satellite Facility (ASF), the Jet Propulsion Laboratory (JPL), and OpenTopography at the San Diego Supercomputer Center (SDSC) to design and implement a seamless distributed access system for SAR data and derived data products (i.e. interferograms). A major milestone for the first year of the SSARA project was a unified application programming interface (API) for SAR data search and results at ASF and UNAVCO (WInSAR and EarthScope data archives) through the use of simple web services. A federated query service was developed using the unified APIs, providing users a single search interface for both archives (http://www.unavco.org/ws/brokered/ssara/sar/search). A command line client that utilizes this new service is provided as an open source utility for the community on GitHub (https://github.com/bakerunavco/SSARA). Further API development and enhancements added more InSAR specific keywords and quality control parameters (Doppler centroid, faraday rotation, InSAR stack size, and perpendicular baselines). To facilitate InSAR processing, the federated query service incorporated URLs for DEM (from OpenTopography) and tropospheric corrections (from the JPL OSCAR service) in addition to the URLs for SAR data. This federated query service will provide relevant QC metadata for selecting pairs of SAR data for InSAR processing and all the URLs necessary for interferogram generation. Interest from the international community has prompted an effort to incorporate other SAR data archives (the ESA Virtual Archive 4 and the DLR TerraSAR-X_SSC Geohazard Supersites and Natural Laboratories collections) into the federated query service which provide data for researchers outside the US and North America.

  15. SkyQuery - A Prototype Distributed Query and Cross-Matching Web Service for the Virtual Observatory

    Science.gov (United States)

    Thakar, A. R.; Budavari, T.; Malik, T.; Szalay, A. S.; Fekete, G.; Nieto-Santisteban, M.; Haridas, V.; Gray, J.

    2002-12-01

    We have developed a prototype distributed query and cross-matching service for the VO community, called SkyQuery, which is implemented with hierarchichal Web Services. SkyQuery enables astronomers to run combined queries on existing distributed heterogeneous astronomy archives. SkyQuery provides a simple, user-friendly interface to run distributed queries over the federation of registered astronomical archives in the VO. The SkyQuery client connects to the portal Web Service, which farms the query out to the individual archives, which are also Web Services called SkyNodes. The cross-matching algorithm is run recursively on each SkyNode. Each archive is a relational DBMS with a HTM index for fast spatial lookups. The results of the distributed query are returned as an XML DataSet that is automatically rendered by the client. SkyQuery also returns the image cutout corresponding to the query result. SkyQuery finds not only matches between the various catalogs, but also dropouts - objects that exist in some of the catalogs but not in others. This is often as important as finding matches. We demonstrate the utility of SkyQuery with a brown-dwarf search between SDSS and 2MASS, and a search for radio-quiet quasars in SDSS, 2MASS and FIRST. The importance of a service like SkyQuery for the worldwide astronomical community cannot be overstated: data on the same objects in various archives is mapped in different wavelength ranges and looks very different due to different errors, instrument sensitivities and other peculiarities of each archive. Our cross-matching algorithm preforms a fuzzy spatial join across multiple catalogs. This type of cross-matching is currently often done by eye, one object at a time. A static cross-identification table for a set of archives would become obsolete by the time it was built - the exponential growth of astronomical data means that a dynamic cross-identification mechanism like SkyQuery is the only viable option. SkyQuery was funded by a

  16. Error Checking for Chinese Query by Mining Web Log

    Directory of Open Access Journals (Sweden)

    Jianyong Duan

    2015-01-01

    Full Text Available For the search engine, error-input query is a common phenomenon. This paper uses web log as the training set for the query error checking. Through the n-gram language model that is trained by web log, the queries are analyzed and checked. Some features including query words and their number are introduced into the model. At the same time data smoothing algorithm is used to solve data sparseness problem. It will improve the overall accuracy of the n-gram model. The experimental results show that it is effective.

  17. Improving Web Search for Difficult Queries

    Science.gov (United States)

    Wang, Xuanhui

    2009-01-01

    Search engines have now become essential tools in all aspects of our life. Although a variety of information needs can be served very successfully, there are still a lot of queries that search engines can not answer very effectively and these queries always make users feel frustrated. Since it is quite often that users encounter such "difficult…

  18. The effect of query complexity on Web searching results

    Directory of Open Access Journals (Sweden)

    B.J. Jansen

    2000-01-01

    Full Text Available This paper presents findings from a study of the effects of query structure on retrieval by Web search services. Fifteen queries were selected from the transaction log of a major Web search service in simple query form with no advanced operators (e.g., Boolean operators, phrase operators, etc. and submitted to 5 major search engines - Alta Vista, Excite, FAST Search, Infoseek, and Northern Light. The results from these queries became the baseline data. The original 15 queries were then modified using the various search operators supported by each of the 5 search engines for a total of 210 queries. Each of these 210 queries was also submitted to the applicable search service. The results obtained were then compared to the baseline results. A total of 2,768 search results were returned by the set of all queries. In general, increasing the complexity of the queries had little effect on the results with a greater than 70% overlap in results, on average. Implications for the design of Web search services and directions for future research are discussed.

  19. SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases.

    Science.gov (United States)

    Schweiger, Dominik; Trajanoski, Zlatko; Pabinger, Stephan

    2014-08-15

    Semantic Web has established itself as a framework for using and sharing data across applications and database boundaries. Here, we present a web-based platform for querying biological Semantic Web databases in a graphical way. SPARQLGraph offers an intuitive drag & drop query builder, which converts the visual graph into a query and executes it on a public endpoint. The tool integrates several publicly available Semantic Web databases, including the databases of the just recently released EBI RDF platform. Furthermore, it provides several predefined template queries for answering biological questions. Users can easily create and save new query graphs, which can also be shared with other researchers. This new graphical way of creating queries for biological Semantic Web databases considerably facilitates usability as it removes the requirement of knowing specific query languages and database structures. The system is freely available at http://sparqlgraph.i-med.ac.at.

  20. Parasol: An Architecture for Cross-Cloud Federated Graph Querying

    Energy Technology Data Exchange (ETDEWEB)

    Lieberman, Michael; Choudhury, Sutanay; Hughes, Marisa; Patrone, Dennis; Hider, Sandy; Piatko, Christine; Chapman, Matthew; Marple, JP; Silberberg, David

    2014-06-22

    Large scale data fusion of multiple datasets can often provide in- sights that examining datasets individually cannot. However, when these datasets reside in different data centers and cannot be collocated due to technical, administrative, or policy barriers, a unique set of problems arise that hamper querying and data fusion. To ad- dress these problems, a system and architecture named Parasol is presented that enables federated queries over graph databases residing in multiple clouds. Parasol’s design is flexible and requires only minimal assumptions for participant clouds. Query optimization techniques are also described that are compatible with Parasol’s lightweight architecture. Experiments on a prototype implementation of Parasol indicate its suitability for cross-cloud federated graph queries.

  1. Web page sorting algorithm based on query keyword distance relation

    Science.gov (United States)

    Yang, Han; Cui, Hong Gang; Tang, Hao

    2017-08-01

    In order to optimize the problem of page sorting, according to the search keywords in the web page in the relationship between the characteristics of the proposed query keywords clustering ideas. And it is converted into the degree of aggregation of the search keywords in the web page. Based on the PageRank algorithm, the clustering degree factor of the query keyword is added to make it possible to participate in the quantitative calculation. This paper proposes an improved algorithm for PageRank based on the distance relation between search keywords. The experimental results show the feasibility and effectiveness of the method.

  2. jQuery mobile web development essentials

    CERN Document Server

    Camden, Raymond

    2013-01-01

    Packed with practical examples, code, and screenshots, this book will show you how to create mobile optimized sites using the easiest, most practical HTML/JavaScript framework available today.If you are a web developer looking to create mobile optimized websites then this book is for you. Basic knowledge of HTML is required. Some familiarity with JavaScript will help, but is not required.

  3. BioFed: federated query processing over life sciences linked open data.

    Science.gov (United States)

    Hasnain, Ali; Mehmood, Qaiser; Sana E Zainab, Syeda; Saleem, Muhammad; Warren, Claude; Zehra, Durre; Decker, Stefan; Rebholz-Schuhmann, Dietrich

    2017-03-15

    endpoint's availability based on the EndpointData graph. Our evaluation of BioFed against FedX is based on 20 heterogeneous federated SPARQL queries and shows competitive execution performance in comparison to FedX, which can be attributed to the provision of provenance information for the source selection. Developing and testing federated query engines for life sciences data is still a challenging task. According to our findings, it is advantageous to optimise the source selection. The cataloguing of SPARQL endpoints, including type and property indexing, leads to efficient querying of data resources over the Web of Data. This could even be further improved through the use of ontologies, e.g., for abstract normalisation of query terms.

  4. Data management and query processing in semantic web databases

    CERN Document Server

    Groppe, Sven

    2011-01-01

    The Semantic Web, which is intended to establish a machine-understandable Web, is currently changing from being an emerging trend to a technology used in complex real-world applications. A number of standards and techniques have been developed by the World Wide Web Consortium (W3C), e.g., the Resource Description Framework (RDF), which provides a general method for conceptual descriptions for Web resources, and SPARQL, an RDF querying language. Recent examples of large RDF data with billions of facts include the UniProt comprehensive catalog of protein sequence, function and annotation data, t

  5. Ensemble learned vaccination uptake prediction using web search queries

    DEFF Research Database (Denmark)

    Hansen, Niels Dalum; Lioma, Christina; Mølbak, Kåre

    2016-01-01

    We present a method that uses ensemble learning to combine clinical and web-mined time-series data in order to predict future vaccination uptake. The clinical data is official vaccination registries, and the web data is query frequencies collected from Google Trends. Experiments with official...... vaccine records show that our method predicts vaccination uptake eff?ectively (4.7 Root Mean Squared Error). Whereas performance is best when combining clinical and web data, using solely web data yields comparative performance. To our knowledge, this is the ?first study to predict vaccination uptake...

  6. Developing responsive web applications with Ajax and jQuery

    CERN Document Server

    Patel, Sandeep Kumar

    2014-01-01

    This book is a standard tutorial for web application developers presented in a comprehensive, step-by-step manner to explain the nuances involved. It has an abundance of code and examples supporting explanations of each feature. This book is intended for Java developers wanting to create rich and responsive applications using AJAX. Basic experience of using jQuery is assumed.

  7. The Odyssey Approach for Optimizing Federated SPARQL Queries

    DEFF Research Database (Denmark)

    Montoya, Gabriela; Skaf-Molli, Hala; Hose, Katja

    2017-01-01

    . Nevertheless, these plans may still exhibit a high number of intermediate results or high execution times because of heuristics and inaccurate cost estimations. In this paper, we present Odyssey, an approach that uses statistics that allow for a more accurate cost estimation for federated queries and therefore...

  8. Designing a federated multimedia information system on the semantic web

    NARCIS (Netherlands)

    Vdovják, R.; Barna, P.; Houben, G.J.P.M.; Eder, J.; Missikoff, M.

    2003-01-01

    A federated Web-based multimedia information system on one hand gathers its data from various Web sources, on the other hand offers the end-user a rich semantics describing its content and a user-friendly environment for expressing queries over its data. There are three essential ingredients to

  9. CrossQuery: a web tool for easy associative querying of transcriptome data.

    Directory of Open Access Journals (Sweden)

    Toni U Wagner

    Full Text Available Enormous amounts of data are being generated by modern methods such as transcriptome or exome sequencing and microarray profiling. Primary analyses such as quality control, normalization, statistics and mapping are highly complex and need to be performed by specialists. Thereafter, results are handed back to biomedical researchers, who are then confronted with complicated data lists. For rather simple tasks like data filtering, sorting and cross-association there is a need for new tools which can be used by non-specialists. Here, we describe CrossQuery, a web tool that enables straight forward, simple syntax queries to be executed on transcriptome sequencing and microarray datasets. We provide deep-sequencing data sets of stem cell lines derived from the model fish Medaka and microarray data of human endothelial cells. In the example datasets provided, mRNA expression levels, gene, transcript and sample identification numbers, GO-terms and gene descriptions can be freely correlated, filtered and sorted. Queries can be saved for later reuse and results can be exported to standard formats that allow copy-and-paste to all widespread data visualization tools such as Microsoft Excel. CrossQuery enables researchers to quickly and freely work with transcriptome and microarray data sets requiring only minimal computer skills. Furthermore, CrossQuery allows growing association of multiple datasets as long as at least one common point of correlated information, such as transcript identification numbers or GO-terms, is shared between samples. For advanced users, the object-oriented plug-in and event-driven code design of both server-side and client-side scripts allow easy addition of new features, data sources and data types.

  10. CrossQuery: a web tool for easy associative querying of transcriptome data.

    Science.gov (United States)

    Wagner, Toni U; Fischer, Andreas; Thoma, Eva C; Schartl, Manfred

    2011-01-01

    Enormous amounts of data are being generated by modern methods such as transcriptome or exome sequencing and microarray profiling. Primary analyses such as quality control, normalization, statistics and mapping are highly complex and need to be performed by specialists. Thereafter, results are handed back to biomedical researchers, who are then confronted with complicated data lists. For rather simple tasks like data filtering, sorting and cross-association there is a need for new tools which can be used by non-specialists. Here, we describe CrossQuery, a web tool that enables straight forward, simple syntax queries to be executed on transcriptome sequencing and microarray datasets. We provide deep-sequencing data sets of stem cell lines derived from the model fish Medaka and microarray data of human endothelial cells. In the example datasets provided, mRNA expression levels, gene, transcript and sample identification numbers, GO-terms and gene descriptions can be freely correlated, filtered and sorted. Queries can be saved for later reuse and results can be exported to standard formats that allow copy-and-paste to all widespread data visualization tools such as Microsoft Excel. CrossQuery enables researchers to quickly and freely work with transcriptome and microarray data sets requiring only minimal computer skills. Furthermore, CrossQuery allows growing association of multiple datasets as long as at least one common point of correlated information, such as transcript identification numbers or GO-terms, is shared between samples. For advanced users, the object-oriented plug-in and event-driven code design of both server-side and client-side scripts allow easy addition of new features, data sources and data types.

  11. A study of medical and health queries to web search engines.

    Science.gov (United States)

    Spink, Amanda; Yang, Yin; Jansen, Jim; Nykanen, Pirrko; Lorence, Daniel P; Ozmutlu, Seda; Ozmutlu, H Cenk

    2004-03-01

    This paper reports findings from an analysis of medical or health queries to different web search engines. We report results: (i). comparing samples of 10000 web queries taken randomly from 1.2 million query logs from the AlltheWeb.com and Excite.com commercial web search engines in 2001 for medical or health queries, (ii). comparing the 2001 findings from Excite and AlltheWeb.com users with results from a previous analysis of medical and health related queries from the Excite Web search engine for 1997 and 1999, and (iii). medical or health advice-seeking queries beginning with the word 'should'. Findings suggest: (i). a small percentage of web queries are medical or health related, (ii). the top five categories of medical or health queries were: general health, weight issues, reproductive health and puberty, pregnancy/obstetrics, and human relationships, and (iii). over time, the medical and health queries may have declined as a proportion of all web queries, as the use of specialized medical/health websites and e-commerce-related queries has increased. Findings provide insights into medical and health-related web querying and suggests some implications for the use of the general web search engines when seeking medical/health information.

  12. Web search queries can predict stock market volumes.

    Science.gov (United States)

    Bordino, Ilaria; Battiston, Stefano; Caldarelli, Guido; Cristelli, Matthieu; Ukkonen, Antti; Weber, Ingmar

    2012-01-01

    We live in a computerized and networked society where many of our actions leave a digital trace and affect other people's actions. This has lead to the emergence of a new data-driven research field: mathematical methods of computer science, statistical physics and sociometry provide insights on a wide range of disciplines ranging from social science to human mobility. A recent important discovery is that search engine traffic (i.e., the number of requests submitted by users to search engines on the www) can be used to track and, in some cases, to anticipate the dynamics of social phenomena. Successful examples include unemployment levels, car and home sales, and epidemics spreading. Few recent works applied this approach to stock prices and market sentiment. However, it remains unclear if trends in financial markets can be anticipated by the collective wisdom of on-line users on the web. Here we show that daily trading volumes of stocks traded in NASDAQ-100 are correlated with daily volumes of queries related to the same stocks. In particular, query volumes anticipate in many cases peaks of trading by one day or more. Our analysis is carried out on a unique dataset of queries, submitted to an important web search engine, which enable us to investigate also the user behavior. We show that the query volume dynamics emerges from the collective but seemingly uncoordinated activity of many users. These findings contribute to the debate on the identification of early warnings of financial systemic risk, based on the activity of users of the www.

  13. Web search queries can predict stock market volumes.

    Directory of Open Access Journals (Sweden)

    Ilaria Bordino

    Full Text Available We live in a computerized and networked society where many of our actions leave a digital trace and affect other people's actions. This has lead to the emergence of a new data-driven research field: mathematical methods of computer science, statistical physics and sociometry provide insights on a wide range of disciplines ranging from social science to human mobility. A recent important discovery is that search engine traffic (i.e., the number of requests submitted by users to search engines on the www can be used to track and, in some cases, to anticipate the dynamics of social phenomena. Successful examples include unemployment levels, car and home sales, and epidemics spreading. Few recent works applied this approach to stock prices and market sentiment. However, it remains unclear if trends in financial markets can be anticipated by the collective wisdom of on-line users on the web. Here we show that daily trading volumes of stocks traded in NASDAQ-100 are correlated with daily volumes of queries related to the same stocks. In particular, query volumes anticipate in many cases peaks of trading by one day or more. Our analysis is carried out on a unique dataset of queries, submitted to an important web search engine, which enable us to investigate also the user behavior. We show that the query volume dynamics emerges from the collective but seemingly uncoordinated activity of many users. These findings contribute to the debate on the identification of early warnings of financial systemic risk, based on the activity of users of the www.

  14. Drexel at TREC 2014 Federated Web Search Track

    Science.gov (United States)

    2014-11-01

    of its input RS results. 1. INTRODUCTION Federated Web Search is the task of searching multiple search engines simultaneously and combining their...or distributed properly[5]. The goal of RS is then, for a given query, to select only the most promising search engines from all those available. Most...result pages of 149 search engines . 4000 queries are used in building the sample set. As a part of the Vertical Selection task, search engines are

  15. TopFed: TCGA tailored federated query processing and linking to LOD.

    Science.gov (United States)

    Saleem, Muhammad; Padmanabhuni, Shanmukha S; Ngomo, Axel-Cyrille Ngonga; Iqbal, Aftab; Almeida, Jonas S; Decker, Stefan; Deus, Helena F

    2014-01-01

    The Cancer Genome Atlas (TCGA) is a multidisciplinary, multi-institutional effort to catalogue genetic mutations responsible for cancer using genome analysis techniques. One of the aims of this project is to create a comprehensive and open repository of cancer related molecular analysis, to be exploited by bioinformaticians towards advancing cancer knowledge. However, devising bioinformatics applications to analyse such large dataset is still challenging, as it often requires downloading large archives and parsing the relevant text files. Therefore, it is making it difficult to enable virtual data integration in order to collect the critical co-variates necessary for analysis. We address these issues by transforming the TCGA data into the Semantic Web standard Resource Description Format (RDF), link it to relevant datasets in the Linked Open Data (LOD) cloud and further propose an efficient data distribution strategy to host the resulting 20.4 billion triples data via several SPARQL endpoints. Having the TCGA data distributed across multiple SPARQL endpoints, we enable biomedical scientists to query and retrieve information from these SPARQL endpoints by proposing a TCGA tailored federated SPARQL query processing engine named TopFed. We compare TopFed with a well established federation engine FedX in terms of source selection and query execution time by using 10 different federated SPARQL queries with varying requirements. Our evaluation results show that TopFed selects on average less than half of the sources (with 100% recall) with query execution time equal to one third to that of FedX. With TopFed, we aim to offer biomedical scientists a single-point-of-access through which distributed TCGA data can be accessed in unison. We believe the proposed system can greatly help researchers in the biomedical domain to carry out their research effectively with TCGA as the amount and diversity of data exceeds the ability of local resources to handle its retrieval and

  16. What Snippets Say About Pages in Federated Web Search

    NARCIS (Netherlands)

    Demeester, Thomas; Nguyen, Dong-Phuong; Trieschnigg, Rudolf Berend; Develder, Chris; Hiemstra, Djoerd; Hou, Yuexian; Nie, Jian-Yun; Sun, Le; Wang, Bo; Zhang, Peng

    2012-01-01

    What is the likelihood that a Web page is considered relevant to a query, given the relevance assessment of the corresponding snippet? Using a new federated IR test collection that contains search results from over a hundred search engines on the internet, we are able to investigate such research

  17. Web-based topology queries on a BIM model

    DEFF Research Database (Denmark)

    Rasmussen, Mads Holten; Hviid, Christian Anker; Karlshøj, Jan

    Building Information Modeling (BIM) is in the industry often confused with 3D-modeling regardless that the potential of modeling information goes way beyond performing clash detections on geometrical objects occupying the same physical space. Lately, several research projects have tried to change...... that by extending BIM with information using linked data technologies. However, when showing information alone the strong communication benefits of 3D are neglected, and a practical way of connecting the two worlds is currently missing. In this paper, we present a prototype of a visual query interface running...... is to establish a baseline for discussion of the general design choices that have been considered, and the developed application further serves as a proof of concept for combining BIM model data with a knowledge graph and potentially other sources of Linked Open Data, in a simple web interface....

  18. Seasonal Web Search Query Selection for Influenza-Like Illness (ILI) Estimation

    DEFF Research Database (Denmark)

    Hansen, Niels Dalum; Mølbak, Kåre; Cox, Ingemar Johansson

    2017-01-01

    Inuenza-like illness (ILI) estimation from web search data is an important web analytics task. The basic idea is to use the frequencies of queries in web search logs that are correlated with past ILI activity as features when estimating current ILI activity. It has been noted that since inuenza...

  19. Query transformations and their role in Web searching by the members of the general public

    Directory of Open Access Journals (Sweden)

    Martin Whittle

    2006-01-01

    Full Text Available Introduction. This paper reports preliminary research in a primarily experimental study of how the general public search for information on the Web. The focus is on the query transformation patterns that characterise searching. Method. In this work, we have used transaction logs from the Excite search engine to develop methods for analysing query transformations that should aid the analysis of our ongoing experimental work. Our methods involve the use of similarity techniques to link queries with the most similar previous query in a train. The resulting query transformations are represented as a list of codes representing a whole search. Analysis. It is shown how query transformation sequences can be represented as graphical networks and some basic statistical results are shown. A correlation analysis is performed to examine the co-occurrence of Boolean and quotation mark changes with the syntactic changes. Results. A frequency analysis of the occurrence of query transformation codes is presented. The connectivity of graphs obtained from the query transformation is investigated and found to follow an exponential scaling law. The correlation analysis reveals a number of patterns that provide some interesting insights into Web searching by the general public. Conclusion. We have developed analytical methods based on query similarity that can be applied to our current experimental work with volunteer subjects. The results of these will form part of a database with the aim of developing an improved understanding of how the public search the Web.

  20. Memory-Aware Query Routing in Interactive Web-based Information Systems

    NARCIS (Netherlands)

    F. Waas; M.L. Kersten (Martin)

    2001-01-01

    textabstractQuery throughput is one of the primary optimization goals in interactive web-based information systems in order to achieve the performance necessary to serve large user communities. Queries in this application domain differ significantly from those in traditional database applications:

  1. Federated queries of clinical data repositories: the sum of the parts does not equal the whole.

    Science.gov (United States)

    Weber, Griffin M

    2013-06-01

    In 2008 we developed a shared health research information network (SHRINE), which for the first time enabled research queries across the full patient populations of four Boston hospitals. It uses a federated architecture, where each hospital returns only the aggregate count of the number of patients who match a query. This allows hospitals to retain control over their local databases and comply with federal and state privacy laws. However, because patients may receive care from multiple hospitals, the result of a federated query might differ from what the result would be if the query were run against a single central repository. This paper describes the situations when this happens and presents a technique for correcting these errors. We use a one-time process of identifying which patients have data in multiple repositories by comparing one-way hash values of patient demographics. This enables us to partition the local databases such that all patients within a given partition have data at the same subset of hospitals. Federated queries are then run separately on each partition independently, and the combined results are presented to the user. Using theoretical bounds and simulated hospital networks, we demonstrate that once the partitions are made, SHRINE can produce more precise estimates of the number of patients matching a query. Uncertainty in the overlap of patient populations across hospitals limits the effectiveness of SHRINE and other federated query tools. Our technique reduces this uncertainty while retaining an aggregate federated architecture.

  2. Bat-Inspired Algorithm Based Query Expansion for Medical Web Information Retrieval.

    Science.gov (United States)

    Khennak, Ilyes; Drias, Habiba

    2017-02-01

    With the increasing amount of medical data available on the Web, looking for health information has become one of the most widely searched topics on the Internet. Patients and people of several backgrounds are now using Web search engines to acquire medical information, including information about a specific disease, medical treatment or professional advice. Nonetheless, due to a lack of medical knowledge, many laypeople have difficulties in forming appropriate queries to articulate their inquiries, which deem their search queries to be imprecise due the use of unclear keywords. The use of these ambiguous and vague queries to describe the patients' needs has resulted in a failure of Web search engines to retrieve accurate and relevant information. One of the most natural and promising method to overcome this drawback is Query Expansion. In this paper, an original approach based on Bat Algorithm is proposed to improve the retrieval effectiveness of query expansion in medical field. In contrast to the existing literature, the proposed approach uses Bat Algorithm to find the best expanded query among a set of expanded query candidates, while maintaining low computational complexity. Moreover, this new approach allows the determination of the length of the expanded query empirically. Numerical results on MEDLINE, the on-line medical information database, show that the proposed approach is more effective and efficient compared to the baseline.

  3. Mandatory Class 1 Federal Areas Web Service

    Data.gov (United States)

    U.S. Environmental Protection Agency — This web service contains the following layers: Mandatory Class 1 Federal Area polygons and Mandatory Class 1 Federal Area labels in the United States. The polygon...

  4. On the Querying for Places on the Mobile Web

    DEFF Research Database (Denmark)

    Jensen, Christian S.

    2011-01-01

    The web is undergoing a fundamental transformation: it is becoming mobile and is acquiring a spatial dimension. Thus, the web is increasingly being used from mobile devices, notably smartphones, that can be geo-positioned using GPS or technologies that exploit wireless communication networks...

  5. Analysing Twitter and web queries for flu trend prediction.

    Science.gov (United States)

    Santos, José Carlos; Matos, Sérgio

    2014-05-07

    Social media platforms encourage people to share diverse aspects of their daily life. Among these, shared health related information might be used to infer health status and incidence rates for specific conditions or symptoms. In this work, we present an infodemiology study that evaluates the use of Twitter messages and search engine query logs to estimate and predict the incidence rate of influenza like illness in Portugal. Based on a manually classified dataset of 2704 tweets from Portugal, we selected a set of 650 textual features to train a Naïve Bayes classifier to identify tweets mentioning flu or flu-like illness or symptoms. We obtained a precision of 0.78 and an F-measure of 0.83, based on cross validation over the complete annotated set. Furthermore, we trained a multiple linear regression model to estimate the health-monitoring data from the Influenzanet project, using as predictors the relative frequencies obtained from the tweet classification results and from query logs, and achieved a correlation ratio of 0.89 (puser-generated content have mostly focused on the english language. Our results further validate those studies and show that by changing the initial steps of data preprocessing and feature extraction and selection, the proposed approaches can be adapted to other languages. Additionally, we investigated whether the predictive model created can be applied to data from the subsequent flu season. In this case, although the prediction result was good, an initial phase to adapt the regression model could be necessary to achieve more robust results.

  6. Effective Filtering of Query Results on Updated User Behavioral Profiles in Web Mining

    Directory of Open Access Journals (Sweden)

    S. Sadesh

    2015-01-01

    Full Text Available Web with tremendous volume of information retrieves result for user related queries. With the rapid growth of web page recommendation, results retrieved based on data mining techniques did not offer higher performance filtering rate because relationships between user profile and queries were not analyzed in an extensive manner. At the same time, existing user profile based prediction in web data mining is not exhaustive in producing personalized result rate. To improve the query result rate on dynamics of user behavior over time, Hamilton Filtered Regime Switching User Query Probability (HFRS-UQP framework is proposed. HFRS-UQP framework is split into two processes, where filtering and switching are carried out. The data mining based filtering in our research work uses the Hamilton Filtering framework to filter user result based on personalized information on automatic updated profiles through search engine. Maximized result is fetched, that is, filtered out with respect to user behavior profiles. The switching performs accurate filtering updated profiles using regime switching. The updating in profile change (i.e., switches regime in HFRS-UQP framework identifies the second- and higher-order association of query result on the updated profiles. Experiment is conducted on factors such as personalized information search retrieval rate, filtering efficiency, and precision ratio.

  7. Index Compression and Efficient Query Processing in Large Web Search Engines

    Science.gov (United States)

    Ding, Shuai

    2013-01-01

    The inverted index is the main data structure used by all the major search engines. Search engines build an inverted index on their collection to speed up query processing. As the size of the web grows, the length of the inverted list structures, which can easily grow to hundreds of MBs or even GBs for common terms (roughly linear in the size of…

  8. haploR: an R package for querying web-based annotation tools.

    Science.gov (United States)

    Zhbannikov, Ilya Y; Arbeev, Konstantin; Ukraintseva, Svetlana; Yashin, Anatoliy I

    2017-01-01

    We developed haploR , an R package for querying web based genome annotation tools HaploReg and RegulomeDB. haploR gathers information in a data frame which is suitable for downstream bioinformatic analyses. This will facilitate post-genome wide association studies streamline analysis for rapid discovery and interpretation of genetic associations.

  9. Exploration of Web Users' Search Interests through Automatic Subject Categorization of Query Terms.

    Science.gov (United States)

    Pu, Hsiao-tieh; Yang, Chyan; Chuang, Shui-Lung

    2001-01-01

    Proposes a mechanism that carefully integrates human and machine efforts to explore Web users' search interests. The approach consists of a four-step process: extraction of core terms; construction of subject taxonomy; automatic subject categorization of query terms; and observation of users' search interests. Research findings are proved valuable…

  10. Spatial Search Techniques for Mobile 3D Queries in Sensor Web Environments

    Directory of Open Access Journals (Sweden)

    James D. Carswell

    2013-03-01

    Full Text Available Developing mobile geo-information systems for sensor web applications involves technologies that can access linked geographical and semantically related Internet information. Additionally, in tomorrow’s Web 4.0 world, it is envisioned that trillions of inexpensive micro-sensors placed throughout the environment will also become available for discovery based on their unique geo-referenced IP address. Exploring these enormous volumes of disparate heterogeneous data on today’s location and orientation aware smartphones requires context-aware smart applications and services that can deal with “information overload”. 3DQ (Three Dimensional Query is our novel mobile spatial interaction (MSI prototype that acts as a next-generation base for human interaction within such geospatial sensor web environments/urban landscapes. It filters information using “Hidden Query Removal” functionality that intelligently refines the search space by calculating the geometry of a three dimensional visibility shape (Vista space at a user’s current location. This 3D shape then becomes the query “window” in a spatial database for retrieving information on only those objects visible within a user’s actual 3D field-of-view. 3DQ reduces information overload and serves to heighten situation awareness on constrained commercial off-the-shelf devices by providing visibility space searching as a mobile web service. The effects of variations in mobile spatial search techniques in terms of query speed vs. accuracy are evaluated and presented in this paper.

  11. Manually Classifying User Search Queries on an Academic Library Web Site

    Science.gov (United States)

    Chapman, Suzanne; Desai, Shevon; Hagedorn, Kat; Varnum, Ken; Mishra, Sonali; Piacentine, Julie

    2013-01-01

    The University of Michigan Library wanted to learn more about the kinds of searches its users were conducting through the "one search" search box on the Library Web site. Library staff conducted two investigations. A preliminary investigation in 2011 involved the manual review of the 100 most frequently occurring queries conducted…

  12. Utility of Web search query data in testing theoretical assumptions about mephedrone.

    Science.gov (United States)

    Kapitány-Fövény, Máté; Demetrovics, Zsolt

    2017-05-01

    With growing access to the Internet, people who use drugs and traffickers started to obtain information about novel psychoactive substances (NPS) via online platforms. This paper aims to analyze whether a decreasing Web interest in formerly banned substances-cocaine, heroin, and MDMA-and the legislative status of mephedrone predict Web interest about this NPS. Google Trends was used to measure changes of Web interest on cocaine, heroin, MDMA, and mephedrone. Google search results for mephedrone within the same time frame were analyzed and categorized. Web interest about classic drugs found to be more persistent. Regarding geographical distribution, location of Web searches for heroin and cocaine was less centralized. Illicit status of mephedrone was a negative predictor of its Web search query rates. The connection between mephedrone-related Web search rates and legislative status of this substance was significantly mediated by ecstasy-related Web search queries, the number of documentaries, and forum/blog entries about mephedrone. The results might provide support for the hypothesis that mephedrone's popularity was highly correlated with its legal status as well as it functioned as a potential substitute for MDMA. Google Trends was found to be a useful tool for testing theoretical assumptions about NPS. Copyright © 2017 John Wiley & Sons, Ltd.

  13. Federal health web sites: current & future roles.

    Science.gov (United States)

    Cronin, Carol

    2002-09-01

    An examination of the current and possible future roles of federal health Web sites, this paper provides an overview of site categories, functions, target audiences, marketing approaches, knowledge management, and evaluation strategies. It concludes with a look at future opportunities and challenges for the federal government in providing health information online.

  14. High-performance web services for querying gene and variant annotation.

    Science.gov (United States)

    Xin, Jiwen; Mark, Adam; Afrasiabi, Cyrus; Tsueng, Ginger; Juchler, Moritz; Gopal, Nikhil; Stupp, Gregory S; Putman, Timothy E; Ainscough, Benjamin J; Griffith, Obi L; Torkamani, Ali; Whetzel, Patricia L; Mungall, Christopher J; Mooney, Sean D; Su, Andrew I; Wu, Chunlei

    2016-05-06

    Efficient tools for data management and integration are essential for many aspects of high-throughput biology. In particular, annotations of genes and human genetic variants are commonly used but highly fragmented across many resources. Here, we describe MyGene.info and MyVariant.info, high-performance web services for querying gene and variant annotation information. These web services are currently accessed more than three million times permonth. They also demonstrate a generalizable cloud-based model for organizing and querying biological annotation information. MyGene.info and MyVariant.info are provided as high-performance web services, accessible at http://mygene.info and http://myvariant.info . Both are offered free of charge to the research community.

  15. Profile-IQ: Web-based data query system for local health department infrastructure and activities.

    Science.gov (United States)

    Shah, Gulzar H; Leep, Carolyn J; Alexander, Dayna

    2014-01-01

    To demonstrate the use of National Association of County & City Health Officials' Profile-IQ, a Web-based data query system, and how policy makers, researchers, the general public, and public health professionals can use the system to generate descriptive statistics on local health departments. This article is a descriptive account of an important health informatics tool based on information from the project charter for Profile-IQ and the authors' experience and knowledge in design and use of this query system. Profile-IQ is a Web-based data query system that is based on open-source software: MySQL 5.5, Google Web Toolkit 2.2.0, Apache Commons Math library, Google Chart API, and Tomcat 6.0 Web server deployed on an Amazon EC2 server. It supports dynamic queries of National Profile of Local Health Departments data on local health department finances, workforce, and activities. Profile-IQ's customizable queries provide a variety of statistics not available in published reports and support the growing information needs of users who do not wish to work directly with data files for lack of staff skills or time, or to avoid a data use agreement. Profile-IQ also meets the growing demand of public health practitioners and policy makers for data to support quality improvement, community health assessment, and other processes associated with voluntary public health accreditation. It represents a step forward in the recent health informatics movement of data liberation and use of open source information technology solutions to promote public health.

  16. Web Exploration Tools for a Fast Federated Optical Survey Database

    Science.gov (United States)

    Humphreys, Roberta M.

    2000-01-01

    We implemented several new web-based tools to improve the efficiency and versatility of access to the APS Catalog of the POSS I (Palomar Observatory-National Geographic Sky Survey) and its associated image database. The most important addition was a federated database system to link the APS Catalog and image database into one Internet-accessible database. With the FDBS, the queries and transactions on the integrated database are performed as if it were a single database. We installed Myriad the FDBS developed by Professor Jaideep Srivastava and members of his group in the University of Minnesota Computer Science Department. It is the first system to provide schema integration, query processing and optimization, and transaction management capabilities in a single framework. The attached figure illustrates the Myriad architecture. The FDBS permits horizontal access to the data, not just vertical. For example, for the APS, queries can be made not only by sky position, but also by any parameter present in either of the databases. APS users will be able to produce an image of all the blue galaxies and stellar sources for comparison with x-ray source error ellipses from AXAF (X Ray Astrophysics Facility) (Chandra) for example. The FDBS is now available as a beta release with the appropriate query forms at our web site. While much of our time was occupied with adapting Myriad to the APS environment, we also made major changes in Star Base, our DBMS for the Catalog, at the web interface to improve its efficiency for issuing and processing queries. Star Base is now three times faster for large queries. Improvements were also made at the web end of the image database for faster access; although work still needs to be done to the image database itself for more efficient return with the FDBS. During the past few years, we made several improvements to the database pipeline that creates the individual plate databases queries by StarBase. The changes include improved positions

  17. A Topological Framework for Interactive Queries on 3D Models in the Web

    Directory of Open Access Journals (Sweden)

    Mauro Figueiredo

    2014-01-01

    Full Text Available Several technologies exist to create 3D content for the web. With X3D, WebGL, and X3DOM, it is possible to visualize and interact with 3D models in a web browser. Frequently, three-dimensional objects are stored using the X3D file format for the web. However, there is no explicit topological information, which makes it difficult to design fast algorithms for applications that require adjacency and incidence data. This paper presents a new open source toolkit TopTri (Topological model for Triangle meshes for Web3D servers that builds the topological model for triangular meshes of manifold or nonmanifold models. Web3D client applications using this toolkit make queries to the web server to get adjacent and incidence information of vertices, edges, and faces. This paper shows the application of the topological information to get minimal local points and iso-lines in a 3D mesh in a web browser. As an application, we present also the interactive identification of stalactites in a cave chamber in a 3D web browser. Several tests show that even for large triangular meshes with millions of triangles, the adjacency and incidence information is returned in real time making the presented toolkit appropriate for interactive Web3D applications.

  18. A Topological Framework for Interactive Queries on 3D Models in the Web

    Science.gov (United States)

    Figueiredo, Mauro; Rodrigues, José I.; Silvestre, Ivo; Veiga-Pires, Cristina

    2014-01-01

    Several technologies exist to create 3D content for the web. With X3D, WebGL, and X3DOM, it is possible to visualize and interact with 3D models in a web browser. Frequently, three-dimensional objects are stored using the X3D file format for the web. However, there is no explicit topological information, which makes it difficult to design fast algorithms for applications that require adjacency and incidence data. This paper presents a new open source toolkit TopTri (Topological model for Triangle meshes) for Web3D servers that builds the topological model for triangular meshes of manifold or nonmanifold models. Web3D client applications using this toolkit make queries to the web server to get adjacent and incidence information of vertices, edges, and faces. This paper shows the application of the topological information to get minimal local points and iso-lines in a 3D mesh in a web browser. As an application, we present also the interactive identification of stalactites in a cave chamber in a 3D web browser. Several tests show that even for large triangular meshes with millions of triangles, the adjacency and incidence information is returned in real time making the presented toolkit appropriate for interactive Web3D applications. PMID:24977236

  19. Mining Genotype-Phenotype Associations from Public Knowledge Sources via Semantic Web Querying.

    Science.gov (United States)

    Kiefer, Richard C; Freimuth, Robert R; Chute, Christopher G; Pathak, Jyotishman

    2013-01-01

    Gene Wiki Plus (GeneWiki+) and the Online Mendelian Inheritance in Man (OMIM) are publicly available resources for sharing information about disease-gene and gene-SNP associations in humans. While immensely useful to the scientific community, both resources are manually curated, thereby making the data entry and publication process time-consuming, and to some degree, error-prone. To this end, this study investigates Semantic Web technologies to validate existing and potentially discover new genotype-phenotype associations in GWP and OMIM. In particular, we demonstrate the applicability of SPARQL queries for identifying associations not explicitly stated for commonly occurring chronic diseases in GWP and OMIM, and report our preliminary findings for coverage, completeness, and validity of the associations. Our results highlight the benefits of Semantic Web querying technology to validate existing disease-gene associations as well as identify novel associations although further evaluation and analysis is required before such information can be applied and used effectively.

  20. QueryArch3D: Querying and Visualising 3D Models of a Maya Archaeological Site in a Web-Based Interface

    Directory of Open Access Journals (Sweden)

    Giorgio Agugiaro

    2011-12-01

    Full Text Available Constant improvements in the field of surveying, computing and distribution of digital-content are reshaping the way Cultural Heritage can be digitised and virtually accessed, even remotely via web. A traditional 2D approach for data access, exploration, retrieval and exploration may generally suffice, however more complex analyses concerning spatial and temporal features require 3D tools, which, in some cases, have not yet been implemented or are not yet generally commercially available. Efficient organisation and integration strategies applicable to the wide array of heterogeneous data in the field of Cultural Heritage represent a hot research topic nowadays. This article presents a visualisation and query tool (QueryArch3D conceived to deal with multi-resolution 3D models. Geometric data are organised in successive levels of detail (LoD, provided with geometric and semantic hierarchies and enriched with attributes coming from external data sources. The visualisation and query front-end enables the 3D navigation of the models in a virtual environment, as well as the interaction with the objects by means of queries based on attributes or on geometries. The tool can be used as a standalone application, or served through the web. The characteristics of the research work, along with some implementation issues and the developed QueryArch3D tool will be discussed and presented.

  1. Domainwise Web Page Optimization Based On Clustered Query Sessions Using Hybrid Of Trust And ACO For Effective Information Retrieval

    Directory of Open Access Journals (Sweden)

    Dr. Suruchi Chawla

    2015-08-01

    Full Text Available Abstract In this paper hybrid of Ant Colony OptimizationACO and trust has been used for domainwise web page optimization in clustered query sessions for effective Information retrieval. The trust of the web page identifies its degree of relevance in satisfying specific information need of the user. The trusted web pages when optimized using pheromone updates in ACO will identify the trusted colonies of web pages which will be relevant to users information need in a given domain. Hence in this paper the hybrid of Trust and ACO has been used on clustered query sessions for identifying more and more relevant number of documents in a given domain in order to better satisfy the information need of the user. Experiment was conducted on the data set of web query sessions to test the effectiveness of the proposed approach in selected three domains Academics Entertainment and Sports and the results confirm the improvement in the precision of search results.

  2. Adding Conflict Resolution Features to a Query Language for Database Federations

    Directory of Open Access Journals (Sweden)

    Kai-Uwe Sattler

    2000-11-01

    Full Text Available A main problem of data integration is the treatment of conflicts caused by different modeling of realworld entities, different data models or simply by different representations of one and the same object. During the integration phase these conflicts have to be identified and resolved as part of the mapping between local and global schemata. Therefore, conflict resolution affects the definition of the integrated view as well as query transformation and evaluation, in this paper we present a SQL extension for defining and querying database federations. This language addresses in particular the resolution of integration conflicts by providing mechanisms for mapping attributes, restructuring relations as well as extended integration operations. Finally, the application of these resolution strategies is briefly explained by presenting a simple conflict resolution method.

  3. A web-based data-querying tool based on ontology-driven methodology and flowchart-based model.

    Science.gov (United States)

    Ping, Xiao-Ou; Chung, Yufang; Tseng, Yi-Ju; Liang, Ja-Der; Yang, Pei-Ming; Huang, Guan-Tarn; Lai, Feipei

    2013-10-08

    Because of the increased adoption rate of electronic medical record (EMR) systems, more health care records have been increasingly accumulating in clinical data repositories. Therefore, querying the data stored in these repositories is crucial for retrieving the knowledge from such large volumes of clinical data. The aim of this study is to develop a Web-based approach for enriching the capabilities of the data-querying system along the three following considerations: (1) the interface design used for query formulation, (2) the representation of query results, and (3) the models used for formulating query criteria. The Guideline Interchange Format version 3.5 (GLIF3.5), an ontology-driven clinical guideline representation language, was used for formulating the query tasks based on the GLIF3.5 flowchart in the Protégé environment. The flowchart-based data-querying model (FBDQM) query execution engine was developed and implemented for executing queries and presenting the results through a visual and graphical interface. To examine a broad variety of patient data, the clinical data generator was implemented to automatically generate the clinical data in the repository, and the generated data, thereby, were employed to evaluate the system. The accuracy and time performance of the system for three medical query tasks relevant to liver cancer were evaluated based on the clinical data generator in the experiments with varying numbers of patients. In this study, a prototype system was developed to test the feasibility of applying a methodology for building a query execution engine using FBDQMs by formulating query tasks using the existing GLIF. The FBDQM-based query execution engine was used to successfully retrieve the clinical data based on the query tasks formatted using the GLIF3.5 in the experiments with varying numbers of patients. The accuracy of the three queries (ie, "degree of liver damage," "degree of liver damage when applying a mutually exclusive setting

  4. A Taxonomic Search Engine: Federating taxonomic databases using web services

    Directory of Open Access Journals (Sweden)

    Page Roderic DM

    2005-03-01

    Full Text Available Abstract Background The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. Results The Taxonomic Search Engine (TSE is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata for each name. Conclusion The Taxonomic Search Engine is available at http://darwin.zoology.gla.ac.uk/~rpage/portal/ and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names.

  5. A Taxonomic Search Engine: federating taxonomic databases using web services.

    Science.gov (United States)

    Page, Roderic D M

    2005-03-09

    The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. The Taxonomic Search Engine (TSE) is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO) and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID) authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata) for each name. The Taxonomic Search Engine is available at http://darwin.zoology.gla.ac.uk/~rpage/portal/ and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names.

  6. The Shared Health Research Information Network (SHRINE): a prototype federated query tool for clinical data repositories.

    Science.gov (United States)

    Weber, Griffin M; Murphy, Shawn N; McMurry, Andrew J; Macfadden, Douglas; Nigrin, Daniel J; Churchill, Susanne; Kohane, Isaac S

    2009-01-01

    The authors developed a prototype Shared Health Research Information Network (SHRINE) to identify the technical, regulatory, and political challenges of creating a federated query tool for clinical data repositories. Separate Institutional Review Boards (IRBs) at Harvard's three largest affiliated health centers approved use of their data, and the Harvard Medical School IRB approved building a Query Aggregator Interface that can simultaneously send queries to each hospital and display aggregate counts of the number of matching patients. Our experience creating three local repositories using the open source Informatics for Integrating Biology and the Bedside (i2b2) platform can be used as a road map for other institutions. The authors are actively working with the IRBs and regulatory groups to develop procedures that will ultimately allow investigators to obtain identified patient data and biomaterials through SHRINE. This will guide us in creating a future technical architecture that is scalable to a national level, compliant with ethical guidelines, and protective of the interests of the participating hospitals.

  7. Mining Genotype-Phenotype Associations from Public Knowledge Sources via Semantic Web Querying

    Science.gov (United States)

    Kiefer, Richard C.; Freimuth, Robert R.; Chute, Christopher G; Pathak, Jyotishman

    Gene Wiki Plus (GeneWiki+) and the Online Mendelian Inheritance in Man (OMIM) are publicly available resources for sharing information about disease-gene and gene-SNP associations in humans. While immensely useful to the scientific community, both resources are manually curated, thereby making the data entry and publication process time-consuming, and to some degree, error-prone. To this end, this study investigates Semantic Web technologies to validate existing and potentially discover new genotype-phenotype associations in GWP and OMIM. In particular, we demonstrate the applicability of SPARQL queries for identifying associations not explicitly stated for commonly occurring chronic diseases in GWP and OMIM, and report our preliminary findings for coverage, completeness, and validity of the associations. Our results highlight the benefits of Semantic Web querying technology to validate existing disease-gene associations as well as identify novel associations although further evaluation and analysis is required before such information can be applied and used effectively. PMID:24303249

  8. Age-related differences in the accuracy of web query-based predictions of influenza-like illness.

    Directory of Open Access Journals (Sweden)

    Alexander Domnich

    Full Text Available Web queries are now widely used for modeling, nowcasting and forecasting influenza-like illness (ILI. However, given that ILI attack rates vary significantly across ages, in terms of both magnitude and timing, little is known about whether the association between ILI morbidity and ILI-related queries is comparable across different age-groups. The present study aimed to investigate features of the association between ILI morbidity and ILI-related query volume from the perspective of age.Since Google Flu Trends is unavailable in Italy, Google Trends was used to identify entry terms that correlated highly with official ILI surveillance data. All-age and age-class-specific modeling was performed by means of linear models with generalized least-square estimation. Hold-out validation was used to quantify prediction accuracy. For purposes of comparison, predictions generated by exponential smoothing were computed.Five search terms showed high correlation coefficients of > .6. In comparison with exponential smoothing, the all-age query-based model correctly predicted the peak time and yielded a higher correlation coefficient with observed ILI morbidity (.978 vs. .929. However, query-based prediction of ILI morbidity was associated with a greater error. Age-class-specific query-based models varied significantly in terms of prediction accuracy. In the 0-4 and 25-44-year age-groups, these did well and outperformed exponential smoothing predictions; in the 15-24 and ≥ 65-year age-classes, however, the query-based models were inaccurate and highly overestimated peak height. In all but one age-class, peak timing predicted by the query-based models coincided with observed timing.The accuracy of web query-based models in predicting ILI morbidity rates could differ among ages. Greater age-specific detail may be useful in flu query-based studies in order to account for age-specific features of the epidemiology of ILI.

  9. Resource Selection for Federated Search on the Web

    NARCIS (Netherlands)

    Nguyen, Dong-Phuong; Demeester, Thomas; Trieschnigg, Rudolf Berend; Hiemstra, Djoerd

    A publicly available dataset for federated search reflecting a real web environment has long been bsent, making it difficult for researchers to test the validity of their federated search algorithms for the web setting. We present several experiments and analyses on resource selection on the web

  10. Overview of the TREC 2013 Federated Web Search Track

    NARCIS (Netherlands)

    Demeester, Thomas; Trieschnigg, Rudolf Berend; Nguyen, Dong-Phuong; Hiemstra, Djoerd

    2014-01-01

    The TREC Federated Web Search track is intended to promote research related to federated search in a realistic web setting, and hereto provides a large data collection gathered from a series of online search engines. This overview paper discusses the results of the first edition of the track, FedWeb

  11. Overview of the TREC 2014 Federated Web Search Track

    NARCIS (Netherlands)

    Demeester, Thomas; Trieschnigg, Rudolf Berend; Nguyen, Dong-Phuong; Zhou, Ke; Hiemstra, Djoerd

    2014-01-01

    The TREC Federated Web Search track facilitates research in topics related to federated web search, by providing a large realistic data collection sampled from a multitude of online search engines. The FedWeb 2013 challenges of Resource Selection and Results Merging challenges are again included in

  12. FedWeb Greatest Hits: Presenting the New Test Collection for Federated Web Search

    NARCIS (Netherlands)

    Demeester, Thomas; Trieschnigg, Rudolf Berend; Zhou, Ke; Nguyen, Dong-Phuong; Hiemstra, Djoerd

    This paper presents 'FedWeb Greatest Hits', a large new test collection for research in web information retrieval. As a combination and extension of the datasets used in the TREC Federated Web Search Track, this collection opens up new research possibilities on federated web search challenges, as

  13. Overview of the TREC 2014 Federated Web Search Track

    OpenAIRE

    Demeester, Thomas; Trieschnigg, Rudolf Berend; Nguyen, Dong-Phuong; Zhou, Ke; Hiemstra, Djoerd

    2014-01-01

    The TREC Federated Web Search track facilitates research in topics related to federated web search, by providing a large realistic data collection sampled from a multitude of online search engines. The FedWeb 2013 challenges of Resource Selection and Results Merging challenges are again included in FedWeb 2014, and we additionally introduced the task of vertical selection. Other new aspects are the required link between the Resource Selection and Results Merging, and the importance of diversi...

  14. Overview of the TREC 2013 federated web search track

    OpenAIRE

    Demeester, Thomas; Trieschnigg, D; Nguyen, D; Hiemstra, D

    2013-01-01

    The TREC Federated Web Search track is intended to promote research related to federated search in a realistic web setting, and hereto provides a large data collection gathered from a series of online search engines. This overview paper discusses the results of the first edition of the track, FedWeb 2013. The focus was on basic challenges in federated search: (1) resource selection, and (2) results merging. After an overview of the provided data collection and the relevance judgments for the ...

  15. QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks.

    Directory of Open Access Journals (Sweden)

    Asa Thibodeau

    2016-06-01

    Full Text Available Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for mapping interactions between DNA regions, e.g., ChIA-PET and HiC, can generate genome-wide maps of interactions between regulatory elements. These interaction datasets are important resources to infer distal gene targets of non-coding regulatory elements and to facilitate prioritization of critical loci for important cellular functions. With the increasing diversity and complexity of genomic information and public ontologies, making sense of these datasets demands integrative and easy-to-use software tools. Moreover, network representation of chromatin interaction maps enables effective data visualization, integration, and mining. Currently, there is no software that can take full advantage of network theory approaches for the analysis of chromatin interaction datasets. To fill this gap, we developed a web-based application, QuIN, which enables: 1 building and visualizing chromatin interaction networks, 2 annotating networks with user-provided private and publicly available functional genomics and interaction datasets, 3 querying network components based on gene name or chromosome location, and 4 utilizing network based measures to identify and prioritize critical regulatory targets and their direct and indirect interactions.QuIN's web server is available at http://quin.jax.org QuIN is developed in Java and JavaScript, utilizing an Apache Tomcat web server and MySQL database and the source code is available under the GPLV3 license available on GitHub: https://github.com/UcarLab/QuIN/.

  16. QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks.

    Science.gov (United States)

    Thibodeau, Asa; Márquez, Eladio J; Luo, Oscar; Ruan, Yijun; Menghi, Francesca; Shin, Dong-Guk; Stitzel, Michael L; Vera-Licona, Paola; Ucar, Duygu

    2016-06-01

    Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers) at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for mapping interactions between DNA regions, e.g., ChIA-PET and HiC, can generate genome-wide maps of interactions between regulatory elements. These interaction datasets are important resources to infer distal gene targets of non-coding regulatory elements and to facilitate prioritization of critical loci for important cellular functions. With the increasing diversity and complexity of genomic information and public ontologies, making sense of these datasets demands integrative and easy-to-use software tools. Moreover, network representation of chromatin interaction maps enables effective data visualization, integration, and mining. Currently, there is no software that can take full advantage of network theory approaches for the analysis of chromatin interaction datasets. To fill this gap, we developed a web-based application, QuIN, which enables: 1) building and visualizing chromatin interaction networks, 2) annotating networks with user-provided private and publicly available functional genomics and interaction datasets, 3) querying network components based on gene name or chromosome location, and 4) utilizing network based measures to identify and prioritize critical regulatory targets and their direct and indirect interactions. QuIN's web server is available at http://quin.jax.org QuIN is developed in Java and JavaScript, utilizing an Apache Tomcat web server and MySQL database and the source code is available under the GPLV3 license available on GitHub: https://github.com/UcarLab/QuIN/.

  17. A Web 2.0 Application for Executing Queries and Services on Climatic Data

    Science.gov (United States)

    Abad-Mota, S.; Ruckhaus, E.; Garboza, A.; Tepedino, G.

    2007-12-01

    aggregation, hourly, daily, monthly, so that they can be provided to the user at the desired level. This means that additional caution has to be exercised in query answering, in order to distinguish between primary and derived data. On the other hand, a Web 2.0 application is being designed to provide a front-end to the repository. This design focuses on two important aspects: the use of metadata structures, and the definition of collaborative Web 2.0 features that can be integrated to a project of this nature. Metadata descriptors include for a set of measurements, its quality, granularity and other dimension information. With these descriptors it is possible to establish relationships between different sets of measurements and provide scientists with efficient searching mechanisms that determine the related sets of measurements that contribute to a query answer. Unlike traditional applications for climatic data, our approach not only satisfies requirements of researchers specialized in this domain, but also those of anyone interested in this area; one of the objectives is to build an informal knowledge base that can be improved and consolidated with the usage of the system.

  18. Querying phenotype-genotype relationships on patient datasets using semantic web technology: the example of Cerebrotendinous xanthomatosis.

    Science.gov (United States)

    Taboada, María; Martínez, Diego; Pilo, Belén; Jiménez-Escrig, Adriano; Robinson, Peter N; Sobrido, María J

    2012-07-31

    Semantic Web technology can considerably catalyze translational genetics and genomics research in medicine, where the interchange of information between basic research and clinical levels becomes crucial. This exchange involves mapping abstract phenotype descriptions from research resources, such as knowledge databases and catalogs, to unstructured datasets produced through experimental methods and clinical practice. This is especially true for the construction of mutation databases. This paper presents a way of harmonizing abstract phenotype descriptions with patient data from clinical practice, and querying this dataset about relationships between phenotypes and genetic variants, at different levels of abstraction. Due to the current availability of ontological and terminological resources that have already reached some consensus in biomedicine, a reuse-based ontology engineering approach was followed. The proposed approach uses the Ontology Web Language (OWL) to represent the phenotype ontology and the patient model, the Semantic Web Rule Language (SWRL) to bridge the gap between phenotype descriptions and clinical data, and the Semantic Query Web Rule Language (SQWRL) to query relevant phenotype-genotype bidirectional relationships. The work tests the use of semantic web technology in the biomedical research domain named cerebrotendinous xanthomatosis (CTX), using a real dataset and ontologies. A framework to query relevant phenotype-genotype bidirectional relationships is provided. Phenotype descriptions and patient data were harmonized by defining 28 Horn-like rules in terms of the OWL concepts. In total, 24 patterns of SWQRL queries were designed following the initial list of competency questions. As the approach is based on OWL, the semantic of the framework adapts the standard logical model of an open world assumption. This work demonstrates how semantic web technologies can be used to support flexible representation and computational inference mechanisms

  19. The iMars WebGIS - Spatio-Temporal Data Queries and Single Image Map Web Services

    Science.gov (United States)

    Walter, Sebastian; Steikert, Ralf; Schreiner, Bjoern; Muller, Jan-Peter; van Gasselt, Stephan; Sidiropoulos, Panagiotis; Lanz-Kroechert, Julia

    2017-04-01

    Introduction: Web-based planetary image dissemination platforms usually show outline coverages of the data and offer querying for metadata as well as preview and download, e.g. the HRSC Mapserver (Walter & van Gasselt, 2014). Here we introduce a new approach for a system dedicated to change detection by simultanous visualisation of single-image time series in a multi-temporal context. While the usual form of presenting multi-orbit datasets is the merge of the data into a larger mosaic, we want to stay with the single image as an important snapshot of the planetary surface at a specific time. In the context of the EU FP-7 iMars project we process and ingest vast amounts of automatically co-registered (ACRO) images. The base of the co-registration are the high precision HRSC multi-orbit quadrangle image mosaics, which are based on bundle-block-adjusted multi-orbit HRSC DTMs. Additionally we make use of the existing bundle-adjusted HRSC single images available at the PDS archives. A prototype demonstrating the presented features is available at http://imars.planet.fu-berlin.de. Multi-temporal database: In order to locate multiple coverage of images and select images based on spatio-temporal queries, we converge available coverage catalogs for various NASA imaging missions into a relational database management system with geometry support. We harvest available metadata entries during our processing pipeline using the Integrated Software for Imagers and Spectrometers (ISIS) software. Currently, this database contains image outlines from the MGS/MOC, MRO/CTX and the MO/THEMIS instruments with imaging dates ranging from 1996 to the present. For the MEx/HRSC data, we already maintain a database which we automatically update with custom software based on the VICAR environment. Web Map Service with time support: The MapServer software is connected to the database and provides Web Map Services (WMS) with time support based on the START_TIME image attribute. It allows temporal

  20. News trends and web search query of HIV/AIDS in Hong Kong

    Science.gov (United States)

    Chiu, Alice P. Y.; Lin, Qianying

    2017-01-01

    Background The HIV epidemic in Hong Kong has worsened in recent years, with major contributions from high-risk subgroup of men who have sex with men (MSM). Internet use is prevalent among the majority of the local population, where they sought health information online. This study examines the impacts of HIV/AIDS and MSM news coverage on web search query in Hong Kong. Methods Relevant news coverage about HIV/AIDS and MSM from January 1st, 2004 to December 31st, 2014 was obtained from the WiseNews databse. News trends were created by computing the number of relevant articles by type, topic, place of origin and sub-populations. We then obtained relevant search volumes from Google and analysed causality between news trends and Google Trends using Granger Causality test and orthogonal impulse function. Results We found that editorial news has an impact on “HIV” Google searches on HIV, with the search term popularity peaking at an average of two weeks after the news are published. Similarly, editorial news has an impact on the frequency of “AIDS” searches two weeks after. MSM-related news trends have a more fluctuating impact on “MSM” Google searches, although the time lag varies anywhere from one week later to ten weeks later. Conclusions This infodemiological study shows that there is a positive impact of news trends on the online search behavior of HIV/AIDS or MSM-related issues for up to ten weeks after. Health promotional professionals could make use of this brief time window to tailor the timing of HIV awareness campaigns and public health interventions to maximise its reach and effectiveness. PMID:28922376

  1. Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases.

    Science.gov (United States)

    Wollbrett, Julien; Larmande, Pierre; de Lamotte, Frédéric; Ruiz, Manuel

    2013-04-15

    In recent years, a large amount of "-omics" data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic.

  2. Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases

    Science.gov (United States)

    2013-01-01

    Background In recent years, a large amount of “-omics” data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. Results We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. Conclusions BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic. PMID:23586394

  3. UMass at TREC WEB 2014: Entity Query Feature Expansion using Knowledge Base Links

    Science.gov (United States)

    2014-11-01

    task on the category A subset and demonstrate the benefit of entity-centric approaches even for non-entity queries like “dark chocolate health benefits ...category A subset and demonstrate the benefit of entity-centric approaches even for non-entity queries like ???dark chocolate health benefits ???. 15...rogers 289 benefits of yoga Model MAP ERR@20 NDCG@20 SDM 4.18 9.15 12.61 WikiRM1 4.00 9.31 12.80 SDM-RM3 3.53 7.61 11.00 EQFE 4.67 10.00 14.61 (a) Results

  4. Robust Query Processing for Personalized Information Access on the Semantic Web

    DEFF Research Database (Denmark)

    Dolog, Peter; Stuckenschmidt, Heiner; Wache, Holger

    and user preferences. We describe a framework for information access that combines query refinement and relaxation in order to provide robust, personalized access to heterogeneous RDF data as well as an implementation in terms of rewriting rules and explain its application in the context of e-learning...

  5. PatternQuery: web application for fast detection of biomacromolecular structural patterns in the entire Protein Data Bank.

    Science.gov (United States)

    Sehnal, David; Pravda, Lukáš; Svobodová Vařeková, Radka; Ionescu, Crina-Maria; Koča, Jaroslav

    2015-07-01

    Well defined biomacromolecular patterns such as binding sites, catalytic sites, specific protein or nucleic acid sequences, etc. precisely modulate many important biological phenomena. We introduce PatternQuery, a web-based application designed for detection and fast extraction of such patterns. The application uses a unique query language with Python-like syntax to define the patterns that will be extracted from datasets provided by the user, or from the entire Protein Data Bank (PDB). Moreover, the database-wide search can be restricted using a variety of criteria, such as PDB ID, resolution, and organism of origin, to provide only relevant data. The extraction generally takes a few seconds for several hundreds of entries, up to approximately one hour for the whole PDB. The detected patterns are made available for download to enable further processing, as well as presented in a clear tabular and graphical form directly in the browser. The unique design of the language and the provided service could pave the way towards novel PDB-wide analyses, which were either difficult or unfeasible in the past. The application is available free of charge at http://ncbr.muni.cz/PatternQuery. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Exploring U.S Cropland - A Web Service based Cropland Data Layer Visualization, Dissemination and Querying System (Invited)

    Science.gov (United States)

    Yang, Z.; Han, W.; di, L.

    2010-12-01

    The National Agricultural Statistics Service (NASS) of the USDA produces the Cropland Data Layer (CDL) product, which is a raster-formatted, geo-referenced, U.S. crop specific land cover classification. These digital data layers are widely used for a variety of applications by universities, research institutions, government agencies, and private industry in climate change studies, environmental ecosystem studies, bioenergy production & transportation planning, environmental health research and agricultural production decision making. The CDL is also used internally by NASS for crop acreage and yield estimation. Like most geospatial data products, the CDL product is only available by CD/DVD delivery or online bulk file downloading via the National Research Conservation Research (NRCS) Geospatial Data Gateway (external users) or in a printed paper map format. There is no online geospatial information access and dissemination, no crop visualization & browsing, no geospatial query capability, nor online analytics. To facilitate the application of this data layer and to help disseminating the data, a web-service based CDL interactive map visualization, dissemination, querying system is proposed. It uses Web service based service oriented architecture, adopts open standard geospatial information science technology and OGC specifications and standards, and re-uses functions/algorithms from GeoBrain Technology (George Mason University developed). This system provides capabilities of on-line geospatial crop information access, query and on-line analytics via interactive maps. It disseminates all data to the decision makers and users via real time retrieval, processing and publishing over the web through standards-based geospatial web services. A CDL region of interest can also be exported directly to Google Earth for mashup or downloaded for use with other desktop application. This web service based system greatly improves equal-accessibility, interoperability, usability

  7. Research on Web Search Behavior: How Online Query Data Inform Social Psychology.

    Science.gov (United States)

    Lai, Kaisheng; Lee, Yan Xin; Chen, Hao; Yu, Rongjun

    2017-10-01

    The widespread use of web searches in daily life has allowed researchers to study people's online social and psychological behavior. Using web search data has advantages in terms of data objectivity, ecological validity, temporal resolution, and unique application value. This review integrates existing studies on web search data that have explored topics including sexual behavior, suicidal behavior, mental health, social prejudice, social inequality, public responses to policies, and other psychosocial issues. These studies are categorized as descriptive, correlational, inferential, predictive, and policy evaluation research. The integration of theory-based hypothesis testing in future web search research will result in even stronger contributions to social psychology.

  8. Categorical and Specificity Differences between User-Supplied Tags and Search Query Terms for Images. An Analysis of "Flickr" Tags and Web Image Search Queries

    Science.gov (United States)

    Chung, EunKyung; Yoon, JungWon

    2009-01-01

    Introduction: The purpose of this study is to compare characteristics and features of user supplied tags and search query terms for images on the "Flickr" Website in terms of categories of pictorial meanings and level of term specificity. Method: This study focuses on comparisons between tags and search queries using Shatford's categorization…

  9. Learning semantic query suggestions

    NARCIS (Netherlands)

    Meij, E.; Bron, M.; Hollink, L.; Huurnink, B.; de Rijke, M.

    2009-01-01

    An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide

  10. NoSQL: collection document and cloud by using a dynamic web query form

    Science.gov (United States)

    Abdalla, Hemn B.; Lin, Jinzhao; Li, Guoquan

    2015-07-01

    Mongo-DB (from "humongous") is an open-source document database and the leading NoSQL database. A NoSQL (Not Only SQL, next generation databases, being non-relational, deal, open-source and horizontally scalable) presenting a mechanism for storage and retrieval of documents. Previously, we stored and retrieved the data using the SQL queries. Here, we use the MonogoDB that means we are not utilizing the MySQL and SQL queries. Directly importing the documents into our Drives, retrieving the documents on that drive by not applying the SQL queries, using the IO BufferReader and Writer, BufferReader for importing our type of document files to my folder (Drive). For retrieving the document files, the usage is BufferWriter from the particular folder (or) Drive. In this sense, providing the security for those storing files for what purpose means if we store the documents in our local folder means all or views that file and modified that file. So preventing that file, we are furnishing the security. The original document files will be changed to another format like in this paper; Binary format is used. Our documents will be converting to the binary format after that direct storing in one of our folder, that time the storage space will provide the private key for accessing that file. Wherever any user tries to discover the Document files means that file data are in the binary format, the document's file owner simply views that original format using that personal key from receive the secret key from the cloud.

  11. 基于HTML5+jQuery Mobile的移动Web应用开发研究%Study of mobile web application development based on HTML5 and jQuery Mobile

    Institute of Scientific and Technical Information of China (English)

    覃凤萍

    2015-01-01

    With the rapidly growing popularity of smart devices such as iphone and Android,mobile web technology has gradually become a new hot spot of concern,traditional site will be transferred to the mobile terminal due to market demand . Using jQuery Mobile and HTML5 to do mobile web application development, with the development of simple, short release cycle, cross-platform, cross-platform advantages . In this paper, jQuery Mobile and HTML5 mobile web application development made a presentation and analysis.%随着iphone、Android等智能设备的迅速普及,移动Web技术逐渐成为关注的新热点,传统信息类和电子商务网站因市场需求向移动终端转移。使用jQuery Mobile和HTML5做移动Web应用开发,具有开发简单,发布周期短、跨平台跨设备的优点。文章对jQuery Mobile和HTML5的移动Web应用开发做了介绍和分析。

  12. Snippet-based relevance predictions for federated web search

    NARCIS (Netherlands)

    Demeester, Thomas; Nguyen, Dong-Phuong; Trieschnigg, Rudolf Berend; Develder, Chris; Hiemstra, Djoerd

    How well can the relevance of a page be predicted, purely based on snippets? This would be highly useful in a Federated Web Search setting where caching large amounts of result snippets is more feasible than caching entire pages. The experiments reported in this paper make use of result snippets and

  13. PhyloExplorer: a web server to validate, explore and query phylogenetic trees

    Directory of Open Access Journals (Sweden)

    Auberval Nicolas

    2009-05-01

    Full Text Available Abstract Background Many important problems in evolutionary biology require molecular phylogenies to be reconstructed. Phylogenetic trees must then be manipulated for subsequent inclusion in publications or analyses such as supertree inference and tree comparisons. However, no tool is currently available to facilitate the management of tree collections providing, for instance: standardisation of taxon names among trees with respect to a reference taxonomy; selection of relevant subsets of trees or sub-trees according to a taxonomic query; or simply computation of descriptive statistics on the collection. Moreover, although several databases of phylogenetic trees exist, there is currently no easy way to find trees that are both relevant and complementary to a given collection of trees. Results We propose a tool to facilitate assessment and management of phylogenetic tree collections. Given an input collection of rooted trees, PhyloExplorer provides facilities for obtaining statistics describing the collection, correcting invalid taxon names, extracting taxonomically relevant parts of the collection using a dedicated query language, and identifying related trees in the TreeBASE database. Conclusion PhyloExplorer is a simple and interactive website implemented through underlying Python libraries and MySQL databases. It is available at: http://www.ncbi.orthomam.univ-montp2.fr/phyloexplorer/ and the source code can be downloaded from: http://code.google.com/p/taxomanie/.

  14. An informatics supported web-based data annotation and query tool to expedite translational research for head and neck malignancies

    International Nuclear Information System (INIS)

    Amin, Waqas; Kang, Hyunseok P; Egloff, Ann Marie; Singh, Harpreet; Trent, Kerry; Ridge-Hetrick, Jennifer; Seethala, Raja R; Grandis, Jennifer; Parwani, Anil V

    2009-01-01

    The Specialized Program of Research Excellence (SPORE) in Head and Neck Cancer neoplasm virtual biorepository is a bioinformatics-supported system to incorporate data from various clinical, pathological, and molecular systems into a single architecture based on a set of common data elements (CDEs) that provides semantic and syntactic interoperability of data sets. The various components of this annotation tool include the Development of Common Data Elements (CDEs) that are derived from College of American Pathologists (CAP) Checklist and North American Association of Central Cancer Registries (NAACR) standards. The Data Entry Tool is a portable and flexible Oracle-based data entry device, which is an easily mastered web-based tool. The Data Query Tool helps investigators and researchers to search de-identified information within the warehouse/resource through a 'point and click' interface, thus enabling only the selected data elements to be essentially copied into a data mart using a multi dimensional model from the warehouse's relational structure. The SPORE Head and Neck Neoplasm Database contains multimodal datasets that are accessible to investigators via an easy to use query tool. The database currently holds 6553 cases and 10607 tumor accessions. Among these, there are 965 metastatic, 4227 primary, 1369 recurrent, and 483 new primary cases. The data disclosure is strictly regulated by user's authorization. The SPORE Head and Neck Neoplasm Virtual Biorepository is a robust translational biomedical informatics tool that can facilitate basic science, clinical, and translational research. The Data Query Tool acts as a central source providing a mechanism for researchers to efficiently find clinically annotated datasets and biospecimens that are relevant to their research areas. The tool protects patient privacy by revealing only de-identified data in accordance with regulations and approvals of the IRB and scientific review committee

  15. Time-Series Adaptive Estimation of Vaccination Uptake Using Web Search Queries

    DEFF Research Database (Denmark)

    Dalum Hansen, Niels; Mølbak, Kåre; Cox, Ingemar J.

    2017-01-01

    Estimating vaccination uptake is an integral part of ensuring public health. It was recently shown that vaccination uptake can be estimated automatically from web data, instead of slowly collected clinical records or population surveys [2]. All prior work in this area assumes that features of vac...

  16. An evaluation of multi-probe locality sensitive hashing for computing similarities over web-scale query logs.

    Directory of Open Access Journals (Sweden)

    Graham Cormode

    Full Text Available Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users' queries from commercial search engines, computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH methods and evaluate four variants in a distributed computing environment (specifically, Hadoop. We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with "vanilla" LSH, even when using the same amount of space.

  17. WebEQ: a web-GIS System to collect, display and query data for the management of the earthquake emergency in Central Italy

    Science.gov (United States)

    Carbone, Gianluca; Cosentino, Giuseppe; Pennica, Francesco; Moscatelli, Massimiliano; Stigliano, Francesco

    2017-04-01

    After the strong earthquakes that hit central Italy in recent months, the Center for Seismic Microzonation and its applications (CentroMS) was commissioned by the Italian Department of Civil Protection to conduct the study of seismic microzonation of the territories affected by the earthquake of August 24, 2016. As part of the activities of microzonation, IGAG CNR has created WebEQ, a management tool of the data that have been acquired by all participants (i.e., more than twenty research institutes and university departments). The data collection was organized and divided into sub-areas, assigned to working groups with multidisciplinary expertise in geology, geophysics and engineering. WebEQ is a web-GIS System that helps all the subjects involved in the data collection activities, through tools aimed at data uploading and validation, and with a simple GIS interface to display, query and download geographic data. WebEQ is contributing to the creation of a large database containing geographical data, both vector and raster, from various sources and types: - Regional Technical Map em Geological and geomorphological maps em Data location maps em Maps of microzones homogeneous in seismic perspective and seismic microzonation maps em National strong motion network location. Data loading is done through simple input masks that ensure consistency with the database structure, avoiding possible errors and helping users to interact with the map through user-friendly tools. All the data are thematized through standardized symbologies and colors (Gruppo di lavoro MS 2008), in order to allow the easy interpretation by all users. The data download tools allow data exchange between working groups and the scientific community to benefit from the activities. The seismic microzonation activities are still ongoing. WebEQ is enabling easy management of large amounts of data and will form a basis for the development of tools for the management of the upcoming seismic emergencies.

  18. User perspectives on query difficulty

    DEFF Research Database (Denmark)

    Lioma, Christina; Larsen, Birger; Schütze, Hinrich

    2011-01-01

    be difficult for the system to address? (2) Are users aware of specific features in their query (e.g., domain-specificity, vagueness) that may render their query difficult for an IR system to address? A study of 420 queries from a Web search engine query log that are pre-categorised as easy, medium, hard...

  19. Federated queries of clinical data repositories: Scaling to a national network.

    Science.gov (United States)

    Weber, Griffin M

    2015-06-01

    Federated networks of clinical research data repositories are rapidly growing in size from a handful of sites to true national networks with more than 100 hospitals. This study creates a conceptual framework for predicting how various properties of these systems will scale as they continue to expand. Starting with actual data from Harvard's four-site Shared Health Research Information Network (SHRINE), the framework is used to imagine a future 4000 site network, representing the majority of hospitals in the United States. From this it becomes clear that several common assumptions of small networks fail to scale to a national level, such as all sites being online at all times or containing data from the same date range. On the other hand, a large network enables researchers to select subsets of sites that are most appropriate for particular research questions. Developers of federated clinical data networks should be aware of how the properties of these networks change at different scales and design their software accordingly. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. CWI and TU Delft at TREC 2013: Contextual Suggestion, Federated Web Search, KBA, and Web Tracks

    NARCIS (Netherlands)

    A. Bellogín Kouki (Alejandro); G.G. Gebremeskel (Gebre); J. He (Jiyin); J.J.P. Lin (Jimmy); A. Said (Alan); T. Samar (Thaer); A.P. de Vries (Arjen); J.B.P. Vuurens (Jeroen)

    2014-01-01

    htmlabstractThis paper provides an overview of the work done at the Centrum Wiskunde & Informatica (CWI) and Delft University of Technology (TU Delft) for different tracks of TREC 2013. We participated in the Contextual Suggestion Track, the Federated Web Search Track, the Knowledge Base

  1. Spatial Keyword Querying

    DEFF Research Database (Denmark)

    Cao, Xin; Chen, Lisi; Cong, Gao

    2012-01-01

    The web is increasingly being used by mobile users. In addition, it is increasingly becoming possible to accurately geo-position mobile users and web content. This development gives prominence to spatial web data management. Specifically, a spatial keyword query takes a user location and user-sup...... different kinds of functionality as well as the ideas underlying their definition....

  2. Query recommendation for children

    NARCIS (Netherlands)

    Duarte Torres, Sergio; Hiemstra, Djoerd; Weber, Ingmar; Serdyukov, Pavel

    2012-01-01

    One of the biggest problems that children experience while searching the web occurs during the query formulation process. Children have been found to struggle formulating queries based on keywords given their limited vocabulary and their difficulty to choose the right keywords. In this work we

  3. Collective spatial keyword querying

    DEFF Research Database (Denmark)

    Cao, Xin; Cong, Gao; Jensen, Christian S.

    2011-01-01

    With the proliferation of geo-positioning and geo-tagging, spatial web objects that possess both a geographical location and a textual description are gaining in prevalence, and spatial keyword queries that exploit both location and textual description are gaining in prominence. However, the quer......With the proliferation of geo-positioning and geo-tagging, spatial web objects that possess both a geographical location and a textual description are gaining in prevalence, and spatial keyword queries that exploit both location and textual description are gaining in prominence. However......, the queries studied so far generally focus on finding individual objects that each satisfy a query rather than finding groups of objects where the objects in a group collectively satisfy a query. We define the problem of retrieving a group of spatial web objects such that the group's keywords cover the query......'s keywords and such that objects are nearest to the query location and have the lowest inter-object distances. Specifically, we study two variants of this problem, both of which are NP-complete. We devise exact solutions as well as approximate solutions with provable approximation bounds to the problems. We...

  4. Federated Search and the Library Web Site: A Study of Association of Research Libraries Member Web Sites

    Science.gov (United States)

    Williams, Sarah C.

    2010-01-01

    The purpose of this study was to investigate how federated search engines are incorporated into the Web sites of libraries in the Association of Research Libraries. In 2009, information was gathered for each library in the Association of Research Libraries with a federated search engine. This included the name of the federated search service and…

  5. Stakeholder Web-based Interrogable Federated Toolkit (SWIFT), Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — The chief innovation is the development of a Predictive Query Language that populates databases with future information provided from aviation models, along with...

  6. External phenome analysis enables a rational federated query strategy to detect changing rates of treatment-related complications associated with multiple myeloma.

    Science.gov (United States)

    Warner, Jeremy L; Alterovitz, Gil; Bodio, Kelly; Joyce, Robin M

    2013-01-01

    Electronic health records (EHRs) are increasingly useful for health services research. For relatively uncommon conditions, such as multiple myeloma (MM) and its treatment-related complications, a combination of multiple EHR sources is essential for such research. The Shared Health Research Information Network (SHRINE) enables queries for aggregate results across participating institutions. Development of a rational search strategy in SHRINE may be augmented through analysis of pre-existing databases. We developed a SHRINE query for likely non-infectious treatment-related complications of MM, based upon an analysis of the Multiparameter Intelligent Monitoring in Intensive Care (MIMIC II) database. Using this query strategy, we found that the rate of likely treatment-related complications significantly increased from 2001 to 2007, by an average of 6% a year (p=0.01), across the participating SHRINE institutions. This finding is in keeping with increasingly aggressive strategies in the treatment of MM. This proof of concept demonstrates that a staged approach to federated queries, using external EHR data, can yield potentially clinically meaningful results.

  7. Query deforestation

    OpenAIRE

    Grust, Torsten; Scholl, Marc H.

    1998-01-01

    The construction of a declarative query engine for a DBMS includes the challenge of compiling algebraic queries into efficient execution plans that can be run on top of the persistent storage. This work pursues the goal of employing foldr-build deforestation for the derivation of efficient streaming programs - programs that do not allocate intermediate data structures to perform their task - from algebraic (combinator) query plans. The query engine is based on the insertion representation of ...

  8. Project Lefty: More Bang for the Search Query

    Science.gov (United States)

    Varnum, Ken

    2010-01-01

    This article describes the Project Lefty, a search system that, at a minimum, adds a layer on top of traditional federated search tools that will make the wait for results more worthwhile for researchers. At best, Project Lefty improves search queries and relevance rankings for web-scale discovery tools to make the results themselves more relevant…

  9. AthMethPre: a web server for the prediction and query of mRNA m6A sites in Arabidopsis thaliana.

    Science.gov (United States)

    Xiang, Shunian; Yan, Zhangming; Liu, Ke; Zhang, Yaou; Sun, Zhirong

    2016-10-18

    N 6 -Methyladenosine (m 6 A) is the most prevalent and abundant modification in mRNA that has been linked to many key biological processes. High-throughput experiments have generated m 6 A-peaks across the transcriptome of A. thaliana, but the specific methylated sites were not assigned, which impedes the understanding of m 6 A functions in plants. Therefore, computational prediction of mRNA m 6 A sites becomes emergently important. Here, we present a method to predict the m 6 A sites for A. thaliana mRNA sequence(s). To predict the m 6 A sites of an mRNA sequence, we employed the support vector machine to build a classifier using the features of the positional flanking nucleotide sequence and position-independent k-mer nucleotide spectrum. Our method achieved good performance and was applied to a web server to provide service for the prediction of A. thaliana m 6 A sites. The server also provides a comprehensive database of predicted transcriptome-wide m 6 A sites and curated m 6 A-seq peaks from the literature for query and visualization. The AthMethPre web server is the first web server that provides a user-friendly tool for the prediction and query of A. thaliana mRNA m 6 A sites, which is freely accessible for public use at .

  10. jQuery cookbook

    CERN Document Server

    2010-01-01

    jQuery simplifies building rich, interactive web frontends. Getting started with this JavaScript library is easy, but it can take years to fully realize its breadth and depth; this cookbook shortens the learning curve considerably. With these recipes, you'll learn patterns and practices from 19 leading developers who use jQuery for everything from integrating simple components into websites and applications to developing complex, high-performance user interfaces. Ideal for newcomers and JavaScript veterans alike, jQuery Cookbook starts with the basics and then moves to practical use cases w

  11. Learning jQuery

    CERN Document Server

    Chaffer, Jonathan

    2013-01-01

    Step through each of the core concepts of the jQuery library, building an overall picture of its capabilities. Once you have thoroughly covered the basics, the book returns to each concept to cover more advanced examples and techniques.This book is for web designers who want to create interactive elements for their designs, and for developers who want to create the best user interface for their web applications. Basic JavaScript programming and knowledge of HTML and CSS is required. No knowledge of jQuery is assumed, nor is experience with any other JavaScript libraries.

  12. EquiX-A Search and Query Language for XML.

    Science.gov (United States)

    Cohen, Sara; Kanza, Yaron; Kogan, Yakov; Sagiv, Yehoshua; Nutt, Werner; Serebrenik, Alexander

    2002-01-01

    Describes EquiX, a search language for XML that combines querying with searching to query the data and the meta-data content of Web pages. Topics include search engines; a data model for XML documents; search query syntax; search query semantics; an algorithm for evaluating a query on a document; and indexing EquiX queries. (LRW)

  13. Superfund Query

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Superfund Query allows users to retrieve data from the Comprehensive Environmental Response, Compensation, and Liability Information System (CERCLIS) database.

  14. jQuery Mobile

    CERN Document Server

    Reid, Jon

    2011-01-01

    Native apps have distinct advantages, but the future belongs to mobile web apps that function on a broad range of smartphones and tablets. Get started with jQuery Mobile, the touch-optimized framework for creating apps that look and behave consistently across many devices. This concise book provides HTML5, CSS3, and JavaScript code examples, screen shots, and step-by-step guidance to help you build a complete working app with jQuery Mobile. If you're already familiar with the jQuery JavaScript library, you can use your existing skills to build cross-platform mobile web apps right now. This b

  15. 75 FR 57086 - Submission for Review: Federal Cyber Service: Scholarship for Service (SFS) Registration Web Site

    Science.gov (United States)

    2010-09-17

    ... OFFICE OF PERSONNEL MANAGEMENT Submission for Review: Federal Cyber Service: Scholarship for Service (SFS) Registration Web Site AGENCY: Office of Personnel Management. ACTION: 30-Day Notice and... National Science Foundation in accordance with [[Page 57087

  16. Approximating terminological queries

    NARCIS (Netherlands)

    Stuckenschmidt, Heiner; Van Harmelen, Frank

    2002-01-01

    Current proposals for languages to encode terminological knowledge in intelligent systems support logical reasoning for answering user queries about objects and classes. An application of these languages on the World Wide Web, however, is hampered by the limitations of logical reasoning in terms

  17. Query responses

    Directory of Open Access Journals (Sweden)

    Paweł Łupkowski

    2017-05-01

    Full Text Available In this article we consider the phenomenon of answering a query with a query. Although such answers are common, no large scale, corpus-based characterization exists, with the exception of clarification requests. After briefly reviewing different theoretical approaches on this subject, we present a corpus study of query responses in the British National Corpus and develop a taxonomy for query responses. We point at a variety of response categories that have not been formalized in previous dialogue work, particularly those relevant to adversarial interaction. We show that different response categories have significantly different rates of subsequent answer provision. We provide a formal analysis of the response categories in the framework of KoS.

  18. ExpTreeDB: web-based query and visualization of manually annotated gene expression profiling experiments of human and mouse from GEO.

    Science.gov (United States)

    Ni, Ming; Ye, Fuqiang; Zhu, Juanjuan; Li, Zongwei; Yang, Shuai; Yang, Bite; Han, Lu; Wu, Yongge; Chen, Ying; Li, Fei; Wang, Shengqi; Bo, Xiaochen

    2014-12-01

    Numerous public microarray datasets are valuable resources for the scientific communities. Several online tools have made great steps to use these data by querying related datasets with users' own gene signatures or expression profiles. However, dataset annotation and result exhibition still need to be improved. ExpTreeDB is a database that allows for queries on human and mouse microarray experiments from Gene Expression Omnibus with gene signatures or profiles. Compared with similar applications, ExpTreeDB pays more attention to dataset annotations and result visualization. We introduced a multiple-level annotation system to depict and organize original experiments. For example, a tamoxifen-treated cell line experiment is hierarchically annotated as 'agent→drug→estrogen receptor antagonist→tamoxifen'. Consequently, retrieved results are exhibited by an interactive tree-structured graphics, which provide an overview for related experiments and might enlighten users on key items of interest. The database is freely available at http://biotech.bmi.ac.cn/ExpTreeDB. Web site is implemented in Perl, PHP, R, MySQL and Apache. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  19. Flexible Query Answering Systems

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 10th International Conference on Flexible Query Answering Systems, FQAS 2013, held in Granada, Spain, in September 2013. The 59 full papers included in this volume were carefully reviewed and selected from numerous submissions. The papers...... are organized in a general session train and a parallel special session track. The general session train covers the following topics: querying-answering systems; semantic technology; patterns and classification; personalization and recommender systems; searching and ranking; and Web and human...

  20. Enhancing Recall in Semantic Querying

    DEFF Research Database (Denmark)

    Rouces, Jacobo

    2013-01-01

    lexically and structurally different, which we will introduce in the next section. As RDF graphs from different sources are expected to be linked, the modeling heterogeneities will make the federated graph become sparser and inconsistent. This is detrimental to the recall of SPARQL queries, as the query...

  1. SPARK: Adapting Keyword Query to Semantic Search

    Science.gov (United States)

    Zhou, Qi; Wang, Chong; Xiong, Miao; Wang, Haofen; Yu, Yong

    Semantic search promises to provide more accurate result than present-day keyword search. However, progress with semantic search has been delayed due to the complexity of its query languages. In this paper, we explore a novel approach of adapting keywords to querying the semantic web: the approach automatically translates keyword queries into formal logic queries so that end users can use familiar keywords to perform semantic search. A prototype system named 'SPARK' has been implemented in light of this approach. Given a keyword query, SPARK outputs a ranked list of SPARQL queries as the translation result. The translation in SPARK consists of three major steps: term mapping, query graph construction and query ranking. Specifically, a probabilistic query ranking model is proposed to select the most likely SPARQL query. In the experiment, SPARK achieved an encouraging translation result.

  2. SCRY: Enabling quantitative reasoning in SPARQL queries

    NARCIS (Netherlands)

    Meroño-Peñuela, A.; Stringer, Bas; Loizou, Antonis; Abeln, Sanne; Heringa, Jaap

    2015-01-01

    The inability to include quantitative reasoning in SPARQL queries slows down the application of Semantic Web technology in the life sciences. SCRY, our SPARQL compatible service layer, improves this by executing services at query time and making their outputs query-accessible, generating RDF data on

  3. Automatically Preparing Safe SQL Queries

    Science.gov (United States)

    Bisht, Prithvi; Sistla, A. Prasad; Venkatakrishnan, V. N.

    We present the first sound program source transformation approach for automatically transforming the code of a legacy web application to employ PREPARE statements in place of unsafe SQL queries. Our approach therefore opens the way for eradicating the SQL injection threat vector from legacy web applications.

  4. QUERY SUPPORT FOR GMZ

    Directory of Open Access Journals (Sweden)

    A. Khandelwal

    2017-07-01

    Full Text Available Generic text-based compression models are simple and fast but there are two issues that needs to be addressed. They cannot leverage the structure that exists in data to achieve better compression and there is an unnecessary decompression step before the user can actually use the data. To address these issues, we came up with GMZ, a lossless compression model aimed at achieving high compression ratios. The decision to design GMZ (Khandelwal and Rajan, 2017 exclusively for GML's Simple Features Profile (SFP seems fair because of the high use of SFP in WFS and that it facilitates high optimisation of the compression model. This is an extension of our work on GMZ. In a typical server-client model such as Web Feature Service, the server is the primary creator and provider of GML, and therefore, requires compression and query capabilities. On the other hand, the client is the primary consumer of GML, and therefore, requires decompression and visualisation capabilities. In the first part of our work, we demonstrated compression using a python script that can be plugged in a server architecture, and decompression and visualisation in a web browser using a Firefox addon. The focus of this work is to develop the already existing tools to provide query capability to server. Our model provides the ability to decompress individual features in isolation, which is an essential requirement for realising query in compressed state. We con - struct an R-Tree index for spatial data and a custom index for non-spatial data and store these in a separate index file to prevent alter - ing the compression model. This facilitates independent use of compressed GMZ file where index can be constructed when required. The focus of this work is the bounding-box or range query commonly used in webGIS with provision for other spatial and non-spatial queries. The decrement in compression ratios due to the new index file is in the range of 1–3 percent which is trivial considering

  5. The CMS DBS query language

    International Nuclear Information System (INIS)

    Kuznetsov, Valentin; Riley, Daniel; Afaq, Anzar; Sekhri, Vijay; Guo Yuyi; Lueking, Lee

    2010-01-01

    The CMS experiment has implemented a flexible and powerful system enabling users to find data within the CMS physics data catalog. The Dataset Bookkeeping Service (DBS) comprises a database and the services used to store and access metadata related to CMS physics data. To this, we have added a generalized query system in addition to the existing web and programmatic interfaces to the DBS. This query system is based on a query language that hides the complexity of the underlying database structure by discovering the join conditions between database tables. This provides a way of querying the system that is simple and straightforward for CMS data managers and physicists to use without requiring knowledge of the database tables or keys. The DBS Query Language uses the ANTLR tool to build the input query parser and tokenizer, followed by a query builder that uses a graph representation of the DBS schema to construct the SQL query sent to underlying database. We will describe the design of the query system, provide details of the language components and overview of how this component fits into the overall data discovery system architecture.

  6. Efficacy of Web-Based Instruction to Provide Training on Federal Motor Carrier Safety Regulations

    Science.gov (United States)

    2011-05-01

    This report presents an evaluation of the current state-of-the-art Web-based instruction (WBI), reviews the current computer platforms of potential users of WBI, reviews the current status of WBI applications for Federal Motor Carrier Safety Administ...

  7. 75 FR 20400 - Submission for Review: Federal Cyber Service: Scholarship for Service (SFS) Registration Web Site

    Science.gov (United States)

    2010-04-19

    ... OFFICE OF PERSONNEL MANAGEMENT Submission for Review: Federal Cyber Service: Scholarship for Service (SFS) Registration Web Site AGENCY: U.S. Office of Personnel Management. ACTION: 60-Day Notice and... applicable supporting documentation, may be obtained by contacting the San Antonio Services Branch, Office of...

  8. Querying XML Data with SPARQL

    Science.gov (United States)

    Bikakis, Nikos; Gioldasis, Nektarios; Tsinaraki, Chrisa; Christodoulakis, Stavros

    SPARQL is today the standard access language for Semantic Web data. In the recent years XML databases have also acquired industrial importance due to the widespread applicability of XML in the Web. In this paper we present a framework that bridges the heterogeneity gap and creates an interoperable environment where SPARQL queries are used to access XML databases. Our approach assumes that fairly generic mappings between ontology constructs and XML Schema constructs have been automatically derived or manually specified. The mappings are used to automatically translate SPARQL queries to semantically equivalent XQuery queries which are used to access the XML databases. We present the algorithms and the implementation of SPARQL2XQuery framework, which is used for answering SPARQL queries over XML databases.

  9. jQuery Pocket Reference

    CERN Document Server

    Flanagan, David

    2010-01-01

    "As someone who uses jQuery on a regular basis, it was surprising to discover how much of the library I'm not using. This book is indispensable for anyone who is serious about using jQuery for non-trivial applications."-- Raffaele Cecco, longtime developer of video games, including Cybernoid, Exolon, and Stormlord jQuery is the "write less, do more" JavaScript library. Its powerful features and ease of use have made it the most popular client-side JavaScript framework for the Web. This book is jQuery's trusty companion: the definitive "read less, learn more" guide to the library. jQuery P

  10. Instant jQuery selectors

    CERN Document Server

    De Rosa, Aurelio

    2013-01-01

    Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. Instant jQuery Selectors follows a simple how-to format with recipes aimed at making you well versed with the wide range of selectors that jQuery has to offer through a myriad of examples.Instant jQuery Selectors is for web developers who want to delve into jQuery from its very starting point: selectors. Even if you're already familiar with the framework and its selectors, you could find several tips and tricks that you aren't aware of, especially about performance and how jQuery ac

  11. iReceptor: A platform for querying and analyzing antibody/B-cell and T-cell receptor repertoire data across federated repositories.

    Science.gov (United States)

    Corrie, Brian D; Marthandan, Nishanth; Zimonja, Bojan; Jaglale, Jerome; Zhou, Yang; Barr, Emily; Knoetze, Nicole; Breden, Frances M W; Christley, Scott; Scott, Jamie K; Cowell, Lindsay G; Breden, Felix

    2018-07-01

    Next-generation sequencing allows the characterization of the adaptive immune receptor repertoire (AIRR) in exquisite detail. These large-scale AIRR-seq data sets have rapidly become critical to vaccine development, understanding the immune response in autoimmune and infectious disease, and monitoring novel therapeutics against cancer. However, at present there is no easy way to compare these AIRR-seq data sets across studies and institutions. The ability to combine and compare information for different disease conditions will greatly enhance the value of AIRR-seq data for improving biomedical research and patient care. The iReceptor Data Integration Platform (gateway.ireceptor.org) provides one implementation of the AIRR Data Commons envisioned by the AIRR Community (airr-community.org), an initiative that is developing protocols to facilitate sharing and comparing AIRR-seq data. The iReceptor Scientific Gateway links distributed (federated) AIRR-seq repositories, allowing sequence searches or metadata queries across multiple studies at multiple institutions, returning sets of sequences fulfilling specific criteria. We present a review of the development of iReceptor, and how it fits in with the general trend toward sharing genomic and health data, and the development of standards for describing and reporting AIRR-seq data. Researchers interested in integrating their repositories of AIRR-seq data into the iReceptor Platform are invited to contact support@ireceptor.org. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  12. A Framework for WWW Query Processing

    Science.gov (United States)

    Wu, Binghui Helen; Wharton, Stephen (Technical Monitor)

    2000-01-01

    Query processing is the most common operation in a DBMS. Sophisticated query processing has been mainly targeted at a single enterprise environment providing centralized control over data and metadata. Submitting queries by anonymous users on the web is different in such a way that load balancing or DBMS' accessing control becomes the key issue. This paper provides a solution by introducing a framework for WWW query processing. The success of this framework lies in the utilization of query optimization techniques and the ontological approach. This methodology has proved to be cost effective at the NASA Goddard Space Flight Center Distributed Active Archive Center (GDAAC).

  13. jQuery For Dummies

    CERN Document Server

    Beighley, Lynn

    2010-01-01

    Learn how jQuery can make your Web page or blog stand out from the crowd!. jQuery is free, open source software that allows you to extend and customize Joomla!, Drupal, AJAX, and WordPress via plug-ins. Assuming no previous programming experience, Lynn Beighley takes you through the basics of jQuery from the very start. You'll discover how the jQuery library separates itself from other JavaScript libraries through its ease of use, compactness, and friendliness if you're a beginner programmer. Written in the easy-to-understand style of the For Dummies brand, this book demonstrates how you can a

  14. Flexible Query Answering Systems 2006

    DEFF Research Database (Denmark)

    -computer interaction. The overall theme of the FQAS conferences is innovative query systems aimed at providing easy, flexible, and intuitive access to information. Such systems are intended to facilitate retrieval from information repositories such as databases, libraries, and the World-Wide Web. These repositories......This volume constitutes the proceedings of the Seventh International Conference on Flexible Query Answering Systems, FQAS 2006, held in Milan, Italy, on June 7--10, 2006. FQAS is the premier conference for researchers and practitioners concerned with the vital task of providing easy, flexible...... are typically equipped with standard query systems which are often inadequate, and the focus of FQAS is the development of query systems that are more expressive, informative, cooperative, and productive. These proceedings contain contributions from invited speakers and 53 original papers out of about 100...

  15. Hacia una web social libre y federada: el caso de Lorea Towards a free and federated social web: the case of Lorea

    Directory of Open Access Journals (Sweden)

    Alex Haché

    2012-06-01

    Full Text Available Este artículo aborda las limitaciones de la web 2.0, las claves para superarlas y las recientes iniciativas de web social libre y federada. De entre ellas nos centraremos en el caso de Lorea desde nuestra experiencia directa con ella. This paper tackles the limitations of web 2.0, the keys to overcome them and the recent efforts towards a free and federated social web. Between these we will focus on the case of Lorea from our direct experience with it.

  16. Genetic algorithms for RDF chain query optimization

    NARCIS (Netherlands)

    Hogenboom, A.C.; Milea, D.V.; Frasincar, F.; Kaymak, U.; Calders, T.; Tuyls, K.; Pechenizkiy, M.

    2009-01-01

    The application of Semantic Web technologies in an Electronic Commerce environment implies a need for good support tools. Fast query engines are required for efficient real-time querying of large amounts of data, usually represented using RDF. We focus on optimizing a special class of SPARQL

  17. Towards Verbalizing SPARQL Queries in Arabic

    Directory of Open Access Journals (Sweden)

    I. Al Agha

    2016-04-01

    Full Text Available With the wide spread of Open Linked Data and Semantic Web technologies, a larger amount of data has been published on the Web in the RDF and OWL formats. This data can be queried using SPARQL, the Semantic Web Query Language. SPARQL cannot be understood by ordinary users and is not directly accessible to humans, and thus they will not be able to check whether the retrieved answers truly correspond to the intended information need. Driven by this challenge, natural language generation from SPARQL data has recently attracted a considerable attention. However, most existing solutions to verbalize SPARQL in natural language focused on English and Latin-based languages. Little effort has been made on the Arabic language which has different characteristics and morphology. This work aims to particularly help Arab users to perceive SPARQL queries on the Semantic Web by translating SPARQL to Arabic. It proposes an approach that gets a SPARQL query as an input and generates a query expressed in Arabic as an output. The translation process combines both morpho-syntactic analysis and language dependencies to generate a legible and understandable Arabic query. The approach was preliminary assessed with a sample query set, and results indicated that 75% of the queries were correctly translated into Arabic.

  18. Incremental Query Rewriting with Resolution

    Science.gov (United States)

    Riazanov, Alexandre; Aragão, Marcelo A. T.

    We address the problem of semantic querying of relational databases (RDB) modulo knowledge bases using very expressive knowledge representation formalisms, such as full first-order logic or its various fragments. We propose to use a resolution-based first-order logic (FOL) reasoner for computing schematic answers to deductive queries, with the subsequent translation of these schematic answers to SQL queries which are evaluated using a conventional relational DBMS. We call our method incremental query rewriting, because an original semantic query is rewritten into a (potentially infinite) series of SQL queries. In this chapter, we outline the main idea of our technique - using abstractions of databases and constrained clauses for deriving schematic answers, and provide completeness and soundness proofs to justify the applicability of this technique to the case of resolution for FOL without equality. The proposed method can be directly used with regular RDBs, including legacy databases. Moreover, we propose it as a potential basis for an efficient Web-scale semantic search technology.

  19. Head First jQuery

    CERN Document Server

    Benedetti, Ryan

    2011-01-01

    Want to add more interactivity and polish to your websites? Discover how jQuery can help you build complex scripting functionality in just a few lines of code. With Head First jQuery, you'll quickly get up to speed on this amazing JavaScript library by learning how to navigate HTML documents while handling events, effects, callbacks, and animations. By the time you've completed the book, you'll be incorporating Ajax apps, working seamlessly with HTML and CSS, and handling data with PHP, MySQL and JSON. If you want to learn-and understand-how to create interactive web pages, unobtrusive scrip

  20. From Questions to Queries

    Directory of Open Access Journals (Sweden)

    M. Drlík

    2007-12-01

    Full Text Available The extension of (Internet databases forceseveryone to become more familiar with techniques of datastorage and retrieval because users’ success often dependson their ability to pose right questions and to be able tointerpret their answers. University programs pay moreattention to developing database programming skills than todata exploitation skills. To educate our students to become“database users”, the authors intensively exploit supportivetools simplifying the production of database elements astables, queries, forms, reports, web pages, and macros.Videosequences demonstrating “standard operations” forcompleting them have been prepared to enhance out-ofclassroomlearning. The use of SQL and other professionaltools is reduced to the cases when the wizards are unable togenerate the intended construct.

  1. Approximate dictionary queries

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Gasieniec, Leszek

    1996-01-01

    Given a set of n binary strings of length m each. We consider the problem of answering d-queries. Given a binary query string of length m, a d-query is to report if there exists a string in the set within Hamming distance d of . We present a data structure of size O(nm) supporting 1-queries in ti...

  2. Visual Querying in Chemical Databases using SMARTS Patterns

    OpenAIRE

    Šípek, Vojtěch

    2014-01-01

    The purpose of this thesis is to create framework for visual querying in chemical databases which will be implemented as a web application. By using graphical editor, which is a part of client side, the user creates queries which are translated into chemical query language SMARTS. This query is parsed on the application server which is connected to the chemical database. This framework also contains tooling for creating the database and index structure above it. 1

  3. Query Classification and Study of University Students' Search Trends

    Science.gov (United States)

    Maabreh, Majdi A.; Al-Kabi, Mohammed N.; Alsmadi, Izzat M.

    2012-01-01

    Purpose: This study is an attempt to develop an automatic identification method for Arabic web queries and divide them into several query types using data mining. In addition, it seeks to evaluate the impact of the academic environment on using the internet. Design/methodology/approach: The web log files were collected from one of the higher…

  4. Learning via Query Synthesis

    KAUST Repository

    Alabdulmohsin, Ibrahim

    2017-01-01

    Active learning is a subfield of machine learning that has been successfully used in many applications. One of the main branches of active learning is query synthe- sis, where the learning agent constructs artificial queries from scratch in order

  5. Web Page Recommendation Using Web Mining

    OpenAIRE

    Modraj Bhavsar; Mrs. P. M. Chavan

    2014-01-01

    On World Wide Web various kind of content are generated in huge amount, so to give relevant result to user web recommendation become important part of web application. On web different kind of web recommendation are made available to user every day that includes Image, Video, Audio, query suggestion and web page. In this paper we are aiming at providing framework for web page recommendation. 1) First we describe the basics of web mining, types of web mining. 2) Details of each...

  6. Recommending Multidimensional Queries

    Science.gov (United States)

    Giacometti, Arnaud; Marcel, Patrick; Negre, Elsa

    Interactive analysis of datacube, in which a user navigates a cube by launching a sequence of queries is often tedious since the user may have no idea of what the forthcoming query should be in his current analysis. To better support this process we propose in this paper to apply a Collaborative Work approach that leverages former explorations of the cube to recommend OLAP queries. The system that we have developed adapts Approximate String Matching, a technique popular in Information Retrieval, to match the current analysis with the former explorations and help suggesting a query to the user. Our approach has been implemented with the open source Mondrian OLAP server to recommend MDX queries and we have carried out some preliminary experiments that show its efficiency for generating effective query recommendations.

  7. Unemployment Insurance Query (UIQ)

    Data.gov (United States)

    Social Security Administration — The Unemployment Insurance Query (UIQ) provides State Unemployment Insurance agencies real-time online access to SSA data. This includes SSN verification and Title...

  8. Digital Investigations of AN Archaeological Smart Point Cloud: a Real Time Web-Based Platform to Manage the Visualisation of Semantical Queries

    Science.gov (United States)

    Poux, F.; Neuville, R.; Hallot, P.; Van Wersch, L.; Luczfalvy Jancsó, A.; Billen, R.

    2017-05-01

    While virtual copies of the real world tend to be created faster than ever through point clouds and derivatives, their working proficiency by all professionals' demands adapted tools to facilitate knowledge dissemination. Digital investigations are changing the way cultural heritage researchers, archaeologists, and curators work and collaborate to progressively aggregate expertise through one common platform. In this paper, we present a web application in a WebGL framework accessible on any HTML5-compatible browser. It allows real time point cloud exploration of the mosaics in the Oratory of Germigny-des-Prés, and emphasises the ease of use as well as performances. Our reasoning engine is constructed over a semantically rich point cloud data structure, where metadata has been injected a priori. We developed a tool that directly allows semantic extraction and visualisation of pertinent information for the end users. It leads to efficient communication between actors by proposing optimal 3D viewpoints as a basis on which interactions can grow.

  9. Keeping Dublin Core Simple: Cross-Domain Discovery or Resource Description?; First Steps in an Information Commerce Economy: Digital Rights Management in the Emerging E-Book Environment; Interoperability: Digital Rights Management and the Emerging EBook Environment; Searching the Deep Web: Direct Query Engine Applications at the Department of Energy.

    Science.gov (United States)

    Lagoze, Carl; Neylon, Eamonn; Mooney, Stephen; Warnick, Walter L.; Scott, R. L.; Spence, Karen J.; Johnson, Lorrie A.; Allen, Valerie S.; Lederman, Abe

    2001-01-01

    Includes four articles that discuss Dublin Core metadata, digital rights management and electronic books, including interoperability; and directed query engines, a type of search engine designed to access resources on the deep Web that is being used at the Department of Energy. (LRW)

  10. Joint Top-K Spatial Keyword Query Processing

    DEFF Research Database (Denmark)

    Wu, Dingming; Yiu, Man Lung; Cong, Gao

    2012-01-01

    Web users and content are increasingly being geopositioned, and increased focus is being given to serving local content in response to web queries. This development calls for spatial keyword queries that take into account both the locations and textual descriptions of content. We study the effici......Web users and content are increasingly being geopositioned, and increased focus is being given to serving local content in response to web queries. This development calls for spatial keyword queries that take into account both the locations and textual descriptions of content. We study...... the efficient, joint processing of multiple top-k spatial keyword queries. Such joint processing is attractive during high query loads and also occurs when multiple queries are used to obfuscate a user's true query. We propose a novel algorithm and index structure for the joint processing of top-k spatial...... keyword queries. Empirical studies show that the proposed solution is efficient on real data sets. We also offer analytical studies on synthetic data sets to demonstrate the efficiency of the proposed solution. Index Terms IEEE Terms Electronic mail , Google , Indexes , Joints , Mobile communication...

  11. SPARQL Assist language-neutral query composer

    Science.gov (United States)

    2012-01-01

    Background SPARQL query composition is difficult for the lay-person, and even the experienced bioinformatician in cases where the data model is unfamiliar. Moreover, established best-practices and internationalization concerns dictate that the identifiers for ontological terms should be opaque rather than human-readable, which further complicates the task of synthesizing queries manually. Results We present SPARQL Assist: a Web application that addresses these issues by providing context-sensitive type-ahead completion during SPARQL query construction. Ontological terms are suggested using their multi-lingual labels and descriptions, leveraging existing support for internationalization and language-neutrality. Moreover, the system utilizes the semantics embedded in ontologies, and within the query itself, to help prioritize the most likely suggestions. Conclusions To ensure success, the Semantic Web must be easily available to all users, regardless of locale, training, or preferred language. By enhancing support for internationalization, and moreover by simplifying the manual construction of SPARQL queries through the use of controlled-natural-language interfaces, we believe we have made some early steps towards simplifying access to Semantic Web resources. PMID:22373327

  12. SPARQL assist language-neutral query composer.

    Science.gov (United States)

    McCarthy, Luke; Vandervalk, Ben; Wilkinson, Mark

    2012-01-25

    SPARQL query composition is difficult for the lay-person, and even the experienced bioinformatician in cases where the data model is unfamiliar. Moreover, established best-practices and internationalization concerns dictate that the identifiers for ontological terms should be opaque rather than human-readable, which further complicates the task of synthesizing queries manually. We present SPARQL Assist: a Web application that addresses these issues by providing context-sensitive type-ahead completion during SPARQL query construction. Ontological terms are suggested using their multi-lingual labels and descriptions, leveraging existing support for internationalization and language-neutrality. Moreover, the system utilizes the semantics embedded in ontologies, and within the query itself, to help prioritize the most likely suggestions. To ensure success, the Semantic Web must be easily available to all users, regardless of locale, training, or preferred language. By enhancing support for internationalization, and moreover by simplifying the manual construction of SPARQL queries through the use of controlled-natural-language interfaces, we believe we have made some early steps towards simplifying access to Semantic Web resources.

  13. Enabling Semantic Queries Against the Spatial Database

    Directory of Open Access Journals (Sweden)

    PENG, X.

    2012-02-01

    Full Text Available The spatial database based upon the object-relational database management system (ORDBMS has the merits of a clear data model, good operability and high query efficiency. That is why it has been widely used in spatial data organization and management. However, it cannot express the semantic relationships among geospatial objects, making the query results difficult to meet the user's requirement well. Therefore, this paper represents an attempt to combine the Semantic Web technology with the spatial database so as to make up for the traditional database's disadvantages. In this way, on the one hand, users can take advantages of ORDBMS to store and manage spatial data; on the other hand, if the spatial database is released in the form of Semantic Web, the users could describe a query more concisely with the cognitive pattern which is similar to that of daily life. As a consequence, this methodology enables the benefits of both Semantic Web and the object-relational database (ORDB available. The paper discusses systematically the semantic enriched spatial database's architecture, key technologies and implementation. Subsequently, we demonstrate the function of spatial semantic queries via a practical prototype system. The query results indicate that the method used in this study is feasible.

  14. Pro PHP and jQuery

    CERN Document Server

    Lengstorf, Jason

    2010-01-01

    This book is for intermediate programmers interested in building AJAX web applications using jQuery and PHP. Along with teaching some advanced PHP techniques, it will teach you how to take your dynamic applications to the next level by adding a JavaScript layer with jQuery. * Learn to utilize built-in PHP functions to build calendar tools.* Learn how jQuery can be used for AJAX, animation, client-side validation, and more.What you'll learn* Use PHP to build a calendar application that allows users to post, view, edit, and delete events.* Use jQuery to allow the calendar app to be viewed and ed

  15. Optimizing Temporal Queries

    DEFF Research Database (Denmark)

    Toman, David; Bowman, Ivan Thomas

    2003-01-01

    Recent research in the area of temporal databases has proposed a number of query languages that vary in their expressive power and the semantics they provide to users. These query languages represent a spectrum of solutions to the tension between clean semantics and efficient evaluation. Often, t...

  16. Mastering jQuery

    CERN Document Server

    Libby, Alex

    2015-01-01

    If you are a developer who is already familiar with using jQuery and wants to push your skill set further, then this book is for you. The book assumes an intermediate knowledge level of jQuery, JavaScript, HTML5, and CSS.

  17. Range-clustering queries

    NARCIS (Netherlands)

    Abrahamsen, M.; de Berg, M.T.; Buchin, K.A.; Mehr, M.; Mehrabi, A.D.

    2017-01-01

    In a geometric k -clustering problem the goal is to partition a set of points in R d into k subsets such that a certain cost function of the clustering is minimized. We present data structures for orthogonal range-clustering queries on a point set S : given a query box Q and an integer k>2 , compute

  18. jQuery 2.0 animation techniques beginner's guide

    CERN Document Server

    Culpepper, Adam

    2013-01-01

    This book is a guide to help you create attractive web page animations using jQuery. Written in a friendly and engaging approach this book is designed to be placed alongside your computer as a mentor.If you are a web designer or a frontend developer or if you want to learn how to animate the user interface of your web applications with jQuery, this book is for you. Experience with jQuery or Javascript would be helpful but solid knowledge base of HTML and CSS is assumed.

  19. Querying Workflow Logs

    Directory of Open Access Journals (Sweden)

    Yan Tang

    2018-01-01

    Full Text Available A business process or workflow is an assembly of tasks that accomplishes a business goal. Business process management is the study of the design, configuration/implementation, enactment and monitoring, analysis, and re-design of workflows. The traditional methodology for the re-design and improvement of workflows relies on the well-known sequence of extract, transform, and load (ETL, data/process warehousing, and online analytical processing (OLAP tools. In this paper, we study the ad hoc queryiny of process enactments for (data-centric business processes, bypassing the traditional methodology for more flexibility in querying. We develop an algebraic query language based on “incident patterns” with four operators inspired from Business Process Model and Notation (BPMN representation, allowing the user to formulate ad hoc queries directly over workflow logs. A formal semantics of this query language, a preliminary query evaluation algorithm, and a group of elementary properties of the operators are provided.

  20. Indexing for summary queries

    DEFF Research Database (Denmark)

    Yi, Ke; Wang, Lu; Wei, Zhewei

    2014-01-01

    ), of a particular attribute of these records. Aggregation queries are especially useful in business intelligence and data analysis applications where users are interested not in the actual records, but some statistics of them. They can also be executed much more efficiently than reporting queries, by embedding...... returned by reporting queries. In this article, we design indexing techniques that allow for extracting a statistical summary of all the records in the query. The summaries we support include frequent items, quantiles, and various sketches, all of which are of central importance in massive data analysis....... Our indexes require linear space and extract a summary with the optimal or near-optimal query cost. We illustrate the efficiency and usefulness of our designs through extensive experiments and a system demonstration....

  1. Evaluating XML-Extended OLAP Queries Based on a Physical Algebra

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    2006-01-01

    . In this paper, we extend previous work on the logical federation of OLAP and XML data sources by presenting a simplified query semantics, a physical query algebra and a robust OLAP-XML query engine as well as the query evaluation techniques. Performance experiments with a prototypical implementation suggest...

  2. Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition

    Directory of Open Access Journals (Sweden)

    Suzuki Motoyuki

    2009-01-01

    Full Text Available Abstract We are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves Web documents using the chosen keywords. A problem is that the selected keywords tend to contain misrecognized words. The proposed method introduces two new ideas for avoiding the effects of keywords derived from misrecognized words. The first idea is to compose multiple queries from selected keyword candidates so that the misrecognized words and correct words do not fall into one query. The second idea is that the number of Web documents downloaded for each query is determined according to the "query relevance." Combining these two ideas, we can alleviate bad effect of misrecognized keywords by decreasing the number of downloaded Web documents from queries that contain misrecognized keywords. Finally, we examine a method of determining the number of iterative adaptations based on the recognition likelihood. Experiments have shown that the proposed stopping criterion can determine almost the optimum number of iterations. In the final experiment, the word accuracy without adaptation (55.29% was improved to 60.38%, which was 1.13 point better than the result of the conventional unsupervised adaptation method (59.25%.

  3. Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition

    Directory of Open Access Journals (Sweden)

    Akinori Ito

    2009-01-01

    Full Text Available We are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves Web documents using the chosen keywords. A problem is that the selected keywords tend to contain misrecognized words. The proposed method introduces two new ideas for avoiding the effects of keywords derived from misrecognized words. The first idea is to compose multiple queries from selected keyword candidates so that the misrecognized words and correct words do not fall into one query. The second idea is that the number of Web documents downloaded for each query is determined according to the “query relevance.” Combining these two ideas, we can alleviate bad effect of misrecognized keywords by decreasing the number of downloaded Web documents from queries that contain misrecognized keywords. Finally, we examine a method of determining the number of iterative adaptations based on the recognition likelihood. Experiments have shown that the proposed stopping criterion can determine almost the optimum number of iterations. In the final experiment, the word accuracy without adaptation (55.29% was improved to 60.38%, which was 1.13 point better than the result of the conventional unsupervised adaptation method (59.25%.

  4. Memory aware query scheduling in a database cluster

    NARCIS (Netherlands)

    F. Waas; M.L. Kersten (Martin)

    2000-01-01

    textabstractQuery throughput is one of the primary optimization goals in interactive web-based information systems in order to achieve the performance necessary to serve large user communities. Queries in this application domain differ significantly from those in traditional database applications:

  5. Ontology Based Queries - Investigating a Natural Language Interface

    NARCIS (Netherlands)

    van der Sluis, Ielka; Hielkema, F.; Mellish, C.; Doherty, G.

    2010-01-01

    In this paper we look at what may be learned from a comparative study examining non-technical users with a background in social science browsing and querying metadata. Four query tasks were carried out with a natural language interface and with an interface that uses a web paradigm with hyperlinks.

  6. Knowledge Query Language (KQL)

    Science.gov (United States)

    2016-02-12

    described as a sparse, distributed multidimensional sorted map. Unlike a relational database , BigTable has no multicolumn primary keys or constraints. The...in query languages such as SQL. Figure 3. Address expression-based querying. Each circled step in Figure 3 is described below. Datastore/ Database ...implementation we describe in later sections stores the instance of registry ontology in JSON files. 7 Throughout the rest of this report, we use the

  7. Bioqueries: a collaborative environment to create, explore and share SPARQL queries in Life Sciences

    OpenAIRE

    García-Godoy, Maria Jesús; López-Camacho, Esteban; Navas-Delgado, Ismael; Aldana-Montes, Jose Francisco

    2016-01-01

    Bioqueries provides a collaborative environment to create, explore, execute, clone and share SPARQL queries (including Federated Queries). Federated SPARQL queries can retrieve information from more than one data source. Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech.

  8. Learning from the History of Distributed Query Processing

    DEFF Research Database (Denmark)

    Betz, Heiko; Gropengießer, Francis; Hose, Katja

    2012-01-01

    The vision of the Semantic Web has triggered the development of various new applications and opened up new directions in research. Recently, much effort has been put into the development of techniques for query processing over Linked Data. Being based upon techniques originally developed...... for distributed and federated databases, some of them inherit the same or similar problems. Thus, the goal of this paper is to point out pitfalls that the previous generation of researchers has already encountered and to introduce the Linked Data as a Service as an idea that has the potential to solve the problem...... in some scenarios. Hence, this paper discusses nine theses about Linked Data processing and sketches a research agenda for future endeavors in the area of Linked Data processing....

  9. Web Usage Mining Analysis of Federated Search Tools for Egyptian Scholars

    Science.gov (United States)

    Mohamed, Khaled A.; Hassan, Ahmed

    2008-01-01

    Purpose: This paper aims to examine the behaviour of the Egyptian scholars while accessing electronic resources through two federated search tools. The main purpose of this article is to provide guidance for federated search tool technicians and support teams about user issues, including the need for training. Design/methodology/approach: Log…

  10. SM4MQ: A Semantic Model for Multidimensional Queries

    DEFF Research Database (Denmark)

    Varga, Jovan; Dobrokhotova, Ekaterina; Romero, Oscar

    2017-01-01

    On-Line Analytical Processing (OLAP) is a data analysis approach to support decision-making. On top of that, Exploratory OLAP is a novel initiative for the convergence of OLAP and the Semantic Web (SW) that enables the use of OLAP techniques on SW data. Moreover, OLAP approaches exploit different......, sharing, and reuse on the SW. As OLAP is based on the underlying multidimensional (MD) data model we denote such queries as MD queries and define SM4MQ: A Semantic Model for Multidimensional Queries. Furthermore, we propose a method to automate the exploitation of queries by means of SPARQL. We apply...

  11. KoralQuery -- A General Corpus Query Protocol

    DEFF Research Database (Denmark)

    Bingel, Joachim; Diewald, Nils

    2015-01-01

    . In this paper, we present KoralQuery, a JSON-LD based general corpus query protocol, aiming to be independent of particular QLs, tasks and corpus formats. In addition to describing the system of types and operations that KoralQuery is built on, we exemplify the representation of corpus queries in the serialized...

  12. Code query by example

    Science.gov (United States)

    Vaucouleur, Sebastien

    2011-02-01

    We introduce code query by example for customisation of evolvable software products in general and of enterprise resource planning systems (ERPs) in particular. The concept is based on an initial empirical study on practices around ERP systems. We motivate our design choices based on those empirical results, and we show how the proposed solution helps with respect to the infamous upgrade problem: the conflict between the need for customisation and the need for upgrade of ERP systems. We further show how code query by example can be used as a form of lightweight static analysis, to detect automatically potential defects in large software products. Code query by example as a form of lightweight static analysis is particularly interesting in the context of ERP systems: it is often the case that programmers working in this field are not computer science specialists but more of domain experts. Hence, they require a simple language to express custom rules.

  13. jQuery Mobile Up and Running

    CERN Document Server

    Firtman, Maximiliano

    2012-01-01

    Would you like to build one mobile web application that works on iPad and Kindle Fire as well as iPhone and Android smartphones? This introductory guide to jQuery Mobile shows you how. Through a series of hands-on exercises, you'll learn the best ways to use this framework's many interface components to build customizable, multiplatform apps. You don't need any programming skills or previous experience with jQuery to get started. By the time you finish this book, you'll know how to create responsive, Ajax-based interfaces that work on a variety of smartphones and tablets, using jQuery Mobile

  14. jQuery for designers beginner's guide

    CERN Document Server

    MacLees, Natalie

    2014-01-01

    A step-by-step guide that spices up your web pages and designs them in the way you want using the most widely used JavaScript library, jQuery. The beginner-friendly and easy-to-understand approach of the book will help get to grips with jQuery in no time. If you know the fundamentals of HTML and CSS, and want to extend your knowledge by learning to use JavaScript, then this is just the book for you. jQuery makes JavaScript straightforward and approachable - you'll be surprised at how easy it can be to add animations and special effects to your beautifully designed pages.

  15. Spatial Keyword Query Processing

    DEFF Research Database (Denmark)

    Chen, Lisi; Jensen, Christian S.; Wu, Dingming

    2013-01-01

    Geo-textual indices play an important role in spatial keyword query- ing. The existing geo-textual indices have not been compared sys- tematically under the same experimental framework. This makes it difficult to determine which indexing technique best supports specific functionality. We provide...... an all-around survey of 12 state- of-the-art geo-textual indices. We propose a benchmark that en- ables the comparison of the spatial keyword query performance. We also report on the findings obtained when applying the bench- mark to the indices, thus uncovering new insights that may guide index...

  16. RCQ-GA: RDF Chain Query Optimization Using Genetic Algorithms

    Science.gov (United States)

    Hogenboom, Alexander; Milea, Viorel; Frasincar, Flavius; Kaymak, Uzay

    The application of Semantic Web technologies in an Electronic Commerce environment implies a need for good support tools. Fast query engines are needed for efficient querying of large amounts of data, usually represented using RDF. We focus on optimizing a special class of SPARQL queries, the so-called RDF chain queries. For this purpose, we devise a genetic algorithm called RCQ-GA that determines the order in which joins need to be performed for an efficient evaluation of RDF chain queries. The approach is benchmarked against a two-phase optimization algorithm, previously proposed in literature. The more complex a query is, the more RCQ-GA outperforms the benchmark in solution quality, execution time needed, and consistency of solution quality. When the algorithms are constrained by a time limit, the overall performance of RCQ-GA compared to the benchmark further improves.

  17. Adaptive and Optimized RDF Query Interface for Distributed WFS Data

    Directory of Open Access Journals (Sweden)

    Tian Zhao

    2017-04-01

    Full Text Available Web Feature Service (WFS is a protocol for accessing geospatial data stores such as databases and Shapefiles over the Web. However, WFS does not provide direct access to data distributed in multiple servers. In addition, WFS features extracted from their original sources are not convenient for user access due to the lack of connection to high-level concepts. Users are facing the choices of either querying each WFS server first and then integrating the results, or converting the data from all WFS servers to a more expressive format such as RDF (Resource Description Framework and then querying the integrated data. The first choice requires additional programming while the second choice is not practical for large or frequently updated datasets. The new contribution of this paper is that we propose a novel adaptive and optimized RDF query interface to overcome the aforementioned limitation. Specifically, in this paper, we propose a novel algorithm to query and synthesize distributed WFS data through an RDF query interface, where users can specify data requests to multiple WFS servers using a single RDF query. Users can also define a simple configuration to associate WFS feature types, attributes, and values with RDF classes, properties, and values so that user queries can be written using a more uniform and informative vocabulary. The algorithm translates each RDF query written in SPARQL-like syntax to multiple WFS GetFeature requests, and then converts and integrates the multiple WFS results to get the answers to the original query. The generated GetFeature requests are sent asynchronously and simultaneously to WFS servers to take advantage of the server parallelism. The results of each GetFeature request are cached to improve query response time for subsequent queries that involve one or more of the cached requests. A JavaScript-based prototype is implemented and experimental results show that the query response time can be greatly reduced through

  18. Manchester visual query language

    Science.gov (United States)

    Oakley, John P.; Davis, Darryl N.; Shann, Richard T.

    1993-04-01

    We report a database language for visual retrieval which allows queries on image feature information which has been computed and stored along with images. The language is novel in that it provides facilities for dealing with feature data which has actually been obtained from image analysis. Each line in the Manchester Visual Query Language (MVQL) takes a set of objects as input and produces another, usually smaller, set as output. The MVQL constructs are mainly based on proven operators from the field of digital image analysis. An example is the Hough-group operator which takes as input a specification for the objects to be grouped, a specification for the relevant Hough space, and a definition of the voting rule. The output is a ranked list of high scoring bins. The query could be directed towards one particular image or an entire image database, in the latter case the bins in the output list would in general be associated with different images. We have implemented MVQL in two layers. The command interpreter is a Lisp program which maps each MVQL line to a sequence of commands which are used to control a specialized database engine. The latter is a hybrid graph/relational system which provides low-level support for inheritance and schema evolution. In the paper we outline the language and provide examples of useful queries. We also describe our solution to the engineering problems associated with the implementation of MVQL.

  19. Flexible Query Answering Systems

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 12th International Conference on Flexible Query Answering Systems, FQAS 2017, held in London, UK, in June 2017. The 21 full papers presented in this book together with 4 short papers were carefully reviewed and selected from 43 submissions...

  20. Learning via Query Synthesis

    KAUST Repository

    Alabdulmohsin, Ibrahim Mansour

    2017-05-07

    Active learning is a subfield of machine learning that has been successfully used in many applications. One of the main branches of active learning is query synthe- sis, where the learning agent constructs artificial queries from scratch in order to reveal sensitive information about the underlying decision boundary. It has found applications in areas, such as adversarial reverse engineering, automated science, and computational chemistry. Nevertheless, the existing literature on membership query synthesis has, generally, focused on finite concept classes or toy problems, with a limited extension to real-world applications. In this thesis, I develop two spectral algorithms for learning halfspaces via query synthesis. The first algorithm is a maximum-determinant convex optimization method while the second algorithm is a Markovian method that relies on Khachiyan’s classical update formulas for solving linear programs. The general theme of these methods is to construct an ellipsoidal approximation of the version space and to synthesize queries, afterward, via spectral decomposition. Moreover, I also describe how these algorithms can be extended to other settings as well, such as pool-based active learning. Having demonstrated that halfspaces can be learned quite efficiently via query synthesis, the second part of this thesis proposes strategies for mitigating the risk of reverse engineering in adversarial environments. One approach that can be used to render query synthesis algorithms ineffective is to implement a randomized response. In this thesis, I propose a semidefinite program (SDP) for learning a distribution of classifiers, subject to the constraint that any individual classifier picked at random from this distributions provides reliable predictions with a high probability. This algorithm is, then, justified both theoretically and empirically. A second approach is to use a non-parametric classification method, such as similarity-based classification. In this

  1. A Dynamic Extension of ATLAS Run Query Service

    CERN Document Server

    Buliga, Alexandru

    2015-01-01

    The ATLAS RunQuery is a primarily web-based service for the ATLAS community to access meta information about the data taking in a concise format. In order to provide a better user experience, the service was moved to use a new technology, involving concepts such as: Web Sockets, on demand data, client-side scripting, memory caching and parallelizing execution.

  2. Google BigQuery analytics

    CERN Document Server

    Tigani, Jordan

    2014-01-01

    How to effectively use BigQuery, avoid common mistakes, and execute sophisticated queries against large datasets Google BigQuery Analytics is the perfect guide for business and data analysts who want the latest tips on running complex queries and writing code to communicate with the BigQuery API. The book uses real-world examples to demonstrate current best practices and techniques, and also explains and demonstrates streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results. In addit

  3. Algebra-Based Optimization of XML-Extended OLAP Queries

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    In today’s OLAP systems, integrating fast changing data, e.g., stock quotes, physically into a cube is complex and time-consuming. The widespread use of XML makes it very possible that this data is available in XML format on the WWW; thus, making XML data logically federated with OLAP systems...... is desirable. This report presents a complete foundation for such OLAP-XML federations. This includes a prototypical query engine, a simplified query semantics based on previous work, and a complete physical algebra which enables precise modeling of the execution tasks of an OLAP-XML query. Effective algebra...

  4. Conceptual querying through ontologies

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik

    2009-01-01

    is motivated by an obvious need for users to survey huge volumes of objects in query answers. An ontology formalism and a special notion of-instantiated ontology" are introduced. The latter is a structure reflecting the content in the document collection in that; it is a restriction of a general world......We present here ail approach to conceptual querying where the aim is, given a collection of textual database objects or documents, to target an abstraction of the entire database content in terms of the concepts appearing in documents, rather than the documents in the collection. The approach...... knowledge ontology to the concepts instantiated in the collection. The notion of ontology-based similarity is briefly described, language constructs for direct navigation and retrieval of concepts in the ontology are discussed and approaches to conceptual summarization are presented....

  5. Usare WebDewey

    OpenAIRE

    Baldi, Paolo

    2016-01-01

    This presentation shows how to use the WebDewey tool. Features of WebDewey. Italian WebDewey compared with American WebDewey. Querying Italian WebDewey. Italian WebDewey and MARC21. Italian WebDewey and UNIMARC. Numbers, captions, "equivalente verbale": Dewey decimal classification in Italian catalogues. Italian WebDewey and Nuovo soggettario. Italian WebDewey and LCSH. Italian WebDewey compared with printed version of Italian Dewey Classification (22. edition): advantages and disadvantages o...

  6. Field experience with the FAA's Web-based medical certification system "AMCS/DIWS". Federal Aviation Administration.

    Science.gov (United States)

    Angelici, Arnold A; Mohler, Stanley R

    2002-04-01

    The October 1, 1999, introduction in the U.S. of a Web-based medical certification process for civil aircrew opened a new era within civil aviation. The Federal Aviation Administration's (FAA) Aeromedical Certification System/Document Imaging Workflow System (AMCS/DIWS) has imposed certain new requirements on the designated Aviation Medical Examiners (AMEs), including the use of Internet systems and procedures. A number of AMEs elected to discontinue their work as the classic medical certification processes were replaced. The authors document their personal experience with respect to the new system, and cite the overall advantages that modernized medical certification procedures bring. These advantages include far fewer "mistakes of omission" by AMEs, more timely receipt by the FAA of aircrew certification data, and a developing master aircrew database for analytic studies.

  7. Query optimization over crowdsourced data

    KAUST Repository

    Park, Hyunjung; Widom, Jennifer

    2013-01-01

    Deco is a comprehensive system for answering declarative queries posed over stored relational data together with data obtained on-demand from the crowd. In this paper we describe Deco's cost-based query optimizer, building on Deco's data model

  8. Mastering jQuery mobile

    CERN Document Server

    Lambert, Chip

    2015-01-01

    You've started down the path of jQuery Mobile, now begin mastering some of jQuery Mobile's higher level topics. Go beyond jQuery Mobile's documentation and master one of the hottest mobile technologies out there. Previous JavaScript and PHP experience can help you get the most out of this book.

  9. Query optimization over crowdsourced data

    KAUST Repository

    Park, Hyunjung

    2013-08-26

    Deco is a comprehensive system for answering declarative queries posed over stored relational data together with data obtained on-demand from the crowd. In this paper we describe Deco\\'s cost-based query optimizer, building on Deco\\'s data model, query language, and query execution engine presented earlier. Deco\\'s objective in query optimization is to find the best query plan to answer a query, in terms of estimated monetary cost. Deco\\'s query semantics and plan execution strategies require several fundamental changes to traditional query optimization. Novel techniques incorporated into Deco\\'s query optimizer include a cost model distinguishing between "free" existing data versus paid new data, a cardinality estimation algorithm coping with changes to the database state during query execution, and a plan enumeration algorithm maximizing reuse of common subplans in a setting that makes reuse challenging. We experimentally evaluate Deco\\'s query optimizer, focusing on the accuracy of cost estimation and the efficiency of plan enumeration.

  10. Evaluating XML-Extended OLAP Queries Based on a Physical Algebra

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    2004-01-01

    is desirable. In this paper, we extend previous work on the logical federation of OLAP and XML data sources by presenting a simplified query semantics,a physical query algebra and a robust OLAP-XML query engine.Performance experiments with a prototypical implementation suggest that the performance for OLAP...

  11. Query Log Analysis of an Electronic Health Record Search Engine

    Science.gov (United States)

    Yang, Lei; Mei, Qiaozhu; Zheng, Kai; Hanauer, David A.

    2011-01-01

    We analyzed a longitudinal collection of query logs of a full-text search engine designed to facilitate information retrieval in electronic health records (EHR). The collection, 202,905 queries and 35,928 user sessions recorded over a course of 4 years, represents the information-seeking behavior of 533 medical professionals, including frontline practitioners, coding personnel, patient safety officers, and biomedical researchers for patient data stored in EHR systems. In this paper, we present descriptive statistics of the queries, a categorization of information needs manifested through the queries, as well as temporal patterns of the users’ information-seeking behavior. The results suggest that information needs in medical domain are substantially more sophisticated than those that general-purpose web search engines need to accommodate. Therefore, we envision there exists a significant challenge, along with significant opportunities, to provide intelligent query recommendations to facilitate information retrieval in EHR. PMID:22195150

  12. Advanced SPARQL querying in small molecule databases.

    Science.gov (United States)

    Galgonek, Jakub; Hurt, Tomáš; Michlíková, Vendula; Onderka, Petr; Schwarz, Jan; Vondrášek, Jiří

    2016-01-01

    In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users. We extended a SPARQL engine to be able to use special procedures inside SPARQL queries. This allows the user to work with data that cannot be simply precomputed and thus cannot be directly stored in the database. We designed an algorithm that checks a query against data ontology to identify possible user errors. This greatly improves query debugging. We also introduced an approach to visualize retrieved data in a user-friendly way, based on templates describing visualizations of resource classes. To integrate all of our approaches, we developed a simple web application. Our system was implemented successfully, and we demonstrated its usability on the ChEBI database transformed into RDF form. To demonstrate procedure call functions, we employed compound similarity searching based on OrChem. The application is publicly available at https://bioinfo.uochb.cas.cz/projects/chemRDF.

  13. Enabling Incremental Query Re-Optimization.

    Science.gov (United States)

    Liu, Mengmeng; Ives, Zachary G; Loo, Boon Thau

    2016-01-01

    As declarative query processing techniques expand to the Web, data streams, network routers, and cloud platforms, there is an increasing need to re-plan execution in the presence of unanticipated performance changes. New runtime information may affect which query plan we prefer to run. Adaptive techniques require innovation both in terms of the algorithms used to estimate costs , and in terms of the search algorithm that finds the best plan. We investigate how to build a cost-based optimizer that recomputes the optimal plan incrementally given new cost information, much as a stream engine constantly updates its outputs given new data. Our implementation especially shows benefits for stream processing workloads. It lays the foundations upon which a variety of novel adaptive optimization algorithms can be built. We start by leveraging the recently proposed approach of formulating query plan enumeration as a set of recursive datalog queries ; we develop a variety of novel optimization approaches to ensure effective pruning in both static and incremental cases. We further show that the lessons learned in the declarative implementation can be equally applied to more traditional optimizer implementations.

  14. RESPEITO À CIDADANIA: PROVENDO ACESSIBILIDADE WEB NA UNIVERSIDADE FEDERAL DE SERGIPE (UFS

    Directory of Open Access Journals (Sweden)

    Quelita Araújo Diniz da Silva

    2012-03-01

    Full Text Available O acesso à informação facilitado pelas Tecnologias de Informação e Comunicação (TIC é um dos principais elementos da sociedade do conhecimento, sendo visto como possibilidade de inclusão social. Entretanto, alguns indivíduos possuem necessidades, permanentes ou temporárias, que os limitam no acesso às informações disponibilizadas, mantendo-os à periferia da sociedade. Este trabalho tem como objetivo discutir a acessibilidade na Web, discutindo os principais padrões, as diretrizes e técnicas para o projeto de webapps acessíveis. Como também, expor o conceito de tecnologias assistivas e como elas adequam às funcionalidades computacionais ao internauta com necessidades especiais auxiliando-o na navegação. Por fim, descrever um estudo de caso realizado no site da POSGRAP – UFS objetivando viabilizar à comunidade Sergipana à inclusão provendo acessibilidade aos produtos e serviços da UFS.

  15. Instant Cassandra query language

    CERN Document Server

    Singh, Amresh

    2013-01-01

    Get to grips with a new technology, understand what it is and what it can do for you, and then get to work with the most important features and tasks. It's an Instant Starter guide.Instant Cassandra Query Language is great for those who are working with Cassandra databases and who want to either learn CQL to check data from the console or build serious applications using CQL. If you're looking for something that helps you get started with CQL in record time and you hate the idea of learning a new language syntax, then this book is for you.

  16. Processing Incomplete Query Specifications in a Context-Dependent Reasoning Framework

    Directory of Open Access Journals (Sweden)

    Neli P. Zlatareva

    2013-04-01

    Full Text Available Search is the most prominent web service, which is about to change dramatically with the transition to the Semantic Web. Semantic Web applications are expected to deal with complex conjunctive queries, and not always such queries can be completely and precisely defined. Current Semantic Web reasoners built upon Description Logics have limited processing power in such environments. We discuss some of their limitations, and show how an alternative logical framework utilizing context-dependent rules can be extended to handle incomplete or imprecise query specifications.

  17. Regular paths in SparQL: querying the NCI Thesaurus.

    Science.gov (United States)

    Detwiler, Landon T; Suciu, Dan; Brinkley, James F

    2008-11-06

    OWL, the Web Ontology Language, provides syntax and semantics for representing knowledge for the semantic web. Many of the constructs of OWL have a basis in the field of description logics. While the formal underpinnings of description logics have lead to a highly computable language, it has come at a cognitive cost. OWL ontologies are often unintuitive to readers lacking a strong logic background. In this work we describe GLEEN, a regular path expression library, which extends the RDF query language SparQL to support complex path expressions over OWL and other RDF-based ontologies. We illustrate the utility of GLEEN by showing how it can be used in a query-based approach to defining simpler, more intuitive views of OWL ontologies. In particular we show how relatively simple GLEEN-enhanced SparQL queries can create views of the OWL version of the NCI Thesaurus that match the views generated by the web-based NCI browser.

  18. The Impact of the Introduction of Web Information Systems (WIS) on Information Policies: An Analysis of the Canadian Federal Government Policies Related to WIS.

    Science.gov (United States)

    Dufour, Christine; Bergeron, Pierette

    2002-01-01

    Presents results of an analysis of the Canadian federal government information policies that govern its Web information systems (WIS) that was conducted to better understand how the government has adapted its information policies to the WIS. Discusses results that indicate new policies have been crafted to take into account the WIS context.…

  19. Elements of a Spatial Web

    DEFF Research Database (Denmark)

    Jensen, Christian S.

    2010-01-01

    Driven by factors such as the increasingly mobile use of the web and the proliferation of geo-positioning technologies, the web is rapidly acquiring a spatial aspect. Specifically, content and users are being geo-tagged, and services are being developed that exploit these tags. The research...... community is hard at work inventing means of efficiently supporting new spatial query functionality. Points of interest with a web presence, called spatial web objects, have a location as well as a textual description. Spatio-textual queries return such objects that are near a location argument...... and are relevant to a text argument. An important element in enabling such queries is to be able to rank spatial web objects. Another is to be able to determine the relevance of an object to a query. Yet another is to enable the efficient processing of such queries. The talk covers recent results on spatial web...

  20. Federated Giovanni: A Distributed Web Service for Analysis and Visualization of Remote Sensing Data

    Science.gov (United States)

    Lynnes, Chris

    2014-01-01

    The Geospatial Interactive Online Visualization and Analysis Interface (Giovanni) is a popular tool for users of the Goddard Earth Sciences Data and Information Services Center (GES DISC) and has been in use for over a decade. It provides a wide variety of algorithms and visualizations to explore large remote sensing datasets without having to download the data and without having to write readers and visualizers for it. Giovanni is now being extended to enable its capabilities at other data centers within the Earth Observing System Data and Information System (EOSDIS). This Federated Giovanni will allow four other data centers to add and maintain their data within Giovanni on behalf of their user community. Those data centers are the Physical Oceanography Distributed Active Archive Center (PO.DAAC), MODIS Adaptive Processing System (MODAPS), Ocean Biology Processing Group (OBPG), and Land Processes Distributed Active Archive Center (LP DAAC). Three tiers are supported: Tier 1 (GES DISC-hosted) gives the remote data center a data management interface to add and maintain data, which are provided through the Giovanni instance at the GES DISC. Tier 2 packages Giovanni up as a virtual machine for distribution to and deployment by the other data centers. Data variables are shared among data centers by sharing documents from the Solr database that underpins Giovanni's data management capabilities. However, each data center maintains their own instance of Giovanni, exposing the variables of most interest to their user community. Tier 3 is a Shared Source model, in which the data centers cooperate to extend the infrastructure by contributing source code.

  1. Moving Spatial Keyword Queries

    DEFF Research Database (Denmark)

    Wu, Dingming; Yiu, Man Lung; Jensen, Christian S.

    2013-01-01

    propose two algorithms for computing safe zones that guarantee correct results at any time and that aim to optimize the server-side computation as well as the communication between the server and the client. We exploit tight and conservative approximations of safe zones and aggressive computational space...... text data. State-of-the-art solutions for moving queries employ safe zones that guarantee the validity of reported results as long as the user remains within the safe zone associated with a result. However, existing safe-zone methods focus solely on spatial locations and ignore text relevancy. We...... pruning. We present techniques that aim to compute the next safe zone efficiently, and we present two types of conservative safe zones that aim to reduce the communication cost. Empirical studies with real data suggest that the proposals are efficient. To understand the effectiveness of the proposed safe...

  2. Ranking Queries on Uncertain Data

    CERN Document Server

    Hua, Ming

    2011-01-01

    Uncertain data is inherent in many important applications, such as environmental surveillance, market analysis, and quantitative economics research. Due to the importance of those applications and rapidly increasing amounts of uncertain data collected and accumulated, analyzing large collections of uncertain data has become an important task. Ranking queries (also known as top-k queries) are often natural and useful in analyzing uncertain data. Ranking Queries on Uncertain Data discusses the motivations/applications, challenging problems, the fundamental principles, and the evaluation algorith

  3. Research Issues in Mobile Querying

    DEFF Research Database (Denmark)

    Breunig, M.; Jensen, Christian Søndergaard; Klein, M.

    2004-01-01

    This document reports on key aspects of the discussions conducted within the working group. In particular, the document aims to offer a structured and somewhat digested summary of the group's discussions. The document first offers concepts that enable characterization of "mobile queries" as well...... as the types of systems that enable such queries. It explores the notion of context in mobile queries. The document ends with a few observations, mainly regarding challenges....

  4. Optimizing queries in distributed systems

    Directory of Open Access Journals (Sweden)

    Ion LUNGU

    2006-01-01

    Full Text Available This research presents the main elements of query optimizations in distributed systems. First, data architecture according with system level architecture in a distributed environment is presented. Then the architecture of a distributed database management system (DDBMS is described on conceptual level followed by the presentation of the distributed query execution steps on these information systems. The research ends with presentation of some aspects of distributed database query optimization and strategies used for that.

  5. The role of economics in the QUERI program: QUERI Series.

    Science.gov (United States)

    Smith, Mark W; Barnett, Paul G

    2008-04-22

    The United States (U.S.) Department of Veterans Affairs (VA) Quality Enhancement Research Initiative (QUERI) has implemented economic analyses in single-site and multi-site clinical trials. To date, no one has reviewed whether the QUERI Centers are taking an optimal approach to doing so. Consistent with the continuous learning culture of the QUERI Program, this paper provides such a reflection. We present a case study of QUERI as an example of how economic considerations can and should be integrated into implementation research within both single and multi-site studies. We review theoretical and applied cost research in implementation studies outside and within VA. We also present a critique of the use of economic research within the QUERI program. Economic evaluation is a key element of implementation research. QUERI has contributed many developments in the field of implementation but has only recently begun multi-site implementation trials across multiple regions within the national VA healthcare system. These trials are unusual in their emphasis on developing detailed costs of implementation, as well as in the use of business case analyses (budget impact analyses). Economics appears to play an important role in QUERI implementation studies, only after implementation has reached the stage of multi-site trials. Economic analysis could better inform the choice of which clinical best practices to implement and the choice of implementation interventions to employ. QUERI economics also would benefit from research on costing methods and development of widely accepted international standards for implementation economics.

  6. Smart query answering for marine sensor data.

    Science.gov (United States)

    Shahriar, Md Sumon; de Souza, Paulo; Timms, Greg

    2011-01-01

    We review existing query answering systems for sensor data. We then propose an extended query answering approach termed smart query, specifically for marine sensor data. The smart query answering system integrates pattern queries and continuous queries. The proposed smart query system considers both streaming data and historical data from marine sensor networks. The smart query also uses query relaxation technique and semantics from domain knowledge as a recommender system. The proposed smart query benefits in building data and information systems for marine sensor networks.

  7. Smart Query Answering for Marine Sensor Data

    Directory of Open Access Journals (Sweden)

    Paulo de Souza

    2011-03-01

    Full Text Available We review existing query answering systems for sensor data. We then propose an extended query answering approach termed smart query, specifically for marine sensor data. The smart query answering system integrates pattern queries and continuous queries. The proposed smart query system considers both streaming data and historical data from marine sensor networks. The smart query also uses query relaxation technique and semantics from domain knowledge as a recommender system. The proposed smart query benefits in building data and information systems for marine sensor networks.

  8. Airport Status Web Service

    Data.gov (United States)

    Department of Transportation — A web service that allows end-users the ability to query the current known delays in the National Airspace System as well as the current weather from NOAA by airport...

  9. jQuery UI cookbook

    CERN Document Server

    Boduch, Adam

    2013-01-01

    Filled with a practical collection of recipes, jQuery UI Cookbook is full of clear, step-by-step instructions that will help you harness the powerful UI framework in jQuery. Depending on your needs, you can dip in and out of the Cookbook and its recipes, or follow the book from start to finish.If you are a jQuery UI developer looking to improve your existing applications, extract ideas for your new application, or to better understand the overall widget architecture, then jQuery UI Cookbook is a must-have for you. The reader should at least have a rudimentary understanding of what jQuery UI is

  10. Optimizing RDF Data Cubes for Efficient Processing of Analytical Queries

    DEFF Research Database (Denmark)

    Jakobsen, Kim Ahlstrøm; Andersen, Alex B.; Hose, Katja

    2015-01-01

    data warehouses and data cubes. Today, external data sources are essential for analytics and, as the Semantic Web gains popularity, more and more external sources are available in native RDF. With the recent SPARQL 1.1 standard, performing analytical queries over RDF data sources has finally become...

  11. In-context query reformulation for failing SPARQL queries

    Science.gov (United States)

    Viswanathan, Amar; Michaelis, James R.; Cassidy, Taylor; de Mel, Geeth; Hendler, James

    2017-05-01

    Knowledge bases for decision support systems are growing increasingly complex, through continued advances in data ingest and management approaches. However, humans do not possess the cognitive capabilities to retain a bird's-eyeview of such knowledge bases, and may end up issuing unsatisfiable queries to such systems. This work focuses on the implementation of a query reformulation approach for graph-based knowledge bases, specifically designed to support the Resource Description Framework (RDF). The reformulation approach presented is instance-and schema-aware. Thus, in contrast to relaxation techniques found in the state-of-the-art, the presented approach produces in-context query reformulation.

  12. Geospatial semantic web

    CERN Document Server

    Zhang, Chuanrong; Li, Weidong

    2015-01-01

    This book covers key issues related to Geospatial Semantic Web, including geospatial web services for spatial data interoperability; geospatial ontology for semantic interoperability; ontology creation, sharing, and integration; querying knowledge and information from heterogeneous data source; interfaces for Geospatial Semantic Web, VGI (Volunteered Geographic Information) and Geospatial Semantic Web; challenges of Geospatial Semantic Web; and development of Geospatial Semantic Web applications. This book also describes state-of-the-art technologies that attempt to solve these problems such as WFS, WMS, RDF, OWL, and GeoSPARQL, and demonstrates how to use the Geospatial Semantic Web technologies to solve practical real-world problems such as spatial data interoperability.

  13. Top-k aggregation queries in large-scale distributed systems

    OpenAIRE

    Michel, Sebastian

    2007-01-01

    Distributed top-k query processing has recently become an essential functionality in a large number of emerging application classes like Internet traffic monitoring and Peer-to-Peer Web search. This work addresses efficient algorithms for distributed top-k queries in wide-area networks where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers. More precisely, in this thesis, we make the following distributions: We present the fa...

  14. Harvesting All Matching Information To A Given Query From a Deep Website

    NARCIS (Netherlands)

    Khelghati, Mohammadreza; Hiemstra, Djoerd; van Keulen, Maurice; Armano, Giuliano; Bozzon, Alessandro; Giuliani, Alessandro

    In this paper, the goal is harvesting all documents matching a given (entity) query from a deep web source. The objective is to retrieve all information about for instance "Denzel Washington", "Iran Nuclear Deal", or "FC Barcelona" from data hidden behind web forms. Policies of web search engines

  15. The role of economics in the QUERI program: QUERI Series

    Directory of Open Access Journals (Sweden)

    Smith Mark W

    2008-04-01

    Full Text Available Abstract Background The United States (U.S. Department of Veterans Affairs (VA Quality Enhancement Research Initiative (QUERI has implemented economic analyses in single-site and multi-site clinical trials. To date, no one has reviewed whether the QUERI Centers are taking an optimal approach to doing so. Consistent with the continuous learning culture of the QUERI Program, this paper provides such a reflection. Methods We present a case study of QUERI as an example of how economic considerations can and should be integrated into implementation research within both single and multi-site studies. We review theoretical and applied cost research in implementation studies outside and within VA. We also present a critique of the use of economic research within the QUERI program. Results Economic evaluation is a key element of implementation research. QUERI has contributed many developments in the field of implementation but has only recently begun multi-site implementation trials across multiple regions within the national VA healthcare system. These trials are unusual in their emphasis on developing detailed costs of implementation, as well as in the use of business case analyses (budget impact analyses. Conclusion Economics appears to play an important role in QUERI implementation studies, only after implementation has reached the stage of multi-site trials. Economic analysis could better inform the choice of which clinical best practices to implement and the choice of implementation interventions to employ. QUERI economics also would benefit from research on costing methods and development of widely accepted international standards for implementation economics.

  16. Multi-Dimensional Path Queries

    DEFF Research Database (Denmark)

    Bækgaard, Lars

    1998-01-01

    to create nested path structures. We present an SQL-like query language that is based on path expressions and we show how to use it to express multi-dimensional path queries that are suited for advanced data analysis in decision support environments like data warehousing environments......We present the path-relationship model that supports multi-dimensional data modeling and querying. A path-relationship database is composed of sets of paths and sets of relationships. A path is a sequence of related elements (atoms, paths, and sets of paths). A relationship is a binary path...

  17. Recommendation Sets and Choice Queries

    DEFF Research Database (Denmark)

    Viappiani, Paolo Renato; Boutilier, Craig

    2011-01-01

    Utility elicitation is an important component of many applications, such as decision support systems and recommender systems. Such systems query users about their preferences and offer recommendations based on the system's belief about the user's utility function. We analyze the connection between...... the problem of generating optimal recommendation sets and the problem of generating optimal choice queries, considering both Bayesian and regret-based elicitation. Our results show that, somewhat surprisingly, under very general circumstances, the optimal recommendation set coincides with the optimal query....

  18. Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning in Web 3.0

    OpenAIRE

    Padma, S.; Seshasaayee, Ananthi

    2012-01-01

    Web 3.0 is an evolving extension of the current web environme bnt. Information in web 3.0 can be collaborated and communicated when queried. Web 3.0 architecture provides an excellent learning experience to the students. Web 3.0 is 3D, media centric and semantic. Web based learning has been on high in recent days. Web 3.0 has intelligent agents as tutors to collect and disseminate the answers to the queries by the students. Completely Interactive learner's query determine the customization of...

  19. Using Query Languages and Mobile Code to Reduce Service Invocation Costs

    National Research Council Canada - National Science Library

    Szymanski, R; Palmer, N; Chase, T

    2006-01-01

    .... As a concrete example, this paper describes work being performed by CERDEC C2D (Communications Electronics Research, Development, and Engineering Center - Command and Control Directorate) at Fort Monmouth, NJ to expose a military mission plan as a web service through the use of simple SQL-like (Structured Query Language) statements optimized for mission data query.

  20. Robust Optimization of Database Queries

    Indian Academy of Sciences (India)

    JAYANT

    2011-07-06

    Jul 6, 2011 ... Based on first-order logic. ○ Edgar ... Cost-based Query Optimizer s choice of execution plan ... Determines the values of goods shipped between nations in a time period select ..... Born: 1881 Elected: 1934 Section: Medicine.

  1. Schedule Sales Query Raw Data

    Data.gov (United States)

    General Services Administration — Schedule Sales Query presents sales volume figures as reported to GSA by contractors. The reports are generated as quarterly reports for the current year and the...

  2. Labeling RDF Graphs for Linear Time and Space Querying

    Science.gov (United States)

    Furche, Tim; Weinzierl, Antonius; Bry, François

    Indices and data structures for web querying have mostly considered tree shaped data, reflecting the view of XML documents as tree-shaped. However, for RDF (and when querying ID/IDREF constraints in XML) data is indisputably graph-shaped. In this chapter, we first study existing indexing and labeling schemes for RDF and other graph datawith focus on support for efficient adjacency and reachability queries. For XML, labeling schemes are an important part of the widespread adoption of XML, in particular for mapping XML to existing (relational) database technology. However, the existing indexing and labeling schemes for RDF (and graph data in general) sacrifice one of the most attractive properties of XML labeling schemes, the constant time (and per-node space) test for adjacency (child) and reachability (descendant). In the second part, we introduce the first labeling scheme for RDF data that retains this property and thus achieves linear time and space processing of acyclic RDF queries on a significantly larger class of graphs than previous approaches (which are mostly limited to tree-shaped data). Finally, we show how this labeling scheme can be applied to (acyclic) SPARQL queries to obtain an evaluation algorithm with time and space complexity linear in the number of resources in the queried RDF graph.

  3. Algebra-Based Optimization of XML-Extended OLAP Queries

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    2006-01-01

    In today’s OLAP systems, integrating fast changing data physically into a cube is complex and time-consuming. Our solution, the “OLAP-XML Federation System,” makes it possible to reference the fast changing data in XML format in OLAP queries without physical integration. In this paper, we introduce...

  4. Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea.

    Science.gov (United States)

    Woo, Hyekyung; Cho, Youngtae; Shim, Eunyoung; Lee, Jong-Koo; Lee, Chang-Gun; Kim, Seong Hwan

    2016-07-04

    As suggested as early as in 2006, logs of queries submitted to search engines seeking information could be a source for detection of emerging influenza epidemics if changes in the volume of search queries are monitored (infodemiology). However, selecting queries that are most likely to be associated with influenza epidemics is a particular challenge when it comes to generating better predictions. In this study, we describe a methodological extension for detecting influenza outbreaks using search query data; we provide a new approach for query selection through the exploration of contextual information gleaned from social media data. Additionally, we evaluate whether it is possible to use these queries for monitoring and predicting influenza epidemics in South Korea. Our study was based on freely available weekly influenza incidence data and query data originating from the search engine on the Korean website Daum between April 3, 2011 and April 5, 2014. To select queries related to influenza epidemics, several approaches were applied: (1) exploring influenza-related words in social media data, (2) identifying the chief concerns related to influenza, and (3) using Web query recommendations. Optimal feature selection by least absolute shrinkage and selection operator (Lasso) and support vector machine for regression (SVR) were used to construct a model predicting influenza epidemics. In total, 146 queries related to influenza were generated through our initial query selection approach. A considerable proportion of optimal features for final models were derived from queries with reference to the social media data. The SVR model performed well: the prediction values were highly correlated with the recent observed influenza-like illness (r=.956; Psearch queries to enhance influenza surveillance in South Korea. In addition, an approach for query selection using social media data seems ideal for supporting influenza surveillance based on search query data.

  5. Traitor: associating concepts using the world wide web

    NARCIS (Netherlands)

    Drijfhout, Wanno; Oliver, J.; Oliver, Jundt; Wevers, L.; Hiemstra, Djoerd

    We use Common Crawl's 25TB data set of web pages to construct a database of associated concepts using Hadoop. The database can be queried through a web application with two query interfaces. A textual interface allows searching for similarities and differences between multiple concepts using a query

  6. Dynamic Planar Range Maxima Queries

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Tsakalidis, Konstantinos

    2011-01-01

    We consider the dynamic two-dimensional maxima query problem. Let P be a set of n points in the plane. A point is maximal if it is not dominated by any other point in P. We describe two data structures that support the reporting of the t maximal points that dominate a given query point, and allow...... for insertions and deletions of points in P. In the pointer machine model we present a linear space data structure with O(logn + t) worst case query time and O(logn) worst case update time. This is the first dynamic data structure for the planar maxima dominance query problem that achieves these bounds...... are integers in the range U = {0, …,2 w  − 1 }. We present a linear space data structure that supports 3-sided range maxima queries in O(logn/loglogn+t) worst case time and updates in O(logn/loglogn) worst case time. These are the first sublogarithmic worst case bounds for all operations in the RAM model....

  7. Data management on the spatial web

    DEFF Research Database (Denmark)

    Jensen, Christian S.

    2012-01-01

    Due in part to the increasing mobile use of the web and the proliferation of geo-positioning, the web is fast acquiring a significant spatial aspect. Content and users are being augmented with locations that are used increasingly by location-based services. Studies suggest that each week, several...... billion web queries are issued that have local intent and target spatial web objects. These are points of interest with a web presence, and they thus have locations as well as textual descriptions. This development has given prominence to spatial web data management, an area ripe with new and exciting...... opportunities and challenges. The research community has embarked on inventing and supporting new query functionality for the spatial web. Different kinds of spatial web queries return objects that are near a location argument and are relevant to a text argument. To support such queries, it is important...

  8. A web-based approach to data imputation

    KAUST Repository

    Li, Zhixu; Sharaf, Mohamed Abdel Fattah; Sitbon, Laurianne; Sadiq, Shazia Wasim; Indulska, Marta; Zhou, Xiaofang

    2013-01-01

    principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme

  9. Protecting count queries in study design.

    Science.gov (United States)

    Vinterbo, Staal A; Sarwate, Anand D; Boxwala, Aziz A

    2012-01-01

    Today's clinical research institutions provide tools for researchers to query their data warehouses for counts of patients. To protect patient privacy, counts are perturbed before reporting; this compromises their utility for increased privacy. The goal of this study is to extend current query answer systems to guarantee a quantifiable level of privacy and allow users to tailor perturbations to maximize the usefulness according to their needs. A perturbation mechanism was designed in which users are given options with respect to scale and direction of the perturbation. The mechanism translates the true count, user preferences, and a privacy level within administrator-specified bounds into a probability distribution from which the perturbed count is drawn. Users can significantly impact the scale and direction of the count perturbation and can receive more accurate final cohort estimates. Strong and semantically meaningful differential privacy is guaranteed, providing for a unified privacy accounting system that can support role-based trust levels. This study provides an open source web-enabled tool to investigate visually and numerically the interaction between system parameters, including required privacy level and user preference settings. Quantifying privacy allows system administrators to provide users with a privacy budget and to monitor its expenditure, enabling users to control the inevitable loss of utility. While current measures of privacy are conservative, this system can take advantage of future advances in privacy measurement. The system provides new ways of trading off privacy and utility that are not provided in current study design systems.

  10. Man vs. Machine: Differences in SPARQL Queries

    NARCIS (Netherlands)

    Rietveld, L.; Hoekstra, R.

    2014-01-01

    Server-side SPARQL query logs have been a topic of study for some time now. The USEWOD collection of query logs is currently the primary source of information for researchers. A recurring problem is that these logs leave application queries and queries created by humans indistinguishable. In this

  11. Fingerprinting Keywords in Search Queries over Tor

    Directory of Open Access Journals (Sweden)

    Oh Se Eun

    2017-10-01

    Full Text Available Search engine queries contain a great deal of private and potentially compromising information about users. One technique to prevent search engines from identifying the source of a query, and Internet service providers (ISPs from identifying the contents of queries is to query the search engine over an anonymous network such as Tor.

  12. Querying Large Physics Data Sets Over an Information Grid

    CERN Document Server

    Baker, N; Kovács, Z; Le Goff, J M; McClatchey, R

    2001-01-01

    Optimising use of the Web (WWW) for LHC data analysis is a complex problem and illustrates the challenges arising from the integration of and computation across massive amounts of information distributed worldwide. Finding the right piece of information can, at times, be extremely time-consuming, if not impossible. So-called Grids have been proposed to facilitate LHC computing and many groups have embarked on studies of data replication, data migration and networking philosophies. Other aspects such as the role of 'middleware' for Grids are emerging as requiring research. This paper positions the need for appropriate middleware that enables users to resolve physics queries across massive data sets. It identifies the role of meta-data for query resolution and the importance of Information Grids for high-energy physics analysis rather than just Computational or Data Grids. This paper identifies software that is being implemented at CERN to enable the querying of very large collaborating HEP data-sets, initially...

  13. Querying Natural Logic Knowledge Bases

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Jensen, Per Anker

    2017-01-01

    This paper describes the principles of a system applying natural logic as a knowledge base language. Natural logics are regimented fragments of natural language employing high level inference rules. We advocate the use of natural logic for knowledge bases dealing with querying of classes...... in ontologies and class-relationships such as are common in life-science descriptions. The paper adopts a version of natural logic with recursive restrictive clauses such as relative clauses and adnominal prepositional phrases. It includes passive as well as active voice sentences. We outline a prototype...... for partial translation of natural language into natural logic, featuring further querying and conceptual path finding in natural logic knowledge bases....

  14. Persistent Identifiers for Improved Accessibility for Linked Data Querying

    Science.gov (United States)

    Shepherd, A.; Chandler, C. L.; Arko, R. A.; Fils, D.; Jones, M. B.; Krisnadhi, A.; Mecum, B.

    2016-12-01

    The adoption of linked open data principles within the geosciences has increased the amount of accessible information available on the Web. However, this data is difficult to consume for those who are unfamiliar with Semantic Web technologies such as Web Ontology Language (OWL), Resource Description Framework (RDF) and SPARQL - the RDF query language. Consumers would need to understand the structure of the data and how to efficiently query it. Furthermore, understanding how to query doesn't solve problems of poor precision and recall in search results. For consumers unfamiliar with the data, full-text searches are most accessible, but not ideal as they arrest the advantages of data disambiguation and co-reference resolution efforts. Conversely, URI searches across linked data can deliver improved search results, but knowledge of these exact URIs may remain difficult to obtain. The increased adoption of Persistent Identifiers (PIDs) can lead to improved linked data querying by a wide variety of consumers. Because PIDs resolve to a single entity, they are an excellent data point for disambiguating content. At the same time, PIDs are more accessible and prominent than a single data provider's linked data URI. When present in linked open datasets, PIDs provide balance between the technical and social hurdles of linked data querying as evidenced by the NSF EarthCube GeoLink project. The GeoLink project, funded by NSF's EarthCube initiative, have brought together data repositories include content from field expeditions, laboratory analyses, journal publications, conference presentations, theses/reports, and funding awards that span scientific studies from marine geology to marine ecosystems and biogeochemistry to paleoclimatology.

  15. Fuzzy Querying: Issues and Perspectives..

    Czech Academy of Sciences Publication Activity Database

    Kacprzyk, J.; Pasi, G.; Vojtáš, Peter; Zadrozny, S.

    2000-01-01

    Roč. 36, č. 6 (2000), s. 605-616 ISSN 0023-5954 Institutional research plan: AV0Z1030915 Keywords : flexible querying * information retrieval * fuzzy databases Subject RIV: BA - General Mathematics http://dml.cz/handle/10338.dmlcz/135376

  16. Querying Large Biological Network Datasets

    Science.gov (United States)

    Gulsoy, Gunhan

    2013-01-01

    New experimental methods has resulted in increasing amount of genetic interaction data to be generated every day. Biological networks are used to store genetic interaction data gathered. Increasing amount of data available requires fast large scale analysis methods. Therefore, we address the problem of querying large biological network datasets.…

  17. The EarthServer Federation: State, Role, and Contribution to GEOSS

    Science.gov (United States)

    Merticariu, Vlad; Baumann, Peter

    2016-04-01

    The intercontinental EarthServer initiative has established a European datacube platform with proven scalability: known databases exceed 100 TB, and single queries have been split across more than 1,000 cloud nodes. Its service interface being rigorously based on the OGC "Big Geo Data" standards, Web Coverage Service (WCS) and Web Coverage Processing Service (WCPS), a series of clients can dock into the services, ranging from open-source OpenLayers and QGIS over open-source NASA WorldWind to proprietary ESRI ArcGIS. Datacube fusion in a "mix and match" style is supported by the platform technolgy, the rasdaman Array Database System, which transparently federates queries so that users simply approach any node of the federation to access any data item, internally optimized for minimal data transfer. Notably, rasdaman is part of GEOSS GCI. NASA is contributing its Web WorldWind virtual globe for user-friendly data extraction, navigation, and analysis. Integrated datacube / metadata queries are contributed by CITE. Current federation members include ESA (managed by MEEO sr.l.), Plymouth Marine Laboratory (PML), the European Centre for Medium-Range Weather Forecast (ECMWF), Australia's National Computational Infrastructure, and Jacobs University (adding in Planetary Science). Further data centers have expressed interest in joining. We present the EarthServer approach, discuss its underlying technology, and illustrate the contribution this datacube platform can make to GEOSS.

  18. Query optimization for graph analytics on linked data using SPARQL

    Energy Technology Data Exchange (ETDEWEB)

    Hong, Seokyong [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lee, Sangkeun [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lim, Seung -Hwan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sukumar, Sreenivas R. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Vatsavai, Ranga Raju [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2015-07-01

    Triplestores that support query languages such as SPARQL are emerging as the preferred and scalable solution to represent data and meta-data as massive heterogeneous graphs using Semantic Web standards. With increasing adoption, the desire to conduct graph-theoretic mining and exploratory analysis has also increased. Addressing that desire, this paper presents a solution that is the marriage of Graph Theory and the Semantic Web. We present software that can analyze Linked Data using graph operations such as counting triangles, finding eccentricity, testing connectedness, and computing PageRank directly on triple stores via the SPARQL interface. We describe the process of optimizing performance of the SPARQL-based implementation of such popular graph algorithms by reducing the space-overhead, simplifying iterative complexity and removing redundant computations by understanding query plans. Our optimized approach shows significant performance gains on triplestores hosted on stand-alone workstations as well as hardware-optimized scalable supercomputers such as the Cray XMT.

  19. WebVis: a hierarchical web homepage visualizer

    Science.gov (United States)

    Renteria, Jose C.; Lodha, Suresh K.

    2000-02-01

    WebVis, the Hierarchical Web Home Page Visualizer, is a tool for managing home web pages. The user can access this tool via the WWW and obtain a hierarchical visualization of one's home web pages. WebVis is a real time interactive tool that supports many different queries on the statistics of internal files such as sizes, age, and type. In addition, statistics on embedded information such as VRML files, Java applets, images and sound files can be extracted and queried. Results of these queries are visualized using color, shape and size of different nodes of the hierarchy. The visualization assists the user in a variety of task, such as quickly finding outdated information or locate large files. WebVIs is one solution to the growing web space maintenance problem. Implementation of WebVis is realized with Perl and Java. Perl pattern matching and file handling routines are used to collect and process web space linkage information and web document information. Java utilizes the collected information to produce visualization of the web space. Java also provides WebVis with real time interactivity, while running off the WWW. Some WebVis examples of home web page visualization are presented.

  20. Sexual information seeking on web search engines.

    Science.gov (United States)

    Spink, Amanda; Koricich, Andrew; Jansen, B J; Cole, Charles

    2004-02-01

    Sexual information seeking is an important element within human information behavior. Seeking sexually related information on the Internet takes many forms and channels, including chat rooms discussions, accessing Websites or searching Web search engines for sexual materials. The study of sexual Web queries provides insight into sexually-related information-seeking behavior, of value to Web users and providers alike. We qualitatively analyzed queries from logs of 1,025,910 Alta Vista and AlltheWeb.com Web user queries from 2001. We compared the differences in sexually-related Web searching between Alta Vista and AlltheWeb.com users. Differences were found in session duration, query outcomes, and search term choices. Implications of the findings for sexual information seeking are discussed.

  1. Extending OLAP Querying to External Object

    DEFF Research Database (Denmark)

    Pedersen, Torben Bach; Shoshani, Arie; Gu, Junmin

    On-Line Analytical Processing (OLAP) systems based on a dimensional view of data have found widespread use in business applications and are being used increasingly in non-standard applications. These systems provide good performance and ease-of-use. However, the complex structures and relationships...... inherent in data in nonstandard applications are not accommodated well by OLAP systems. In contrast, object database systems are built to handle such complexity, but do not support OLAP-type querying well. This paper presents the concepts and techniques underlying a flexible, multi-model federated system...... that enables OLAP users to exploit simultaneously the features of OLAP and object systems. The system allows data to be handled using the most appropriate data model and technology: OLAP systems for dimensional data and object database systems for more complex, general data. Additionally, physical data...

  2. Query Processing in Ontology-Based Peer-to-Peer Systems

    NARCIS (Netherlands)

    Stuckenschmidt, Heiner; Harmelen, Frank Van; Giunchiglia, Fausto

    2005-01-01

    The unstructured, heterogeneous and dynamic nature of the Web poses a new challenge to query-answering over multiple data sources. The so-called Semantic Web aims at providing more and semantically richer structures in terms of ontologies and meta-data. A problem that remains is the combined use of

  3. jQuery 2.0 development cookbook

    CERN Document Server

    Revill, Leon

    2014-01-01

    Taking a recipe-based approach, this book presents numerous practical examples that you can use directly in your applications. The book covers the essential issues you will face while developing your web applications and gives you solutions to them. The recipes in this book are written in a manner that rapidly takes you from beginner to expert level.This book is for web developers of all skill levels. Although some knowledge of JavaScript, HTML, and CSS is required, this Cookbook will teach jQuery newcomers all the basics required to move on to the more complex examples of this book, which wil

  4. JavaScript & jQuery The Missing Manual

    CERN Document Server

    McFarland, David

    2011-01-01

    JavaScript lets you supercharge your HTML with animation, interactivity, and visual effects-but many web designers find the language hard to learn. This jargon-free guide covers JavaScript basics and shows you how to save time and effort with the jQuery library of prewritten JavaScript code. You'll soon be building web pages that feel and act like desktop programs, without having to do much programming. The important stuff you need to know: Make your pages interactive. Create JavaScript events that react to visitor actions.Use animations and effects. Build drop-down navigation menus, pop-ups

  5. Optimizing Temporal Queries: Efficient Handling of Duplicates

    DEFF Research Database (Denmark)

    Toman, David; Bowman, Ivan Thomas

    2001-01-01

    , these query languages are implemented by translating temporal queries into standard relational queries. However, the compiled queries are often quite cumbersome and expensive to execute even using state-of-the- art relational products. This paper presents an optimization technique that produces more efficient...... translated SQL queries by taking into account the properties of the encoding used for temporal attributes. For concreteness, this translation technique is presented in the context of SQL/TP; however, these techniques are also applicable to other temporal query languages....

  6. The NIF LinkOut broker: a web resource to facilitate federated data integration using NCBI identifiers.

    Science.gov (United States)

    Marenco, Luis; Ascoli, Giorgio A; Martone, Maryann E; Shepherd, Gordon M; Miller, Perry L

    2008-09-01

    This paper describes the NIF LinkOut Broker (NLB) that has been built as part of the Neuroscience Information Framework (NIF) project. The NLB is designed to coordinate the assembly of links to neuroscience information items (e.g., experimental data, knowledge bases, and software tools) that are (1) accessible via the Web, and (2) related to entries in the National Center for Biotechnology Information's (NCBI's) Entrez system. The NLB collects these links from each resource and passes them to the NCBI which incorporates them into its Entrez LinkOut service. In this way, an Entrez user looking at a specific Entrez entry can LinkOut directly to related neuroscience information. The information stored in the NLB can also be utilized in other ways. A second approach, which is operational on a pilot basis, is for the NLB Web server to create dynamically its own Web page of LinkOut links for each NCBI identifier in the NLB database. This approach can allow other resources (in addition to the NCBI Entrez) to LinkOut to related neuroscience information. The paper describes the current NLB system and discusses certain design issues that arose during its implementation.

  7. Does query expansion limit our learning? A comparison of social-based expansion to content-based expansion for medical queries on the internet.

    Science.gov (United States)

    Pentoney, Christopher; Harwell, Jeff; Leroy, Gondy

    2014-01-01

    Searching for medical information online is a common activity. While it has been shown that forming good queries is difficult, Google's query suggestion tool, a type of query expansion, aims to facilitate query formation. However, it is unknown how this expansion, which is based on what others searched for, affects the information gathering of the online community. To measure the impact of social-based query expansion, this study compared it with content-based expansion, i.e., what is really in the text. We used 138,906 medical queries from the AOL User Session Collection and expanded them using Google's Autocomplete method (social-based) and the content of the Google Web Corpus (content-based). We evaluated the specificity and ambiguity of the expansion terms for trigram queries. We also looked at the impact on the actual results using domain diversity and expansion edit distance. Results showed that the social-based method provided more precise expansion terms as well as terms that were less ambiguous. Expanded queries do not differ significantly in diversity when expanded using the social-based method (6.72 different domains returned in the first ten results, on average) vs. content-based method (6.73 different domains, on average).

  8. Beginning ASPNET Web Pages with WebMatrix

    CERN Document Server

    Brind, Mike

    2011-01-01

    Learn to build dynamic web sites with Microsoft WebMatrix Microsoft WebMatrix is designed to make developing dynamic ASP.NET web sites much easier. This complete Wrox guide shows you what it is, how it works, and how to get the best from it right away. It covers all the basic foundations and also introduces HTML, CSS, and Ajax using jQuery, giving beginning programmers a firm foundation for building dynamic web sites.Examines how WebMatrix is expected to become the new recommended entry-level tool for developing web sites using ASP.NETArms beginning programmers, students, and educators with al

  9. Querying Sentiment Development over Time

    DEFF Research Database (Denmark)

    Andreasen, Troels; Christiansen, Henning; Have, Christian Theil

    2013-01-01

    A new language is introduced for describing hypotheses about fluctuations of measurable properties in streams of timestamped data, and as prime example, we consider trends of emotions in the constantly flowing stream of Twitter messages. The language, called EmoEpisodes, has a precise semantics...... that measures how well a hypothesis characterizes a given time interval; the semantics is parameterized so it can be adjusted to different views of the data. EmoEpisodes is extended to a query language with variables standing for unknown topics and emotions, and the query-answering mechanism will return...... instantiations for topics and emotions as well as time intervals that provide the largest deflections in this measurement. Experiments are performed on a selection of Twitter data to demonstrates the usefulness of the approach....

  10. Query containment in entity SQL

    OpenAIRE

    Rull Fort, Guillem; Bernstein, Philip A.; Garcia dos Santos, Ivo; Katsis, Yannis; Melnik, Sergey; Teniente López, Ernest

    2013-01-01

    We describe a software architecture we have developed for a constructive containment checker of Entity SQL queries defined over extended ER schemas expressed in Microsoft's Entity Data Model. Our application of interest is compilation of object-to-relational mappings for Microsoft's ADO.NET Entity Framework, which has been shipping since 2007. The supported language includes several features which have been individually addressed in the past but, to the best of our knowledge, they have not be...

  11. Ontology-Based Querying with Bio2RDF's Linked Open Data.

    Science.gov (United States)

    Callahan, Alison; Cruz-Toledo, José; Dumontier, Michel

    2013-04-15

    A key activity for life scientists in this post "-omics" age involves searching for and integrating biological data from a multitude of independent databases. However, our ability to find relevant data is hampered by non-standard web and database interfaces backed by an enormous variety of data formats. This heterogeneity presents an overwhelming barrier to the discovery and reuse of resources which have been developed at great public expense.To address this issue, the open-source Bio2RDF project promotes a simple convention to integrate diverse biological data using Semantic Web technologies. However, querying Bio2RDF remains difficult due to the lack of uniformity in the representation of Bio2RDF datasets. We describe an update to Bio2RDF that includes tighter integration across 19 new and updated RDF datasets. All available open-source scripts were first consolidated to a single GitHub repository and then redeveloped using a common API that generates normalized IRIs using a centralized dataset registry. We then mapped dataset specific types and relations to the Semanticscience Integrated Ontology (SIO) and demonstrate simplified federated queries across multiple Bio2RDF endpoints. This coordinated release marks an important milestone for the Bio2RDF open source linked data framework. Principally, it improves the quality of linked data in the Bio2RDF network and makes it easier to access or recreate the linked data locally. We hope to continue improving the Bio2RDF network of linked data by identifying priority databases and increasing the vocabulary coverage to additional dataset vocabularies beyond SIO.

  12. The Geodetic Seamless Archive Centers Service Layer: A System Architecture for Federating Geodesy Data Repositories

    Science.gov (United States)

    McWhirter, J.; Boler, F. M.; Bock, Y.; Jamason, P.; Squibb, M. B.; Noll, C. E.; Blewitt, G.; Kreemer, C. W.

    2010-12-01

    Three geodesy Archive Centers, Scripps Orbit and Permanent Array Center (SOPAC), NASA's Crustal Dynamics Data Information System (CDDIS) and UNAVCO are engaged in a joint effort to define and develop a common Web Service Application Programming Interface (API) for accessing geodetic data holdings. This effort is funded by the NASA ROSES ACCESS Program to modernize the original GPS Seamless Archive Centers (GSAC) technology which was developed in the 1990s. A new web service interface, the GSAC-WS, is being developed to provide uniform and expanded mechanisms through which users can access our data repositories. In total, our respective archives hold tens of millions of files and contain a rich collection of site/station metadata. Though we serve similar user communities, we currently provide a range of different access methods, query services and metadata formats. This leads to a lack of consistency in the userís experience and a duplication of engineering efforts. The GSAC-WS API and its reference implementation in an underlying Java-based GSAC Service Layer (GSL) supports metadata and data queries into site/station oriented data archives. The general nature of this API makes it applicable to a broad range of data systems. The overall goals of this project include providing consistent and rich query interfaces for end users and client programs, the development of enabling technology to facilitate third party repositories in developing these web service capabilities and to enable the ability to perform data queries across a collection of federated GSAC-WS enabled repositories. A fundamental challenge faced in this project is to provide a common suite of query services across a heterogeneous collection of data yet enabling each repository to expose their specific metadata holdings. To address this challenge we are developing a "capabilities" based service where a repository can describe its specific query and metadata capabilities. Furthermore, the architecture of

  13. Query Optimizations over Decentralized RDF Graphs

    KAUST Repository

    Abdelaziz, Ibrahim; Mansour, Essam; Ouzzani, Mourad; Aboulnaga, Ashraf; Kalnis, Panos

    2017-01-01

    Applications in life sciences, decentralized social networks, Internet of Things, and statistical linked dataspaces integrate data from multiple decentralized RDF graphs via SPARQL queries. Several approaches have been proposed to optimize query

  14. a Novel Approach of Indexing and Retrieving Spatial Polygons for Efficient Spatial Region Queries

    Science.gov (United States)

    Zhao, J. H.; Wang, X. Z.; Wang, F. Y.; Shen, Z. H.; Zhou, Y. C.; Wang, Y. L.

    2017-10-01

    Spatial region queries are more and more widely used in web-based applications. Mechanisms to provide efficient query processing over geospatial data are essential. However, due to the massive geospatial data volume, heavy geometric computation, and high access concurrency, it is difficult to get response in real time. Spatial indexes are usually used in this situation. In this paper, based on k-d tree, we introduce a distributed KD-Tree (DKD-Tree) suitbable for polygon data, and a two-step query algorithm. The spatial index construction is recursive and iterative, and the query is an in memory process. Both the index and query methods can be processed in parallel, and are implemented based on HDFS, Spark and Redis. Experiments on a large volume of Remote Sensing images metadata have been carried out, and the advantages of our method are investigated by comparing with spatial region queries executed on PostgreSQL and PostGIS. Results show that our approach not only greatly improves the efficiency of spatial region query, but also has good scalability, Moreover, the two-step spatial range query algorithm can also save cluster resources to support a large number of concurrent queries. Therefore, this method is very useful when building large geographic information systems.

  15. A NOVEL APPROACH OF INDEXING AND RETRIEVING SPATIAL POLYGONS FOR EFFICIENT SPATIAL REGION QUERIES

    Directory of Open Access Journals (Sweden)

    J. H. Zhao

    2017-10-01

    Full Text Available Spatial region queries are more and more widely used in web-based applications. Mechanisms to provide efficient query processing over geospatial data are essential. However, due to the massive geospatial data volume, heavy geometric computation, and high access concurrency, it is difficult to get response in real time. Spatial indexes are usually used in this situation. In this paper, based on k-d tree, we introduce a distributed KD-Tree (DKD-Tree suitbable for polygon data, and a two-step query algorithm. The spatial index construction is recursive and iterative, and the query is an in memory process. Both the index and query methods can be processed in parallel, and are implemented based on HDFS, Spark and Redis. Experiments on a large volume of Remote Sensing images metadata have been carried out, and the advantages of our method are investigated by comparing with spatial region queries executed on PostgreSQL and PostGIS. Results show that our approach not only greatly improves the efficiency of spatial region query, but also has good scalability, Moreover, the two-step spatial range query algorithm can also save cluster resources to support a large number of concurrent queries. Therefore, this method is very useful when building large geographic information systems.

  16. On tractable query evaluation for SPARQL

    OpenAIRE

    Mengel, Stefan; Skritek, Sebastian

    2017-01-01

    Despite much work within the last decade on foundational properties of SPARQL - the standard query language for RDF data - rather little is known about the exact limits of tractability for this language. In particular, this is the case for SPARQL queries that contain the OPTIONAL-operator, even though it is one of the most intensively studied features of SPARQL. The aim of our work is to provide a more thorough picture of tractable classes of SPARQL queries. In general, SPARQL query evaluatio...

  17. Nearest Neighbor Queries in Road Networks

    DEFF Research Database (Denmark)

    Jensen, Christian Søndergaard; Kolar, Jan; Pedersen, Torben Bach

    2003-01-01

    in road networks. Such queries may be of use in many services. Specifically, we present an easily implementable data model that serves well as a foundation for such queries. We also present the design of a prototype system that implements the queries based on the data model. The algorithm used...

  18. Advanced Query Formulation in Deductive Databases.

    Science.gov (United States)

    Niemi, Timo; Jarvelin, Kalervo

    1992-01-01

    Discusses deductive databases and database management systems (DBMS) and introduces a framework for advanced query formulation for end users. Recursive processing is described, a sample extensional database is presented, query types are explained, and criteria for advanced query formulation from the end user's viewpoint are examined. (31…

  19. On the formulation of performant sparql queries

    NARCIS (Netherlands)

    Loizou, A.; Angles, R.; Groth, P.T.

    2014-01-01

    Abstract The combination of the flexibility of RDF and the expressiveness of SPARQL provides a powerful mechanism to model, integrate and query data. However, these properties also mean that it is nontrivial to write performant SPARQL queries. Indeed, it is quite easy to create queries that tax even

  20. How Good Are Query Optimizers, Really?

    NARCIS (Netherlands)

    Leis, Viktor; Gubichev, Andrey; Mirchev, Atanas; Boncz, Peter; Kemper, Alfons; Neumann, Thomas

    2016-01-01

    Finding a good join order is crucial for query performance. In this paper, we introduce the Join Order Benchmark (JOB) and experimentally revisit the main components in the classic query optimizer architecture using a complex, real-world data set and realistic multi-join queries. We investigate the

  1. Predecessor queries in dynamic integer sets

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting

    1997-01-01

    We consider the problem of maintaining a set of n integers in the range 0.2w–1 under the operations of insertion, deletion, predecessor queries, minimum queries and maximum queries on a unit cost RAM with word size w bits. Let f (n) be an arbitrary nondecreasing smooth function satisfying n...

  2. Mobile Information Access with Spoken Query Answering

    DEFF Research Database (Denmark)

    Brøndsted, Tom; Larsen, Henrik Legind; Larsen, Lars Bo

    2006-01-01

    window focused over the part which most likely contains an answer to the query. The two systems are integrated into a full spoken query answering system. The prototype can answer queries and questions within the chosen football (soccer) test domain, but the system has the flexibility for being ported...

  3. Truth Space Method for Caching Database Queries

    Directory of Open Access Journals (Sweden)

    S. V. Mosin

    2015-01-01

    Full Text Available We propose a new method of client-side data caching for relational databases with a central server and distant clients. Data are loaded into the client cache based on queries executed on the server. Every query has the corresponding DB table – the result of the query execution. These queries have a special form called "universal relational query" based on three fundamental Relational Algebra operations: selection, projection and natural join. We have to mention that such a form is the closest one to the natural language and the majority of database search queries can be expressed in this way. Besides, this form allows us to analyze query correctness by checking lossless join property. A subsequent query may be executed in a client’s local cache if we can determine that the query result is entirely contained in the cache. For this we compare truth spaces of the logical restrictions in a new user’s query and the results of the queries execution in the cache. Such a comparison can be performed analytically , without need in additional Database queries. This method may be used to define lacking data in the cache and execute the query on the server only for these data. To do this the analytical approach is also used, what distinguishes our paper from the existing technologies. We propose four theorems for testing the required conditions. The first and the third theorems conditions allow us to define the existence of required data in cache. The second and the fourth theorems state conditions to execute queries with cache only. The problem of cache data actualizations is not discussed in this paper. However, it can be solved by cataloging queries on the server and their serving by triggers in background mode. The article is published in the author’s wording.

  4. Integration, Provenance, and Temporal Queries for Large-Scale Knowledge Bases

    OpenAIRE

    Gao, Shi

    2016-01-01

    Knowledge bases that summarize web information in RDF triples deliver many benefits, including support for natural language question answering and powerful structured queries that extract encyclopedic knowledge via SPARQL. Large scale knowledge bases grow rapidly in terms of scale and significance, and undergo frequent changes in both schema and content. Two critical problems have thus emerged: (i) how to support temporal queries that explore the history of knowledge bases or flash-back to th...

  5. APLICANDO TRAÇOS DE ACESSIBILIDADE E USABILIDADE WEB MÓVEL NA UNIVERSIDADE FEDERAL DE SERGIPE: RESPEITO À CIDADANIA E À INCLUSÃO DIGITAL

    Directory of Open Access Journals (Sweden)

    Christiano Santos Santana

    2012-12-01

    Full Text Available O crescente acesso às redes móveis expõe a UFS a uma necessidade vertente na atualização de seus padrões web anteriormente estabelecidos, remetendo às necessidades de reconfiguração de padrões objetivando favorecer a inclusão social e digital à comunidade sergipana. Essa pesquisa procurou analisar os problemas e aspectos que envolvem as questões de Acessibilidade e Usabilidade da web móvel relacionados ao site da Pós-Graduação da Universidade Federal de Sergipe (POSGRAP-UFS. Considerando-se a importância do uso desse portal para o meio acadêmico e comunitário, criou-se uma versão móvel considerando-se as normas e padrões do W3C (World Wide Web Consortium. Os resultados concretizaram melhorias na interface, layout e arquitetura na nova versão quando comparado à versão Desktop, reiterando, então, o respeito à cidadania, a inclusão digital e social. Em síntese, a presente pesquisa procurou estabelecer a equidade tecnológica, servindo de subsídio para o desenvolvimento futuro de outros sub-portais da UFS e em resposta, facilitando o acesso para os usuários do sistema, reduzindo os possíveis obstáculos nas interações pré-existentes via dispositivos móveis.

  6. Federal Fleet Report

    Data.gov (United States)

    General Services Administration — Annual report of Federal agencies' motor vehicle fleet data collected in the Federal Automotive Statistical Tool (FAST), a web-based reporting tool cosponsored by...

  7. Digital Earth Watch (DEW): How Mobile Apps Are Paving The Way Towards A Federated Web-Services Architecture For Citizen Science

    Science.gov (United States)

    Carrera, F.; Schloss, A. L.; Guerin, S.; Beaudry, J.; Pickle, J.

    2011-12-01

    interacts with the UNH server via APIs (Application Programming Interfaces) that were created to allow bi-directional machine-to-machine interaction between the mobile device and the web site. Thus, the principal functions that a user can perform on the web site, such as finding post sites on a map and viewing and adding picture sets, are available on the smartphone. The development of the APIs makes it now possible not only to communicate with our own mobile app, but, more importantly, it opens the door for other computer systems to directly interact with our server. Our ongoing discussions with the National Phenology Network and Project Budburst, have highlighted the potential (and perhaps the need) for the creation of a distributed web-service architecture whereby each national program exposes its key functionalities not only to their own mobile phone apps, but also to other organizations, in a federated system of servers, all supporting citizen-based digital earth watch programs.

  8. Preventing SQL Injection through Automatic Query Sanitization with ASSIST

    Directory of Open Access Journals (Sweden)

    Raymond Mui

    2010-09-01

    Full Text Available Web applications are becoming an essential part of our everyday lives. Many of our activities are dependent on the functionality and security of these applications. As the scale of these applications grows, injection vulnerabilities such as SQL injection are major security challenges for developers today. This paper presents the technique of automatic query sanitization to automatically remove SQL injection vulnerabilities in code. In our technique, a combination of static analysis and program transformation are used to automatically instrument web applications with sanitization code. We have implemented this technique in a tool named ASSIST (Automatic and Static SQL Injection Sanitization Tool for protecting Java-based web applications. Our experimental evaluation showed that our technique is effective against SQL injection vulnerabilities and has a low overhead.

  9. Design and evaluation of a NoSQL database for storing and querying RDF data

    Directory of Open Access Journals (Sweden)

    Kanda Runapongsa Saikaew

    2014-12-01

    Full Text Available Currently the amount of web data has increased excessively. Its metadata is widely used in order to fully exploit web information resources. This causes the need for Semantic Web technology to quickly analyze such big data. Resource Description Framework (RDF is a standard for describing web resources. In this paper, we propose a method to exploit a NoSQL database, specifically MongoDB, to store and query RDF data. We choose MongoDB to represent a NoSQL database because it is one of the most popular high-performance NoSQL databases. We evaluate the proposed design and implementation by using the Berlin SPARQL Benchmark, which is one of the most widely accepted benchmarks for comparing the performance of RDF storage systems. We compare three database systems, which are Apache Jena TDB (native RDF store, MySQL (relational database, and our proposed system with MongoDB (NoSQL database. Based on the experimental results analysis, our proposed system outperforms other database systems for most queries when the data set size is small. However, for a larger data set, MongoDB performs well for queries with simple operators while MySQL offers an efficient solution for complex queries. The result of this work can provide some guideline for choosing an appropriate RDF database system and applying a NoSQL database in storing and querying RDF data.

  10. Towards Automatic Improvement of Patient Queries in Health Retrieval Systems

    Directory of Open Access Journals (Sweden)

    Nesrine KSENTINI

    2016-07-01

    Full Text Available With the adoption of health information technology for clinical health, e-health is becoming usual practice today. Users of this technology find it difficult to seek information relevant to their needs due to the increasing amount of the clinical and medical data on the web, and the lack of knowledge of medical jargon. In this regards, a method is described to improve user's needs by automatically adding new related terms to their queries which appear in the same context of the original query in order to improve final search results. This method is based on the assessment of semantic relationships defined by a proposed statistical method between a set of terms or keywords. Experiments were performed on CLEF-eHealth-2015 database and the obtained results show the effectiveness of our proposed method.

  11. Improving Web Page Retrieval using Search Context from Clicked Domain Names

    NARCIS (Netherlands)

    Li, R.

    Search context is a crucial factor that helps to understand a user’s information need in ad-hoc Web page retrieval. A query log of a search engine contains rich information on issued queries and their corresponding clicked Web pages. The clicked data implies its relevance to the query and can be

  12. Advanced Query and Data Mining Capabilities for MaROS

    Science.gov (United States)

    Wang, Paul; Wallick, Michael N.; Allard, Daniel A.; Gladden, Roy E.; Hy, Franklin H.

    2013-01-01

    The Mars Relay Operational Service (MaROS) comprises a number of tools to coordinate, plan, and visualize various aspects of the Mars Relay network. These levels include a Web-based user interface, a back-end "ReSTlet" built in Java, and databases that store the data as it is received from the network. As part of MaROS, the innovators have developed and implemented a feature set that operates on several levels of the software architecture. This new feature is an advanced querying capability through either the Web-based user interface, or through a back-end REST interface to access all of the data gathered from the network. This software is not meant to replace the REST interface, but to augment and expand the range of available data. The current REST interface provides specific data that is used by the MaROS Web application to display and visualize the information; however, the returned information from the REST interface has typically been pre-processed to return only a subset of the entire information within the repository, particularly only the information that is of interest to the GUI (graphical user interface). The new, advanced query and data mining capabilities allow users to retrieve the raw data and/or to perform their own data processing. The query language used to access the repository is a restricted subset of the structured query language (SQL) that can be built safely from the Web user interface, or entered as freeform SQL by a user. The results are returned in a CSV (Comma Separated Values) format for easy exporting to third party tools and applications that can be used for data mining or user-defined visualization and interpretation. This is the first time that a service is capable of providing access to all cross-project relay data from a single Web resource. Because MaROS contains the data for a variety of missions from the Mars network, which span both NASA and ESA, the software also establishes an access control list (ACL) on each data record

  13. Digging Deeper: The Deep Web.

    Science.gov (United States)

    Turner, Laura

    2001-01-01

    Focuses on the Deep Web, defined as Web content in searchable databases of the type that can be found only by direct query. Discusses the problems of indexing; inability to find information not indexed in the search engine's database; and metasearch engines. Describes 10 sites created to access online databases or directly search them. Lists ways…

  14. Distributed Deep Web Search

    NARCIS (Netherlands)

    Tjin-Kam-Jet, Kien

    2013-01-01

    The World Wide Web contains billions of documents (and counting); hence, it is likely that some document will contain the answer or content you are searching for. While major search engines like Bing and Google often manage to return relevant results to your query, there are plenty of situations in

  15. COEUS: "semantic web in a box" for biomedical applications.

    Science.gov (United States)

    Lopes, Pedro; Oliveira, José Luís

    2012-12-17

    As the "omics" revolution unfolds, the growth in data quantity and diversity is bringing about the need for pioneering bioinformatics software, capable of significantly improving the research workflow. To cope with these computer science demands, biomedical software engineers are adopting emerging semantic web technologies that better suit the life sciences domain. The latter's complex relationships are easily mapped into semantic web graphs, enabling a superior understanding of collected knowledge. Despite increased awareness of semantic web technologies in bioinformatics, their use is still limited. COEUS is a new semantic web framework, aiming at a streamlined application development cycle and following a "semantic web in a box" approach. The framework provides a single package including advanced data integration and triplification tools, base ontologies, a web-oriented engine and a flexible exploration API. Resources can be integrated from heterogeneous sources, including CSV and XML files or SQL and SPARQL query results, and mapped directly to one or more ontologies. Advanced interoperability features include REST services, a SPARQL endpoint and LinkedData publication. These enable the creation of multiple applications for web, desktop or mobile environments, and empower a new knowledge federation layer. The platform, targeted at biomedical application developers, provides a complete skeleton ready for rapid application deployment, enhancing the creation of new semantic information systems. COEUS is available as open source at http://bioinformatics.ua.pt/coeus/.

  16. Cost reduction for web-based data imputation

    KAUST Repository

    Li, Zhixu; Shang, Shuo; Xie, Qing; Zhang, Xiangliang

    2014-01-01

    Web-based Data Imputation enables the completion of incomplete data sets by retrieving absent field values from the Web. In particular, complete fields can be used as keywords in imputation queries for absent fields. However, due to the ambiguity

  17. Sharing and executing linked data queries in a collaborative environment.

    Science.gov (United States)

    García Godoy, María Jesús; López-Camacho, Esteban; Navas-Delgado, Ismael; Aldana-Montes, José F

    2013-07-01

    Life Sciences have emerged as a key domain in the Linked Data community because of the diversity of data semantics and formats available through a great variety of databases and web technologies. Thus, it has been used as the perfect domain for applications in the web of data. Unfortunately, bioinformaticians are not exploiting the full potential of this already available technology, and experts in Life Sciences have real problems to discover, understand and devise how to take advantage of these interlinked (integrated) data. In this article, we present Bioqueries, a wiki-based portal that is aimed at community building around biological Linked Data. This tool has been designed to aid bioinformaticians in developing SPARQL queries to access biological databases exposed as Linked Data, and also to help biologists gain a deeper insight into the potential use of this technology. This public space offers several services and a collaborative infrastructure to stimulate the consumption of biological Linked Data and, therefore, contribute to implementing the benefits of the web of data in this domain. Bioqueries currently contains 215 query entries grouped by database and theme, 230 registered users and 44 end points that contain biological Resource Description Framework information. The Bioqueries portal is freely accessible at http://bioqueries.uma.es. Supplementary data are available at Bioinformatics online.

  18. Query Migration from Object Oriented World to Semantic World

    Directory of Open Access Journals (Sweden)

    Nassima Soussi

    2016-06-01

    Full Text Available In the last decades, object-oriented approach was able to take a large share of databases market aiming to design and implement structured and reusable software through the composition of independent elements in order to have programs with a high performance. On the other hand, the mass of information stored in the web is increasing day after day with a vertiginous speed, exposing the currently web faced with the problem of creating a bridge so as to facilitate access to data between different applications and systems as well as to look for relevant and exact information wished by users. In addition, all existing approach of rewriting object oriented languages to SPARQL language rely on models transformation process to guarantee this mapping. All the previous raisons has prompted us to write this paper in order to bridge an important gap between these two heterogeneous worlds (object oriented and semantic web world by proposing the first provably semantics preserving OQLto-SPARQL translation algorithm for each element of OQL Query (SELECT clause, FROM clause, FILTER constraint, implicit/ explicit join and union/intersection SELECT queries.

  19. Optimal Planar Orthogonal Skyline Counting Queries

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Larsen, Kasper Green

    2014-01-01

    counting queries, i.e. given a query rectangle R to report the size of the skyline of P\\cap R. We present a data structure for storing n points with integer coordinates having query time O(lg n/lglg n) and space usage O(n). The model of computation is a unit cost RAM with logarithmic word size. We prove...

  20. Adding query privacy to robust DHTs

    DEFF Research Database (Denmark)

    Backes, Michael; Goldberg, Ian; Kate, Aniket

    2012-01-01

    intermediate peers that (help to) route the queries towards their destinations. In this paper, we satisfy this requirement by presenting an approach for providing privacy for the keys in DHT queries. We use the concept of oblivious transfer (OT) in communication over DHTs to preserve query privacy without...... privacy over robust DHTs. Finally, we compare the performance of our privacy-preserving protocols with their more privacy-invasive counterparts. We observe that there is no increase in the message complexity...

  1. jQuery Tools UI Library

    CERN Document Server

    Libby, Alex

    2012-01-01

    A practical tutorial with powerful yet simple projects that are quick to implement. This book is aimed at developers who have prior jQuery knowledge, but may not have any prior experience with jQuery Tools. It is possible that they may have started with the basics of jQuery Tools, but want to learn more about how it can be used, as well as get ideas for future projects.

  2. Secure Skyline Queries on Cloud Platform.

    Science.gov (United States)

    Liu, Jinfei; Yang, Juncheng; Xiong, Li; Pei, Jian

    2017-04-01

    Outsourcing data and computation to cloud server provides a cost-effective way to support large scale data storage and query processing. However, due to security and privacy concerns, sensitive data (e.g., medical records) need to be protected from the cloud server and other unauthorized users. One approach is to outsource encrypted data to the cloud server and have the cloud server perform query processing on the encrypted data only. It remains a challenging task to support various queries over encrypted data in a secure and efficient way such that the cloud server does not gain any knowledge about the data, query, and query result. In this paper, we study the problem of secure skyline queries over encrypted data. The skyline query is particularly important for multi-criteria decision making but also presents significant challenges due to its complex computations. We propose a fully secure skyline query protocol on data encrypted using semantically-secure encryption. As a key subroutine, we present a new secure dominance protocol, which can be also used as a building block for other queries. Finally, we provide both serial and parallelized implementations and empirically study the protocols in terms of efficiency and scalability under different parameter settings, verifying the feasibility of our proposed solutions.

  3. A structural query system for Han characters

    DEFF Research Database (Denmark)

    Skala, Matthew

    2016-01-01

    The IDSgrep structural query system for Han character dictionaries is presented. This dictionary search system represents the spatial structure of Han characters using Extended Ideographic Description Sequences (EIDSes), a data model and syntax based on the Unicode IDS concept. It includes a query...... language for EIDS databases, with a freely available implementation and format translation from popular third-party IDS and XML character databases. The system is designed to suit the needs of font developers and foreign language learners. The search algorithm includes a bit vector index inspired by Bloom...... filters to support faster query operations. Experimental results are presented, evaluating the effect of the indexing on query performance....

  4. Adding Query Privacy to Robust DHTs

    DEFF Research Database (Denmark)

    Backes, Michael; Goldberg, Ian; Kate, Aniket

    2011-01-01

    intermediate peers that (help to) route the queries towards their destinations. In this paper, we satisfy this requirement by presenting an approach for providing privacy for the keys in DHT queries. We use the concept of oblivious transfer (OT) in communication over DHTs to preserve query privacy without...... of obtaining query privacy over robust DHTs. Finally, we compare the performance of our privacy-preserving protocols with their more privacy-invasive counterparts. We observe that there is no increase in the message complexity and only a small overhead in the computational complexity....

  5. Retrieving top-k prestige-based relevant spatial web objects

    DEFF Research Database (Denmark)

    Cao, Xin; Cong, Gao; Jensen, Christian S.

    2010-01-01

    The location-aware keyword query returns ranked objects that are near a query location and that have textual descriptions that match query keywords. This query occurs inherently in many types of mobile and traditional web services and applications, e.g., Yellow Pages and Maps services. Previous...... of prestige-based relevance to capture both the textual relevance of an object to a query and the effects of nearby objects. Based on this, a new type of query, the Location-aware top-k Prestige-based Text retrieval (LkPT) query, is proposed that retrieves the top-k spatial web objects ranked according...... to both prestige-based relevance and location proximity. We propose two algorithms that compute LkPT queries. Empirical studies with real-world spatial data demonstrate that LkPT queries are more effective in retrieving web objects than a previous approach that does not consider the effects of nearby...

  6. Automatically exposing OpenLifeData via SADI semantic Web Services.

    Science.gov (United States)

    González, Alejandro Rodríguez; Callahan, Alison; Cruz-Toledo, José; Garcia, Adrian; Egaña Aranguren, Mikel; Dumontier, Michel; Wilkinson, Mark D

    2014-01-01

    Two distinct trends are emerging with respect to how data is shared, collected, and analyzed within the bioinformatics community. First, Linked Data, exposed as SPARQL endpoints, promises to make data easier to collect and integrate by moving towards the harmonization of data syntax, descriptive vocabularies, and identifiers, as well as providing a standardized mechanism for data access. Second, Web Services, often linked together into workflows, normalize data access and create transparent, reproducible scientific methodologies that can, in principle, be re-used and customized to suit new scientific questions. Constructing queries that traverse semantically-rich Linked Data requires substantial expertise, yet traditional RESTful or SOAP Web Services cannot adequately describe the content of a SPARQL endpoint. We propose that content-driven Semantic Web Services can enable facile discovery of Linked Data, independent of their location. We use a well-curated Linked Dataset - OpenLifeData - and utilize its descriptive metadata to automatically configure a series of more than 22,000 Semantic Web Services that expose all of its content via the SADI set of design principles. The OpenLifeData SADI services are discoverable via queries to the SHARE registry and easy to integrate into new or existing bioinformatics workflows and analytical pipelines. We demonstrate the utility of this system through comparison of Web Service-mediated data access with traditional SPARQL, and note that this approach not only simplifies data retrieval, but simultaneously provides protection against resource-intensive queries. We show, through a variety of different clients and examples of varying complexity, that data from the myriad OpenLifeData can be recovered without any need for prior-knowledge of the content or structure of the SPARQL endpoints. We also demonstrate that, via clients such as SHARE, the complexity of federated SPARQL queries is dramatically reduced.

  7. Supercharged JavaScript Graphics with HTML5 canvas, jQuery, and More

    CERN Document Server

    Cecco, Raffaele

    2011-01-01

    With HTML5 and improved web browser support, JavaScript has become the tool of choice for creating high-performance web graphics. This faced-paced book shows you how to use JavaScript, jQuery, DHTML, and HTML5's Canvas element to create rich web applications for computers and mobile devices. By following real-world examples, experienced web developers learn fun and useful approaches to arcade games, DHTML effects, business dashboards, and other applications. This book serves complex subjects in easily digestible pieces, and each topic acts as a foundation for the next. Tackle JavaScript opti

  8. Web Services and Other Enhancements at the Northern California Earthquake Data Center

    Science.gov (United States)

    Neuhauser, D. S.; Zuzlewski, S.; Allen, R. M.

    2012-12-01

    The Northern California Earthquake Data Center (NCEDC) provides data archive and distribution services for seismological and geophysical data sets that encompass northern California. The NCEDC is enhancing its ability to deliver rapid information through Web Services. NCEDC Web Services use well-established web server and client protocols and REST software architecture to allow users to easily make queries using web browsers or simple program interfaces and to receive the requested data in real-time rather than through batch or email-based requests. Data are returned to the user in the appropriate format such as XML, RESP, or MiniSEED depending on the service, and are compatible with the equivalent IRIS DMC web services. The NCEDC is currently providing the following Web Services: (1) Station inventory and channel response information delivered in StationXML format, (2) Channel response information delivered in RESP format, (3) Time series availability delivered in text and XML formats, (4) Single channel and bulk data request delivered in MiniSEED format. The NCEDC is also developing a rich Earthquake Catalog Web Service to allow users to query earthquake catalogs based on selection parameters such as time, location or geographic region, magnitude, depth, azimuthal gap, and rms. It will return (in QuakeML format) user-specified results that can include simple earthquake parameters, as well as observations such as phase arrivals, codas, amplitudes, and computed parameters such as first motion mechanisms, moment tensors, and rupture length. The NCEDC will work with both IRIS and the International Federation of Digital Seismograph Networks (FDSN) to define a uniform set of web service specifications that can be implemented by multiple data centers to provide users with a common data interface across data centers. The NCEDC now hosts earthquake catalogs and waveforms from the US Department of Energy (DOE) Enhanced Geothermal Systems (EGS) monitoring networks. These

  9. Indexing, Query Processing, and Clustering of Spatio-Temporal Text Objects

    DEFF Research Database (Denmark)

    Skovsgaard, Anders

    With the increasing mobile use of the web from geo-positioned devices, the Internet is increasingly acquiring a spatial aspect, with still more types of content being geo-tagged. As a result of this development, a wide range of location-aware queries and applications have emerged. The large amounts...... of data available coupled with the increasing number of location-aware queries calls for efficient indexing and query processing techniques. This dissertation investigates how to manage geo-tagged text content to support these workloads in three specific areas: (i) grouping of spatio-textual objects, (ii......, the grouping of spatio-textual objects is done without considering query locations, and a clustering approach is proposed that takes into account both the spatial and textual attributes of the objects. The technique expands clusters based on a proposed quality function that enables clusters of arbitrary shape...

  10. Searchable Data Vault: Encrypted Queries in Secure Distributed Cloud Storage

    Directory of Open Access Journals (Sweden)

    Geong Sen Poh

    2017-05-01

    Full Text Available Cloud storage services allow users to efficiently outsource their documents anytime and anywhere. Such convenience, however, leads to privacy concerns. While storage providers may not read users’ documents, attackers may possibly gain access by exploiting vulnerabilities in the storage system. Documents may also be leaked by curious administrators. A simple solution is for the user to encrypt all documents before submitting them. This method, however, makes it impossible to efficiently search for documents as they are all encrypted. To resolve this problem, we propose a multi-server searchable symmetric encryption (SSE scheme and construct a system called the searchable data vault (SDV. A unique feature of the scheme is that it allows an encrypted document to be divided into blocks and distributed to different storage servers so that no single storage provider has a complete document. By incorporating the scheme, the SDV protects the privacy of documents while allowing for efficient private queries. It utilizes a web interface and a controller that manages user credentials, query indexes and submission of encrypted documents to cloud storage services. It is also the first system that enables a user to simultaneously outsource and privately query documents from a few cloud storage services. Our preliminary performance evaluation shows that this feature introduces acceptable computation overheads when compared to submitting documents directly to a cloud storage service.

  11. How Do Children Reformulate Their Search Queries?

    Science.gov (United States)

    Rutter, Sophie; Ford, Nigel; Clough, Paul

    2015-01-01

    Introduction: This paper investigates techniques used by children in year 4 (age eight to nine) of a UK primary school to reformulate their queries, and how they use information retrieval systems to support query reformulation. Method: An in-depth study analysing the interactions of twelve children carrying out search tasks in a primary school…

  12. The Data Cyclotron query processing scheme

    NARCIS (Netherlands)

    Goncalves, R.; Kersten, M.

    2011-01-01

    A grand challenge of distributed query processing is to devise a self-organizing architecture which exploits all hardware resources optimally to manage the database hot set, minimize query response time, and maximize throughput without single point global coordination. The Data Cyclotron

  13. The Data Cyclotron query processing scheme.

    NARCIS (Netherlands)

    R.A. Goncalves (Romulo); M.L. Kersten (Martin)

    2011-01-01

    htmlabstractA grand challenge of distributed query processing is to devise a self-organizing architecture which exploits all hardware resources optimally to manage the database hot set, minimize query response time, and maximize throughput without single point global coordination. The Data Cyclotron

  14. Querying and Mining Strings Made Easy

    KAUST Repository

    Sahli, Majed

    2017-10-13

    With the advent of large string datasets in several scientific and business applications, there is a growing need to perform ad-hoc analysis on strings. Currently, strings are stored, managed, and queried using procedural codes. This limits users to certain operations supported by existing procedural applications and requires manual query planning with limited tuning opportunities. This paper presents StarQL, a generic and declarative query language for strings. StarQL is based on a native string data model that allows StarQL to support a large variety of string operations and provide semantic-based query optimization. String analytic queries are too intricate to be solved on one machine. Therefore, we propose a scalable and efficient data structure that allows StarQL implementations to handle large sets of strings and utilize large computing infrastructures. Our evaluation shows that StarQL is able to express workloads of application-specific tools, such as BLAST and KAT in bioinformatics, and to mine Wikipedia text for interesting patterns using declarative queries. Furthermore, the StarQL query optimizer shows an order of magnitude reduction in query execution time.

  15. Exploiting External Collections for Query Expansion

    NARCIS (Netherlands)

    Weerkamp, W.; Balog, K.; de Rijke, M.

    2012-01-01

    A persisting challenge in the field of information retrieval is the vocabulary mismatch between a user’s information need and the relevant documents. One way of addressing this issue is to apply query modeling: to add terms to the original query and reweigh the terms. In social media, where

  16. A semantic perspective on query log analysis

    NARCIS (Netherlands)

    Hofmann, K.; de Rijke, M.; Huurnink, B.; Meij, E.

    2009-01-01

    We present our views on the CLEF log file analysis task. We argue for a task definition that focuses on the semantic enrichment of query logs. In addition, we discuss how additional information about the context in which queries are being made could further our understanding of users’ information

  17. A general approach to query flattening

    NARCIS (Netherlands)

    van Ruth, J.

    The translation of queries from complex data models to simpler data models is a recurring theme in the construction of efficient data management systems. In this paper we propose a general framework to guide the translation from data models with nested types to a flat relational model (query

  18. A Multi-Query Optimizer for Monet

    NARCIS (Netherlands)

    S. Manegold (Stefan); A.J. Pellenkoft (Jan); M.L. Kersten (Martin)

    2000-01-01

    textabstractDatabase systems allow for concurrent use of several applications (and query interfaces). Each application generates an ``optimal'' plan---a sequence of low-level database operators---for accessing the database. The queries posed by users through the same application can be optimized

  19. A multi-query optimizer for Monet

    NARCIS (Netherlands)

    S. Manegold (Stefan); A.J. Pellenkoft (Jan); M.L. Kersten (Martin)

    2000-01-01

    textabstractDatabase systems allow for concurrent use of several applications (and query interfaces). Each application generates an ``optimal'' plan---a sequence of low-level database operators---for accessing the database. The queries posed by users through the same application can be optimized

  20. Path Minima Queries in Dynamic Weighted Trees

    DEFF Research Database (Denmark)

    Davoodi, Pooya; Brodal, Gerth Stølting; Satti, Srinivasa Rao

    2011-01-01

    In the path minima problem on a tree, each edge is assigned a weight and a query asks for the edge with minimum weight on a path between two nodes. For the dynamic version of the problem, where the edge weights can be updated, we give data structures that achieve optimal query time\\todo{what about...

  1. Querying Business Process Models with VMQL

    DEFF Research Database (Denmark)

    Störrle, Harald; Acretoaie, Vlad

    2013-01-01

    The Visual Model Query Language (VMQL) has been invented with the objectives (1) to make it easier for modelers to query models effectively, and (2) to be universally applicable to all modeling languages. In previous work, we have applied VMQL to UML, and validated the first of these two claims. ...

  2. Efficient Approximate OLAP Querying Over Time Series

    DEFF Research Database (Denmark)

    Perera, Kasun Baruhupolage Don Kasun Sanjeewa; Hahmann, Martin; Lehner, Wolfgang

    2016-01-01

    The ongoing trend for data gathering not only produces larger volumes of data, but also increases the variety of recorded data types. Out of these, especially time series, e.g. various sensor readings, have attracted attention in the domains of business intelligence and decision making. As OLAP...... queries play a major role in these domains, it is desirable to also execute them on time series data. While this is not a problem on the conceptual level, it can become a bottleneck with regards to query run-time. In general, processing OLAP queries gets more computationally intensive as the volume...... of data grows. This is a particular problem when querying time series data, which generally contains multiple measures recorded at fine time granularities. Usually, this issue is addressed either by scaling up hardware or by employing workload based query optimization techniques. However, these solutions...

  3. PiCO QL: A software library for runtime interactive queries on program data

    Science.gov (United States)

    Fragkoulis, Marios; Spinellis, Diomidis; Louridas, Panos

    PiCO QL is an open source C/C++ software whose scientific scope is real-time interactive analysis of in-memory data through SQL queries. It exposes a relational view of a system's or application's data structures, which is queryable through SQL. While the application or system is executing, users can input queries through a web-based interface or issue web service requests. Queries execute on the live data structures through the respective relational views. PiCO QL makes a good candidate for ad-hoc data analysis in applications and for diagnostics in systems settings. Applications of PiCO QL include the Linux kernel, the Valgrind instrumentation framework, a GIS application, a virtual real-time observatory of stellar objects, and a source code analyser.

  4. PiCO QL: A software library for runtime interactive queries on program data

    Directory of Open Access Journals (Sweden)

    Marios Fragkoulis

    2016-01-01

    Full Text Available Pico ql is an open source c/c++ software whose scientific scope is real-time interactive analysis of in-memory data through sql queries. It exposes a relational view of a system’s or application’s data structures, which is queryable through sql. While the application or system is executing, users can input queries through a web-based interface or issue web service requests. Queries execute on the live data structures through the respective relational views. pico ql makes a good candidate for ad-hoc data analysis in applications and for diagnostics in systems settings. Applications of pico ql include the Linux kernel, the Valgrind instrumentation framework, a gis application, a virtual real-time observatory of stellar objects, and a source code analyser.

  5. Medical Information Retrieval Enhanced with User's Query Expanded with Tag-Neighbors

    DEFF Research Database (Denmark)

    Durao, Frederico; Bayyapu, Karunakar Reddy; Xu, Guandong

    2013-01-01

    Under-specified queries often lead to undesirable search results that do not contain the information needed. This problem gets worse when it comes to medical information, a natural human demand everywhere. Existing search engines on the Web often are unable to handle medical search well because...

  6. Error Analysis of Ia Supernova and Query on Cosmic Dark Energy

    Indian Academy of Sciences (India)

    2016-01-27

    Jan 27, 2016 ... Error Analysis of Ia Supernova and Query on Cosmic Dark Energy. Qiuhe Peng Yiming Hu Kun ... https://www.ias.ac.in/article/fulltext/joaa/035/03/0253-0256 ... Articles are also visible in Web of Science immediately. All these ...

  7. Deep iCrawl: An Intelligent Vision-Based Deep Web Crawler

    OpenAIRE

    R.Anita; V.Ganga Bharani; N.Nityanandam; Pradeep Kumar Sahoo

    2011-01-01

    The explosive growth of World Wide Web has posed a challenging problem in extracting relevant data. Traditional web crawlers focus only on the surface web while the deep web keeps expanding behind the scene. Deep web pages are created dynamically as a result of queries posed to specific web databases. The structure of the deep web pages makes it impossible for traditional web crawlers to access deep web contents. This paper, Deep iCrawl, gives a novel and vision-based app...

  8. Using search query surveillance to monitor tax avoidance and smoking cessation following the United States' 2009 "SCHIP" cigarette tax increase.

    Science.gov (United States)

    Ayers, John W; Ribisl, Kurt; Brownstein, John S

    2011-03-16

    Smokers can use the web to continue or quit their habit. Online vendors sell reduced or tax-free cigarettes lowering smoking costs, while health advocates use the web to promote cessation. We examined how smokers' tax avoidance and smoking cessation Internet search queries were motivated by the United States' (US) 2009 State Children's Health Insurance Program (SCHIP) federal cigarette excise tax increase and two other state specific tax increases. Google keyword searches among residents in a taxed geography (US or US state) were compared to an untaxed geography (Canada) for two years around each tax increase. Search data were normalized to a relative search volume (RSV) scale, where the highest search proportion was labeled 100 with lesser proportions scaled by how they relatively compared to the highest proportion. Changes in RSV were estimated by comparing means during and after the tax increase to means before the tax increase, across taxed and untaxed geographies. The SCHIP tax was associated with an 11.8% (95% confidence interval [95%CI], 5.7 to 17.9; ptax levels in Canada during the months after the tax. Tax avoidance searches increased 27.9% (95%CI, 15.9 to 39.9; ptax compared to Canada, respectively, suggesting avoidance is the more pronounced and durable response. Trends were similar for state-specific tax increases but suggest strong interactive processes across taxes. When the SCHIP tax followed Florida's tax, versus not, it promoted more cessation and avoidance searches. Efforts to combat tax avoidance and increase cessation may be enhanced by using interventions targeted and tailored to smokers' searches. Search query surveillance is a valuable real-time, free and public method, that may be generalized to other behavioral, biological, informational or psychological outcomes manifested online.

  9. Query Optimizations over Decentralized RDF Graphs

    KAUST Repository

    Abdelaziz, Ibrahim

    2017-05-18

    Applications in life sciences, decentralized social networks, Internet of Things, and statistical linked dataspaces integrate data from multiple decentralized RDF graphs via SPARQL queries. Several approaches have been proposed to optimize query processing over a small number of heterogeneous data sources by utilizing schema information. In the case of schema similarity and interlinks among sources, these approaches cause unnecessary data retrieval and communication, leading to poor scalability and response time. This paper addresses these limitations and presents Lusail, a system for scalable and efficient SPARQL query processing over decentralized graphs. Lusail achieves scalability and low query response time through various optimizations at compile and run times. At compile time, we use a novel locality-aware query decomposition technique that maximizes the number of query triple patterns sent together to a source based on the actual location of the instances satisfying these triple patterns. At run time, we use selectivity-awareness and parallel query execution to reduce network latency and to increase parallelism by delaying the execution of subqueries expected to return large results. We evaluate Lusail using real and synthetic benchmarks, with data sizes up to billions of triples on an in-house cluster and a public cloud. We show that Lusail outperforms state-of-the-art systems by orders of magnitude in terms of scalability and response time.

  10. LexGrid: a framework for representing, storing, and querying biomedical terminologies from simple to sublime.

    Science.gov (United States)

    Pathak, Jyotishman; Solbrig, Harold R; Buntrock, James D; Johnson, Thomas M; Chute, Christopher G

    2009-01-01

    Many biomedical terminologies, classifications, and ontological resources such as the NCI Thesaurus (NCIT), International Classification of Diseases (ICD), Systematized Nomenclature of Medicine (SNOMED), Current Procedural Terminology (CPT), and Gene Ontology (GO) have been developed and used to build a variety of IT applications in biology, biomedicine, and health care settings. However, virtually all these resources involve incompatible formats, are based on different modeling languages, and lack appropriate tooling and programming interfaces (APIs) that hinder their wide-scale adoption and usage in a variety of application contexts. The Lexical Grid (LexGrid) project introduced in this paper is an ongoing community-driven initiative, coordinated by the Mayo Clinic Division of Biomedical Statistics and Informatics, designed to bridge this gap using a common terminology model called the LexGrid model. The key aspect of the model is to accommodate multiple vocabulary and ontology distribution formats and support of multiple data stores for federated vocabulary distribution. The model provides a foundation for building consistent and standardized APIs to access multiple vocabularies that support lexical search queries, hierarchy navigation, and a rich set of features such as recursive subsumption (e.g., get all the children of the concept penicillin). Existing LexGrid implementations include the LexBIG API as well as a reference implementation of the HL7 Common Terminology Services (CTS) specification providing programmatic access via Java, Web, and Grid services.

  11. Relative aggregation operator in database fuzzy querying

    Directory of Open Access Journals (Sweden)

    Luminita DUMITRIU

    2005-12-01

    Full Text Available Fuzzy selection criteria querying relational databases include vague terms; they usually refer linguistic values form the attribute linguistic domains, defined as fuzzy sets. Generally, when a vague query is processed, the definitions of vague terms must already exist in a knowledge base. But there are also cases when vague terms must be dynamically defined, when a particular operation is used to aggregate simple criteria in a complex selection. The paper presents a new aggregation operator and the corresponding algorithm to evaluate the fuzzy query.

  12. Experimental quantum private queries with linear optics

    International Nuclear Information System (INIS)

    De Martini, Francesco; Giovannetti, Vittorio; Lloyd, Seth; Maccone, Lorenzo; Nagali, Eleonora; Sansoni, Linda; Sciarrino, Fabio

    2009-01-01

    The quantum private query is a quantum cryptographic protocol to recover information from a database, preserving both user and data privacy: the user can test whether someone has retained information on which query was asked and the database provider can test the amount of information released. Here we discuss a variant of the quantum private query algorithm that admits a simple linear optical implementation: it employs the photon's momentum (or time slot) as address qubits and its polarization as bus qubit. A proof-of-principle experimental realization is implemented.

  13. Instant MDX queries for SQL Server 2012

    CERN Document Server

    Emond, Nicholas

    2013-01-01

    Get to grips with a new technology, understand what it is and what it can do for you, and then get to work with the most important features and tasks. This short, focused guide is a great way to get stated with writing MDX queries. New developers can use this book as a reference for how to use functions and the syntax of a query as well as how to use Calculated Members and Named Sets.This book is great for new developers who want to learn the MDX query language from scratch and install SQL Server 2012 with Analysis Services

  14. Personalizing Web Search based on User Profile

    OpenAIRE

    Utage, Sharyu; Ahire, Vijaya

    2016-01-01

    Web Search engine is most widely used for information retrieval from World Wide Web. These Web Search engines help user to find most useful information. When different users Searches for same information, search engine provide same result without understanding who is submitted that query. Personalized web search it is search technique for proving useful result. This paper models preference of users as hierarchical user profiles. a framework is proposed called UPS. It generalizes profile and m...

  15. SPANG: a SPARQL client supporting generation and reuse of queries for distributed RDF databases.

    Science.gov (United States)

    Chiba, Hirokazu; Uchiyama, Ikuo

    2017-02-08

    Toward improved interoperability of distributed biological databases, an increasing number of datasets have been published in the standardized Resource Description Framework (RDF). Although the powerful SPARQL Protocol and RDF Query Language (SPARQL) provides a basis for exploiting RDF databases, writing SPARQL code is burdensome for users including bioinformaticians. Thus, an easy-to-use interface is necessary. We developed SPANG, a SPARQL client that has unique features for querying RDF datasets. SPANG dynamically generates typical SPARQL queries according to specified arguments. It can also call SPARQL template libraries constructed in a local system or published on the Web. Further, it enables combinatorial execution of multiple queries, each with a distinct target database. These features facilitate easy and effective access to RDF datasets and integrative analysis of distributed data. SPANG helps users to exploit RDF datasets by generation and reuse of SPARQL queries through a simple interface. This client will enhance integrative exploitation of biological RDF datasets distributed across the Web. This software package is freely available at http://purl.org/net/spang .

  16. Vectorization vs. compilation in query execution

    NARCIS (Netherlands)

    J. Sompolski (Juliusz); M. Zukowski (Marcin); P.A. Boncz (Peter)

    2011-01-01

    textabstractCompiling database queries into executable (sub-) programs provides substantial benefits comparing to traditional interpreted execution. Many of these benefits, such as reduced interpretation overhead, better instruction code locality, and providing opportunities to use SIMD

  17. Algebraic Optimization of Recursive Database Queries

    DEFF Research Database (Denmark)

    Hansen, Michael Reichhardt

    1988-01-01

    Queries are expressed by relational algebra expressions including a fixpoint operation. A condition is presented under which a natural join commutes with a fixpoint operation. This condition is a simple check of attribute sets of sub-expressions of the query. The work may be considered a generali......Queries are expressed by relational algebra expressions including a fixpoint operation. A condition is presented under which a natural join commutes with a fixpoint operation. This condition is a simple check of attribute sets of sub-expressions of the query. The work may be considered...... a generalization of Aho and Ullman, (1979). The result is interpreted in function free logic database terms as a transformation of the recursively defined predicate involving: (a) elimination of an argument, and (b) propagation of selections (instantiations) to the extensionally defined predicates. A collection...

  18. Clean Air Markets - Allowances Query Wizard

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Allowances Query Wizard is part of a suite of Clean Air Markets-related tools that are accessible at http://camddataandmaps.epa.gov/gdm/index.cfm. The Allowances...

  19. Clean Air Markets - Compliance Query Wizard

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Compliance Query Wizard is part of a suite of Clean Air Markets-related tools that are accessible at http://ampd.epa.gov/ampd/. The Compliance module provides...

  20. Schedule Sales Query Report Generation System

    Data.gov (United States)

    General Services Administration — Schedule Sales Query presents sales volume figures as reported to GSA by contractors. The reports are generated as quarterly reports for the current year and the...

  1. A Query System Implementation Case Study.

    Science.gov (United States)

    Hiser, Judith N.; Neil, M. Elizabeth

    1985-01-01

    The Department of Administrative Programming Services of Clemson University investigated products available in user-friendly retrieval systems. The test of INTELLECT, a natural language query system written by Artifical Intelligence Corporation, is described. (Author/MLW)

  2. Path-based Queries on Trajectory Data

    DEFF Research Database (Denmark)

    Krogh, Benjamin Bjerre; Pelekis, Nikos; Theodoridis, Yannis

    2014-01-01

    In traffic research, management, and planning a number of path-based analyses are heavily used, e.g., for computing turn-times, evaluating green waves, or studying traffic flow. These analyses require retrieving the trajectories that follow the full path being analyzed. Existing path queries cannot...... sufficiently support such path-based analyses because they retrieve all trajectories that touch any edge in the path. In this paper, we define and formalize the strict path query. This is a novel query type tailored to support path-based analysis, where trajectories must follow all edges in the path...... a specific path by only retrieving data from the first and last edge in the path. To correctly answer strict path queries existing network-constrained trajectory indexes must retrieve data from all edges in the path. An extensive performance study of NETTRA using a very large real-world trajectory data set...

  3. Querying temporal databases via OWL 2 QL

    CSIR Research Space (South Africa)

    Klarman, S

    2014-06-01

    Full Text Available SQL:2011, the most recently adopted version of the SQL query language, has unprecedentedly standardized the representation of temporal data in relational databases. Following the successful paradigm of ontology-based data access, we develop a...

  4. Evaluating Trajectory Queries over Imprecise Location Data

    DEFF Research Database (Denmark)

    Xie, Scott, Xike; Cheng, Reynold; Yiu, Man Lung

    2012-01-01

    Trajectory queries, which retrieve nearby objects for every point of a given route, can be used to identify alerts of potential threats along a vessel route, or monitor the adjacent rescuers to a travel path. However, the locations of these objects (e.g., threats, succours) may not be precisely...... obtained due to hardware limitations of measuring devices, as well as the constantly-changing nature of the external environment. Ignoring data uncertainty can render low query quality, and cause undesirable consequences such as missing alerts of threats and poor response time in rescue operations. Also......, the query is quite time-consuming, since all the points on the trajectory are considered. In this paper, we study how to efficiently evaluate trajectory queries over imprecise location data, by proposing a new concept called the u-bisector. In general, the u-bisector is an extension of bisector to handle...

  5. Intelligent Agent Based Semantic Web in Cloud Computing Environment

    OpenAIRE

    Mukhopadhyay, Debajyoti; Sharma, Manoj; Joshi, Gajanan; Pagare, Trupti; Palwe, Adarsha

    2013-01-01

    Considering today's web scenario, there is a need of effective and meaningful search over the web which is provided by Semantic Web. Existing search engines are keyword based. They are vulnerable in answering intelligent queries from the user due to the dependence of their results on information available in web pages. While semantic search engines provides efficient and relevant results as the semantic web is an extension of the current web in which information is given well defined meaning....

  6. Menangkal Serangan SQL Injection Dengan Parameterized Query

    Directory of Open Access Journals (Sweden)

    Yulianingsih Yulianingsih

    2016-06-01

    Full Text Available Semakin meningkat pertumbuhan layanan informasi maka semakin tinggi pula tingkat kerentanan keamanan dari suatu sumber informasi. Melalui tulisan ini disajikan penelitian yang dilakukan secara eksperimen yang membahas tentang kejahatan penyerangan database secara SQL Injection. Penyerangan dilakukan melalui halaman autentikasi dikarenakan halaman ini merupakan pintu pertama akses yang seharusnya memiliki pertahanan yang cukup. Kemudian dilakukan eksperimen terhadap metode Parameterized Query untuk mendapatkan solusi terhadap permasalahan tersebut.   Kata kunci— Layanan Informasi, Serangan, eksperimen, SQL Injection, Parameterized Query.

  7. Evaluating SPARQL queries on massive RDF datasets

    KAUST Repository

    Al-Harbi, Razen

    2015-08-01

    Distributed RDF systems partition data across multiple computer nodes. Partitioning is typically based on heuristics that minimize inter-node communication and it is performed in an initial, data pre-processing phase. Therefore, the resulting partitions are static and do not adapt to changes in the query workload; as a result, existing systems are unable to consistently avoid communication for queries that are not favored by the initial data partitioning. Furthermore, for very large RDF knowledge bases, the partitioning phase becomes prohibitively expensive, leading to high startup costs. In this paper, we propose AdHash, a distributed RDF system which addresses the shortcomings of previous work. First, AdHash initially applies lightweight hash partitioning, which drastically minimizes the startup cost, while favoring the parallel processing of join patterns on subjects, without any data communication. Using a locality-aware planner, queries that cannot be processed in parallel are evaluated with minimal communication. Second, AdHash monitors the data access patterns and adapts dynamically to the query load by incrementally redistributing and replicating frequently accessed data. As a result, the communication cost for future queries is drastically reduced or even eliminated. Our experiments with synthetic and real data verify that AdHash (i) starts faster than all existing systems, (ii) processes thousands of queries before other systems become online, and (iii) gracefully adapts to the query load, being able to evaluate queries on billion-scale RDF data in sub-seconds. In this demonstration, audience can use a graphical interface of AdHash to verify its performance superiority compared to state-of-the-art distributed RDF systems.

  8. Dreamweaver CS6 HTML5, CSS3, responsive design, and jQuery

    CERN Document Server

    Karlins, David

    2013-01-01

    This book combines accessible, clear, engaging, and candid reference material, advice, and shortcuts with substantial stepbystep instructions for creating a wide range of HTML5 and CSS3 designs and page content in Dreamweaver.This book is geared towards experienced Dreamweaver web designers migrating to HTML5 and jQuery. It also targets web designers new to Dreamweaver who want to jump with two feet into the most current web design tools and features. While focused primarily on Dreamweaver CS5.5, the book includes content of value to readers using older versions of Dreamweaver with directions

  9. Evaluating SPARQL queries on massive RDF datasets

    KAUST Repository

    Al-Harbi, Razen; Abdelaziz, Ibrahim; Kalnis, Panos; Mamoulis, Nikos

    2015-01-01

    In this paper, we propose AdHash, a distributed RDF system which addresses the shortcomings of previous work. First, AdHash initially applies lightweight hash partitioning, which drastically minimizes the startup cost, while favoring the parallel processing of join patterns on subjects, without any data communication. Using a locality-aware planner, queries that cannot be processed in parallel are evaluated with minimal communication. Second, AdHash monitors the data access patterns and adapts dynamically to the query load by incrementally redistributing and replicating frequently accessed data. As a result, the communication cost for future queries is drastically reduced or even eliminated. Our experiments with synthetic and real data verify that AdHash (i) starts faster than all existing systems, (ii) processes thousands of queries before other systems become online, and (iii) gracefully adapts to the query load, being able to evaluate queries on billion-scale RDF data in sub-seconds. In this demonstration, audience can use a graphical interface of AdHash to verify its performance superiority compared to state-of-the-art distributed RDF systems.

  10. AMADA: Web Data Repositories in the Amazon Cloud

    OpenAIRE

    Aranda-Andújar , Andrés; Bugiotti , Francesca; Camacho-Rodríguez , Jesús; Colazzo , Dario; Goasdoué , François; Kaoudi , Zoi; Manolescu , Ioana

    2012-01-01

    International audience; We present AMADA, a platform for storing Web data (in particular, XML documents and RDF graphs) based on the Amazon Web Services (AWS) cloud infrastructure. AMADA operates in a Software as a Service (SaaS) approach, allowing users to upload, index, store, and query large volumes of Web data. The demonstration shows (i) the step-by-step procedure for building and exploiting the warehouse (storing, indexing, querying) and (ii) the monitoring tools enabling one to control...

  11. Stratification-Based Outlier Detection over the Deep Web

    OpenAIRE

    Xian, Xuefeng; Zhao, Pengpeng; Sheng, Victor S.; Fang, Ligang; Gu, Caidong; Yang, Yuanfeng; Cui, Zhiming

    2016-01-01

    For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribu...

  12. A Query Cache Tool for Optimizing Repeatable and Parallel OLAP Queries

    Science.gov (United States)

    Santos, Ricardo Jorge; Bernardino, Jorge

    On-line analytical processing against data warehouse databases is a common form of getting decision making information for almost every business field. Decision support information oftenly concerns periodic values based on regular attributes, such as sales amounts, percentages, most transactioned items, etc. This means that many similar OLAP instructions are periodically repeated, and simultaneously, between the several decision makers. Our Query Cache Tool takes advantage of previously executed queries, storing their results and the current state of the data which was accessed. Future queries only need to execute against the new data, inserted since the queries were last executed, and join these results with the previous ones. This makes query execution much faster, because we only need to process the most recent data. Our tool also minimizes the execution time and resource consumption for similar queries simultaneously executed by different users, putting the most recent ones on hold until the first finish and returns the results for all of them. The stored query results are held until they are considered outdated, then automatically erased. We present an experimental evaluation of our tool using a data warehouse based on a real-world business dataset and use a set of typical decision support queries to discuss the results, showing a very high gain in query execution time.

  13. Web Caching

    Indian Academy of Sciences (India)

    leveraged through Web caching technology. Specifically, Web caching becomes an ... Web routing can improve the overall performance of the Internet. Web caching is similar to memory system caching - a Web cache stores Web resources in ...

  14. Mining the human phenome using semantic web technologies: a case study for Type 2 Diabetes.

    Science.gov (United States)

    Pathak, Jyotishman; Kiefer, Richard C; Bielinski, Suzette J; Chute, Christopher G

    2012-01-01

    The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form "biobanks" where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypothesis generation. In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped with Type 2 Diabetes for discovering gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries.

  15. Web-based resources for critical care education.

    Science.gov (United States)

    Kleinpell, Ruth; Ely, E Wesley; Williams, Ged; Liolios, Antonios; Ward, Nicholas; Tisherman, Samuel A

    2011-03-01

    To identify, catalog, and critically evaluate Web-based resources for critical care education. A multilevel search strategy was utilized. Literature searches were conducted (from 1996 to September 30, 2010) using OVID-MEDLINE, PubMed, and the Cumulative Index to Nursing and Allied Health Literature with the terms "Web-based learning," "computer-assisted instruction," "e-learning," "critical care," "tutorials," "continuing education," "virtual learning," and "Web-based education." The Web sites of relevant critical care organizations (American College of Chest Physicians, American Society of Anesthesiologists, American Thoracic Society, European Society of Intensive Care Medicine, Society of Critical Care Medicine, World Federation of Societies of Intensive and Critical Care Medicine, American Association of Critical Care Nurses, and World Federation of Critical Care Nurses) were reviewed for the availability of e-learning resources. Finally, Internet searches and e-mail queries to critical care medicine fellowship program directors and members of national and international acute/critical care listserves were conducted to 1) identify the use of and 2) review and critique Web-based resources for critical care education. To ensure credibility of Web site information, Web sites were reviewed by three independent reviewers on the basis of the criteria of authority, objectivity, authenticity, accuracy, timeliness, relevance, and efficiency in conjunction with suggested formats for evaluating Web sites in the medical literature. Literature searches using OVID-MEDLINE, PubMed, and the Cumulative Index to Nursing and Allied Health Literature resulted in >250 citations. Those pertinent to critical care provide examples of the integration of e-learning techniques, the development of specific resources, reports of the use of types of e-learning, including interactive tutorials, case studies, and simulation, and reports of student or learner satisfaction, among other general

  16. Extending Climate Analytics-As to the Earth System Grid Federation

    Science.gov (United States)

    Tamkin, G.; Schnase, J. L.; Duffy, D.; McInerney, M.; Nadeau, D.; Li, J.; Strong, S.; Thompson, J. H.

    2015-12-01

    We are building three extensions to prior-funded work on climate analytics-as-a-service that will benefit the Earth System Grid Federation (ESGF) as it addresses the Big Data challenges of future climate research: (1) We are creating a cloud-based, high-performance Virtual Real-Time Analytics Testbed supporting a select set of climate variables from six major reanalysis data sets. This near real-time capability will enable advanced technologies like the Cloudera Impala-based Structured Query Language (SQL) query capabilities and Hadoop-based MapReduce analytics over native NetCDF files while providing a platform for community experimentation with emerging analytic technologies. (2) We are building a full-featured Reanalysis Ensemble Service comprising monthly means data from six reanalysis data sets. The service will provide a basic set of commonly used operations over the reanalysis collections. The operations will be made accessible through NASA's climate data analytics Web services and our client-side Climate Data Services (CDS) API. (3) We are establishing an Open Geospatial Consortium (OGC) WPS-compliant Web service interface to our climate data analytics service that will enable greater interoperability with next-generation ESGF capabilities. The CDS API will be extended to accommodate the new WPS Web service endpoints as well as ESGF's Web service endpoints. These activities address some of the most important technical challenges for server-side analytics and support the research community's requirements for improved interoperability and improved access to reanalysis data.

  17. Heuristic query optimization for query multiple table and multiple clausa on mobile finance application

    Science.gov (United States)

    Indrayana, I. N. E.; P, N. M. Wirasyanti D.; Sudiartha, I. KG

    2018-01-01

    Mobile application allow many users to access data from the application without being limited to space, space and time. Over time the data population of this application will increase. Data access time will cause problems if the data record has reached tens of thousands to millions of records.The objective of this research is to maintain the performance of data execution for large data records. One effort to maintain data access time performance is to apply query optimization method. The optimization used in this research is query heuristic optimization method. The built application is a mobile-based financial application using MySQL database with stored procedure therein. This application is used by more than one business entity in one database, thus enabling rapid data growth. In this stored procedure there is an optimized query using heuristic method. Query optimization is performed on a “Select” query that involves more than one table with multiple clausa. Evaluation is done by calculating the average access time using optimized and unoptimized queries. Access time calculation is also performed on the increase of population data in the database. The evaluation results shown the time of data execution with query heuristic optimization relatively faster than data execution time without using query optimization.

  18. Integrating Data Warehouses with Web Data

    DEFF Research Database (Denmark)

    Perez, Juan Manuel; Berlanga, Rafael; Aramburu, Maria Jose

    This paper surveys the most relevant research on combining Data Warehouse (DW) and Web data. It studies the XML technologies that are currently being used to integrate, store, query and retrieve web data, and their application to data warehouses. The paper addresses the problem of integrating...

  19. Multitasking Web Searching and Implications for Design.

    Science.gov (United States)

    Ozmutlu, Seda; Ozmutlu, H. C.; Spink, Amanda

    2003-01-01

    Findings from a study of users' multitasking searches on Web search engines include: multitasking searches are a noticeable user behavior; multitasking search sessions are longer than regular search sessions in terms of queries per session and duration; both Excite and AlltheWeb.com users search for about three topics per multitasking session and…

  20. XQOWL: An Extension of XQuery for OWL Querying and Reasoning

    Directory of Open Access Journals (Sweden)

    Jesús M. Almendros-Jiménez

    2015-01-01

    Full Text Available One of the main aims of the so-called Web of Data is to be able to handle heterogeneous resources where data can be expressed in either XML or RDF. The design of programming languages able to handle both XML and RDF data is a key target in this context. In this paper we present a framework called XQOWL that makes possible to handle XML and RDF/OWL data with XQuery. XQOWL can be considered as an extension of the XQuery language that connects XQuery with SPARQL and OWL reasoners. XQOWL embeds SPARQL queries (via Jena SPARQL engine in XQuery and enables to make calls to OWL reasoners (HermiT, Pellet and FaCT++ from XQuery. It permits to combine queries against XML and RDF/OWL resources as well as to reason with RDF/OWL data. Therefore input data can be either XML or RDF/OWL and output data can be formatted in XML (also using RDF/OWL XML serialization.

  1. KnockoutJS web development

    CERN Document Server

    Farrar, John

    2015-01-01

    This book is for web developers and designers who work with HTML and JavaScript to help them manage data and interactivity with data using KnockoutJS. Knowledge about jQuery will be useful but is not necessary.

  2. QuerySpaces on Hadoop for the ATLAS EventIndex

    CERN Document Server

    Hrivnac, Julius; The ATLAS collaboration; Cranshaw, Jack; Favareto, Andrea; Prokoshin, Fedor; Glasman, Claudia; Toebbicke, Rainer

    2015-01-01

    A Hadoop-based implementation of the adaptive query engine serving as the back-end for the ATLAS EventIndex. The QuerySpaces implementation handles both original data and search results providing fast and efficient mechanisms for new user queries using already accumulated knowledge for optimization. Detailed descriptions and statistics about user requests are collected in HBase tables and HDFS files. Requests are associated to their results and a graph of relations between them is created to be used to find the most efficient way of providing answers to new requests The environment is completely transparent to users and is accessible over several command-line interfaces, a Web Service and a programming API.

  3. A SQL-Database Based Meta-CASE System and its Query Subsystem

    Science.gov (United States)

    Eessaar, Erki; Sgirka, Rünno

    Meta-CASE systems simplify the creation of CASE (Computer Aided System Engineering) systems. In this paper, we present a meta-CASE system that provides a web-based user interface and uses an object-relational database system (ORDBMS) as its basis. The use of ORDBMSs allows us to integrate different parts of the system and simplify the creation of meta-CASE and CASE systems. ORDBMSs provide powerful query mechanism. The proposed system allows developers to use queries to evaluate and gradually improve artifacts and calculate values of software measures. We illustrate the use of the systems by using SimpleM modeling language and discuss the use of SQL in the context of queries about artifacts. We have created a prototype of the meta-CASE system by using PostgreSQL™ ORDBMS and PHP scripting language.

  4. Infodemiology of status epilepticus: A systematic validation of the Google Trends-based search queries.

    Science.gov (United States)

    Bragazzi, Nicola Luigi; Bacigaluppi, Susanna; Robba, Chiara; Nardone, Raffaele; Trinka, Eugen; Brigo, Francesco

    2016-02-01

    People increasingly use Google looking for health-related information. We previously demonstrated that in English-speaking countries most people use this search engine to obtain information on status epilepticus (SE) definition, types/subtypes, and treatment. Now, we aimed at providing a quantitative analysis of SE-related web queries. This analysis represents an advancement, with respect to what was already previously discussed, in that the Google Trends (GT) algorithm has been further refined and correlational analyses have been carried out to validate the GT-based query volumes. Google Trends-based SE-related query volumes were well correlated with information concerning causes and pharmacological and nonpharmacological treatments. Google Trends can provide both researchers and clinicians with data on realities and contexts that are generally overlooked and underexplored by classic epidemiology. In this way, GT can foster new epidemiological studies in the field and can complement traditional epidemiological tools. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. EFFECTIVELY SEARCHING SPECIMEN AND OBSERVATION DATA WITH TOQE, THE THESAURUS OPTIMIZED QUERY EXPANDER

    Directory of Open Access Journals (Sweden)

    Anton Güntsch

    2009-09-01

    Full Text Available Today’s specimen and observation data portals lack a flexible mechanism, able to link up thesaurus-enabled data sources such as taxonomic checklist databases and expand user queries to related terms, significantly enhancing result sets. The TOQE system (Thesaurus Optimized Query Expander is a REST-like XML web-service implemented in Python and designed for this purpose. Acting as an interface between portals and thesauri, TOQE allows the implementation of specialized portal systems with a set of thesauri supporting its specific focus. It is both easy to use for portal programmers and easy to configure for thesaurus database holders who want to expose their system as a service for query expansions. Currently, TOQE is used in four specimen and observation data portals. The documentation is available from http://search.biocase.org/toqe/.

  6. The RCSB Protein Data Bank: redesigned web site and web services.

    Science.gov (United States)

    Rose, Peter W; Beran, Bojan; Bi, Chunxiao; Bluhm, Wolfgang F; Dimitropoulos, Dimitris; Goodsell, David S; Prlic, Andreas; Quesada, Martha; Quinn, Gregory B; Westbrook, John D; Young, Jasmine; Yukich, Benjamin; Zardecki, Christine; Berman, Helen M; Bourne, Philip E

    2011-01-01

    The RCSB Protein Data Bank (RCSB PDB) web site (http://www.pdb.org) has been redesigned to increase usability and to cater to a larger and more diverse user base. This article describes key enhancements and new features that fall into the following categories: (i) query and analysis tools for chemical structure searching, query refinement, tabulation and export of query results; (ii) web site customization and new structure alerts; (iii) pair-wise and representative protein structure alignments; (iv) visualization of large assemblies; (v) integration of structural data with the open access literature and binding affinity data; and (vi) web services and web widgets to facilitate integration of PDB data and tools with other resources. These improvements enable a range of new possibilities to analyze and understand structure data. The next generation of the RCSB PDB web site, as described here, provides a rich resource for research and education.

  7. Implementation of Quantum Private Queries Using Nuclear Magnetic Resonance

    International Nuclear Information System (INIS)

    Wang Chuan; Hao Liang; Zhao Lian-Jie

    2011-01-01

    We present a modified protocol for the realization of a quantum private query process on a classical database. Using one-qubit query and CNOT operation, the query process can be realized in a two-mode database. In the query process, the data privacy is preserved as the sender would not reveal any information about the database besides her query information, and the database provider cannot retain any information about the query. We implement the quantum private query protocol in a nuclear magnetic resonance system. The density matrix of the memory registers are constructed. (general)

  8. Web Publishing Schedule

    Science.gov (United States)

    Section 207(f)(2) of the E-Gov Act requires federal agencies to develop an inventory and establish a schedule of information to be published on their Web sites, make those schedules available for public comment. To post the schedules on the web site.

  9. Templates and Queries in Contextual Hypermedia

    DEFF Research Database (Denmark)

    Anderson, Kenneth Mark; Hansen, Frank Allan; Bouvin, Niels Olof

    2006-01-01

    discuss a framework, HyConSC, that implements this model and describe how it can be used to build new contextual hypermedia systems. Our framework aids the developer in the iterative development of contextual queries (via a dynamic query browser) and offers support for con-text matching, a key feature...... of contextual hypermedia. We have tested the framework with data and sensors taken from the HyCon contextual hypermedia system and are now migrating HyCon to this new framework....

  10. SPARQL Query Re-writing Using Partonomy Based Transformation Rules

    Science.gov (United States)

    Jain, Prateek; Yeh, Peter Z.; Verma, Kunal; Henson, Cory A.; Sheth, Amit P.

    Often the information present in a spatial knowledge base is represented at a different level of granularity and abstraction than the query constraints. For querying ontology's containing spatial information, the precise relationships between spatial entities has to be specified in the basic graph pattern of SPARQL query which can result in long and complex queries. We present a novel approach to help users intuitively write SPARQL queries to query spatial data, rather than relying on knowledge of the ontology structure. Our framework re-writes queries, using transformation rules to exploit part-whole relations between geographical entities to address the mismatches between query constraints and knowledge base. Our experiments were performed on completely third party datasets and queries. Evaluations were performed on Geonames dataset using questions from National Geographic Bee serialized into SPARQL and British Administrative Geography Ontology using questions from a popular trivia website. These experiments demonstrate high precision in retrieval of results and ease in writing queries.

  11. Answering SPARQL queries modulo RDF Schema with paths

    OpenAIRE

    Alkhateeb, Faisal; Euzenat, Jérôme

    2013-01-01

    alkhateeb2013a; SPARQL is the standard query language for RDF graphs. In its strict instantiation, it only offers querying according to the RDF semantics and would thus ignore the semantics of data expressed with respect to (RDF) schemas or (OWL) ontologies. Several extensions to SPARQL have been proposed to query RDF data modulo RDFS, i.e., interpreting the query with RDFS semantics and/or considering external ontologies. We introduce a general framework which allows for expressing query ans...

  12. New Path Based Index Structure for Processing CAS Queries over XML Database

    Directory of Open Access Journals (Sweden)

    Krishna Asawa

    2017-01-01

    Full Text Available Querying nested data has become one of the most challenging issues for retrieving desired information from the Web. Today diverse applications generate a tremendous amount of data in different formats. These data and information exchanged on the Web are commonly expressed as nested representation such as XML, JSON, etc. Unlike the traditional database system, they don't have a rigid schema. In general, the nested data is managed by storing data and its structures separately which significantly reduces the performance of data retrieving. Ensuring efficiency of processing queries which locates the exact positions of the elements has become a big challenging issue. There are different indexing structures which have been proposed in the literature to improve the performance of the query processing on the nested structure. Most of the past researches on nested structure concentrate on the structure alone. This paper proposes new index structure which combines siblings of the terminal nodes as one path which efficiently processes twig queries with less number of lookups and joins. The proposed approach is compared with some of the existing approaches. The results also show that they are processed with better performance compared to the existing ones.

  13. Evolutionary Algorithms for Boolean Queries Optimization

    Czech Academy of Sciences Publication Activity Database

    Húsek, Dušan; Snášel, Václav; Neruda, Roman; Owais, S.S.J.; Krömer, P.

    2006-01-01

    Roč. 3, č. 1 (2006), s. 15-20 ISSN 1790-0832 R&D Projects: GA AV ČR 1ET100300414 Institutional research plan: CEZ:AV0Z10300504 Keywords : evolutionary algorithms * genetic algorithms * information retrieval * Boolean query Subject RIV: BA - General Mathematics

  14. Boolean Queries Optimization by Genetic Algorithms

    Czech Academy of Sciences Publication Activity Database

    Húsek, Dušan; Owais, S.S.J.; Krömer, P.; Snášel, Václav

    2005-01-01

    Roč. 15, - (2005), s. 395-409 ISSN 1210-0552 R&D Projects: GA AV ČR 1ET100300414 Institutional research plan: CEZ:AV0Z10300504 Keywords : evolutionary algorithms * genetic algorithms * genetic programming * information retrieval * Boolean query Subject RIV: BB - Applied Statistics, Operational Research

  15. External query expansion in the blogosphere

    NARCIS (Netherlands)

    Weerkamp, W.; de Rijke, M.; Voorhees, E.M.; Buckland, L.P.

    2009-01-01

    We describe the participation of the University of Amsterdam’s ILPS group in the blog track at TREC 2008. We mainly explored different ways of using external corpora to expand the original query. In the blog post retrieval task we did not succeed in improving over a simple baseline (equal weights

  16. C-SPARQL : SPARQL for continuous querying

    OpenAIRE

    Barbieri, Davide Francesco; Braga, Daniele; Ceri, Stefano; Valle, Emanuele Della; Grossniklaus, Michael

    2009-01-01

    C-SPARQL is an extension of SPARQL to support continuous queries, registered and continuously executed over RDF data streams, considering windows of such streams. Supporting streams in RDF format guarantees interoperability and opens up important applications, in which reasoners can deal with knowledge that evolves over time. We present C-SPARQL by means of examples in Urban Computing.

  17. The data cyclotron query processing scheme

    NARCIS (Netherlands)

    R.A. Goncalves (Romulo); M.L. Kersten (Martin)

    2010-01-01

    htmlabstractDistributed database systems exploit static workload characteristics to steer data fragmentation and data allocation schemes. However, the grand challenge of distributed query processing is to come up with a self-organizing architecture, which exploits all resources to manage the hot

  18. XAL: An algebra for XML query optimization

    NARCIS (Netherlands)

    Frasincar, F.; Houben, G.J.P.M.; Pau, C.D.; Zhou, Xiaofang

    2002-01-01

    This paper proposes XAL, an XML ALgebra. Its novelty is based on the simplicity of its data model and its well-defined logical operators, which makes it suitable for composability, optimizability, and semantics definition of a query language for XML data. At the heart of the algebra resides the

  19. Combining the power of searching and querying

    NARCIS (Netherlands)

    Cohen, S.; Kanza, Y.; Kogan, Y.A.; Nutt, W.; Sagiv, Y.; Serebrenik, A.; Etzion, O.; Scheuermann, P.

    2000-01-01

    EquiX is a search language for XML that combines the power of querying with the simplicity of searching. Requirements for search languages are discussed and it is shown that EquiX meets the necessary criteria. Both a graphical abstract syntax and a formal concrete syntax are presented for EquiX

  20. Beginning SQL queries from novice to professional

    CERN Document Server

    Churcher, Clare

    2016-01-01

    Anyone who does any work at all with databases needs to know something of SQL. This is a friendly and easy-to-read guide to writing queries with the all-important - in the database world - SQL language. The author writes with exceptional clarity.

  1. Flattening Queries over Nested Data Types

    NARCIS (Netherlands)

    van Ruth, J.

    2006-01-01

    The theory developed in this thesis provides a method to improve the efficiency of querying nested data. The roots of this research lie in the tension between data model expressiveness and performance. Obviously, more expressive data models are more convenient for application programmers. For many

  2. Sonata: Query-Driven Network Telemetry

    KAUST Repository

    Gupta, Arpit; Harrison, Rob; Pawar, Ankita; Birkner, Rü diger; Canini, Marco; Feamster, Nick; Rexford, Jennifer; Willinger, Walter

    2017-01-01

    Operating networks depends on collecting and analyzing measurement data. Current technologies do not make it easy to do so, typically because they separate data collection (e.g., packet capture or flow monitoring) from analysis, producing either too much data to answer a general question or too little data to answer a detailed question. In this paper, we present Sonata, a network telemetry system that uses a uniform query interface to drive the joint collection and analysis of network traffic. Sonata takes the advantage of two emerging technologies---streaming analytics platforms and programmable network devices---to facilitate joint collection and analysis. Sonata allows operators to more directly express network traffic analysis tasks in terms of a high-level language. The underlying runtime partitions each query into a portion that runs on the switch and another that runs on the streaming analytics platform iteratively refines the query to efficiently capture only the traffic that pertains to the operator's query, and exploits sketches to reduce state in switches in exchange for more approximate results. Through an evaluation of a prototype implementation, we demonstrate that Sonata can support a wide range of network telemetry tasks with less state in the network, and lower data rates to streaming analytics systems, than current approaches can achieve.

  3. Sonata: Query-Driven Network Telemetry

    KAUST Repository

    Gupta, Arpit

    2017-05-02

    Operating networks depends on collecting and analyzing measurement data. Current technologies do not make it easy to do so, typically because they separate data collection (e.g., packet capture or flow monitoring) from analysis, producing either too much data to answer a general question or too little data to answer a detailed question. In this paper, we present Sonata, a network telemetry system that uses a uniform query interface to drive the joint collection and analysis of network traffic. Sonata takes the advantage of two emerging technologies---streaming analytics platforms and programmable network devices---to facilitate joint collection and analysis. Sonata allows operators to more directly express network traffic analysis tasks in terms of a high-level language. The underlying runtime partitions each query into a portion that runs on the switch and another that runs on the streaming analytics platform iteratively refines the query to efficiently capture only the traffic that pertains to the operator\\'s query, and exploits sketches to reduce state in switches in exchange for more approximate results. Through an evaluation of a prototype implementation, we demonstrate that Sonata can support a wide range of network telemetry tasks with less state in the network, and lower data rates to streaming analytics systems, than current approaches can achieve.

  4. CUFID-query: accurate network querying through random walk based network flow estimation.

    Science.gov (United States)

    Jeong, Hyundoo; Qian, Xiaoning; Yoon, Byung-Jun

    2017-12-28

    Functional modules in biological networks consist of numerous biomolecules and their complicated interactions. Recent studies have shown that biomolecules in a functional module tend to have similar interaction patterns and that such modules are often conserved across biological networks of different species. As a result, such conserved functional modules can be identified through comparative analysis of biological networks. In this work, we propose a novel network querying algorithm based on the CUFID (Comparative network analysis Using the steady-state network Flow to IDentify orthologous proteins) framework combined with an efficient seed-and-extension approach. The proposed algorithm, CUFID-query, can accurately detect conserved functional modules as small subnetworks in the target network that are expected to perform similar functions to the given query functional module. The CUFID framework was recently developed for probabilistic pairwise global comparison of biological networks, and it has been applied to pairwise global network alignment, where the framework was shown to yield accurate network alignment results. In the proposed CUFID-query algorithm, we adopt the CUFID framework and extend it for local network alignment, specifically to solve network querying problems. First, in the seed selection phase, the proposed method utilizes the CUFID framework to compare the query and the target networks and to predict the probabilistic node-to-node correspondence between the networks. Next, the algorithm selects and greedily extends the seed in the target network by iteratively adding nodes that have frequent interactions with other nodes in the seed network, in a way that the conductance of the extended network is maximally reduced. Finally, CUFID-query removes irrelevant nodes from the querying results based on the personalized PageRank vector for the induced network that includes the fully extended network and its neighboring nodes. Through extensive

  5. A Framework for Transparently Accessing Deep Web Sources

    Science.gov (United States)

    Dragut, Eduard Constantin

    2010-01-01

    An increasing number of Web sites expose their content via query interfaces, many of them offering the same type of products/services (e.g., flight tickets, car rental/purchasing). They constitute the so-called "Deep Web". Accessing the content on the Deep Web has been a long-standing challenge for the database community. For a user interested in…

  6. PERANGKAT BANTU UNTUK OPTIMASI QUERY PADA ORACLE DENGAN RESTRUKTURISASI SQL

    Directory of Open Access Journals (Sweden)

    Darlis Heru Murti

    2006-07-01

    Full Text Available Query merupakan bagian dari bahasa pemrograman SQL (Structured Query Language yang berfungsi untuk mengambil data (read dalam DBMS (Database Management System, termasuk Oracle [3]. Pada Oracle, ada tiga tahap proses yang dilakukan dalam pengeksekusian query, yaitu Parsing, Execute dan Fetch. Sebelum proses execute dijalankan, Oracle terlebih dahulu membuat execution plan yang akan menjadi skenario dalam proses excute.Dalam proses pengeksekusian query, terdapat faktor-faktor yang mempengaruhi kinerja query, di antaranya access path (cara pengambilan data dari sebuah tabel dan operasi join (cara menggabungkan data dari dua tabel. Untuk mendapatkan query dengan kinerja optimal, maka diperlukan pertimbangan-pertimbangan dalam menyikapi faktor-faktor tersebut.  Optimasi query merupakan suatu cara untuk mendapatkan query dengan kinerja seoptimal mungkin, terutama dilihat dari sudut pandang waktu. Ada banyak metode untuk mengoptimasi query, tapi pada Penelitian ini, penulis membuat sebuah aplikasi untuk mengoptimasi query dengan metode restrukturisasi SQL statement. Pada metode ini, objek yang dianalisa adalah struktur klausa yang membangun sebuah query. Aplikasi ini memiliki satu input dan lima jenis output. Input dari aplikasi ini adalah sebuah query sedangkan kelima jenis output aplikasi ini adalah berupa query hasil optimasi, saran perbaikan, saran pembuatan indeks baru, execution plan dan data statistik. Cara kerja aplikasi ini dibagi menjadi empat tahap yaitu mengurai query menjadi sub query, mengurai query per-klausa, menentukan access path dan operasi join dan restrukturisasi query.Dari serangkaian ujicoba yang dilakukan penulis, aplikasi telah dapat berjalan sesuai dengan tujuan pembuatan Penelitian ini, yaitu mendapatkan query dengan kinerja optimal.Kata Kunci : Query, SQL, DBMS, Oracle, Parsing, Execute, Fetch, Execution Plan, Access Path, Operasi Join, Restrukturisasi SQL statement.

  7. Query-by-example surgical activity detection.

    Science.gov (United States)

    Gao, Yixin; Vedula, S Swaroop; Lee, Gyusung I; Lee, Mija R; Khudanpur, Sanjeev; Hager, Gregory D

    2016-06-01

    Easy acquisition of surgical data opens many opportunities to automate skill evaluation and teaching. Current technology to search tool motion data for surgical activity segments of interest is limited by the need for manual pre-processing, which can be prohibitive at scale. We developed a content-based information retrieval method, query-by-example (QBE), to automatically detect activity segments within surgical data recordings of long duration that match a query. The example segment of interest (query) and the surgical data recording (target trial) are time series of kinematics. Our approach includes an unsupervised feature learning module using a stacked denoising autoencoder (SDAE), two scoring modules based on asymmetric subsequence dynamic time warping (AS-DTW) and template matching, respectively, and a detection module. A distance matrix of the query against the trial is computed using the SDAE features, followed by AS-DTW combined with template scoring, to generate a ranked list of candidate subsequences (substrings). To evaluate the quality of the ranked list against the ground-truth, thresholding conventional DTW distances and bipartite matching are applied. We computed the recall, precision, F1-score, and a Jaccard index-based score on three experimental setups. We evaluated our QBE method using a suture throw maneuver as the query, on two tool motion datasets (JIGSAWS and MISTIC-SL) captured in a training laboratory. We observed a recall of 93, 90 and 87 % and a precision of 93, 91, and 88 % with same surgeon same trial (SSST), same surgeon different trial (SSDT) and different surgeon (DS) experiment setups on JIGSAWS, and a recall of 87, 81 and 75 % and a precision of 72, 61, and 53 % with SSST, SSDT and DS experiment setups on MISTIC-SL, respectively. We developed a novel, content-based information retrieval method to automatically detect multiple instances of an activity within long surgical recordings. Our method demonstrated adequate recall

  8. SCALEUS: Semantic Web Services Integration for Biomedical Applications.

    Science.gov (United States)

    Sernadela, Pedro; González-Castro, Lorena; Oliveira, José Luís

    2017-04-01

    In recent years, we have witnessed an explosion of biological data resulting largely from the demands of life science research. The vast majority of these data are freely available via diverse bioinformatics platforms, including relational databases and conventional keyword search applications. This type of approach has achieved great results in the last few years, but proved to be unfeasible when information needs to be combined or shared among different and scattered sources. During recent years, many of these data distribution challenges have been solved with the adoption of semantic web. Despite the evident benefits of this technology, its adoption introduced new challenges related with the migration process, from existent systems to the semantic level. To facilitate this transition, we have developed Scaleus, a semantic web migration tool that can be deployed on top of traditional systems in order to bring knowledge, inference rules, and query federation to the existent data. Targeted at the biomedical domain, this web-based platform offers, in a single package, straightforward data integration and semantic web services that help developers and researchers in the creation process of new semantically enhanced information systems. SCALEUS is available as open source at http://bioinformatics-ua.github.io/scaleus/ .

  9. Efficient evaluation of shortest travel-time path queries through spatial mashups

    KAUST Repository

    Zhang, Detian; Chow, Chi-Yin; Liu, An; Zhang, Xiangliang; Ding, Qingzhu; Li, Qing

    2017-01-01

    In the real world, the route/path with the shortest travel time in a road network is more meaningful than that with the shortest network distance for location-based services (LBS). However, not every LBS provider has adequate resources to compute/estimate travel time for routes by themselves. A cost-effective way for LBS providers to estimate travel time for routes is to issue external route requests to Web mapping services (e.g., Google Maps, Bing Maps, and MapQuest Maps). Due to the high cost of processing such external route requests and the usage limits of Web mapping services, we take the advantage of direction sharing, parallel requesting and waypoints supported by Web mapping services to reduce the number of external route requests and the query response time for shortest travel-time route queries in this paper. We first give the definition of sharing ability to reflect the possibility of sharing the direction information of a route with others, and find out the queries that their query routes are independent with each other for parallel processing. Then, we model the problem of selecting the optimal waypoints for an external route request as finding the longest simple path in a weighted complete digraph. As it is a MAX SNP-hard problem, we propose a greedy algorithm with performance guarantee to find the best set of waypoints in an external route request. We evaluate the performance of our approach using a real Web mapping service, a real road network, real and synthetic data sets. Experimental results show the efficiency, scalability, and applicability of our approach.

  10. Efficient evaluation of shortest travel-time path queries through spatial mashups

    KAUST Repository

    Zhang, Detian

    2017-01-07

    In the real world, the route/path with the shortest travel time in a road network is more meaningful than that with the shortest network distance for location-based services (LBS). However, not every LBS provider has adequate resources to compute/estimate travel time for routes by themselves. A cost-effective way for LBS providers to estimate travel time for routes is to issue external route requests to Web mapping services (e.g., Google Maps, Bing Maps, and MapQuest Maps). Due to the high cost of processing such external route requests and the usage limits of Web mapping services, we take the advantage of direction sharing, parallel requesting and waypoints supported by Web mapping services to reduce the number of external route requests and the query response time for shortest travel-time route queries in this paper. We first give the definition of sharing ability to reflect the possibility of sharing the direction information of a route with others, and find out the queries that their query routes are independent with each other for parallel processing. Then, we model the problem of selecting the optimal waypoints for an external route request as finding the longest simple path in a weighted complete digraph. As it is a MAX SNP-hard problem, we propose a greedy algorithm with performance guarantee to find the best set of waypoints in an external route request. We evaluate the performance of our approach using a real Web mapping service, a real road network, real and synthetic data sets. Experimental results show the efficiency, scalability, and applicability of our approach.

  11. Graph Mining Meets the Semantic Web

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Sangkeun (Matt) [ORNL; Sukumar, Sreenivas R [ORNL; Lim, Seung-Hwan [ORNL

    2015-01-01

    The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today, data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. We address that need through implementation of three popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, and PageRank). We implement these algorithms as SPARQL queries, wrapped within Python scripts. We evaluate the performance of our implementation on 6 real world data sets and show graph mining algorithms (that have a linear-algebra formulation) can indeed be unleashed on data represented as RDF graphs using the SPARQL query interface.

  12. SWORS: a system for the efficient retrieval of relevant spatial web objects

    DEFF Research Database (Denmark)

    Cao, Xin; Cong, Gao; Jensen, Christian S.

    2012-01-01

    Spatial web objects that possess both a geographical location and a textual description are gaining in prevalence. This gives prominence to spatial keyword queries that exploit both location and textual arguments. Such queries are used in many web services such as yellow pages and maps services....

  13. Ontology based heterogeneous materials database integration and semantic query

    Science.gov (United States)

    Zhao, Shuai; Qian, Quan

    2017-10-01

    Materials digital data, high throughput experiments and high throughput computations are regarded as three key pillars of materials genome initiatives. With the fast growth of materials data, the integration and sharing of data is very urgent, that has gradually become a hot topic of materials informatics. Due to the lack of semantic description, it is difficult to integrate data deeply in semantic level when adopting the conventional heterogeneous database integration approaches such as federal database or data warehouse. In this paper, a semantic integration method is proposed to create the semantic ontology by extracting the database schema semi-automatically. Other heterogeneous databases are integrated to the ontology by means of relational algebra and the rooted graph. Based on integrated ontology, semantic query can be done using SPARQL. During the experiments, two world famous First Principle Computational databases, OQMD and Materials Project are used as the integration targets, which show the availability and effectiveness of our method.

  14. JavaScript and jQuery for data analysis and visualization

    CERN Document Server

    Raasch, Jon; Ogievetsky, Vadim; Lowery, Joseph

    2014-01-01

    Go beyond design concepts-build dynamic data visualizations using JavaScript JavaScript and jQuery for Data Analysis and Visualization goes beyond design concepts to show readers how to build dynamic, best-of-breed visualizations using JavaScript-the most popular language for web programming. The authors show data analysts, developers, and web designers how they can put the power and flexibility of modern JavaScript libraries to work to analyze data and then present it using best-of-breed visualizations. They also demonstrate the use of each technique with real-world use cases, showing how to

  15. Ontobee: A linked ontology data server to support ontology term dereferencing, linkage, query and integration

    Science.gov (United States)

    Ong, Edison; Xiang, Zuoshuang; Zhao, Bin; Liu, Yue; Lin, Yu; Zheng, Jie; Mungall, Chris; Courtot, Mélanie; Ruttenberg, Alan; He, Yongqun

    2017-01-01

    Linked Data (LD) aims to achieve interconnected data by representing entities using Unified Resource Identifiers (URIs), and sharing information using Resource Description Frameworks (RDFs) and HTTP. Ontologies, which logically represent entities and relations in specific domains, are the basis of LD. Ontobee (http://www.ontobee.org/) is a linked ontology data server that stores ontology information using RDF triple store technology and supports query, visualization and linkage of ontology terms. Ontobee is also the default linked data server for publishing and browsing biomedical ontologies in the Open Biological Ontology (OBO) Foundry (http://obofoundry.org) library. Ontobee currently hosts more than 180 ontologies (including 131 OBO Foundry Library ontologies) with over four million terms. Ontobee provides a user-friendly web interface for querying and visualizing the details and hierarchy of a specific ontology term. Using the eXtensible Stylesheet Language Transformation (XSLT) technology, Ontobee is able to dereference a single ontology term URI, and then output RDF/eXtensible Markup Language (XML) for computer processing or display the HTML information on a web browser for human users. Statistics and detailed information are generated and displayed for each ontology listed in Ontobee. In addition, a SPARQL web interface is provided for custom advanced SPARQL queries of one or multiple ontologies. PMID:27733503

  16. Ontobee: A linked ontology data server to support ontology term dereferencing, linkage, query and integration.

    Science.gov (United States)

    Ong, Edison; Xiang, Zuoshuang; Zhao, Bin; Liu, Yue; Lin, Yu; Zheng, Jie; Mungall, Chris; Courtot, Mélanie; Ruttenberg, Alan; He, Yongqun

    2017-01-04

    Linked Data (LD) aims to achieve interconnected data by representing entities using Unified Resource Identifiers (URIs), and sharing information using Resource Description Frameworks (RDFs) and HTTP. Ontologies, which logically represent entities and relations in specific domains, are the basis of LD. Ontobee (http://www.ontobee.org/) is a linked ontology data server that stores ontology information using RDF triple store technology and supports query, visualization and linkage of ontology terms. Ontobee is also the default linked data server for publishing and browsing biomedical ontologies in the Open Biological Ontology (OBO) Foundry (http://obofoundry.org) library. Ontobee currently hosts more than 180 ontologies (including 131 OBO Foundry Library ontologies) with over four million terms. Ontobee provides a user-friendly web interface for querying and visualizing the details and hierarchy of a specific ontology term. Using the eXtensible Stylesheet Language Transformation (XSLT) technology, Ontobee is able to dereference a single ontology term URI, and then output RDF/eXtensible Markup Language (XML) for computer processing or display the HTML information on a web browser for human users. Statistics and detailed information are generated and displayed for each ontology listed in Ontobee. In addition, a SPARQL web interface is provided for custom advanced SPARQL queries of one or multiple ontologies. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. jQuery UI 1.10 the user interface library for jQuery

    CERN Document Server

    Libby, Alex

    2013-01-01

    This book consists of an easy-to-follow, example-based approach that leads you step-by-step through the implementation and customization of each library component.This book is for frontend designers and developers who need to learn how to use jQuery UI quickly. To get the most out of this book, you should have a good working knowledge of HTML, CSS, and JavaScript, and should ideally be comfortable using jQuery.

  18. Stratification-Based Outlier Detection over the Deep Web.

    Science.gov (United States)

    Xian, Xuefeng; Zhao, Pengpeng; Sheng, Victor S; Fang, Ligang; Gu, Caidong; Yang, Yuanfeng; Cui, Zhiming

    2016-01-01

    For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribution of this paper is to develop a new data mining method for outlier detection over deep web. In our approach, the query space of a deep web data source is stratified based on a pilot sample. Neighborhood sampling and uncertainty sampling are developed in this paper with the goal of improving recall and precision based on stratification. Finally, a careful performance evaluation of our algorithm confirms that our approach can effectively detect outliers in deep web.

  19. Graphical modeling and query language for hospitals.

    Science.gov (United States)

    Barzdins, Janis; Barzdins, Juris; Rencis, Edgars; Sostaks, Agris

    2013-01-01

    So far there has been little evidence that implementation of the health information technologies (HIT) is leading to health care cost savings. One of the reasons for this lack of impact by the HIT likely lies in the complexity of the business process ownership in the hospitals. The goal of our research is to develop a business model-based method for hospital use which would allow doctors to retrieve directly the ad-hoc information from various hospital databases. We have developed a special domain-specific process modelling language called the MedMod. Formally, we define the MedMod language as a profile on UML Class diagrams, but we also demonstrate it on examples, where we explain the semantics of all its elements informally. Moreover, we have developed the Process Query Language (PQL) that is based on MedMod process definition language. The purpose of PQL is to allow a doctor querying (filtering) runtime data of hospital's processes described using MedMod. The MedMod language tries to overcome deficiencies in existing process modeling languages, allowing to specify the loosely-defined sequence of the steps to be performed in the clinical process. The main advantages of PQL are in two main areas - usability and efficiency. They are: 1) the view on data through "glasses" of familiar process, 2) the simple and easy-to-perceive means of setting filtering conditions require no more expertise than using spreadsheet applications, 3) the dynamic response to each step in construction of the complete query that shortens the learning curve greatly and reduces the error rate, and 4) the selected means of filtering and data retrieving allows to execute queries in O(n) time regarding the size of the dataset. We are about to continue developing this project with three further steps. First, we are planning to develop user-friendly graphical editors for the MedMod process modeling and query languages. The second step is to do evaluation of usability the proposed language and tool

  20. Cost reduction for web-based data imputation

    KAUST Repository

    Li, Zhixu

    2014-01-01

    Web-based Data Imputation enables the completion of incomplete data sets by retrieving absent field values from the Web. In particular, complete fields can be used as keywords in imputation queries for absent fields. However, due to the ambiguity of these keywords and the data complexity on the Web, different queries may retrieve different answers to the same absent field value. To decide the most probable right answer to each absent filed value, existing method issues quite a few available imputation queries for each absent value, and then vote on deciding the most probable right answer. As a result, we have to issue a large number of imputation queries for filling all absent values in an incomplete data set, which brings a large overhead. In this paper, we work on reducing the cost of Web-based Data Imputation in two aspects: First, we propose a query execution scheme which can secure the most probable right answer to an absent field value by issuing as few imputation queries as possible. Second, we recognize and prune queries that probably will fail to return any answers a priori. Our extensive experimental evaluation shows that our proposed techniques substantially reduce the cost of Web-based Imputation without hurting its high imputation accuracy. © 2014 Springer International Publishing Switzerland.

  1. Cumulative query method for influenza surveillance using search engine data.

    Science.gov (United States)

    Seo, Dong-Woo; Jo, Min-Woo; Sohn, Chang Hwan; Shin, Soo-Yong; Lee, JaeHo; Yu, Maengsoo; Kim, Won Young; Lim, Kyoung Soo; Lee, Sang-Il

    2014-12-16

    Internet search queries have become an important data source in syndromic surveillance system. However, there is currently no syndromic surveillance system using Internet search query data in South Korea. The objective of this study was to examine correlations between our cumulative query method and national influenza surveillance data. Our study was based on the local search engine, Daum (approximately 25% market share), and influenza-like illness (ILI) data from the Korea Centers for Disease Control and Prevention. A quota sampling survey was conducted with 200 participants to obtain popular queries. We divided the study period into two sets: Set 1 (the 2009/10 epidemiological year for development set 1 and 2010/11 for validation set 1) and Set 2 (2010/11 for development Set 2 and 2011/12 for validation Set 2). Pearson's correlation coefficients were calculated between the Daum data and the ILI data for the development set. We selected the combined queries for which the correlation coefficients were .7 or higher and listed them in descending order. Then, we created a cumulative query method n representing the number of cumulative combined queries in descending order of the correlation coefficient. In validation set 1, 13 cumulative query methods were applied, and 8 had higher correlation coefficients (min=.916, max=.943) than that of the highest single combined query. Further, 11 of 13 cumulative query methods had an r value of ≥.7, but 4 of 13 combined queries had an r value of ≥.7. In validation set 2, 8 of 15 cumulative query methods showed higher correlation coefficients (min=.975, max=.987) than that of the highest single combined query. All 15 cumulative query methods had an r value of ≥.7, but 6 of 15 combined queries had an r value of ≥.7. Cumulative query method showed relatively higher correlation with national influenza surveillance data than combined queries in the development and validation set.

  2. Mathematical Formula Search using Natural Language Queries

    Directory of Open Access Journals (Sweden)

    YANG, S.

    2014-11-01

    Full Text Available This paper presents how to search mathematical formulae written in MathML when given plain words as a query. Since the proposed method allows natural language queries like the traditional Information Retrieval for the mathematical formula search, users do not need to enter any complicated math symbols and to use any formula input tool. For this, formula data is converted into plain texts, and features are extracted from the converted texts. In our experiments, we achieve an outstanding performance, a MRR of 0.659. In addition, we introduce how to utilize formula classification for formula search. By using class information, we finally achieve an improved performance, a MRR of 0.690.

  3. PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets.

    Science.gov (United States)

    Djokic-Petrovic, Marija; Cvjetkovic, Vladimir; Yang, Jeremy; Zivanovic, Marko; Wild, David J

    2017-09-20

    There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources. PIBAS FedSPARQL is a web-based query builder and result set visualizer of bioinformatics data. As an advanced feature, our system can detect similar data items identified by different Uniform Resource Identifiers (URIs), using a text-mining algorithm based on the processing of named entities to be used in Vector Space Model and Cosine Similarity Measures. According to our knowledge, PIBAS FedSPARQL was unique among the systems that we found in that it allows detecting of similar data items. As a query builder, our system allows researchers to intuitively construct and run Federated SPARQL queries across multiple data sources, including global initiatives, such as Bio2RDF, Chem2Bio2RDF, EMBL-EBI, and one local initiative called CPCTAS, as well as additional user-specified data source. From the input topic, subtopic, template and keyword, a corresponding initial Federated SPARQL query is created and executed. Based on the data obtained, end users have the ability to choose the most appropriate data sources in their area of interest and exploit their Resource Description Framework (RDF) structure, which allows users to select certain properties of data to enhance query results. The developed system is flexible and allows intuitive creation and execution of queries for an extensive range of bioinformatics topics. Also, the novel "similar data items detection" algorithm can be particularly

  4. TEMPORAL QUERY PROCESSIG USING SQL SERVER

    OpenAIRE

    Vali Shaik, Mastan; Sujatha, P

    2017-01-01

    Most data sources in real-life are not static but change their information in time. This evolution of data in time can give valuable insights to business analysts. Temporal data refers to data, where changes over time or temporal aspects play a central role. Temporal data denotes the evaluation of object characteristics over time. One of the main unresolved problems that arise during the data mining process is treating data that contains temporal information. Temporal queries on time evolving...

  5. Advanced SPARQL querying in small molecule databases

    Czech Academy of Sciences Publication Activity Database

    Galgonek, Jakub; Hurt, T.; Michlíková, V.; Onderka, P.; Schwarz, J.; Vondrášek, Jiří

    2016-01-01

    Roč. 8, Jun 6 (2016), č. článku 31. ISSN 1758-2946 R&D Projects: GA MŠk(CZ) LM2015047 Institutional support: RVO:61388963 Keywords : Resource Description Framework * SPARQL query language * Database of small molecules Subject RIV: CF - Physical ; Theoretical Chemistry Impact factor: 4.220, year: 2016 http://jcheminf.springeropen.com/articles/10.1186/s13321-016-0144-4

  6. Using Search Query Surveillance to Monitor Tax Avoidance and Smoking Cessation following the United States' 2009 “SCHIP” Cigarette Tax Increase

    Science.gov (United States)

    Ayers, John W.; Ribisl, Kurt; Brownstein, John S.

    2011-01-01

    Smokers can use the web to continue or quit their habit. Online vendors sell reduced or tax-free cigarettes lowering smoking costs, while health advocates use the web to promote cessation. We examined how smokers' tax avoidance and smoking cessation Internet search queries were motivated by the United States' (US) 2009 State Children's Health Insurance Program (SCHIP) federal cigarette excise tax increase and two other state specific tax increases. Google keyword searches among residents in a taxed geography (US or US state) were compared to an untaxed geography (Canada) for two years around each tax increase. Search data were normalized to a relative search volume (RSV) scale, where the highest search proportion was labeled 100 with lesser proportions scaled by how they relatively compared to the highest proportion. Changes in RSV were estimated by comparing means during and after the tax increase to means before the tax increase, across taxed and untaxed geographies. The SCHIP tax was associated with an 11.8% (95% confidence interval [95%CI], 5.7 to 17.9; ptax levels in Canada during the months after the tax. Tax avoidance searches increased 27.9% (95%CI, 15.9 to 39.9; ptax compared to Canada, respectively, suggesting avoidance is the more pronounced and durable response. Trends were similar for state-specific tax increases but suggest strong interactive processes across taxes. When the SCHIP tax followed Florida's tax, versus not, it promoted more cessation and avoidance searches. Efforts to combat tax avoidance and increase cessation may be enhanced by using interventions targeted and tailored to smokers' searches. Search query surveillance is a valuable real-time, free and public method, that may be generalized to other behavioral, biological, informational or psychological outcomes manifested online. PMID:21436883

  7. SeqWare Query Engine: storing and searching sequence data in the cloud

    Directory of Open Access Journals (Sweden)

    Merriman Barry

    2010-12-01

    Full Text Available Abstract Background Since the introduction of next-generation DNA sequencers the rapid increase in sequencer throughput, and associated drop in costs, has resulted in more than a dozen human genomes being resequenced over the last few years. These efforts are merely a prelude for a future in which genome resequencing will be commonplace for both biomedical research and clinical applications. The dramatic increase in sequencer output strains all facets of computational infrastructure, especially databases and query interfaces. The advent of cloud computing, and a variety of powerful tools designed to process petascale datasets, provide a compelling solution to these ever increasing demands. Results In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net. Conclusions The SeqWare Query Engine provided an easy way to make the U87MG genome accessible to programmers and non-programmers alike. This enabled a faster and more open exploration of results, quicker tuning of parameters for heuristic variant calling filters

  8. SeqWare Query Engine: storing and searching sequence data in the cloud

    Science.gov (United States)

    2010-01-01

    Background Since the introduction of next-generation DNA sequencers the rapid increase in sequencer throughput, and associated drop in costs, has resulted in more than a dozen human genomes being resequenced over the last few years. These efforts are merely a prelude for a future in which genome resequencing will be commonplace for both biomedical research and clinical applications. The dramatic increase in sequencer output strains all facets of computational infrastructure, especially databases and query interfaces. The advent of cloud computing, and a variety of powerful tools designed to process petascale datasets, provide a compelling solution to these ever increasing demands. Results In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc) with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net). Conclusions The SeqWare Query Engine provided an easy way to make the U87MG genome accessible to programmers and non-programmers alike. This enabled a faster and more open exploration of results, quicker tuning of parameters for heuristic variant calling filters, and a common data

  9. SeqWare Query Engine: storing and searching sequence data in the cloud.

    Science.gov (United States)

    O'Connor, Brian D; Merriman, Barry; Nelson, Stanley F

    2010-12-21

    Since the introduction of next-generation DNA sequencers the rapid increase in sequencer throughput, and associated drop in costs, has resulted in more than a dozen human genomes being resequenced over the last few years. These efforts are merely a prelude for a future in which genome resequencing will be commonplace for both biomedical research and clinical applications. The dramatic increase in sequencer output strains all facets of computational infrastructure, especially databases and query interfaces. The advent of cloud computing, and a variety of powerful tools designed to process petascale datasets, provide a compelling solution to these ever increasing demands. In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc) with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net). The SeqWare Query Engine provided an easy way to make the U87MG genome accessible to programmers and non-programmers alike. This enabled a faster and more open exploration of results, quicker tuning of parameters for heuristic variant calling filters, and a common data interface to simplify development of

  10. Retrieval of very large numbers of items in the Web of Science: an exercise to develop accurate search strategies

    NARCIS (Netherlands)

    Arencibia-Jorge, R.; Leydesdorff, L.; Chinchilla-Rodríguez, Z.; Rousseau, R.; Paris, S.W.

    2009-01-01

    The Web of Science interface counts at most 100,000 retrieved items from a single query. If the query results in a dataset containing more than 100,000 items the number of retrieved items is indicated as >100,000. The problem studied here is how to find the exact number of items in a query that

  11. Um levantamento sobre do uso de ferramentas da Web 2.0 entre os estudantes da Ciência da Informação da Universidade Federal de Santa Catarina

    Directory of Open Access Journals (Sweden)

    Eduarda Bodaneze de Oliveira

    2014-04-01

    Full Text Available http://dx.doi.org/10.5007/1518-2924.2014v19n39p153   Nas últimas décadas, puderam-se verificar enormes mudanças nas relações sociais e grandes transformações nos sistemas de comunicação que tornaram mais rápidos os fluxos de informação por meio dos recursos da Internet. A ideia de uma Web 2.0 (a Web colaborativa agrega ideais de colaboração, interação e participação que proporcionam, através de uma organização em rede, uma intensa produção de informação. Este trabalho analisa o uso de ferramentas da Web 2.0 pelos alunos do Departamento de Ciência da Informação da Universidade Federal de Santa Catarina através da caracterização do perfil dos estudantes como usuários destas ferramentas, da identificação das áreas de interesse, das ferramentas mais utilizadas e das principais finalidades de uso das redes sociais Facebook, Twitter e LinkedIn. O estudo concluiu que há um grande potencial, dentro do segmento avaliado, para a troca de informações e difusão de conhecimentos através das ferramentas da Web 2.0. Além disso, estas ferramentas são vistas pelo público-alvo da pesquisa como sendo de grande utilidade, tanto para funções pessoais e de relacionamento quanto para interesses acadêmicos e profissionais.

  12. SM4MQ: A Semantic Model for Multidimensional Queries

    DEFF Research Database (Denmark)

    Varga, Jovan; Dobrokhotova, Ekaterina; Romero, Oscar

    2017-01-01

    metadata artifacts (e.g., queries) to assist users with the analysis. However, modeling and sharing of most of these artifacts are typically overlooked. Thus, in this paper we focus on the query metadata artifact in the Exploratory OLAP context and propose an RDF-based vocabulary for its representation......, sharing, and reuse on the SW. As OLAP is based on the underlying multidimensional (MD) data model we denote such queries as MD queries and define SM4MQ: A Semantic Model for Multidimensional Queries. Furthermore, we propose a method to automate the exploitation of queries by means of SPARQL. We apply...... the method to a use case of transforming queries from SM4MQ to a vector representation. For the use case, we developed the prototype and performed an evaluation that shows how our approach can significantly ease and support user assistance such as query recommendation....

  13. Accelerating SPARQL Queries and Analytics on RDF Data

    KAUST Repository

    Al-Harbi, Razen

    2016-01-01

    The complexity of SPARQL queries and RDF applications poses great challenges on distributed RDF management systems. SPARQL workloads are dynamic and con- sist of queries with variable complexities. Hence, systems that use static partitioning su

  14. A Revisit of Query Expansion with Different Semantic Levels

    DEFF Research Database (Denmark)

    Zhang, Ce; Cui, Bin; Cong, Gao

    2009-01-01

    Query expansion has received extensive attention in information retrieval community. Although semantic based query expansion appears to be promising in improving retrieval performance, previous research has shown that it cannot consistently improve retrieval performance. It is a tricky problem to...

  15. Semantic querying of data guided by Formal Concept Analysis

    OpenAIRE

    Codocedo , Victor; Lykourentzou , Ioanna; Napoli , Amedeo

    2012-01-01

    International audience; In this paper we present a novel approach to handle querying over a concept lattice of documents and annotations. We focus on the problem of "non-matching documents", which are those that, despite being semantically relevant to the user query, do not contain the query's elements and hence cannot be retrieved by typical string matching approaches. In order to find these documents, we modify the initial user query using the concept lattice as a guide. We achieve this by ...

  16. QUERY RESPONSE TIME COMPARISON NOSQLDB MONGODB WITH SQLDB ORACLE

    Directory of Open Access Journals (Sweden)

    Humasak T. A. Simanjuntak

    2015-01-01

    Full Text Available Penyimpanan data saat ini terdapat dua jenis yakni relational database dan non-relational database. Kedua jenis DBMS (Database Managemnet System tersebut berbeda dalam berbagai aspek seperti per-formansi eksekusi query, scalability, reliability maupun struktur penyimpanan data. Kajian ini memiliki tujuan untuk mengetahui perbandingan performansi DBMS antara Oracle sebagai jenis relational data-base dan MongoDB sebagai jenis non-relational database dalam mengolah data terstruktur. Eksperimen dilakukan untuk mengetahui perbandingan performansi kedua DBMS tersebut untuk operasi insert, select, update dan delete dengan menggunakan query sederhana maupun kompleks pada database Northwind. Untuk mencapai tujuan eksperimen, 18 query yang terdiri dari 2 insert query, 10 select query, 2 update query dan 2 delete query dieksekusi. Query dieksekusi melalui sebuah aplikasi .Net yang dibangun sebagai perantara antara user dengan basis data. Eksperimen dilakukan pada tabel dengan atau tanpa relasi pada Oracle dan embedded atau bukan embedded dokumen pada MongoDB. Response time untuk setiap eksekusi query dibandingkan dengan menggunakan metode statistik. Eksperimen menunjukkan response time query untuk proses select, insert, dan update pada MongoDB lebih cepatdaripada Oracle. MongoDB lebih cepat 64.8 % untuk select query;MongoDB lebihcepat 72.8 % untuk insert query dan MongoDB lebih cepat 33.9 % untuk update query. Pada delete query, Oracle lebih cepat 96.8 % daripada MongoDB untuk table yang berelasi, tetapi MongoDB lebih cepat 83.8 % daripada Oracle untuk table yang tidak memiliki relasi.Untuk query kompleks dengan Map Reduce pada MongoDB lebih lambat 97.6% daripada kompleks query dengan aggregate function pada Oracle.

  17. A Relational Algebra Query Language for Programming Relational Databases

    Science.gov (United States)

    McMaster, Kirby; Sambasivam, Samuel; Anderson, Nicole

    2011-01-01

    In this paper, we describe a Relational Algebra Query Language (RAQL) and Relational Algebra Query (RAQ) software product we have developed that allows database instructors to teach relational algebra through programming. Instead of defining query operations using mathematical notation (the approach commonly taken in database textbooks), students…

  18. Result diversification based on query-specific cluster ranking

    NARCIS (Netherlands)

    He, J.; Meij, E.; de Rijke, M.

    2011-01-01

    Result diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking, in which diversification

  19. Result Diversification Based on Query-Specific Cluster Ranking

    NARCIS (Netherlands)

    J. He (Jiyin); E. Meij; M. de Rijke (Maarten)

    2011-01-01

    htmlabstractResult diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking,

  20. Efficient Processing of Multiple DTW Queries in Time Series Databases

    DEFF Research Database (Denmark)

    Kremer, Hardy; Günnemann, Stephan; Ivanescu, Anca-Maria

    2011-01-01

    . In many of today’s applications, however, large numbers of queries arise at any given time. Existing DTW techniques do not process multiple DTW queries simultaneously, a serious limitation which slows down overall processing. In this paper, we propose an efficient processing approach for multiple DTW...... for multiple DTW queries....

  1. Modeling Large Time Series for Efficient Approximate Query Processing

    DEFF Research Database (Denmark)

    Perera, Kasun S; Hahmann, Martin; Lehner, Wolfgang

    2015-01-01

    query statistics derived from experiments and when running the system. Our approach can also reduce communication load by exchanging models instead of data. To allow seamless integration of model-based querying into traditional data warehouses, we introduce a SQL compatible query terminology. Our...

  2. Extracting Rankings for Spatial Keyword Queries from GPS Data

    DEFF Research Database (Denmark)

    Keles, Ilkcan; Jensen, Christian Søndergaard; Saltenis, Simonas

    2018-01-01

    Studies suggest that many search engine queries have local intent. We consider the evaluation of ranking functions important for such queries. The key challenge is to be able to determine the “best” ranking for a query, as this enables evaluation of the results of ranking functions. We propose...

  3. An Adaptive Directed Query Dissemination Scheme for Wireless Sensor Networks

    NARCIS (Netherlands)

    Chatterjea, Supriyo; De Luigi, Simone; Havinga, Paul J.M.; Sun, M.T.

    This paper describes a directed query dissemination scheme, DirQ that routes queries to the appropriate source nodes based on both constant and dynamicvalued attributes such as sensor types and sensor values. Unlike certain other query dissemination schemes, location information is not essential for

  4. Personalized Metaheuristic Clustering Onto Web Documents

    Institute of Scientific and Technical Information of China (English)

    Wookey Lee

    2004-01-01

    Optimal clustering for the web documents is known to complicated cornbinatorial Optimization problem and it is hard to develop a generally applicable oplimal algorithm. An accelerated simuIated arlneaIing aIgorithm is developed for automatic web document classification. The web document classification problem is addressed as the problem of best describing a match between a web query and a hypothesized web object. The normalized term frequency and inverse document frequency coefficient is used as a measure of the match. Test beds are generated on - line during the search by transforming model web sites. As a result, web sites can be clustered optimally in terms of keyword vectofs of corresponding web documents.

  5. Lost in translation? A multilingual Query Builder improves the quality of PubMed queries: a randomised controlled trial.

    Science.gov (United States)

    Schuers, Matthieu; Joulakian, Mher; Kerdelhué, Gaetan; Segas, Léa; Grosjean, Julien; Darmoni, Stéfan J; Griffon, Nicolas

    2017-07-03

    MEDLINE is the most widely used medical bibliographic database in the world. Most of its citations are in English and this can be an obstacle for some researchers to access the information the database contains. We created a multilingual query builder to facilitate access to the PubMed subset using a language other than English. The aim of our study was to assess the impact of this multilingual query builder on the quality of PubMed queries for non-native English speaking physicians and medical researchers. A randomised controlled study was conducted among French speaking general practice residents. We designed a multi-lingual query builder to facilitate information retrieval, based on available MeSH translations and providing users with both an interface and a controlled vocabulary in their own language. Participating residents were randomly allocated either the French or the English version of the query builder. They were asked to translate 12 short medical questions into MeSH queries. The main outcome was the quality of the query. Two librarians blind to the arm independently evaluated each query, using a modified published classification that differentiated eight types of errors. Twenty residents used the French version of the query builder and 22 used the English version. 492 queries were analysed. There were significantly more perfect queries in the French group vs. the English group (respectively 37.9% vs. 17.9%; p PubMed queries in particular for researchers whose first language is not English.

  6. Head First Mobile Web

    CERN Document Server

    Gardner, Lyza; Grigsby, Jason

    2011-01-01

    Despite the huge number of mobile devices and apps in use today, your business still needs a website. You just need it to be mobile. Head First Mobile Web walks you through the process of making a conventional website work on a variety smartphones and tablets. Put your JavaScript, CSS media query, and HTML5 skills to work-then optimize your site to perform its best in the demanding mobile market. Along the way, you'll discover how to adapt your business strategy to target specific devices. Navigate the increasingly complex mobile landscapeTake both technical and strategic approaches to mobile

  7. Reformulating XQuery queries using GLAV mapping and complex unification

    Directory of Open Access Journals (Sweden)

    Saber Benharzallah

    2016-01-01

    Full Text Available This paper describes an algorithm for reformulation of XQuery queries. The mediation is based on an essential component called mediator. Its main role is to reformulate a user query, written in terms of global schema, into queries written in terms of source schemas. Our algorithm is based on the principle of logical equivalence, simple and complex unification, to obtain a better reformulation. It takes XQuery query, global schema (written in XMLSchema, and mappings GLAV as input parameters and provides resultant query written in terms of source schemas. The results of implementation show the proper functioning of the algorithm.

  8. Approximate furthest neighbor with application to annulus query

    DEFF Research Database (Denmark)

    Pagh, Rasmus; Silvestri, Francesco; Sivertsen, Johan von Tangen

    2016-01-01

    -dimensional Euclidean space. The method builds on the technique of Indyk (SODA 2003), storing random projections to provide sublinear query time for AFN. However, we introduce a different query algorithm, improving on Indyk׳s approximation factor and reducing the running time by a logarithmic factor. We also present......, the query-dependent approach is used for deriving a data structure for the approximate annulus query problem, which is defined as follows: given an input set S and two parameters r>0 and w≥1, construct a data structure that returns for each query point q a point p∈S such that the distance between p and q...

  9. Spatio-temporal databases complex motion pattern queries

    CERN Document Server

    Vieira, Marcos R

    2013-01-01

    This brief presents several new query processing techniques, called complex motion pattern queries, specifically designed for very large spatio-temporal databases of moving objects. The brief begins with the definition of flexible pattern queries, which are powerful because of the integration of variables and motion patterns. This is followed by a summary of the expressive power of patterns and flexibility of pattern queries. The brief then present the Spatio-Temporal Pattern System (STPS) and density-based pattern queries. STPS databases contain millions of records with information about mobi

  10. Optimizing queries in SQL Server 2008

    Directory of Open Access Journals (Sweden)

    Ion LUNGU

    2010-05-01

    Full Text Available Starting from the need to develop efficient IT systems, we intend to review theoptimization methods and tools that can be used by SQL Server database administratorsand developers of applications based on Microsoft technology, focusing on the latestversion of the proprietary DBMS, SQL Server 2008. We’ll reflect on the objectives tobe considered in improving the performance of SQL Server instances, we will tackle themostly used techniques for analyzing and optimizing queries and we will describe the“Optimize for ad hoc workloads”, “Plan Freezing” and “Optimize for unknown" newoptions, accompanied by relevant code examples.

  11. Downloading Multiple Records Using Query Strings

    Directory of Open Access Journals (Sweden)

    Adam Crymble

    2012-11-01

    Full Text Available Downloading a single record from a website is easy, but downloading many records at a time – an increasingly frequent need for a historian – is much more efficient using a programming language such as Python. In this lesson, we will write a program that will download a series of records from the Old Bailey Online using custom search criteria, and save them to a directory on our computer. This process involves interpreting and manipulating URL Query Strings. In this case, the tutorial will seek to download sources that contain references to people of African descent that were published in the Old Bailey Proceedings between 1700 and 1750.

  12. A web-based approach to data imputation

    KAUST Repository

    Li, Zhixu

    2013-10-24

    In this paper, we present WebPut, a prototype system that adopts a novel web-based approach to the data imputation problem. Towards this, Webput utilizes the available information in an incomplete database in conjunction with the data consistency principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme that efficiently leverages our suite of data imputation queries to automatically select the most effective imputation query for each missing value. A greedy iterative algorithm is proposed to schedule the imputation order of the different missing values in a database, and in turn the issuing of their corresponding imputation queries, for improving the accuracy and efficiency of WebPut. Moreover, several optimization techniques are also proposed to reduce the cost of estimating the confidence of imputation queries at both the tuple-level and the database-level. Experiments based on several real-world data collections demonstrate not only the effectiveness of WebPut compared to existing approaches, but also the efficiency of our proposed algorithms and optimization techniques. © 2013 Springer Science+Business Media New York.

  13. WebVR: an interactive web browser for virtual environments

    Science.gov (United States)

    Barsoum, Emad; Kuester, Falko

    2005-03-01

    The pervasive nature of web-based content has lead to the development of applications and user interfaces that port between a broad range of operating systems and databases, while providing intuitive access to static and time-varying information. However, the integration of this vast resource into virtual environments has remained elusive. In this paper we present an implementation of a 3D Web Browser (WebVR) that enables the user to search the internet for arbitrary information and to seamlessly augment this information into virtual environments. WebVR provides access to the standard data input and query mechanisms offered by conventional web browsers, with the difference that it generates active texture-skins of the web contents that can be mapped onto arbitrary surfaces within the environment. Once mapped, the corresponding texture functions as a fully integrated web-browser that will respond to traditional events such as the selection of links or text input. As a result, any surface within the environment can be turned into a web-enabled resource that provides access to user-definable data. In order to leverage from the continuous advancement of browser technology and to support both static as well as streamed content, WebVR uses ActiveX controls to extract the desired texture skin from industry strength browsers, providing a unique mechanism for data fusion and extensibility.

  14. Federal Procurement Data System - FAADS API

    Data.gov (United States)

    General Services Administration — The Federal Award Assistance Data System (FAADS) has been re-engineered as a real-time federal enterprise information system. Web services based on SOAP and XML,...

  15. Federal Procurement Data System - FPDS API

    Data.gov (United States)

    General Services Administration — The Federal Procurement Data System (FPDS) Next Generation has been re-engineered as a real-time federal enterprise information system. Web services based on SOAP...

  16. Harnessing the Deep Web: Present and Future

    OpenAIRE

    Madhavan, Jayant; Afanasiev, Loredana; Antova, Lyublena; Halevy, Alon

    2009-01-01

    Over the past few years, we have built a system that has exposed large volumes of Deep-Web content to Google.com users. The content that our system exposes contributes to more than 1000 search queries per-second and spans over 50 languages and hundreds of domains. The Deep Web has long been acknowledged to be a major source of structured data on the web, and hence accessing Deep-Web content has long been a problem of interest in the data management community. In this paper, we report on where...

  17. A few examples go a long way: Constructing query models from elaborate query formulations

    NARCIS (Netherlands)

    Balog, K.; Weerkamp, W.; de Rijke, M.; Myaeng, S.-H.; Oard, D.W.; Sebastiani, F.; Chua, T.-S.; Leong, M.-K.

    2008-01-01

    We address a specific enterprise document search scenario, where the information need is expressed in an elaborate manner. In our scenario, information needs are expressed using a short query (of a few keywords) together with examples of key reference pages. Given this setup, we investigate how the

  18. Query-Time Optimization Techniques for Structured Queries in Information Retrieval

    Science.gov (United States)

    Cartright, Marc-Allen

    2013-01-01

    The use of information retrieval (IR) systems is evolving towards larger, more complicated queries. Both the IR industrial and research communities have generated significant evidence indicating that in order to continue improving retrieval effectiveness, increases in retrieval model complexity may be unavoidable. From an operational perspective,…

  19. Writing and querying MapReduce views in CouchDB

    CERN Document Server

    Holt, Bradley

    2011-01-01

    If you want to use CouchDB to support real-world applications, you'll need to create MapReduce views that let you query this document-oriented database for meaningful data. With this short and concise ebook, you'll learn how to create a variety of MapReduce views to help you query and aggregate data in CouchDB's large, distributed datasets. You'll get step-by-step instructions and lots of sample code to create and explore several MapReduce views through the course of the book, using an example database you construct. To work with these different views, you'll learn how to use the Futon web a

  20. ASSET Queries: A Set-Oriented and Column-Wise Approach to Modern OLAP

    Science.gov (United States)

    Chatziantoniou, Damianos; Sotiropoulos, Yannis

    Modern data analysis has given birth to numerous grouping constructs and programming paradigms, way beyond the traditional group by. Applications such as data warehousing, web log analysis, streams monitoring and social networks understanding necessitated the use of data cubes, grouping variables, windows and MapReduce. In this paper we review the associated set (ASSET) concept and discuss its applicability in both continuous and traditional data settings. Given a set of values B, an associated set over B is just a collection of annotated data multisets, one for each b(B. The goal is to efficiently compute aggregates over these data sets. An ASSET query consists of repeated definitions of associated sets and aggregates of these, possibly correlated, resembling a spreadsheet document. We review systems implementing ASSET queries both in continuous and persistent contexts and argue for associated sets' analytical abilities and optimization opportunities.

  1. GBA manager: an online tool for querying low-complexity regions in proteins.

    Science.gov (United States)

    Bandyopadhyay, Nirmalya; Kahveci, Tamer

    2010-01-01

    Abstract We developed GBA Manager, an online software that facilitates the Graph-Based Algorithm (GBA) we proposed in our earlier work. GBA identifies the low-complexity regions (LCR) of protein sequences. GBA exploits a similarity matrix, such as BLOSUM62, to compute the complexity of the subsequences of the input protein sequence. It uses a graph-based algorithm to accurately compute the regions that have low complexities. GBA Manager is a user friendly web-service that enables online querying of protein sequences using GBA. In addition to querying capabilities of the existing GBA algorithm, GBA Manager computes the p-values of the LCR identified. The p-value gives an estimate of the possibility that the region appears by chance. GBA Manager presents the output in three different understandable formats. GBA Manager is freely accessible at http://bioinformatics.cise.ufl.edu/GBA/GBA.htm .

  2. An Effective Framework for Distributed Geospatial Query Processing in Grids

    Directory of Open Access Journals (Sweden)

    CHEN, B.

    2010-08-01

    Full Text Available The emergence of Internet has greatly revolutionized the way that geospatial information is collected, managed, processed and integrated. There are several important research issues to be addressed for distributed geospatial applications. First, the performance of geospatial applications is needed to be considered in the Internet environment. In this regard, the Grid as an effective distributed computing paradigm is a good choice. The Grid uses a series of middleware to interconnect and merge various distributed resources into a super-computer with capability of high performance computation. Secondly, it is necessary to ensure the secure use of independent geospatial applications in the Internet environment. The Grid just provides the utility of secure access to distributed geospatial resources. Additionally, it makes good sense to overcome the heterogeneity between individual geospatial information systems in Internet. The Open Geospatial Consortium (OGC proposes a number of generalized geospatial standards e.g. OGC Web Services (OWS to achieve interoperable access to geospatial applications. The OWS solution is feasible and widely adopted by both the academic community and the industry community. Therefore, we propose an integrated framework by incorporating OWS standards into Grids. Upon the framework distributed geospatial queries can be performed in an interoperable, high-performance and secure Grid environment.

  3. Introducing glycomics data into the Semantic Web.

    Science.gov (United States)

    Aoki-Kinoshita, Kiyoko F; Bolleman, Jerven; Campbell, Matthew P; Kawano, Shin; Kim, Jin-Dong; Lütteke, Thomas; Matsubara, Masaaki; Okuda, Shujiro; Ranzinger, Rene; Sawaki, Hiromichi; Shikanai, Toshihide; Shinmachi, Daisuke; Suzuki, Yoshinori; Toukach, Philip; Yamada, Issaku; Packer, Nicolle H; Narimatsu, Hisashi

    2013-11-26

    Glycoscience is a research field focusing on complex carbohydrates (otherwise known as glycans)a, which can, for example, serve as "switches" that toggle between different functions of a glycoprotein or glycolipid. Due to the advancement of glycomics technologies that are used to characterize glycan structures, many glycomics databases are now publicly available and provide useful information for glycoscience research. However, these databases have almost no link to other life science databases. In order to implement support for the Semantic Web most efficiently for glycomics research, the developers of major glycomics databases agreed on a minimal standard for representing glycan structure and annotation information using RDF (Resource Description Framework). Moreover, all of the participants implemented this standard prototype and generated preliminary RDF versions of their data. To test the utility of the converted data, all of the data sets were uploaded into a Virtuoso triple store, and several SPARQL queries were tested as "proofs-of-concept" to illustrate the utility of the Semantic Web in querying across databases which were originally difficult to implement. We were able to successfully retrieve information by linking UniCarbKB, GlycomeDB and JCGGDB in a single SPARQL query to obtain our target information. We also tested queries linking UniProt with GlycoEpitope as well as lectin data with GlycomeDB through PDB. As a result, we have been able to link proteomics data with glycomics data through the implementation of Semantic Web technologies, allowing for more flexible queries across these domains.

  4. Secure count query on encrypted genomic data.

    Science.gov (United States)

    Hasan, Mohammad Zahidul; Mahdi, Md Safiur Rahman; Sadat, Md Nazmus; Mohammed, Noman

    2018-05-01

    Human genomic information can yield more effective healthcare by guiding medical decisions. Therefore, genomics research is gaining popularity as it can identify potential correlations between a disease and a certain gene, which improves the safety and efficacy of drug treatment and can also develop more effective prevention strategies [1]. To reduce the sampling error and to increase the statistical accuracy of this type of research projects, data from different sources need to be brought together since a single organization does not necessarily possess required amount of data. In this case, data sharing among multiple organizations must satisfy strict policies (for instance, HIPAA and PIPEDA) that have been enforced to regulate privacy-sensitive data sharing. Storage and computation on the shared data can be outsourced to a third party cloud service provider, equipped with enormous storage and computation resources. However, outsourcing data to a third party is associated with a potential risk of privacy violation of the participants, whose genomic sequence or clinical profile is used in these studies. In this article, we propose a method for secure sharing and computation on genomic data in a semi-honest cloud server. In particular, there are two main contributions. Firstly, the proposed method can handle biomedical data containing both genotype and phenotype. Secondly, our proposed index tree scheme reduces the computational overhead significantly for executing secure count query operation. In our proposed method, the confidentiality of shared data is ensured through encryption, while making the entire computation process efficient and scalable for cutting-edge biomedical applications. We evaluated our proposed method in terms of efficiency on a database of Single-Nucleotide Polymorphism (SNP) sequences, and experimental results demonstrate that the execution time for a query of 50 SNPs in a database of 50,000 records is approximately 5 s, where each record

  5. The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries

    Directory of Open Access Journals (Sweden)

    Apweiler Rolf

    2006-02-01

    Full Text Available Abstract Background With the vast amounts of biomedical data being generated by high-throughput analysis methods, controlled vocabularies and ontologies are becoming increasingly important to annotate units of information for ease of search and retrieval. Each scientific community tends to create its own locally available ontology. The interfaces to query these ontologies tend to vary from group to group. We saw the need for a centralized location to perform controlled vocabulary queries that would offer both a lightweight web-accessible user interface as well as a consistent, unified SOAP interface for automated queries. Results The Ontology Lookup Service (OLS was created to integrate publicly available biomedical ontologies into a single database. All modified ontologies are updated daily. A list of currently loaded ontologies is available online. The database can be queried to obtain information on a single term or to browse a complete ontology using AJAX. Auto-completion provides a user-friendly search mechanism. An AJAX-based ontology viewer is available to browse a complete ontology or subsets of it. A programmatic interface is available to query the webservice using SOAP. The service is described by a WSDL descriptor file available online. A sample Java client to connect to the webservice using SOAP is available for download from SourceForge. All OLS source code is publicly available under the open source Apache Licence. Conclusion The OLS provides a user-friendly single entry point for publicly available ontologies in the Open Biomedical Ontology (OBO format. It can be accessed interactively or programmatically at http://www.ebi.ac.uk/ontology-lookup/.

  6. Semantic Web Without SPARQL.pdf

    OpenAIRE

    Szekely, Pedro

    2016-01-01

    Discuss the creation of large Semantic Web applications with billions of triples. Instead of using a traditional SPARQL endpoint, our toolchain is a pure JSON toolchain using JSON-LD and ElasticSearch to support queries. The toolchain is familiar to all developers, does not require knowledge of Semantic Web technologies, and performance is 10X better than using SPARQL endpoints. The presentation illustrates the approach in the context of an application to fight human trafficking, using data f...

  7. Web Services and Data Enhancements at the Northern California Earthquake Data Center

    Science.gov (United States)

    Neuhauser, D. S.; Zuzlewski, S.; Lombard, P. N.; Allen, R. M.

    2013-12-01

    The Northern California Earthquake Data Center (NCEDC) provides data archive and distribution services for seismological and geophysical data sets that encompass northern California. The NCEDC is enhancing its ability to deliver rapid information through Web Services. NCEDC Web Services use well-established web server and client protocols and REST software architecture to allow users to easily make queries using web browsers or simple program interfaces and to receive the requested data in real-time rather than through batch or email-based requests. Data are returned to the user in the appropriate format such as XML, RESP, simple text, or MiniSEED depending on the service and selected output format. The NCEDC offers the following web services that are compliant with the International Federation of Digital Seismograph Networks (FDSN) web services specifications: (1) fdsn-dataselect: time series data delivered in MiniSEED format, (2) fdsn-station: station and channel metadata and time series availability delivered in StationXML format, (3) fdsn-event: earthquake event information delivered in QuakeML format. In addition, the NCEDC offers the the following IRIS-compatible web services: (1) sacpz: provide channel gains, poles, and zeros in SAC format, (2) resp: provide channel response information in RESP format, (3) dataless: provide station and channel metadata in Dataless SEED format. The NCEDC is also developing a web service to deliver timeseries from pre-assembled event waveform gathers. The NCEDC has waveform gathers for ~750,000 northern and central California events from 1984 to the present, many of which were created by the USGS NCSN prior to the establishment of the joint NCSS (Northern California Seismic System). We are currently adding waveforms to these older event gathers with time series from the UCB networks and other networks with waveforms archived at the NCEDC, and ensuring that the waveform for each channel in the event gathers have the highest

  8. DATA EXTRACTION AND LABEL ASSIGNMENT FOR WEB DATABASES

    OpenAIRE

    T. Rajesh; T. Prathap; S.Naveen Nambi; A.R. Arunachalam

    2015-01-01

    Deep Web contents are accessed by queries submitted to Web databases and the returned data records are en wrapped in dynamically generated Web pages (they will be called deep Web pages in this paper). The structured data that Extracting from deep Web pages is a challenging problem due to the underlying intricate structures of such pages. Until now, a too many number of techniques have been proposed to address this problem, but all of them have limitations because they are Web-page-programming...

  9. Linked data-as-a-service: The semantic web redeployed

    NARCIS (Netherlands)

    Rietveld, Laurens; Verborgh, Ruben; Beek, Wouter; Vander Sande, Miel; Schlobach, Stefan

    2015-01-01

    Ad-hoc querying is crucial to access information from Linked Data, yet publishing queryable RDF datasets on the Web is not a trivial exercise. The most compelling argument to support this claim is that the Web contains hundreds of thousands of data documents, while only 260 queryable SPARQL

  10. GMB: An Efficient Query Processor for Biological Data

    Directory of Open Access Journals (Sweden)

    Taha Kamal

    2011-06-01

    Full Text Available Bioinformatics applications manage complex biological data stored into distributed and often heterogeneous databases and require large computing power. These databases are too big and complicated to be rapidly queried every time a user submits a query, due to the overhead involved in decomposing the queries, sending the decomposed queries to remote databases, and composing the results. There is also considerable communication costs involved. This study addresses the mentioned problems in Grid-based environment for bioinformatics. We propose a Grid middleware called GMB that alleviates these problems by caching the results of Frequently Used Queries (FUQ. Queries are classified based on their types and frequencies. FUQ are answered from the middleware, which improves their response time. GMB acts as a gateway to TeraGrid Grid: it resides between users’ applications and TeraGrid Grid. We evaluate GMB experimentally.

  11. Evaluation of Sub Query Performance in SQL Server

    Science.gov (United States)

    Oktavia, Tanty; Sujarwo, Surya

    2014-03-01

    The paper explores several sub query methods used in a query and their impact on the query performance. The study uses experimental approach to evaluate the performance of each sub query methods combined with indexing strategy. The sub query methods consist of in, exists, relational operator and relational operator combined with top operator. The experimental shows that using relational operator combined with indexing strategy in sub query has greater performance compared with using same method without indexing strategy and also other methods. In summary, for application that emphasized on the performance of retrieving data from database, it better to use relational operator combined with indexing strategy. This study is done on Microsoft SQL Server 2012.

  12. Scale-Independent Relational Query Processing

    Science.gov (United States)

    Armbrust, Michael Paul

    2013-01-01

    An increasingly common pattern is for newly-released web applications to succumb to a "Success Disaster". In this scenario, overloaded database machines and resultant high response times destroy a previously good user experience, just as a site is becoming popular. Unfortunately, the data independence provided by a traditional relational…

  13. The SQL++ Query Language: Configurable, Unifying and Semi-structured

    OpenAIRE

    Ong, Kian Win; Papakonstantinou, Yannis; Vernoux, Romain

    2014-01-01

    NoSQL databases support semi-structured data, typically modeled as JSON. They also provide limited (but expanding) query languages. Their idiomatic, non-SQL language constructs, the many variations, and the lack of formal semantics inhibit deep understanding of the query languages, and also impede progress towards clean, powerful, declarative query languages. This paper specifies the syntax and semantics of SQL++, which is applicable to both JSON native stores and SQL databases. The SQL++ sem...

  14. Can Internet search queries help to predict stock market volatility?

    OpenAIRE

    Dimpfl, Thomas; Jank, Stephan

    2011-01-01

    This paper studies the dynamics of stock market volatility and retail investor attention measured by internet search queries. We find a strong co-movement of stock market indices’ realized volatility and the search queries for their names. Furthermore, Granger causality is bi-directional: high searches follow high volatility, and high volatility follows high searches. Using the latter feedback effect to predict volatility we find that search queries contain additional information about market...

  15. REVIEW PAPER ON THE DEEP WEB DATA EXTRACTION

    OpenAIRE

    Prof. V. S. Patil*1, Miss Sneha Sitafale2, Miss Priyanka Kale3, Miss Poonam Bhujbal 4 , Miss Mohini Dandge 5 .

    2018-01-01

    Deep web data extraction is the process of extracting a set of data records and the items that they contain from a query result page. Such structured data can be later integrated into results from other data sources and given to the user in a single, cohesive view. Domain identification is used to identify the query interfaces related to the domain from the forms obtained in the search process. The surface web contains a large amount of unfiltered information, whereas the deep web includes hi...

  16. AQBE — QBE Style Queries for Archetyped Data

    Science.gov (United States)

    Sachdeva, Shelly; Yaginuma, Daigo; Chu, Wanming; Bhalla, Subhash

    Large-scale adoption of electronic healthcare applications requires semantic interoperability. The new proposals propose an advanced (multi-level) DBMS architecture for repository services for health records of patients. These also require query interfaces at multiple levels and at the level of semi-skilled users. In this regard, a high-level user interface for querying the new form of standardized Electronic Health Records system has been examined in this study. It proposes a step-by-step graphical query interface to allow semi-skilled users to write queries. Its aim is to decrease user effort and communication ambiguities, and increase user friendliness.

  17. VPipe: Virtual Pipelining for Scheduling of DAG Stream Query Plans

    Science.gov (United States)

    Wang, Song; Gupta, Chetan; Mehta, Abhay

    There are data streams all around us that can be harnessed for tremendous business and personal advantage. For an enterprise-level stream processing system such as CHAOS [1] (Continuous, Heterogeneous Analytic Over Streams), handling of complex query plans with resource constraints is challenging. While several scheduling strategies exist for stream processing, efficient scheduling of complex DAG query plans is still largely unsolved. In this paper, we propose a novel execution scheme for scheduling complex directed acyclic graph (DAG) query plans with meta-data enriched stream tuples. Our solution, called Virtual Pipelined Chain (or VPipe Chain for short), effectively extends the "Chain" pipelining scheduling approach to complex DAG query plans.

  18. Multi-Dimensional Top-k Dominating Queries

    DEFF Research Database (Denmark)

    Yiu, Man Lung; Mamoulis, Nikos

    2009-01-01

    The top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. This query is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects. In addition, it combines the advantages of top......-k and skyline queries without sharing their disadvantages: (i) the output size can be controlled, (ii) no ranking functions need to be specified by users, and (iii) the result is independent of the scales at different dimensions. Despite their importance, top-k dominating queries have not received adequate...

  19. PAQ: Persistent Adaptive Query Middleware for Dynamic Environments

    Science.gov (United States)

    Rajamani, Vasanth; Julien, Christine; Payton, Jamie; Roman, Gruia-Catalin

    Pervasive computing applications often entail continuous monitoring tasks, issuing persistent queries that return continuously updated views of the operational environment. We present PAQ, a middleware that supports applications' needs by approximating a persistent query as a sequence of one-time queries. PAQ introduces an integration strategy abstraction that allows composition of one-time query responses into streams representing sophisticated spatio-temporal phenomena of interest. A distinguishing feature of our middleware is the realization that the suitability of a persistent query's result is a function of the application's tolerance for accuracy weighed against the associated overhead costs. In PAQ, programmers can specify an inquiry strategy that dictates how information is gathered. Since network dynamics impact the suitability of a particular inquiry strategy, PAQ associates an introspection strategy with a persistent query, that evaluates the quality of the query's results. The result of introspection can trigger application-defined adaptation strategies that alter the nature of the query. PAQ's simple API makes developing adaptive querying systems easily realizable. We present the key abstractions, describe their implementations, and demonstrate the middleware's usefulness through application examples and evaluation.

  20. Group-by Skyline Query Processing in Relational Engines

    DEFF Research Database (Denmark)

    Yiu, Man Lung; Luk, Ming-Hay; Lo, Eric

    2009-01-01

    the missing cost model for the BBS algorithm. Experimental results show that our techniques are able to devise the best query plans for a variety of group-by skyline queries. Our focus is on algorithms that can be directly implemented in today's commercial database systems without the addition of new access......The skyline operator was first proposed in 2001 for retrieving interesting tuples from a dataset. Since then, 100+ skyline-related papers have been published; however, we discovered that one of the most intuitive and practical type of skyline queries, namely, group-by skyline queries remains...

  1. Linked Registries: Connecting Rare Diseases Patient Registries through a Semantic Web Layer.

    Science.gov (United States)

    Sernadela, Pedro; González-Castro, Lorena; Carta, Claudio; van der Horst, Eelke; Lopes, Pedro; Kaliyaperumal, Rajaram; Thompson, Mark; Thompson, Rachel; Queralt-Rosinach, Núria; Lopez, Estrella; Wood, Libby; Robertson, Agata; Lamanna, Claudia; Gilling, Mette; Orth, Michael; Merino-Martinez, Roxana; Posada, Manuel; Taruscio, Domenica; Lochmüller, Hanns; Robinson, Peter; Roos, Marco; Oliveira, José Luís

    2017-01-01

    Patient registries are an essential tool to increase current knowledge regarding rare diseases. Understanding these data is a vital step to improve patient treatments and to create the most adequate tools for personalized medicine. However, the growing number of disease-specific patient registries brings also new technical challenges. Usually, these systems are developed as closed data silos, with independent formats and models, lacking comprehensive mechanisms to enable data sharing. To tackle these challenges, we developed a Semantic Web based solution that allows connecting distributed and heterogeneous registries, enabling the federation of knowledge between multiple independent environments. This semantic layer creates a holistic view over a set of anonymised registries, supporting semantic data representation, integrated access, and querying. The implemented system gave us the opportunity to answer challenging questions across disperse rare disease patient registries. The interconnection between those registries using Semantic Web technologies benefits our final solution in a way that we can query single or multiple instances according to our needs. The outcome is a unique semantic layer, connecting miscellaneous registries and delivering a lightweight holistic perspective over the wealth of knowledge stemming from linked rare disease patient registries.

  2. Web corpus construction

    CERN Document Server

    Schafer, Roland

    2013-01-01

    The World Wide Web constitutes the largest existing source of texts written in a great variety of languages. A feasible and sound way of exploiting this data for linguistic research is to compile a static corpus for a given language. There are several adavantages of this approach: (i) Working with such corpora obviates the problems encountered when using Internet search engines in quantitative linguistic research (such as non-transparent ranking algorithms). (ii) Creating a corpus from web data is virtually free. (iii) The size of corpora compiled from the WWW may exceed by several orders of magnitudes the size of language resources offered elsewhere. (iv) The data is locally available to the user, and it can be linguistically post-processed and queried with the tools preferred by her/him. This book addresses the main practical tasks in the creation of web corpora up to giga-token size. Among these tasks are the sampling process (i.e., web crawling) and the usual cleanups including boilerplate removal and rem...

  3. Facilitating Cohort Discovery by Enhancing Ontology Exploration, Query Management and Query Sharing for Large Clinical Data Repositories

    Science.gov (United States)

    Tao, Shiqiang; Cui, Licong; Wu, Xi; Zhang, Guo-Qiang

    2017-01-01

    To help researchers better access clinical data, we developed a prototype query engine called DataSphere for exploring large-scale integrated clinical data repositories. DataSphere expedites data importing using a NoSQL data management system and dynamically renders its user interface for concept-based querying tasks. DataSphere provides an interactive query-building interface together with query translation and optimization strategies, which enable users to build and execute queries effectively and efficiently. We successfully loaded a dataset of one million patients for University of Kentucky (UK) Healthcare into DataSphere with more than 300 million clinical data records. We evaluated DataSphere by comparing it with an instance of i2b2 deployed at UK Healthcare, demonstrating that DataSphere provides enhanced user experience for both query building and execution. PMID:29854239

  4. Facilitating Cohort Discovery by Enhancing Ontology Exploration, Query Management and Query Sharing for Large Clinical Data Repositories.

    Science.gov (United States)

    Tao, Shiqiang; Cui, Licong; Wu, Xi; Zhang, Guo-Qiang

    2017-01-01

    To help researchers better access clinical data, we developed a prototype query engine called DataSphere for exploring large-scale integrated clinical data repositories. DataSphere expedites data importing using a NoSQL data management system and dynamically renders its user interface for concept-based querying tasks. DataSphere provides an interactive query-building interface together with query translation and optimization strategies, which enable users to build and execute queries effectively and efficiently. We successfully loaded a dataset of one million patients for University of Kentucky (UK) Healthcare into DataSphere with more than 300 million clinical data records. We evaluated DataSphere by comparing it with an instance of i2b2 deployed at UK Healthcare, demonstrating that DataSphere provides enhanced user experience for both query building and execution.

  5. CytoscapeRPC: a plugin to create, modify and query Cytoscape networks from scripting languages.

    Science.gov (United States)

    Bot, Jan J; Reinders, Marcel J T

    2011-09-01

    CytoscapeRPC is a plugin for Cytoscape which allows users to create, query and modify Cytoscape networks from any programming language which supports XML-RPC. This enables them to access Cytoscape functionality and visualize their data interactively without leaving the programming environment with which they are familiar. Install through the Cytoscape plugin manager or visit the web page: http://wiki.nbic.nl/index.php/CytoscapeRPC for the user tutorial and download. j.j.bot@tudelft.nl; j.j.bot@tudelft.nl.

  6. Policy Analysis Screening System (PASS) demonstration: sample queries and terminal instructions

    Energy Technology Data Exchange (ETDEWEB)

    None

    1979-10-16

    This document contains the input and output for the Policy Analysis Screening System (PASS) demonstration. This demonstration is stored on a portable disk at the Environmental Impacts Division. Sample queries presented here include: (1) how to use PASS; (2) estimated 1995 energy consumption from Mid-Range Energy-Forecasting System (MEFS) data base; (3) pollution projections from Strategic Environmental Assessment System (SEAS) data base; (4) diesel auto regulations; (5) diesel auto health effects; (6) oil shale health and safety measures; (7) water pollution effects of SRC; (8) acid rainfall from Energy Environmental Statistics (EES) data base; 1990 EIA electric generation by fuel type; sulfate concentrations by Federal region; forecast of 1995 SO/sub 2/ emissions in Region III; and estimated electrical generating capacity in California to 1990. The file name for each query is included.

  7. NOAA's Data Catalog and the Federal Open Data Policy

    Science.gov (United States)

    Wengren, M. J.; de la Beaujardiere, J.

    2014-12-01

    The 2013 Open Data Policy Presidential Directive requires Federal agencies to create and maintain a 'public data listing' that includes all agency data that is currently or will be made publicly-available in the future. The directive requires the use of machine-readable and open formats that make use of 'common core' and extensible metadata formats according to the best practices published in an online repository called 'Project Open Data', to use open licenses where possible, and to adhere to existing metadata and other technology standards to promote interoperability. In order to meet the requirements of the Open Data Policy, the National Oceanic and Atmospheric Administration (NOAA) has implemented an online data catalog that combines metadata from all subsidiary NOAA metadata catalogs into a single master inventory. The NOAA Data Catalog is available to the public for search and discovery, providing access to the NOAA master data inventory through multiple means, including web-based text search, OGC CS-W endpoint, as well as a native Application Programming Interface (API) for programmatic query. It generates on a daily basis the Project Open Data JavaScript Object Notation (JSON) file required for compliance with the Presidential directive. The Data Catalog is based on the open source Comprehensive Knowledge Archive Network (CKAN) software and runs on the Amazon Federal GeoCloud. This presentation will cover topics including mappings of existing metadata in standard formats (FGDC-CSDGM and ISO 19115 XML ) to the Project Open Data JSON metadata schema, representation of metadata elements within the catalog, and compatible metadata sources used to feed the catalog to include Web Accessible Folder (WAF), Catalog Services for the Web (CS-W), and Esri ArcGIS.com. It will also discuss related open source technologies that can be used together to build a spatial data infrastructure compliant with the Open Data Policy.

  8. CYCLOSA: Decentralizing Private Web Search Through SGX-Based Browser Extensions

    OpenAIRE

    Pires, Rafael; Goltzsche, David; Mokhtar, Sonia Ben; Bouchenak, Sara; Boutet, Antoine; Felber, Pascal; Kapitza, Rüdiger; Pasin, Marcelo; Schiavoni, Valerio

    2018-01-01

    By regularly querying Web search engines, users (unconsciously) disclose large amounts of their personal data as part of their search queries, among which some might reveal sensitive information (e.g. health issues, sexual, political or religious preferences). Several solutions exist to allow users querying search engines while improving privacy protection. However, these solutions suffer from a number of limitations: some are subject to user re-identification attacks, while others lack scala...

  9. Web Mining

    Science.gov (United States)

    Fürnkranz, Johannes

    The World-Wide Web provides every internet citizen with access to an abundance of information, but it becomes increasingly difficult to identify the relevant pieces of information. Research in web mining tries to address this problem by applying techniques from data mining and machine learning to Web data and documents. This chapter provides a brief overview of web mining techniques and research areas, most notably hypertext classification, wrapper induction, recommender systems and web usage mining.

  10. Query-Driven Visualization and Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Ruebel, Oliver; Bethel, E. Wes; Prabhat, Mr.; Wu, Kesheng

    2012-11-01

    This report focuses on an approach to high performance visualization and analysis, termed query-driven visualization and analysis (QDV). QDV aims to reduce the amount of data that needs to be processed by the visualization, analysis, and rendering pipelines. The goal of the data reduction process is to separate out data that is "scientifically interesting'' and to focus visualization, analysis, and rendering on that interesting subset. The premise is that for any given visualization or analysis task, the data subset of interest is much smaller than the larger, complete data set. This strategy---extracting smaller data subsets of interest and focusing of the visualization processing on these subsets---is complementary to the approach of increasing the capacity of the visualization, analysis, and rendering pipelines through parallelism. This report discusses the fundamental concepts in QDV, their relationship to different stages in the visualization and analysis pipelines, and presents QDV's application to problems in diverse areas, ranging from forensic cybersecurity to high energy physics.

  11. Similarity queries for temporal toxicogenomic expression profiles.

    Directory of Open Access Journals (Sweden)

    Adam A Smith

    2008-07-01

    Full Text Available We present an approach for answering similarity queries about gene expression time series that is motivated by the task of characterizing the potential toxicity of various chemicals. Our approach involves two key aspects. First, our method employs a novel alignment algorithm based on time warping. Our time warping algorithm has several advantages over previous approaches. It allows the user to impose fairly strong biases on the form that the alignments can take, and it permits a type of local alignment in which the entirety of only one series has to be aligned. Second, our method employs a relaxed spline interpolation to predict expression responses for unmeasured time points, such that the spline does not necessarily exactly fit every observed point. We evaluate our approach using expression time series from the Edge toxicology database. Our experiments show the value of using spline representations for sparse time series. More significantly, they show that our time warping method provides more accurate alignments and classifications than previous standard alignment methods for time series.

  12. Query by image example: The CANDID approach

    Energy Technology Data Exchange (ETDEWEB)

    Kelly, P.M.; Cannon, M. [Los Alamos National Lab., NM (United States). Computer Research and Applications Group; Hush, D.R. [Univ. of New Mexico, Albuquerque, NM (United States). Dept. of Electrical and Computer Engineering

    1995-02-01

    CANDID (Comparison Algorithm for Navigating Digital Image Databases) was developed to enable content-based retrieval of digital imagery from large databases using a query-by-example methodology. A user provides an example image to the system, and images in the database that are similar to that example are retrieved. The development of CANDID was inspired by the N-gram approach to document fingerprinting, where a ``global signature`` is computed for every document in a database and these signatures are compared to one another to determine the similarity between any two documents. CANDID computes a global signature for every image in a database, where the signature is derived from various image features such as localized texture, shape, or color information. A distance between probability density functions of feature vectors is then used to compare signatures. In this paper, the authors present CANDID and highlight two results from their current research: subtracting a ``background`` signature from every signature in a database in an attempt to improve system performance when using inner-product similarity measures, and visualizing the contribution of individual pixels in the matching process. These ideas are applicable to any histogram-based comparison technique.

  13. On (dynamic) range minimum queries in external memory

    DEFF Research Database (Denmark)

    Arge, L.; Fischer, Johannes; Sanders, Peter

    2013-01-01

    We study the one-dimensional range minimum query (RMQ) problem in the external memory model. We provide the first space-optimal solution to the batched static version of the problem. On an instance with N elements and Q queries, our solution takes Θ(sort(N + Q)) = Θ( N+QB log M /B N+QB ) I...

  14. Dataflow Query Execution in a Parallel, Main-memory Environment

    NARCIS (Netherlands)

    Wilschut, A.N.; Apers, Peter M.G.

    In this paper, the performance and characteristics of the execution of various join-trees on a parallel DBMS are studied. The results of this study are a step into the direction of the design of a query optimization strategy that is fit for parallel execution of complex queries. Among others,

  15. Dataflow Query Execution in a Parallel Main-Memory Environment

    NARCIS (Netherlands)

    Wilschut, A.N.; Apers, Peter M.G.

    1991-01-01

    The performance and characteristics of the execution of various join-trees on a parallel DBMS are studied. The results are a step in the direction of the design of a query optimization strategy that is fit for parallel execution of complex queries. Among others, synchronization issues are identified

  16. On the Suitability of Skyline Queries for Data Exploration

    DEFF Research Database (Denmark)

    Chester, Sean; Mortensen, Michael Lind; Assent, Ira

    2014-01-01

    The skyline operator has been studied in database research for multi-criteria decision making. Until now the focus has been on the efficiency or accuracy of single queries. In practice, however, users are increasingly confronted with unknown data collections, where precise query formulation proves...

  17. Mining the SDSS SkyServer SQL queries log

    Science.gov (United States)

    Hirota, Vitor M.; Santos, Rafael; Raddick, Jordan; Thakar, Ani

    2016-05-01

    SkyServer, the Internet portal for the Sloan Digital Sky Survey (SDSS) astronomic catalog, provides a set of tools that allows data access for astronomers and scientific education. One of SkyServer data access interfaces allows users to enter ad-hoc SQL statements to query the catalog. SkyServer also presents some template queries that can be used as basis for more complex queries. This interface has logged over 330 million queries submitted since 2001. It is expected that analysis of this data can be used to investigate usage patterns, identify potential new classes of queries, find similar queries, etc. and to shed some light on how users interact with the Sloan Digital Sky Survey data and how scientists have adopted the new paradigm of e-Science, which could in turn lead to enhancements on the user interfaces and experience in general. In this paper we review some approaches to SQL query mining, apply the traditional techniques used in the literature and present lessons learned, namely, that the general text mining approach for feature extraction and clustering does not seem to be adequate for this type of data, and, most importantly, we find that this type of analysis can result in very different queries being clustered together.

  18. An Experimental Investigation of Complexity in Database Query Formulation Tasks

    Science.gov (United States)

    Casterella, Gretchen Irwin; Vijayasarathy, Leo

    2013-01-01

    Information Technology professionals and other knowledge workers rely on their ability to extract data from organizational databases to respond to business questions and support decision making. Structured query language (SQL) is the standard programming language for querying data in relational databases, and SQL skills are in high demand and are…

  19. Efficient processing of 3-sided range queries with probabilistic guarantees

    DEFF Research Database (Denmark)

    Kaporis, Alexis; Papadopoulos, Apostolos; Sioutas, Spyros

    2010-01-01

    This work studies the problem of 2-dimensional searching for the 3-sided range query of the form [a, b] x (-∞, c] in both main and external memory, by considering a variety of input distributions. A dynamic linear main memory solution is proposed, which answers 3-sided queries in O(log n + t) worst...

  20. Efficient external memory structures for range-aggregate queries

    DEFF Research Database (Denmark)

    Agarwal, P.K.; Yang, J.; Arge, L.

    2013-01-01

    We present external memory data structures for efficiently answering range-aggregate queries. The range-aggregate problem is defined as follows: Given a set of weighted points in Rd, compute the aggregate of the weights of the points that lie inside a d-dimensional orthogonal query rectangle. The...

  1. Video Stream Retrieval of Unseen Queries using Semantic Memory

    NARCIS (Netherlands)

    Cappallo, S.; Mensink, T.; Snoek, C.G.M.; Wilson, R.C.; Hancock, E.R.; Smith, W.A.P.

    2016-01-01

    Retrieval of live, user-broadcast video streams is an under-addressed and increasingly relevant challenge. The on-line nature of the problem requires temporal evaluation and the unforeseeable scope of potential queries motivates an approach which can accommodate arbitrary search queries. To account

  2. Efficient processing of containment queries on nested sets

    NARCIS (Netherlands)

    Ibrahim, A.; Fletcher, G.H.L.

    2013-01-01

    We study the problem of computing containment queries on sets which can have both atomic and set-valued objects as elements, i.e., nested sets. Containment is a fundamental query pattern with many basic applications. Our study of nested set containment is motivated by the ubiquity of nested data in

  3. A Typed Text Retrieval Query Language for XML Documents.

    Science.gov (United States)

    Colazzo, Dario; Sartiani, Carlo; Albano, Antonio; Manghi, Paolo; Ghelli, Giorgio; Lini, Luca; Paoli, Michele

    2002-01-01

    Discussion of XML focuses on a description of Tequyla-TX, a typed text retrieval query language for XML documents that can search on both content and structures. Highlights include motivations; numerous examples; word-based and char-based searches; tag-dependent full-text searches; text normalization; query algebra; data models and term language;…

  4. Real SQL queries 50 challenges : practice for reporting and analysis

    CERN Document Server

    Cohen, Brian; Mishra, Neerja

    2015-01-01

    Queries improve when challenges are authentic. This book sets your learning on the fast track with realistic problems to solve. Topics span sales, marketing, human resources, purchasing, and production. Real SQL Queries: 50 Challenges is perfect for analysts, report writers, or anyone searching for a hands-on approach to learning SQL Server.

  5. A framework for query optimization to support data mining

    NARCIS (Netherlands)

    S.R. Choenni (Sunil); A.P.J.M. Siebes (Arno)

    1996-01-01

    textabstractIn order to extract knowledge from databases, data mining algorithms heavily query the databases. Inefficient processing of these queries will inevitably have its impact on the performance of these algorithms, making them less valuable. In this paper, we describe an optimization

  6. Aspects of Data Warehouse Technologies for Complex Web Data

    OpenAIRE

    Thomsen, Christian

    2008-01-01

    This thesis is about aspects of specification and development of datawarehouse technologies for complex web data. Today, large amounts of dataexist in different web resources and in different formats. But it is oftenhard to analyze and query the often big and complex data or data about thedata (i.e., metadata). It is therefore interesting to apply Data Warehouse(DW) technology to the data. But to apply DW technology to complex web datais not straightforward and the DW community faces new and ...

  7. A Fuzzy Query Mechanism for Human Resource Websites

    Science.gov (United States)

    Lai, Lien-Fu; Wu, Chao-Chin; Huang, Liang-Tsung; Kuo, Jung-Chih

    Users' preferences often contain imprecision and uncertainty that are difficult for traditional human resource websites to deal with. In this paper, we apply the fuzzy logic theory to develop a fuzzy query mechanism for human resource websites. First, a storing mechanism is proposed to store fuzzy data into conventional database management systems without modifying DBMS models. Second, a fuzzy query language is proposed for users to make fuzzy queries on fuzzy databases. User's fuzzy requirement can be expressed by a fuzzy query which consists of a set of fuzzy conditions. Third, each fuzzy condition associates with a fuzzy importance to differentiate between fuzzy conditions according to their degrees of importance. Fourth, the fuzzy weighted average is utilized to aggregate all fuzzy conditions based on their degrees of importance and degrees of matching. Through the mutual compensation of all fuzzy conditions, the ordering of query results can be obtained according to user's preference.

  8. Multiple k Nearest Neighbor Query Processing in Spatial Network Databases

    DEFF Research Database (Denmark)

    Xuegang, Huang; Jensen, Christian Søndergaard; Saltenis, Simonas

    2006-01-01

    This paper concerns the efficient processing of multiple k nearest neighbor queries in a road-network setting. The assumed setting covers a range of scenarios such as the one where a large population of mobile service users that are constrained to a road network issue nearest-neighbor queries...... for points of interest that are accessible via the road network. Given multiple k nearest neighbor queries, the paper proposes progressive techniques that selectively cache query results in main memory and subsequently reuse these for query processing. The paper initially proposes techniques for the case...... where an upper bound on k is known a priori and then extends the techniques to the case where this is not so. Based on empirical studies with real-world data, the paper offers insight into the circumstances under which the different proposed techniques can be used with advantage for multiple k nearest...

  9. Pareto-depth for multiple-query image retrieval.

    Science.gov (United States)

    Hsiao, Ko-Jen; Calder, Jeff; Hero, Alfred O

    2015-02-01

    Most content-based image retrieval systems consider either one single query, or multiple queries that include the same object or represent the same semantic information. In this paper, we consider the content-based image retrieval problem for multiple query images corresponding to different image semantics. We propose a novel multiple-query information retrieval algorithm that combines the Pareto front method with efficient manifold ranking. We show that our proposed algorithm outperforms state of the art multiple-query retrieval algorithms on real-world image databases. We attribute this performance improvement to concavity properties of the Pareto fronts, and prove a theoretical result that characterizes the asymptotic concavity of the fronts.

  10. Federal Holidays

    Data.gov (United States)

    Office of Personnel Management — Federal law (5 U.S.C. 6103) establishes the following public holidays for Federal employees. Please note that most Federal employees work on a Monday through Friday...

  11. A Novel Personalized Web Search Model

    Institute of Scientific and Technical Information of China (English)

    ZHU Zhengyu; XU Jingqiu; TIAN Yunyan; REN Xiang

    2007-01-01

    A novel personalized Web search model is proposed.The new system, as a middleware between a user and a Web search engine, is set up on the client machine. It can learn a user's preference implicitly and then generate the user profile automatically. When the user inputs query keywords, the system can automatically generate a few personalized expansion words by computing the term-term associations according to the current user profile, and then these words together with the query keywords are submitted to a popular search engine such as Yahoo or Google.These expansion words help to express accurately the user's search intention. The new Web search model can make a common search engine personalized, that is, the search engine can return different search results to different users who input the same keywords. The experimental results show the feasibility and applicability of the presented work.

  12. Processing SPARQL queries with regular expressions in RDF databases

    Science.gov (United States)

    2011-01-01

    Background As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users’ requests for extracting information from the RDF data as well as the lack of users’ knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. Results In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Conclusions Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns. PMID:21489225

  13. Processing SPARQL queries with regular expressions in RDF databases

    Directory of Open Access Journals (Sweden)

    Cho Hune

    2011-03-01

    Full Text Available Abstract Background As the Resource Description Framework (RDF data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf or Bio2RDF (bio2rdf.org, SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users’ requests for extracting information from the RDF data as well as the lack of users’ knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. Results In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1 We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2 We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3 We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Conclusions Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns.

  14. Processing SPARQL queries with regular expressions in RDF databases.

    Science.gov (United States)

    Lee, Jinsoo; Pham, Minh-Duc; Lee, Jihwan; Han, Wook-Shin; Cho, Hune; Yu, Hwanjo; Lee, Jeong-Hoon

    2011-03-29

    As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users' requests for extracting information from the RDF data as well as the lack of users' knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns.

  15. Macromolecular query language (MMQL): prototype data model and implementation.

    Science.gov (United States)

    Shindyalov, I N; Chang, W; Pu, C; Bourne, P E

    1994-11-01

    Macromolecular query language (MMQL) is an extensible interpretive language in which to pose questions concerning the experimental or derived features of the 3-D structure of biological macromolecules. MMQL portends to be intuitive with a simple syntax, so that from a user's perspective complex queries are easily written. A number of basic queries and a more complex query--determination of structures containing a five-strand Greek key motif--are presented to illustrate the strengths and weaknesses of the language. The predominant features of MMQL are a filter and pattern grammar which are combined to express a wide range of interesting biological queries. Filters permit the selection of object attributes, for example, compound name and resolution, whereas the patterns currently implemented query primary sequence, close contacts, hydrogen bonding, secondary structure, conformation and amino acid properties (volume, polarity, isoelectric point, hydrophobicity and different forms of exposure). MMQL queries are processed by MMQLlib; a C++ class library, to which new query methods and pattern types are easily added. The prototype implementation described uses PDBlib, another C(++)-based class library from representing the features of biological macromolecules at the level of detail parsable from a PDB file. Since PDBlib can represent data stored in relational and object-oriented databases, as well as PDB files, once these data are loaded they too can be queried by MMQL. Performance metrics are given for queries of PDB files for which all derived data are calculated at run time and compared to a preliminary version of OOPDB, a prototype object-oriented database with a schema based on a persistent version of PDBlib which offers more efficient data access and the potential to maintain derived information. MMQLlib, PDBlib and associated software are available via anonymous ftp from cuhhca.hhmi.columbia.edu.

  16. cPath: open source software for collecting, storing, and querying biological pathways

    Directory of Open Access Journals (Sweden)

    Gross Benjamin E

    2006-11-01

    Full Text Available Abstract Background Biological pathways, including metabolic pathways, protein interaction networks, signal transduction pathways, and gene regulatory networks, are currently represented in over 220 diverse databases. These data are crucial for the study of specific biological processes, including human diseases. Standard exchange formats for pathway information, such as BioPAX, CellML, SBML and PSI-MI, enable convenient collection of this data for biological research, but mechanisms for common storage and communication are required. Results We have developed cPath, an open source database and web application for collecting, storing, and querying biological pathway data. cPath makes it easy to aggregate custom pathway data sets available in standard exchange formats from multiple databases, present pathway data to biologists via a customizable web interface, and export pathway data via a web service to third-party software, such as Cytoscape, for visualization and analysis. cPath is software only, and does not include new pathway information. Key features include: a built-in identifier mapping service for linking identical interactors and linking to external resources; built-in support for PSI-MI and BioPAX standard pathway exchange formats; a web service interface for searching and retrieving pathway data sets; and thorough documentation. The cPath software is freely available under the LGPL open source license for academic and commercial use. Conclusion cPath is a robust, scalable, modular, professional-grade software platform for collecting, storing, and querying biological pathways. It can serve as the core data handling component in information systems for pathway visualization, analysis and modeling.

  17. The OLAP-XML Federation System

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    2006-01-01

    We present the logical “OLAP-XML Federation System” that enables the external data available in XML format to be used as virtual dimensions. Unlike the complex and time-consuming physical integration of OLAP and external data in current OLAP systems, our system makes OLAP queries referencing fast...

  18. jQuery UI 1.7 the user interface library for jQuery

    CERN Document Server

    Wellman, Dan

    2009-01-01

    An example-based approach leads you step-by-step through the implementation and customization of each library component and its associated resources in turn. To emphasize the way that jQuery UI takes the difficulty out of user interface design and implementation, each chapter ends with a 'fun with' section that puts together what you've learned throughout the chapter to make a usable and fun page. In these sections you'll often get to experiment with the latest associated technologies like AJAX and JSON. This book is for front-end designers and developers who need to quickly learn how to use t

  19. How popular is waterpipe tobacco smoking? Findings from internet search queries.

    Science.gov (United States)

    Salloum, Ramzi G; Osman, Amira; Maziak, Wasim; Thrasher, James F

    2015-09-01

    Waterpipe tobacco smoking (WTS), a traditional tobacco consumption practice in the Middle East, is gaining popularity worldwide. Estimates of population-level interest in WTS over time are not documented. We assessed the popularity of WTS using World Wide Web search query results across four English-speaking countries. We analysed trends in Google search queries related to WTS, comparing these trends with those for electronic cigarettes between 2004 and 2013 in Australia, Canada, the UK and the USA. Weekly search volumes were reported as percentages relative to the week with the highest volume of searches. Web-based searches for WTS have increased steadily since 2004 in all four countries. Search volume for WTS was higher than for e-cigarettes in three of the four nations, with the highest volume in the USA. Online searches were primarily targeted at WTS products for home use, followed by searches for WTS cafés/lounges. Online demand for information on WTS-related products and venues is large and increasing. Given the rise in WTS popularity, increasing evidence of exposure-related harms, and relatively lax government regulation, WTS is a serious public health concern and could reach epidemic levels in Western societies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  20. The DEDUCE Guided Query tool: providing simplified access to clinical data for research and quality improvement.

    Science.gov (United States)

    Horvath, Monica M; Winfield, Stephanie; Evans, Steve; Slopek, Steve; Shang, Howard; Ferranti, Jeffrey

    2011-04-01

    In many healthcare organizations, comparative effectiveness research and quality improvement (QI) investigations are hampered by a lack of access to data created as a byproduct of patient care. Data collection often hinges upon either manual chart review or ad hoc requests to technical experts who support legacy clinical systems. In order to facilitate this needed capacity for data exploration at our institution (Duke University Health System), we have designed and deployed a robust Web application for cohort identification and data extraction--the Duke Enterprise Data Unified Content Explorer (DEDUCE). DEDUCE is envisioned as a simple, web-based environment that allows investigators access to administrative, financial, and clinical information generated during patient care. By using business intelligence tools to create a view into Duke Medicine's enterprise data warehouse, DEDUCE provides a Guided Query functionality using a wizard-like interface that lets users filter through millions of clinical records, explore aggregate reports, and, export extracts. Researchers and QI specialists can obtain detailed patient- and observation-level extracts without needing to understand structured query language or the underlying database model. Developers designing such tools must devote sufficient training and develop application safeguards to ensure that patient-centered clinical researchers understand when observation-level extracts should be used. This may mitigate the risk of data being misunderstood and consequently used in an improper fashion. Copyright © 2010 Elsevier Inc. All rights reserved.

  1. Fragger: a protein fragment picker for structural queries.

    Science.gov (United States)

    Berenger, Francois; Simoncini, David; Voet, Arnout; Shrestha, Rojan; Zhang, Kam Y J

    2017-01-01

    Protein modeling and design activities often require querying the Protein Data Bank (PDB) with a structural fragment, possibly containing gaps. For some applications, it is preferable to work on a specific subset of the PDB or with unpublished structures. These requirements, along with specific user needs, motivated the creation of a new software to manage and query 3D protein fragments. Fragger is a protein fragment picker that allows protein fragment databases to be created and queried. All fragment lengths are supported and any set of PDB files can be used to create a database. Fragger can efficiently search a fragment database with a query fragment and a distance threshold. Matching fragments are ranked by distance to the query. The query fragment can have structural gaps and the allowed amino acid sequences matching a query can be constrained via a regular expression of one-letter amino acid codes. Fragger also incorporates a tool to compute the backbone RMSD of one versus many fragments in high throughput. Fragger should be useful for protein design, loop grafting and related structural bioinformatics tasks.

  2. Query construction, entropy, and generalization in neural-network models

    Science.gov (United States)

    Sollich, Peter

    1994-05-01

    We study query construction algorithms, which aim at improving the generalization ability of systems that learn from examples by choosing optimal, nonredundant training sets. We set up a general probabilistic framework for deriving such algorithms from the requirement of optimizing a suitable objective function; specifically, we consider the objective functions entropy (or information gain) and generalization error. For two learning scenarios, the high-low game and the linear perceptron, we evaluate the generalization performance obtained by applying the corresponding query construction algorithms and compare it to training on random examples. We find qualitative differences between the two scenarios due to the different structure of the underlying rules (nonlinear and ``noninvertible'' versus linear); in particular, for the linear perceptron, random examples lead to the same generalization ability as a sequence of queries in the limit of an infinite number of examples. We also investigate learning algorithms which are ill matched to the learning environment and find that, in this case, minimum entropy queries can in fact yield a lower generalization ability than random examples. Finally, we study the efficiency of single queries and its dependence on the learning history, i.e., on whether the previous training examples were generated randomly or by querying, and the difference between globally and locally optimal query construction.

  3. Research in Mobile Database Query Optimization and Processing

    Directory of Open Access Journals (Sweden)

    Agustinus Borgy Waluyo

    2005-01-01

    Full Text Available The emergence of mobile computing provides the ability to access information at any time and place. However, as mobile computing environments have inherent factors like power, storage, asymmetric communication cost, and bandwidth limitations, efficient query processing and minimum query response time are definitely of great interest. This survey groups a variety of query optimization and processing mechanisms in mobile databases into two main categories, namely: (i query processing strategy, and (ii caching management strategy. Query processing includes both pull and push operations (broadcast mechanisms. We further classify push operation into on-demand broadcast and periodic broadcast. Push operation (on-demand broadcast relates to designing techniques that enable the server to accommodate multiple requests so that the request can be processed efficiently. Push operation (periodic broadcast corresponds to data dissemination strategies. In this scheme, several techniques to improve the query performance by broadcasting data to a population of mobile users are described. A caching management strategy defines a number of methods for maintaining cached data items in clients' local storage. This strategy considers critical caching issues such as caching granularity, caching coherence strategy and caching replacement policy. Finally, this survey concludes with several open issues relating to mobile query optimization and processing strategy.

  4. Chapter 18: Web-based Tools - NED VO Services

    Science.gov (United States)

    Mazzarella, J. M.; NED Team

    The NASA/IPAC Extragalactic Database (NED) is a thematic, web-based research facility in widespread use by scientists, educators, space missions, and observatory operations for observation planning, data analysis, discovery, and publication of research about objects beyond our Milky Way galaxy. NED is a portal into a systematic fusion of data from hundreds of sky surveys and tens of thousands of research publications. The contents and services span the entire electromagnetic spectrum from gamma rays through radio frequencies, and are continuously updated to reflect the current literature and releases of large-scale sky survey catalogs. NED has been on the Internet since 1990, growing in content, automation and services with the evolution of information technology. NED is the world's largest database of crossidentified extragalactic objects. As of December 2006, the system contains approximately 10 million objects and 15 million multi-wavelength cross-IDs. Over 4 thousand catalogs and published lists covering the entire electromagnetic spectrum have had their objects cross-identified or associated, with fundamental data parameters federated for convenient queries and retrieval. This chapter describes the interoperability of NED services with other components of the Virtual Observatory (VO). Section 1 is a brief overview of the primary NED web services. Section 2 provides a tutorial for using NED services currently available through the NVO Registry. The "name resolver" provides VO portals and related internet services with celestial coordinates for objects specified by catalog identifier (name); any alias can be queried because this service is based on the source cross-IDs established by NED. All major services have been updated to provide output in VOTable (XML) format that can be accessed directly from the NED web interface or using the NVO registry. These include access to images via SIAP, Cone- Search queries, and services providing fundamental, multi

  5. In-route skyline querying for location-based services

    DEFF Research Database (Denmark)

    Xuegang, Huang; Jensen, Kristian S.

    2005-01-01

    With the emergence of an infrastructure for location-aware mobile services, the processing of advanced, location-based queries that are expected to underlie such services is gaining in relevance, While much work has assumed that users move in Euclidean space, this paper assumes that movement...... their efficient computation. The queries take into account several spatial preferences. and they intuitively return a set of most interesting results for each result returned by the corresponding non-skyline queries. The paper also covers a performance study of the proposed techniques based on real point...

  6. Intelligent query processing for semantic mediation of information systems

    Directory of Open Access Journals (Sweden)

    Saber Benharzallah

    2011-11-01

    Full Text Available We propose an intelligent and an efficient query processing approach for semantic mediation of information systems. We propose also a generic multi agent architecture that supports our approach. Our approach focuses on the exploitation of intelligent agents for query reformulation and the use of a new technology for the semantic representation. The algorithm is self-adapted to the changes of the environment, offers a wide aptitude and solves the various data conflicts in a dynamic way; it also reformulates the query using the schema mediation method for the discovered systems and the context mediation for the other systems.

  7. A new weighted fuzzy grammar on object oriented database queries

    Directory of Open Access Journals (Sweden)

    Ali Haroonabadi

    2012-08-01

    Full Text Available The fuzzy object oriented database model is often used to handle the existing imprecise and complicated objects for many real-world applications. The main focus of this paper is on fuzzy queries and tries to analyze a complicated and complex query to get more meaningful and closer responses. The method permits the user to provide the possibility of allocating the weight to various parts of the query, which makes it easier to follow better goals and return the target objects.

  8. Relaxing rdf queries based on user and domain preferences

    DEFF Research Database (Denmark)

    Dolog, Peter; Stueckenschmidt, Heiner; Wache, Holger

    2009-01-01

    Research in cooperative query answering is triggered by the observation that users are often not able to correctly formulate queries to databases such that they return the intended result. Due to lacking knowledge about the contents and the structure of a database, users will often only be able t...... application in the context of e-learning systems....... knowledge and user preferences. We describe a framework for information access that combines query refinement and relaxation in order to provide robust, personalized access to heterogeneous resource description framework data as well as an implementation in terms of rewriting rules and explain its...

  9. Blink and it's done: Interactive queries on very large data

    OpenAIRE

    Agarwal, Sameer; Iyer, Anand P.; Panda, Aurojit; Mozafari, Barzan; Stoica, Ion; Madden, Samuel R.

    2012-01-01

    In this demonstration, we present BlinkDB, a massively parallel, sampling-based approximate query processing framework for running interactive queries on large volumes of data. The key observation in BlinkDB is that one can make reasonable decisions in the absence of perfect answers. BlinkDB extends the Hive/HDFS stack and can handle the same set of SPJA (selection, projection, join and aggregate) queries as supported by these systems. BlinkDB provides real-time answers along with statistical...

  10. A semantic-based approach for querying linked data using natural language

    KAUST Repository

    Paredes-Valverde, Mario Andrés

    2016-01-11

    The semantic Web aims to provide to Web information with a well-defined meaning and make it understandable not only by humans but also by computers, thus allowing the automation, integration and reuse of high-quality information across different applications. However, current information retrieval mechanisms for semantic knowledge bases are intended to be only used by expert users. In this work, we propose a natural language interface that allows non-expert users the access to this kind of information through formulating queries in natural language. The present approach uses a domain-independent ontology model to represent the question\\'s structure and context. Also, this model allows determination of the answer type expected by the user based on a proposed question classification. To prove the effectiveness of our approach, we have conducted an evaluation in the music domain using LinkedBrainz, an effort to provide the MusicBrainz information as structured data on the Web by means of Semantic Web technologies. Our proposal obtained encouraging results based on the F-measure metric, ranging from 0.74 to 0.82 for a corpus of questions generated by a group of real-world end users. © The Author(s) 2015.

  11. A semantic-based approach for querying linked data using natural language

    KAUST Repository

    Paredes-Valverde, Mario André s; Valencia-Garcí a, Rafael; Rodriguez-Garcia, Miguel Angel; Colomo-Palacios, Ricardo; Alor-Herná ndez, Giner

    2016-01-01

    The semantic Web aims to provide to Web information with a well-defined meaning and make it understandable not only by humans but also by computers, thus allowing the automation, integration and reuse of high-quality information across different applications. However, current information retrieval mechanisms for semantic knowledge bases are intended to be only used by expert users. In this work, we propose a natural language interface that allows non-expert users the access to this kind of information through formulating queries in natural language. The present approach uses a domain-independent ontology model to represent the question's structure and context. Also, this model allows determination of the answer type expected by the user based on a proposed question classification. To prove the effectiveness of our approach, we have conducted an evaluation in the music domain using LinkedBrainz, an effort to provide the MusicBrainz information as structured data on the Web by means of Semantic Web technologies. Our proposal obtained encouraging results based on the F-measure metric, ranging from 0.74 to 0.82 for a corpus of questions generated by a group of real-world end users. © The Author(s) 2015.

  12. ALGORITMA RC4 DALAM PROTEKSI TRANSMISI DAN HASIL QUERY UNTUK ORDBMS POSTGRESQL

    Directory of Open Access Journals (Sweden)

    Yuri Ariyanto

    2009-01-01

    Full Text Available In this research will be worked through about how cryptography RC4's algorithm implementation in protection to query result and of query, security by encryption and descryption up to both is in network. Implementation of this research which is build software in client that function access databases that is placed by the side of server. Software that building to have facility for encryption and descryption query result and of query that is sent from client goes to server and. transmission query result and of query can secure its security. Well guaranted transmission security him of query result and of query can be told to succeed if success software can encryption query result and of query which transmission so that in the event of scanning to both, scanning will not understand data content. Conclusion of this research that is woke up software succeed encryption query and result of query which transmission between application of client and of server databases. Abstract in Bahasa Indonesia: Pada penelitian ini dibahas mengenai bagaimana mengimplementasikan algoritma kriptografi RC4 dalam proteksi terhadap query dan hasil query, pengamanan dilakukan dengan cara melakukan enkripsi dan dekripsi selama keduanya berada di dalam jaringan. Pengimplementasian dari penelitian ini yaitu membangun sebuah software yang akan diletakkan di sisi client yang berfungsi mengakses database yang diletakkan di sisi server. Software yang dibangun memiliki fasilitas untuk mengenkripsi dan mendektipsi query dan hasil query yang dikirimkan dari client ke server dan juga sebaliknya. Dengan demikian tramsmisi query dan hasil query dapat terjamin keamanannya.Terjaminnya keamanan transmisi query dan hasil query dapat dikatakan berhasil jika software berhasil mengenkripsi query dan hasil query yang ditransmisikan sehingga apabila terjadi penyadapan terhadap keduanya, penyadap tidak akan mengerti isi data tersebut. Kesimpulan dari penelitian ini yaitu software yang dibangun

  13. Building FLOW Federating Libraries on the Web

    CERN Document Server

    Keller-Gold, A; Le Meur, Jean-Yves; Baldridge, K K

    2002-01-01

    Individuals, teams, organizations, and networks can be thought of as tiers or classes within the complex grid of technology and practice in which research documentation is both consumed and generated. The panoply of possible classes share with the others a common need for document management tools and practices. The distinctive document management tools and practices used within each represent boundaries across which information could flow openly if technology and metadata standards were to provide an accessible digital framework. The CERN Document Server (CDS), implemented by a research partnership at the San Diego Supercomputer Center (SDSC), establishes a prototype tiered repository system for such a panoply. Research suggests modifications to enable cross-domain information flow and is represented as a metadata grid.

  14. Technologies for conceptual modelling and intelligent query formulation

    CSIR Research Space (South Africa)

    Alberts, R

    2008-11-01

    Full Text Available The aim of the project is to devise and evaluate algorithms, methodologies, techniques and interaction paradigms to build a tool for conceptual modelling and query management of complex data repositories based on a framework with solid formal...

  15. External Data Structures for Shortest Path Queries on Planar Digraphs

    DEFF Research Database (Denmark)

    Arge, Lars; Toma, Laura

    2005-01-01

    In this paper we present space-query trade-offs for external memory data structures that answer shortest path queries on planar directed graphs. For any S = Ω(N 1 + ε) and S = O(N2/B), our main result is a family of structures that use S space and answer queries in O(N2/ S B) I/Os, thus obtaining...... optimal space-query product O(N2/B). An S space structure can be constructed in O(√S · sort(N)) I/Os, where sort(N) is the number of I/Os needed to sort N elements, B is the disk block size, and N is the size of the graph....

  16. An Approach to Assist Designers With Their Queries and Designs

    DEFF Research Database (Denmark)

    Ahmed, Saeema

    2006-01-01

    Recent research investigating how engineers search for information has concluded that engineering designers acquire assistance when formulating queries. An approach to assist designers with their queries is presented. This approach forms part of a knowledge management system, where indexed...... documents are entered into the system (or are automatically indexed by tools within a system). The method builds up a network based upon indices assigned to documents. The network (or chunk) is presented back to the user once a search for knowledge has been completed. The network is build up as indexed...... documents are entered in to a knowledge-based system and is generated dynamically. The network can be used to assist a designer in searching for information; reformulating a query and; to prompt design tasks. This paper presents an approach to prompt designers with their design queries, along with some...

  17. An introduction to XML query processing and keyword search

    CERN Document Server

    Lu, Jiaheng

    2013-01-01

    This book systematically and comprehensively covers the latest advances in XML data searching. It presents an extensive overview of the current query processing and keyword search techniques on XML data.

  18. Determinacy in Static Analysis of jQuery

    DEFF Research Database (Denmark)

    Andreasen, Esben; Møller, Anders

    2014-01-01

    Static analysis for JavaScript can potentially help programmers find errors early during development. Although much progress has been made on analysis techniques, a major obstacle is the prevalence of libraries, in particular jQuery, which apply programming patterns that have detrimental conseque......Static analysis for JavaScript can potentially help programmers find errors early during development. Although much progress has been made on analysis techniques, a major obstacle is the prevalence of libraries, in particular jQuery, which apply programming patterns that have detrimental...... present a static dataflow analysis for JavaScript that infers and exploits determinacy information on-the-fly, to enable analysis of some of the most complex parts of jQuery. The techniques are implemented in the TAJS analysis tool and evaluated on a collection of small programs that use jQuery. Our...

  19. Web archives

    DEFF Research Database (Denmark)

    Finnemann, Niels Ole

    2018-01-01

    This article deals with general web archives and the principles for selection of materials to be preserved. It opens with a brief overview of reasons why general web archives are needed. Section two and three present major, long termed web archive initiatives and discuss the purposes and possible...... values of web archives and asks how to meet unknown future needs, demands and concerns. Section four analyses three main principles in contemporary web archiving strategies, topic centric, domain centric and time-centric archiving strategies and section five discuss how to combine these to provide...... a broad and rich archive. Section six is concerned with inherent limitations and why web archives are always flawed. The last sections deal with the question how web archives may fit into the rapidly expanding, but fragmented landscape of digital repositories taking care of various parts...

  20. Matching health information seekers' queries to medical terms.

    Science.gov (United States)

    Soualmia, Lina F; Prieur-Gaston, Elise; Moalla, Zied; Lecroq, Thierry; Darmoni, Stéfan J

    2012-01-01

    The Internet is a major source of health information but most seekers are not familiar with medical vocabularies. Hence, their searches fail due to bad query formulation. Several methods have been proposed to improve information retrieval: query expansion, syntactic and semantic techniques or knowledge-based methods. However, it would be useful to clean those queries which are misspelled. In this paper, we propose a simple yet efficient method in order to correct misspellings of queries submitted by health information seekers to a medical online search tool. In addition to query normalizations and exact phonetic term matching, we tested two approximate string comparators: the similarity score function of Stoilos and the normalized Levenshtein edit distance. We propose here to combine them to increase the number of matched medical terms in French. We first took a sample of query logs to determine the thresholds and processing times. In the second run, at a greater scale we tested different combinations of query normalizations before or after misspelling correction with the retained thresholds in the first run. According to the total number of suggestions (around 163, the number of the first sample of queries), at a threshold comparator score of 0.3, the normalized Levenshtein edit distance gave the highest F-Measure (88.15%) and at a threshold comparator score of 0.7, the Stoilos function gave the highest F-Measure (84.31%). By combining Levenshtein and Stoilos, the highest F-Measure (80.28%) is obtained with 0.2 and 0.7 thresholds respectively. However, queries are composed by several terms that may be combination of medical terms. The process of query normalization and segmentation is thus required. The highest F-Measure (64.18%) is obtained when this process is realized before spelling-correction. Despite the widely known high performance of the normalized edit distance of Levenshtein, we show in this paper that its combination with the Stoilos algorithm improved