WorldWideScience

Sample records for continuous range queries

  1. Range-clustering queries

    DEFF Research Database (Denmark)

    Abrahamsen, Mikkel; de Berg, Mark; Buchin, Kevin

    2017-01-01

    an optimal k-clustering for S P ∩ Q. We obtain the following results. • We present a general method to compute a (1 + ϵ)-approximation to a range-clustering query, where ϵ > 0 is a parameter that can be specified as part of the query. Our method applies to a large class of clustering problems, including k...

  2. Cooperative Scalable Moving Continuous Query Processing

    DEFF Research Database (Denmark)

    Li, Xiaohui; Karras, Panagiotis; Jensen, Christian S.

    2012-01-01

    A range of applications call for a mobile client to continuously monitor others in close proximity. Past research on such problems has covered two extremes: It has offered totally centralized solutions, where a server takes care of all queries, and totally distributed solutions, in which...... there is no central authority at all. Unfortunately, none of these two solutions scales to intensive moving object tracking applications, where each client poses a query. In this paper, we formulate the moving continuous query (MCQ) problem and propose a balanced model where servers cooperatively take care...... and computation cost for both servers and clients. An experimental study demonstrates that our approaches offer better scalability than competitors...

  3. Modelling Spatio-Temporal Relevancy in Urban Context-Aware Pervasive Systems Using Voronoi Continuous Range Query and Multi-Interval Algebra

    Directory of Open Access Journals (Sweden)

    Najmeh Neysani Samany

    2013-01-01

    Full Text Available Space and time are two dominant factors in context-aware pervasive systems which determine whether an entity is related to the moving user or not. This paper specifically addresses the use of spatio-temporal relations for detecting spatio-temporally relevant contexts to the user. The main contribution of this work is that the proposed model is sensitive to the velocity and direction of the user and applies customized Multi Interval Algebra (MIA with Voronoi Continuous Range Query (VCRQ to introduce spatio-temporally relevant contexts according to their arrangement in space. In this implementation the Spatio-Temporal Relevancy Model for Context-Aware Systems (STRMCAS helps the tourist to find his/her preferred areas that are spatio-temporally relevant. The experimental results in a scenario of tourist navigation are evaluated with respect to the accuracy of the model, performance time and satisfaction of users in 30 iterations of the algorithm. The evaluation process demonstrated the efficiency of the model in real-world applications.

  4. Efficient processing of 3-sided range queries with probabilistic guarantees

    DEFF Research Database (Denmark)

    Kaporis, Alexis; Papadopoulos, Apostolos; Sioutas, Spyros

    2010-01-01

    case time and scales with O (log log n) expected with high probability update time, under continuous μ-random distributions of the x and y coordinates, where n is the current number of stored points and t is the size of the query output. Our expected update bound constitutes a considerable improvement...... probability and an update time of O(log log n) expected with high probability, under the assumption that the x-coordinates are continuously drawn from a smooth distribution and the y-coordinates are continuously drawn from a more restricted class of distributions. The total space is linear. Finally, we......This work studies the problem of 2-dimensional searching for the 3-sided range query of the form [a, b] x (-∞, c] in both main and external memory, by considering a variety of input distributions. A dynamic linear main memory solution is proposed, which answers 3-sided queries in O(log n + t) worst...

  5. Efficient external memory structures for range-aggregate queries

    DEFF Research Database (Denmark)

    Agarwal, P.K.; Yang, J.; Arge, L.

    2013-01-01

    We present external memory data structures for efficiently answering range-aggregate queries. The range-aggregate problem is defined as follows: Given a set of weighted points in Rd, compute the aggregate of the weights of the points that lie inside a d-dimensional orthogonal query rectangle. The...

  6. On (dynamic) range minimum queries in external memory

    DEFF Research Database (Denmark)

    Arge, L.; Fischer, Johannes; Sanders, Peter

    2013-01-01

    We study the one-dimensional range minimum query (RMQ) problem in the external memory model. We provide the first space-optimal solution to the batched static version of the problem. On an instance with N elements and Q queries, our solution takes Θ(sort(N + Q)) = Θ( N+QB log M /B N+QB ) I...

  7. Evaluation of Content-Matched Range Monitoring Queries over Moving Objects in Mobile Computing Environments.

    Science.gov (United States)

    Jung, HaRim; Song, MoonBae; Youn, Hee Yong; Kim, Ung Mo

    2015-09-18

    A content-matched (CM) rangemonitoring query overmoving objects continually retrieves the moving objects (i) whose non-spatial attribute values are matched to given non-spatial query values; and (ii) that are currently located within a given spatial query range. In this paper, we propose a new query indexing structure, called the group-aware query region tree (GQR-tree) for efficient evaluation of CMrange monitoring queries. The primary role of the GQR-tree is to help the server leverage the computational capabilities of moving objects in order to improve the system performance in terms of the wireless communication cost and server workload. Through a series of comprehensive simulations, we verify the superiority of the GQR-tree method over the existing methods.

  8. Efficient continuously moving top-k spatial keyword query processing

    DEFF Research Database (Denmark)

    Wu, Dingming; Yiu, Man Lung; Jensen, Christian S.

    2011-01-01

    Web users and content are increasingly being geo-positioned. This development gives prominence to spatial keyword queries, which involve both the locations and textual descriptions of content. We study the efficient processing of continuously moving top-k spatial keyword (MkSK) queries over spatial...... keyword data. State-of-the-art solutions for moving queries employ safe zones that guarantee the validity of reported results as long as the user remains within a zone. However, existing safe zone methods focus solely on spatial locations and ignore text relevancy. We propose two algorithms for computing...

  9. Succinct Representations of Binary Trees for Range Minimum Queries

    DEFF Research Database (Denmark)

    Davoodi, Pooya; Raman, Rajeev; Satti, Srinivasa

    2012-01-01

    We provide two succinct representations of binary trees that can be used to represent the Cartesian tree of an array A of size n. Both the representations take the optimal 2n + o(n) bits of space in the worst case and support range minimum queries (RMQs) in O(1) time. The first one is a modificat......We provide two succinct representations of binary trees that can be used to represent the Cartesian tree of an array A of size n. Both the representations take the optimal 2n + o(n) bits of space in the worst case and support range minimum queries (RMQs) in O(1) time. The first one...

  10. Two Dimensional Range Minimum Queries and Fibonacci Lattices

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Davoodi, Pooya; Lewenstein, Moshe

    2012-01-01

    technique—the discrepancy properties of Fibonacci lattices—we give an indexing data structure for 2D-RMQs that uses O(N/c) bits additional space with O(clogc(loglogc)2) query time, for any parameter c, 4 ≤ c ≤ N. Also, when the entries of the input matrix are from {0,1}, we show that the query time can...

  11. CSRQ: Communication-Efficient Secure Range Queries in Two-Tiered Sensor Networks.

    Science.gov (United States)

    Dai, Hua; Ye, Qingqun; Yang, Geng; Xu, Jia; He, Ruiliang

    2016-02-20

    In recent years, we have seen many applications of secure query in two-tiered wireless sensor networks. Storage nodes are responsible for storing data from nearby sensor nodes and answering queries from Sink. It is critical to protect data security from a compromised storage node. In this paper, the Communication-efficient Secure Range Query (CSRQ)-a privacy and integrity preserving range query protocol-is proposed to prevent attackers from gaining information of both data collected by sensor nodes and queries issued by Sink. To preserve privacy and integrity, in addition to employing the encoding mechanisms, a novel data structure called encrypted constraint chain is proposed, which embeds the information of integrity verification. Sink can use this encrypted constraint chain to verify the query result. The performance evaluation shows that CSRQ has lower communication cost than the current range query protocols.

  12. CSRQ: Communication-Efficient Secure Range Queries in Two-Tiered Sensor Networks

    Science.gov (United States)

    Dai, Hua; Ye, Qingqun; Yang, Geng; Xu, Jia; He, Ruiliang

    2016-01-01

    In recent years, we have seen many applications of secure query in two-tiered wireless sensor networks. Storage nodes are responsible for storing data from nearby sensor nodes and answering queries from Sink. It is critical to protect data security from a compromised storage node. In this paper, the Communication-efficient Secure Range Query (CSRQ)—a privacy and integrity preserving range query protocol—is proposed to prevent attackers from gaining information of both data collected by sensor nodes and queries issued by Sink. To preserve privacy and integrity, in addition to employing the encoding mechanisms, a novel data structure called encrypted constraint chain is proposed, which embeds the information of integrity verification. Sink can use this encrypted constraint chain to verify the query result. The performance evaluation shows that CSRQ has lower communication cost than the current range query protocols. PMID:26907293

  13. CSRQ: Communication-Efficient Secure Range Queries in Two-Tiered Sensor Networks

    Directory of Open Access Journals (Sweden)

    Hua Dai

    2016-02-01

    Full Text Available In recent years, we have seen many applications of secure query in two-tiered wireless sensor networks. Storage nodes are responsible for storing data from nearby sensor nodes and answering queries from Sink. It is critical to protect data security from a compromised storage node. In this paper, the Communication-efficient Secure Range Query (CSRQ—a privacy and integrity preserving range query protocol—is proposed to prevent attackers from gaining information of both data collected by sensor nodes and queries issued by Sink. To preserve privacy and integrity, in addition to employing the encoding mechanisms, a novel data structure called encrypted constraint chain is proposed, which embeds the information of integrity verification. Sink can use this encrypted constraint chain to verify the query result. The performance evaluation shows that CSRQ has lower communication cost than the current range query protocols.

  14. Data Structures: Sequence Problems, Range Queries, and Fault Tolerance

    DEFF Research Database (Denmark)

    Jørgensen, Allan Grønlund

    a certain function on the elements in a given query subsequence. There are many types of functions that has been considered in connection with input from dierent sources. The input could be ip-data sorted by ip-address, real estate prices sorted by zip code, advertising cost sorted by time etc. We consider...

  15. Linear-Space Data Structures for Range Mode Query in Arrays

    DEFF Research Database (Denmark)

    Chan, Timothy M.; Durocher, Stephane; Larsen, Kasper Green

    2012-01-01

    that supports range mode queries in O(sqrt(n / log n)) worst-case time. Furthermore, we present strong evidence that a query time significantly below sqrt(n) cannot be achieved by purely combinatorial techniques; we show that boolean matrix multiplication of two sqrt(n) by sqrt(n) matrices reduces to n range...

  16. Linear-Space Data Structures for Range Mode Query in Arrays

    DEFF Research Database (Denmark)

    Chan, Timothy M.; Durocher, Stephane; Larsen, Kasper Green

    2014-01-01

    that supports range mode queries in O(sqrt(n / log n)) worst-case time. Furthermore, we present strong evidence that a query time significantly below sqrt(n) cannot be achieved by purely combinatorial techniques; we show that boolean matrix multiplication of two sqrt(n) by sqrt(n) matrices reduces to n range...

  17. Efficient Verifiable Range and Closest Point Queries in Zero-Knowledge

    Directory of Open Access Journals (Sweden)

    Ghosh Esha

    2016-10-01

    Full Text Available We present an efficient method for answering one-dimensional range and closest-point queries in a verifiable and privacy-preserving manner. We consider a model where a data owner outsources a dataset of key-value pairs to a server, who answers range and closest-point queries issued by a client and provides proofs of the answers. The client verifies the correctness of the answers while learning nothing about the dataset besides the answers to the current and previous queries. Our work yields for the first time a zero-knowledge privacy assurance to authenticated range and closest-point queries. Previous work leaked the size of the dataset and used an inefficient proof protocol. Our construction is based on hierarchical identity-based encryption. We prove its security and analyze its efficiency both theoretically and with experiments on synthetic and real data (Enron email and Boston taxi datasets.

  18. Efficient Processing of Continuous Skyline Query over Smarter Traffic Data Stream for Cloud Computing

    Directory of Open Access Journals (Sweden)

    Wang Hanning

    2013-01-01

    Full Text Available The analyzing and processing of multisource real-time transportation data stream lay a foundation for the smart transportation's sensibility, interconnection, integration, and real-time decision making. Strong computing ability and valid mass data management mode provided by the cloud computing, is feasible for handling Skyline continuous query in the mass distributed uncertain transportation data stream. In this paper, we gave architecture of layered smart transportation about data processing, and we formalized the description about continuous query over smart transportation data Skyline. Besides, we proposed mMR-SUDS algorithm (Skyline query algorithm of uncertain transportation stream data based on micro-batchinMap Reduce based on sliding window division and architecture.

  19. Practical Forward-Secure Range and Sort Queries with Update-Oblivious Linked Lists

    Directory of Open Access Journals (Sweden)

    Blass Erik-Oliver

    2015-06-01

    Full Text Available We revisit the problem of privacy-preserving range search and sort queries on encrypted data in the face of an untrusted data store. Our new protocol RASP has several advantages over existing work. First, RASP strengthens privacy by ensuring forward security: after a query for range [a, b], any new record added to the data store is indistinguishable from random, even if the new record falls within range [a, b]. We are able to accomplish this using only traditional hash and block cipher operations, abstaining from expensive asymmetric cryptography and bilinear pairings. Consequently, RASP is highly practical, even for large database sizes. Additionally, we require only cloud storage and not a computational cloud like related works, which can reduce monetary costs significantly. At the heart of RASP, we develop a new update-oblivious bucket-based data structure. We allow for data to be added to buckets without leaking into which bucket it has been added. As long as a bucket is not explicitly queried, the data store does not learn anything about bucket contents. Furthermore, no information is leaked about data additions following a query. Besides formally proving RASP’s privacy, we also present a practical evaluation of RASP on Amazon Dynamo, demonstrating its efficiency and real world applicability.

  20. Collusion-Aware Privacy-Preserving Range Query in Tiered Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Xiaoying Zhang

    2014-12-01

    Full Text Available Wireless sensor networks (WSNs are indispensable building blocks for the Internet of Things (IoT. With the development of WSNs, privacy issues have drawn more attention. Existing work on the privacy-preserving range query mainly focuses on privacy preservation and integrity verification in two-tiered WSNs in the case of compromisedmaster nodes, but neglects the damage of node collusion. In this paper, we propose a series of collusion-aware privacy-preserving range query protocols in two-tiered WSNs. To the best of our knowledge, this paper is the first to consider collusion attacks for a range query in tiered WSNs while fulfilling the preservation of privacy and integrity. To preserve the privacy of data and queries, we propose a novel encoding scheme to conceal sensitive information. To preserve the integrity of the results, we present a verification scheme using the correlation among data. In addition, two schemes are further presented to improve result accuracy and reduce communication cost. Finally, theoretical analysis and experimental results confirm the efficiency, accuracy and privacy of our proposals.

  1. Optimizing Cost of Continuous Overlapping Queries over Data Streams by Filter Adaption

    KAUST Repository

    Xie, Qing

    2016-01-12

    The problem we aim to address is the optimization of cost management for executing multiple continuous queries on data streams, where each query is defined by several filters, each of which monitors certain status of the data stream. Specially the filter can be shared by different queries and expensive to evaluate. The conventional objective for such a problem is to minimize the overall execution cost to solve all queries, by planning the order of filter evaluation in shared strategy. However, in streaming scenario, the characteristics of data items may change in process, which can bring some uncertainty to the outcome of individual filter evaluation, and affect the plan of query execution as well as the overall execution cost. In our work, considering the influence of the uncertain variation of data characteristics, we propose a framework to deal with the dynamic adjustment of filter ordering for query execution on data stream, and focus on the issues of cost management. By incrementally monitoring and analyzing the results of filter evaluation, our proposed approach can be effectively adaptive to the varied stream behavior and adjust the optimal ordering of filter evaluation, so as to optimize the execution cost. In order to achieve satisfactory performance and efficiency, we also discuss the trade-off between the adaptivity of our framework and the overhead incurred by filter adaption. The experimental results on synthetic and two real data sets (traffic and multimedia) show that our framework can effectively reduce and balance the overall query execution cost and keep high adaptivity in streaming scenario.

  2. Scalable Continuous Range Monitoring of Moving Objects in Symbolic Indoor Space

    DEFF Research Database (Denmark)

    Yang, Bin; Lu, Hua; Jensen, Christian Søndergaard

    2009-01-01

    Indoor spaces accommodate large populations of individuals. The continuous range monitoring of such objects can be used as a foundation for a wide variety of applications, e.g., space planning, way finding, and security. Indoor space differs from outdoor space in that symbolic locations, e.......g., rooms, rather than Euclidean positions or spatial network locations are important. In addition, positioning based on presence sensing devices, rather than, e.g., GPS, is assumed. Such devices report the objects in their activation ranges. We propose an incremental, query-aware continuous range query...... processing technique for objects moving in this setting. A set of critical devices is determined for each query, and only the observations from those devices are used to continuously maintain the query result. Due to the limitations of the positioning devices, queries contain certain and uncertain results...

  3. Performance of Point and Range Queries for In-memory Databases using Radix Trees on GPUs

    Energy Technology Data Exchange (ETDEWEB)

    Alam, Maksudul [ORNL; Yoginath, Srikanth B [ORNL; Perumalla, Kalyan S [ORNL

    2016-01-01

    In in-memory database systems augmented by hardware accelerators, accelerating the index searching operations can greatly increase the runtime performance of database queries. Recently, adaptive radix trees (ART) have been shown to provide very fast index search implementation on the CPU. Here, we focus on an accelerator-based implementation of ART. We present a detailed performance study of our GPU-based adaptive radix tree (GRT) implementation over a variety of key distributions, synthetic benchmarks, and actual keys from music and book data sets. The performance is also compared with other index-searching schemes on the GPU. GRT on modern GPUs achieves some of the highest rates of index searches reported in the literature. For point queries, a throughput of up to 106 million and 130 million lookups per second is achieved for sparse and dense keys, respectively. For range queries, GRT yields 600 million and 1000 million lookups per second for sparse and dense keys, respectively, on a large dataset of 64 million 32-bit keys.

  4. Manycore processing of repeated range queries over massive moving objects observations

    DEFF Research Database (Denmark)

    Lettich, Francesco; Orlando, Salvatore; Silvestri, Claudio

    2014-01-01

    processing we devise a hybrid CPU/GPU pipeline that compresses data output and save query processing work. The devised system relies on an ad-hoc spatial index leading to a problem decomposition that results in a set of independent data-parallel tasks. The index is based on a point-region quadtree space...

  5. Superfund Query

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Superfund Query allows users to retrieve data from the Comprehensive Environmental Response, Compensation, and Liability Information System (CERCLIS) database.

  6. Smart query answering for marine sensor data.

    Science.gov (United States)

    Shahriar, Md Sumon; de Souza, Paulo; Timms, Greg

    2011-01-01

    We review existing query answering systems for sensor data. We then propose an extended query answering approach termed smart query, specifically for marine sensor data. The smart query answering system integrates pattern queries and continuous queries. The proposed smart query system considers both streaming data and historical data from marine sensor networks. The smart query also uses query relaxation technique and semantics from domain knowledge as a recommender system. The proposed smart query benefits in building data and information systems for marine sensor networks.

  7. Smart Query Answering for Marine Sensor Data

    Directory of Open Access Journals (Sweden)

    Paulo de Souza

    2011-03-01

    Full Text Available We review existing query answering systems for sensor data. We then propose an extended query answering approach termed smart query, specifically for marine sensor data. The smart query answering system integrates pattern queries and continuous queries. The proposed smart query system considers both streaming data and historical data from marine sensor networks. The smart query also uses query relaxation technique and semantics from domain knowledge as a recommender system. The proposed smart query benefits in building data and information systems for marine sensor networks.

  8. Scribble query

    DEFF Research Database (Denmark)

    Nielsen, Matthias; Elmqvist, Niklas; Grønbæk, Kaj

    2016-01-01

    The wide availability of touch-enabled devices is a unique opportunity for visualization research to invent novel techniques to fluently explore, analyse, and understand complex and large-scale data. In this paper, we introduce Scribble Query, a novel interaction technique for fluid freehand scri...... visualization with Scribble Query. The studies suggest that Scribble Query has a low entry barrier facilitating easy adoption, casual and infrequent usage, and in one case, enabled live dissemination of findings by the domain expert to managers in the organization....... scribbling (casual drawing) on touch-enabled devices to support interactive querying in data visualizations. Inspired by the low-entry yet rich interaction of touch drawing applications, a Scribble Query can be created with a single touch stroke yet have the expressiveness of multiple brushes (a...

  9. Query responses

    Directory of Open Access Journals (Sweden)

    Paweł Łupkowski

    2017-05-01

    Full Text Available In this article we consider the phenomenon of answering a query with a query. Although such answers are common, no large scale, corpus-based characterization exists, with the exception of clarification requests. After briefly reviewing different theoretical approaches on this subject, we present a corpus study of query responses in the British National Corpus and develop a taxonomy for query responses. We point at a variety of response categories that have not been formalized in previous dialogue work, particularly those relevant to adversarial interaction. We show that different response categories have significantly different rates of subsequent answer provision. We provide a formal analysis of the response categories in the framework of KoS.

  10. jQuery Mobile

    CERN Document Server

    Reid, Jon

    2011-01-01

    Native apps have distinct advantages, but the future belongs to mobile web apps that function on a broad range of smartphones and tablets. Get started with jQuery Mobile, the touch-optimized framework for creating apps that look and behave consistently across many devices. This concise book provides HTML5, CSS3, and JavaScript code examples, screen shots, and step-by-step guidance to help you build a complete working app with jQuery Mobile. If you're already familiar with the jQuery JavaScript library, you can use your existing skills to build cross-platform mobile web apps right now. This b

  11. A Survey on Efficient Power Consumption Method for Continuous Location-Based Spatial Queries in Mobile Environment

    Directory of Open Access Journals (Sweden)

    Vijay Kumar

    2014-07-01

    Full Text Available In today’s growing world saving of time and energy is much considerable. Mobile users are very common for human beings. It is beneficial in use not only for call but also for different uses i.e. find a particular place in unknown city or place. It saves both time and energy towards searching the place. Many researchers have been done in this regard. But they have problem like consuming time and speed to search the location by mobile. Approach: This paper proposed algorithm based on circular location finder (CLF. There are many algorithms available like proxy based location search for continuous near neighbor (CNN, estimated valid region (EVR, and estimated window vector (EWV for region search. These are not efficient in sense of consumption of time and energy. Results: Based on our study, circular location finder (CLF increases approximately 68% speed and decrease 3 times power consumption taken by mobile application. CLF algorithm is efficient in both speed and power consumption

  12. QUERY SUPPORT FOR GMZ

    Directory of Open Access Journals (Sweden)

    A. Khandelwal

    2017-07-01

    Full Text Available Generic text-based compression models are simple and fast but there are two issues that needs to be addressed. They cannot leverage the structure that exists in data to achieve better compression and there is an unnecessary decompression step before the user can actually use the data. To address these issues, we came up with GMZ, a lossless compression model aimed at achieving high compression ratios. The decision to design GMZ (Khandelwal and Rajan, 2017 exclusively for GML's Simple Features Profile (SFP seems fair because of the high use of SFP in WFS and that it facilitates high optimisation of the compression model. This is an extension of our work on GMZ. In a typical server-client model such as Web Feature Service, the server is the primary creator and provider of GML, and therefore, requires compression and query capabilities. On the other hand, the client is the primary consumer of GML, and therefore, requires decompression and visualisation capabilities. In the first part of our work, we demonstrated compression using a python script that can be plugged in a server architecture, and decompression and visualisation in a web browser using a Firefox addon. The focus of this work is to develop the already existing tools to provide query capability to server. Our model provides the ability to decompress individual features in isolation, which is an essential requirement for realising query in compressed state. We con - struct an R-Tree index for spatial data and a custom index for non-spatial data and store these in a separate index file to prevent alter - ing the compression model. This facilitates independent use of compressed GMZ file where index can be constructed when required. The focus of this work is the bounding-box or range query commonly used in webGIS with provision for other spatial and non-spatial queries. The decrement in compression ratios due to the new index file is in the range of 1–3 percent which is trivial considering

  13. Continuous magnetic refrigeration in the superfluid helium range

    International Nuclear Information System (INIS)

    Lacaze, Alain.

    1982-10-01

    An experimental prototype magnetic refrigerator based on the well known adiabatic demagnetization principle is described. A continuous process is employed in which gadolinium garnet follows successive magnetization-demagnetization cycles between a hot liquid helium source at 4.2K and a cold superfluid helium source at T [fr

  14. Instant jQuery selectors

    CERN Document Server

    De Rosa, Aurelio

    2013-01-01

    Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. Instant jQuery Selectors follows a simple how-to format with recipes aimed at making you well versed with the wide range of selectors that jQuery has to offer through a myriad of examples.Instant jQuery Selectors is for web developers who want to delve into jQuery from its very starting point: selectors. Even if you're already familiar with the framework and its selectors, you could find several tips and tricks that you aren't aware of, especially about performance and how jQuery ac

  15. Predecessor queries in dynamic integer sets

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting

    1997-01-01

    We consider the problem of maintaining a set of n integers in the range 0.2w–1 under the operations of insertion, deletion, predecessor queries, minimum queries and maximum queries on a unit cost RAM with word size w bits. Let f (n) be an arbitrary nondecreasing smooth function satisfying n...

  16. Approximate dictionary queries

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Gasieniec, Leszek

    1996-01-01

    Given a set of n binary strings of length m each. We consider the problem of answering d-queries. Given a binary query string of length m, a d-query is to report if there exists a string in the set within Hamming distance d of . We present a data structure of size O(nm) supporting 1-queries in ti...

  17. The role of economics in the QUERI program: QUERI Series

    Directory of Open Access Journals (Sweden)

    Smith Mark W

    2008-04-01

    Full Text Available Abstract Background The United States (U.S. Department of Veterans Affairs (VA Quality Enhancement Research Initiative (QUERI has implemented economic analyses in single-site and multi-site clinical trials. To date, no one has reviewed whether the QUERI Centers are taking an optimal approach to doing so. Consistent with the continuous learning culture of the QUERI Program, this paper provides such a reflection. Methods We present a case study of QUERI as an example of how economic considerations can and should be integrated into implementation research within both single and multi-site studies. We review theoretical and applied cost research in implementation studies outside and within VA. We also present a critique of the use of economic research within the QUERI program. Results Economic evaluation is a key element of implementation research. QUERI has contributed many developments in the field of implementation but has only recently begun multi-site implementation trials across multiple regions within the national VA healthcare system. These trials are unusual in their emphasis on developing detailed costs of implementation, as well as in the use of business case analyses (budget impact analyses. Conclusion Economics appears to play an important role in QUERI implementation studies, only after implementation has reached the stage of multi-site trials. Economic analysis could better inform the choice of which clinical best practices to implement and the choice of implementation interventions to employ. QUERI economics also would benefit from research on costing methods and development of widely accepted international standards for implementation economics.

  18. Optimizing Temporal Queries

    DEFF Research Database (Denmark)

    Toman, David; Bowman, Ivan Thomas

    2003-01-01

    , these query languages are implemented by translating temporal queries into standard relational queries. However, the compiled queries are often quite cumbersome and expensive to execute even using state-of-the-art relational products. This paper presents an optimization technique that produces more efficient...... translated SQL queries by taking into account the properties of the encoding used for temporal attributes. For concreteness, this translation technique is presented in the context of SQL/TP; however, these techniques are also applicable to other temporal query languages....

  19. Condorcet query engine: A query engine for coordinated index terms

    NARCIS (Netherlands)

    van der Vet, P.E.; Mars, Nicolaas

    1999-01-01

    On-line information retrieval systems often offer their users some means to tune the query to match the level of granularity of the information request. Users can be offered a far greater range of possibilities, however, if documents are indexed with coordinated index concepts. Coordinated index

  20. Recommending Multidimensional Queries

    Science.gov (United States)

    Giacometti, Arnaud; Marcel, Patrick; Negre, Elsa

    Interactive analysis of datacube, in which a user navigates a cube by launching a sequence of queries is often tedious since the user may have no idea of what the forthcoming query should be in his current analysis. To better support this process we propose in this paper to apply a Collaborative Work approach that leverages former explorations of the cube to recommend OLAP queries. The system that we have developed adapts Approximate String Matching, a technique popular in Information Retrieval, to match the current analysis with the former explorations and help suggesting a query to the user. Our approach has been implemented with the open source Mondrian OLAP server to recommend MDX queries and we have carried out some preliminary experiments that show its efficiency for generating effective query recommendations.

  1. Indexing for summary queries

    DEFF Research Database (Denmark)

    Yi, Ke; Wang, Lu; Wei, Zhewei

    2014-01-01

    returned by reporting queries. In this article, we design indexing techniques that allow for extracting a statistical summary of all the records in the query. The summaries we support include frequent items, quantiles, and various sketches, all of which are of central importance in massive data analysis....... Our indexes require linear space and extract a summary with the optimal or near-optimal query cost. We illustrate the efficiency and usefulness of our designs through extensive experiments and a system demonstration....

  2. Flexible Query Answering Systems

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 12th International Conference on Flexible Query Answering Systems, FQAS 2017, held in London, UK, in June 2017. The 21 full papers presented in this book together with 4 short papers were carefully reviewed and selected from 43 submissions....... The papers cover the following topics: foundations of flexible querying; recommendation and ranking; technologies for flexible representations and querying; knowledge discovery and information/data retrieval; intuitionistic sets; and generalized net model....

  3. Unemployment Insurance Query (UIQ)

    Data.gov (United States)

    Social Security Administration — The Unemployment Insurance Query (UIQ) provides State Unemployment Insurance agencies real-time online access to SSA data. This includes SSN verification and Title...

  4. Mastering jQuery

    CERN Document Server

    Libby, Alex

    2015-01-01

    If you are a developer who is already familiar with using jQuery and wants to push your skill set further, then this book is for you. The book assumes an intermediate knowledge level of jQuery, JavaScript, HTML5, and CSS.

  5. Query complexity in expectation

    NARCIS (Netherlands)

    Kaniewski, J.; Lee, T.; de Wolf, R.; Halldórsson, M.M.; Iwama, K.; Kobayashi, N.; Speckmann, B.

    2015-01-01

    We study the query complexity of computing a function f:{0,1}n→R+ in expectation. This requires the algorithm on input x to output a nonnegative random variable whose expectation equals f(x), using as few queries to the input x as possible. We exactly characterize both the randomized and the quantum

  6. Querying Workflow Logs

    Directory of Open Access Journals (Sweden)

    Yan Tang

    2018-01-01

    Full Text Available A business process or workflow is an assembly of tasks that accomplishes a business goal. Business process management is the study of the design, configuration/implementation, enactment and monitoring, analysis, and re-design of workflows. The traditional methodology for the re-design and improvement of workflows relies on the well-known sequence of extract, transform, and load (ETL, data/process warehousing, and online analytical processing (OLAP tools. In this paper, we study the ad hoc queryiny of process enactments for (data-centric business processes, bypassing the traditional methodology for more flexibility in querying. We develop an algebraic query language based on “incident patterns” with four operators inspired from Business Process Model and Notation (BPMN representation, allowing the user to formulate ad hoc queries directly over workflow logs. A formal semantics of this query language, a preliminary query evaluation algorithm, and a group of elementary properties of the operators are provided.

  7. Multiple k Nearest Neighbor Query Processing in Spatial Network Databases

    DEFF Research Database (Denmark)

    Xuegang, Huang; Jensen, Christian Søndergaard; Saltenis, Simonas

    2006-01-01

    This paper concerns the efficient processing of multiple k nearest neighbor queries in a road-network setting. The assumed setting covers a range of scenarios such as the one where a large population of mobile service users that are constrained to a road network issue nearest-neighbor queries...... for points of interest that are accessible via the road network. Given multiple k nearest neighbor queries, the paper proposes progressive techniques that selectively cache query results in main memory and subsequently reuse these for query processing. The paper initially proposes techniques for the case...... neighbor query processing....

  8. XML Multidimensional Modelling and Querying

    OpenAIRE

    Boucher, Serge; Verhaegen, Boris; Zimányi, Esteban

    2009-01-01

    As XML becomes ubiquitous and XML storage and processing becomes more efficient, the range of use cases for these technologies widens daily. One promising area is the integration of XML and data warehouses, where an XML-native database stores multidimensional data and processes OLAP queries written in the XQuery interrogation language. This paper explores issues arising in the implementation of such a data warehouse. We first compare approaches for multidimensional data modelling in XML, then...

  9. Querying metabolism under different physiological constraints.

    Science.gov (United States)

    Cakmak, Ali; Ozsoyoglu, Gultekin; Hanson, Richard W

    2010-04-01

    Metabolism is a representation of the biochemical principles that govern the production, consumption, degradation, and biosynthesis of metabolites in living cells. Organisms respond to changes in their physiological conditions or environmental perturbations (i.e. constraints) via cooperative implementation of such principles. Querying inner working principles of metabolism under different constraints provides invaluable insights for both researchers and educators. In this paper, we propose a metabolism query language (MQL) and discuss its query processing. MQL enables researchers to explore the behavior of the metabolism with a wide-range of predicates including dietary and physiological condition specifications. The query results of MQL are enriched with both textual and visual representations, and its query processing is completely tailored based on the underlying metabolic principles.

  10. The Medical Query Language

    Science.gov (United States)

    Shusman, Daniel J.; Morgan, Mary M.; Zielstorff, Rita; Barnett, G. Octo

    1983-01-01

    The Medical Query Language (MQL) is an English-like query language with which a user with little or no training in programming or computer science can formulate and satisfy inquiries on data contained in his/her Standard MUMPS database. To date, major applications of MQL have been in the areas of quality assurance, medical research, and practice administration at sites using the COmputer STored Ambulatory Record (COSTAR) database system.

  11. A family of operators with discontinuous ranges and approximation and restoration of continuous functions

    Science.gov (United States)

    Khromov, A. P.; Khromova, G. V.

    2013-10-01

    A class of operators whose construction is similar to but simpler than the construction of Landau operators is the basis for the design of a modified family of operators with discontinuous ranges. This family has good approximation properties with respect to approximation and restoration problems for continuous functions.

  12. Code query by example

    Science.gov (United States)

    Vaucouleur, Sebastien

    2011-02-01

    We introduce code query by example for customisation of evolvable software products in general and of enterprise resource planning systems (ERPs) in particular. The concept is based on an initial empirical study on practices around ERP systems. We motivate our design choices based on those empirical results, and we show how the proposed solution helps with respect to the infamous upgrade problem: the conflict between the need for customisation and the need for upgrade of ERP systems. We further show how code query by example can be used as a form of lightweight static analysis, to detect automatically potential defects in large software products. Code query by example as a form of lightweight static analysis is particularly interesting in the context of ERP systems: it is often the case that programmers working in this field are not computer science specialists but more of domain experts. Hence, they require a simple language to express custom rules.

  13. Flexible Query Answering Systems

    DEFF Research Database (Denmark)

    are organized in a general session train and a parallel special session track. The general session train covers the following topics: querying-answering systems; semantic technology; patterns and classification; personalization and recommender systems; searching and ranking; and Web and human......-computer interaction. The special track covers some some specific and, typically, newer fields, namely: environmental scanning for strategic early warning; generating linguistic descriptions of data; advances in fuzzy querying and fuzzy databases: theory and applications; fusion and ensemble techniques for on......This book constitutes the refereed proceedings of the 10th International Conference on Flexible Query Answering Systems, FQAS 2013, held in Granada, Spain, in September 2013. The 59 full papers included in this volume were carefully reviewed and selected from numerous submissions. The papers...

  14. Robust Optimization of Database Queries

    Indian Academy of Sciences (India)

    JAYANT

    2011-07-06

    Jul 6, 2011 ... join order [ ((S R) C) or ((R C) S) ? ] join techniques [ Nested-Loops or Sort-Merge or Hash ? ] ○ DBMS query optimizer identifies the optimal. ○ DBMS query optimizer identifies the optimal evaluation strategy: “query execution plan”. July 2011. Robust Query Optimization (IASc Mid-year Meeting). 6 ...

  15. Collective spatial keyword querying

    DEFF Research Database (Denmark)

    Cao, Xin; Cong, Gao; Jensen, Christian S.

    2011-01-01

    With the proliferation of geo-positioning and geo-tagging, spatial web objects that possess both a geographical location and a textual description are gaining in prevalence, and spatial keyword queries that exploit both location and textual description are gaining in prominence. However, the quer......With the proliferation of geo-positioning and geo-tagging, spatial web objects that possess both a geographical location and a textual description are gaining in prevalence, and spatial keyword queries that exploit both location and textual description are gaining in prominence. However...

  16. Flexible Query Answering Systems

    DEFF Research Database (Denmark)

    are organized in a general session train and a parallel special session track. The general session train covers the following topics: querying-answering systems; semantic technology; patterns and classification; personalization and recommender systems; searching and ranking; and Web and human......This book constitutes the refereed proceedings of the 10th International Conference on Flexible Query Answering Systems, FQAS 2013, held in Granada, Spain, in September 2013. The 59 full papers included in this volume were carefully reviewed and selected from numerous submissions. The papers...

  17. Learning jQuery

    CERN Document Server

    Chaffer, Jonathan

    2013-01-01

    Step through each of the core concepts of the jQuery library, building an overall picture of its capabilities. Once you have thoroughly covered the basics, the book returns to each concept to cover more advanced examples and techniques.This book is for web designers who want to create interactive elements for their designs, and for developers who want to create the best user interface for their web applications. Basic JavaScript programming and knowledge of HTML and CSS is required. No knowledge of jQuery is assumed, nor is experience with any other JavaScript libraries.

  18. Efficient Approximate OLAP Querying Over Time Series

    DEFF Research Database (Denmark)

    Perera, Kasun Baruhupolage Don Kasun Sanjeewa; Hahmann, Martin; Lehner, Wolfgang

    2016-01-01

    The ongoing trend for data gathering not only produces larger volumes of data, but also increases the variety of recorded data types. Out of these, especially time series, e.g. various sensor readings, have attracted attention in the domains of business intelligence and decision making. As OLAP...... queries play a major role in these domains, it is desirable to also execute them on time series data. While this is not a problem on the conceptual level, it can become a bottleneck with regards to query run-time. In general, processing OLAP queries gets more computationally intensive as the volume...... are either costly or require continuous maintenance. In this paper we propose an approach for approximate OLAP querying of time series that offers constant latency and is maintenance-free. To achieve this, we identify similarities between aggregation cuboids and propose algorithms that eliminate...

  19. Medical Query Language

    OpenAIRE

    Morgan, Mary M.; Beaman, Peter D.; Shusman, Daniel J.; Hupp, Jon A.; Zielstorff, Rita D.; Barnett, G. Octo

    1981-01-01

    This paper describes the Medical Query Language (MQL), a “formal” language which enables unsophisticated users, having no background in programming or computer science, to express information retrieval and analysis questions of their data bases. MQL is designed to access any MUMPS data base. Most MQL applications to date have dealt with the COmputer STored Ambulatory Record (COSTAR) data base.

  20. Spatial Keyword Querying

    DEFF Research Database (Denmark)

    Cao, Xin; Chen, Lisi; Cong, Gao

    2012-01-01

    The web is increasingly being used by mobile users. In addition, it is increasingly becoming possible to accurately geo-position mobile users and web content. This development gives prominence to spatial web data management. Specifically, a spatial keyword query takes a user location and user...

  1. Approximating terminological queries

    NARCIS (Netherlands)

    Stuckenschmidt, Heiner; Van Harmelen, Frank

    2002-01-01

    Current proposals for languages to encode terminological knowledge in intelligent systems support logical reasoning for answering user queries about objects and classes. An application of these languages on the World Wide Web, however, is hampered by the limitations of logical reasoning in terms

  2. Learning via Query Synthesis

    KAUST Repository

    Alabdulmohsin, Ibrahim Mansour

    2017-05-07

    Active learning is a subfield of machine learning that has been successfully used in many applications. One of the main branches of active learning is query synthe- sis, where the learning agent constructs artificial queries from scratch in order to reveal sensitive information about the underlying decision boundary. It has found applications in areas, such as adversarial reverse engineering, automated science, and computational chemistry. Nevertheless, the existing literature on membership query synthesis has, generally, focused on finite concept classes or toy problems, with a limited extension to real-world applications. In this thesis, I develop two spectral algorithms for learning halfspaces via query synthesis. The first algorithm is a maximum-determinant convex optimization method while the second algorithm is a Markovian method that relies on Khachiyan’s classical update formulas for solving linear programs. The general theme of these methods is to construct an ellipsoidal approximation of the version space and to synthesize queries, afterward, via spectral decomposition. Moreover, I also describe how these algorithms can be extended to other settings as well, such as pool-based active learning. Having demonstrated that halfspaces can be learned quite efficiently via query synthesis, the second part of this thesis proposes strategies for mitigating the risk of reverse engineering in adversarial environments. One approach that can be used to render query synthesis algorithms ineffective is to implement a randomized response. In this thesis, I propose a semidefinite program (SDP) for learning a distribution of classifiers, subject to the constraint that any individual classifier picked at random from this distributions provides reliable predictions with a high probability. This algorithm is, then, justified both theoretically and empirically. A second approach is to use a non-parametric classification method, such as similarity-based classification. In this

  3. Capacitance variation measurement method with a continuously variable measuring range for a micro-capacitance sensor

    International Nuclear Information System (INIS)

    Lü, Xiaozhou; Xie, Kai; Xue, Dongfeng; Zhang, Feng; Qi, Liang; Tao, Yebo; Li, Teng; Bao, Weimin; Wang, Songlin; Li, Xiaoping; Chen, Renjie

    2017-01-01

    Micro-capacitance sensors are widely applied in industrial applications for the measurement of mechanical variations. The measurement accuracy of micro-capacitance sensors is highly dependent on the capacitance measurement circuit. To overcome the inability of commonly used methods to directly measure capacitance variation and deal with the conflict between the measurement range and accuracy, this paper presents a capacitance variation measurement method which is able to measure the output capacitance variation (relative value) of the micro-capacitance sensor with a continuously variable measuring range. We present the principles and analyze the non-ideal factors affecting this method. To implement the method, we developed a capacitance variation measurement circuit and carried out experiments to test the circuit. The result shows that the circuit is able to measure a capacitance variation range of 0–700 pF linearly with a maximum relative accuracy of 0.05% and a capacitance range of 0–2 nF (with a baseline capacitance of 1 nF) with a constant resolution of 0.03%. The circuit is proposed as a new method to measure capacitance and is expected to have applications in micro-capacitance sensors for measuring capacitance variation with a continuously variable measuring range. (paper)

  4. KoralQuery -- A General Corpus Query Protocol

    DEFF Research Database (Denmark)

    Bingel, Joachim; Diewald, Nils

    2015-01-01

    The task-oriented and format-driven development of corpus query systems has led to the creation of numerous corpus query languages (QLs) that vary strongly in expressiveness and syntax. This is a severe impediment for the interoperability of corpus analysis systems, which lack a common protocol....... In this paper, we present KoralQuery, a JSON-LD based general corpus query protocol, aiming to be independent of particular QLs, tasks and corpus formats. In addition to describing the system of types and operations that KoralQuery is built on, we exemplify the representation of corpus queries in the serialized...

  5. Precision improvement of frequency-modulated continuous-wave laser ranging system with two auxiliary interferometers

    Science.gov (United States)

    Shi, Guang; Wang, Wen; Zhang, Fumin

    2018-03-01

    The measurement precision of frequency-modulated continuous-wave (FMCW) laser distance measurement should be proportional to the scanning range of the tunable laser. However, the commercial external cavity diode laser (ECDL) is not an ideal tunable laser source in practical applications. Due to the unavoidable mode hopping and scanning nonlinearity of the ECDL, the measurement precision of FMCW laser distance measurements can be substantially affected. Therefore, an FMCW laser ranging system with two auxiliary interferometers is proposed in this paper. Moreover, to eliminate the effects of ECDL, the frequency-sampling method and mode hopping influence suppression method are employed. Compared with a fringe counting interferometer, this FMCW laser ranging system has a measuring error of ± 20 μm at the distance of 5.8 m.

  6. Google BigQuery analytics

    CERN Document Server

    Tigani, Jordan

    2014-01-01

    How to effectively use BigQuery, avoid common mistakes, and execute sophisticated queries against large datasets Google BigQuery Analytics is the perfect guide for business and data analysts who want the latest tips on running complex queries and writing code to communicate with the BigQuery API. The book uses real-world examples to demonstrate current best practices and techniques, and also explains and demonstrates streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results. In addit

  7. Conceptual querying through ontologies

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik

    2009-01-01

    is motivated by an obvious need for users to survey huge volumes of objects in query answers. An ontology formalism and a special notion of-instantiated ontology" are introduced. The latter is a structure reflecting the content in the document collection in that; it is a restriction of a general world......We present here ail approach to conceptual querying where the aim is, given a collection of textual database objects or documents, to target an abstraction of the entire database content in terms of the concepts appearing in documents, rather than the documents in the collection. The approach...... knowledge ontology to the concepts instantiated in the collection. The notion of ontology-based similarity is briefly described, language constructs for direct navigation and retrieval of concepts in the ontology are discussed and approaches to conceptual summarization are presented....

  8. COMPLEX QUERY AND METADATA

    OpenAIRE

    Nakatoh, Tetsuya; Omori, Keisuke; Yamada, Yasuhiro; Hirokawa, Sachio

    2003-01-01

    We are developing a search system DAISEn which integrates multiple search engines and generates a metasearch engine automatically. The target search engines of DAISEn are not general search engines, but are search engines specialized in some area. Integration of such engines yields efficiency and quality. There are search engines of new type which accept complex query and return structured data. Integration of such search engines is much harder than that of simple search engines which accept ...

  9. Effective Density Queries of Continuously Moving Objects

    DEFF Research Database (Denmark)

    Jensen, Christian Søndergaard; Lin, D.; Ooi, B.C.

    2006-01-01

    control system, we need to identify the places that are or would be affected by a traffic jam, and report this information to drivers so that they can choose a less congested route. As a naive way to solve the problem is prohibitively expensive, we first introduce a framework which makes the problem...

  10. Mastering jQuery mobile

    CERN Document Server

    Lambert, Chip

    2015-01-01

    You've started down the path of jQuery Mobile, now begin mastering some of jQuery Mobile's higher level topics. Go beyond jQuery Mobile's documentation and master one of the hottest mobile technologies out there. Previous JavaScript and PHP experience can help you get the most out of this book.

  11. Query optimization over crowdsourced data

    KAUST Repository

    Park, Hyunjung

    2013-08-26

    Deco is a comprehensive system for answering declarative queries posed over stored relational data together with data obtained on-demand from the crowd. In this paper we describe Deco\\'s cost-based query optimizer, building on Deco\\'s data model, query language, and query execution engine presented earlier. Deco\\'s objective in query optimization is to find the best query plan to answer a query, in terms of estimated monetary cost. Deco\\'s query semantics and plan execution strategies require several fundamental changes to traditional query optimization. Novel techniques incorporated into Deco\\'s query optimizer include a cost model distinguishing between "free" existing data versus paid new data, a cardinality estimation algorithm coping with changes to the database state during query execution, and a plan enumeration algorithm maximizing reuse of common subplans in a setting that makes reuse challenging. We experimentally evaluate Deco\\'s query optimizer, focusing on the accuracy of cost estimation and the efficiency of plan enumeration.

  12. Instant Cassandra query language

    CERN Document Server

    Singh, Amresh

    2013-01-01

    Get to grips with a new technology, understand what it is and what it can do for you, and then get to work with the most important features and tasks. It's an Instant Starter guide.Instant Cassandra Query Language is great for those who are working with Cassandra databases and who want to either learn CQL to check data from the console or build serious applications using CQL. If you're looking for something that helps you get started with CQL in record time and you hate the idea of learning a new language syntax, then this book is for you.

  13. PAQ: Persistent Adaptive Query Middleware for Dynamic Environments

    Science.gov (United States)

    Rajamani, Vasanth; Julien, Christine; Payton, Jamie; Roman, Gruia-Catalin

    Pervasive computing applications often entail continuous monitoring tasks, issuing persistent queries that return continuously updated views of the operational environment. We present PAQ, a middleware that supports applications' needs by approximating a persistent query as a sequence of one-time queries. PAQ introduces an integration strategy abstraction that allows composition of one-time query responses into streams representing sophisticated spatio-temporal phenomena of interest. A distinguishing feature of our middleware is the realization that the suitability of a persistent query's result is a function of the application's tolerance for accuracy weighed against the associated overhead costs. In PAQ, programmers can specify an inquiry strategy that dictates how information is gathered. Since network dynamics impact the suitability of a particular inquiry strategy, PAQ associates an introspection strategy with a persistent query, that evaluates the quality of the query's results. The result of introspection can trigger application-defined adaptation strategies that alter the nature of the query. PAQ's simple API makes developing adaptive querying systems easily realizable. We present the key abstractions, describe their implementations, and demonstrate the middleware's usefulness through application examples and evaluation.

  14. The continual reassessment method: comparison of Bayesian stopping rules for dose-ranging studies.

    Science.gov (United States)

    Zohar, S; Chevret, S

    2001-10-15

    The continual reassessment method (CRM) provides a Bayesian estimation of the maximum tolerated dose (MTD) in phase I clinical trials and is also used to estimate the minimal efficacy dose (MED) in phase II clinical trials. In this paper we propose Bayesian stopping rules for the CRM, based on either posterior or predictive probability distributions that can be applied sequentially during the trial. These rules aim at early detection of either the mis-choice of dose range or a prefixed gain in the point estimate or accuracy of estimated probability of response associated with the MTD (or MED). They were compared through a simulation study under six situations that could represent the underlying unknown dose-response (either toxicity or failure) relationship, in terms of sample size, probability of correct selection and bias of the response probability associated to the MTD (or MED). Our results show that the stopping rules act correctly, with early stopping by using the two first rules based on the posterior distribution when the actual underlying dose-response relationship is far from that initially supposed, while the rules based on predictive gain functions provide a discontinuation of inclusions whatever the actual dose-response curve after 20 patients on average, that is, depending mostly on the accumulated data. The stopping rules were then applied to a data set from a dose-ranging phase II clinical trial aiming at estimating the MED dose of midazolam in the sedation of infants during cardiac catheterization. All these findings suggest the early use of the two first rules to detect a mis-choice of dose range, while they confirm the requirement of including at least 20 patients at the same dose to reach an accurate estimate of MTD (MED). A two-stage design is under study. Copyright 2001 John Wiley & Sons, Ltd.

  15. From Questions to Queries

    Directory of Open Access Journals (Sweden)

    M. Drlík

    2007-12-01

    Full Text Available The extension of (Internet databases forceseveryone to become more familiar with techniques of datastorage and retrieval because users’ success often dependson their ability to pose right questions and to be able tointerpret their answers. University programs pay moreattention to developing database programming skills than todata exploitation skills. To educate our students to become“database users”, the authors intensively exploit supportivetools simplifying the production of database elements astables, queries, forms, reports, web pages, and macros.Videosequences demonstrating “standard operations” forcompleting them have been prepared to enhance out-ofclassroomlearning. The use of SQL and other professionaltools is reduced to the cases when the wizards are unable togenerate the intended construct.

  16. Research Issues in Mobile Querying

    DEFF Research Database (Denmark)

    Breunig, M.; Jensen, Christian Søndergaard; Klein, M.

    2004-01-01

    This document reports on key aspects of the discussions conducted within the working group. In particular, the document aims to offer a structured and somewhat digested summary of the group's discussions. The document first offers concepts that enable characterization of "mobile queries" as well...... as the types of systems that enable such queries. It explores the notion of context in mobile queries. The document ends with a few observations, mainly regarding challenges....

  17. The CMS DBS Query Language

    CERN Document Server

    Kuznetsov, Valentin; Afaq, Anzar; Sekhri, Vijay; Guo, Yuyi; Lueking, Lee

    2009-01-01

    The CMS experiment has implemented a flexible and powerful system enabling users to find data within the CMS physics data catalog. The Dataset Bookkeeping Service (DBS) comprises a database and the services used to store and access metadata related to CMS physics data. To this, we have added a generalized query system in addition to the existing web and programmatic interfaces to the DBS. This query system is based on a query language that hides the complexity of the underlying database structure by discovering the join conditions between database tables. This provides a way of querying the system that is simple and straightforward for CMS data managers and physicists to use without requiring knowledge of the database tables or keys. The DBS Query Language uses the ANTLR tool to build the input query parser and tokenizer, followed by a query builder that uses a graph representation of the DBS schema to construct the SQL query sent to underlying database. We will describe the design of the query system, provid...

  18. User perspectives on query difficulty

    DEFF Research Database (Denmark)

    Lioma, Christina; Larsen, Birger; Schütze, Hinrich

    2011-01-01

    The difficulty of a user query can affect the performance of Information Retrieval (IR) systems. What makes a query difficult and how one may predict this is an active research area, focusing mainly on factors relating to the retrieval algorithm, to the properties of the retrieval data...

  19. Querying Sentiment Development over Time

    DEFF Research Database (Denmark)

    Andreasen, Troels; Christiansen, Henning; Have, Christian Theil

    2013-01-01

    that measures how well a hypothesis characterizes a given time interval; the semantics is parameterized so it can be adjusted to different views of the data. EmoEpisodes is extended to a query language with variables standing for unknown topics and emotions, and the query-answering mechanism will return...

  20. jQuery Pocket Reference

    CERN Document Server

    Flanagan, David

    2010-01-01

    "As someone who uses jQuery on a regular basis, it was surprising to discover how much of the library I'm not using. This book is indispensable for anyone who is serious about using jQuery for non-trivial applications."-- Raffaele Cecco, longtime developer of video games, including Cybernoid, Exolon, and Stormlord jQuery is the "write less, do more" JavaScript library. Its powerful features and ease of use have made it the most popular client-side JavaScript framework for the Web. This book is jQuery's trusty companion: the definitive "read less, learn more" guide to the library. jQuery P

  1. jQuery UI cookbook

    CERN Document Server

    Boduch, Adam

    2013-01-01

    Filled with a practical collection of recipes, jQuery UI Cookbook is full of clear, step-by-step instructions that will help you harness the powerful UI framework in jQuery. Depending on your needs, you can dip in and out of the Cookbook and its recipes, or follow the book from start to finish.If you are a jQuery UI developer looking to improve your existing applications, extract ideas for your new application, or to better understand the overall widget architecture, then jQuery UI Cookbook is a must-have for you. The reader should at least have a rudimentary understanding of what jQuery UI is

  2. Query-Time Optimization Techniques for Structured Queries in Information Retrieval

    Science.gov (United States)

    Cartright, Marc-Allen

    2013-01-01

    The use of information retrieval (IR) systems is evolving towards larger, more complicated queries. Both the IR industrial and research communities have generated significant evidence indicating that in order to continue improving retrieval effectiveness, increases in retrieval model complexity may be unavoidable. From an operational perspective,…

  3. Top-k Spatial Preference Queries in Directed Road Networks

    Directory of Open Access Journals (Sweden)

    Muhammad Attique

    2016-09-01

    Full Text Available Top-k spatial preference queries rank objects based on the score of feature objects in their spatial neighborhood. Top-k preference queries are crucial for a wide range of location based services such as hotel browsing and apartment searching. In recent years, a lot of research has been conducted on processing of top-k spatial preference queries in Euclidean space. While few algorithms study top-k preference queries in road networks, they all focus on undirected road networks. In this paper, we investigate the problem of processing the top-k spatial preference queries in directed road networks where each road segment has a particular orientation. Computation of data object scores requires examining the scores of each feature object in its spatial neighborhood. This may cause the computational delay, thus resulting in a high query processing time. In this paper, we address this problem by proposing a pruning and grouping of feature objects to reduce the number of feature objects. Furthermore, we present an efficient algorithm called TOPS that can process top-k spatial preference queries in directed road networks. Experimental results indicate that our algorithm significantly reduces the query processing time compared to period solution for a wide range of problem settings.

  4. Sonata: Query-Driven Network Telemetry

    KAUST Repository

    Gupta, Arpit

    2017-05-02

    Operating networks depends on collecting and analyzing measurement data. Current technologies do not make it easy to do so, typically because they separate data collection (e.g., packet capture or flow monitoring) from analysis, producing either too much data to answer a general question or too little data to answer a detailed question. In this paper, we present Sonata, a network telemetry system that uses a uniform query interface to drive the joint collection and analysis of network traffic. Sonata takes the advantage of two emerging technologies---streaming analytics platforms and programmable network devices---to facilitate joint collection and analysis. Sonata allows operators to more directly express network traffic analysis tasks in terms of a high-level language. The underlying runtime partitions each query into a portion that runs on the switch and another that runs on the streaming analytics platform iteratively refines the query to efficiently capture only the traffic that pertains to the operator\\'s query, and exploits sketches to reduce state in switches in exchange for more approximate results. Through an evaluation of a prototype implementation, we demonstrate that Sonata can support a wide range of network telemetry tasks with less state in the network, and lower data rates to streaming analytics systems, than current approaches can achieve.

  5. Recommendation Sets and Choice Queries

    DEFF Research Database (Denmark)

    Viappiani, Paolo Renato; Boutilier, Craig

    2011-01-01

    Utility elicitation is an important component of many applications, such as decision support systems and recommender systems. Such systems query users about their preferences and offer recommendations based on the system's belief about the user's utility function. We analyze the connection between...... the problem of generating optimal recommendation sets and the problem of generating optimal choice queries, considering both Bayesian and regret-based elicitation. Our results show that, somewhat surprisingly, under very general circumstances, the optimal recommendation set coincides with the optimal query....

  6. jQuery For Dummies

    CERN Document Server

    Beighley, Lynn

    2010-01-01

    Learn how jQuery can make your Web page or blog stand out from the crowd!. jQuery is free, open source software that allows you to extend and customize Joomla!, Drupal, AJAX, and WordPress via plug-ins. Assuming no previous programming experience, Lynn Beighley takes you through the basics of jQuery from the very start. You'll discover how the jQuery library separates itself from other JavaScript libraries through its ease of use, compactness, and friendliness if you're a beginner programmer. Written in the easy-to-understand style of the For Dummies brand, this book demonstrates how you can a

  7. Schedule Sales Query Raw Data

    Data.gov (United States)

    General Services Administration — Schedule Sales Query presents sales volume figures as reported to GSA by contractors. The reports are generated as quarterly reports for the current year and the...

  8. Stability diagrams for continuous wide-range control of two mutually delay-coupled semiconductor lasers

    Science.gov (United States)

    Junges, Leandro; Gallas, Jason A. C.

    2015-05-01

    The dynamics of two mutually delay-coupled semiconductor lasers has been frequently studied experimentally, numerically, and analytically either for weak or strong detuning between the lasers. Here, we present a systematic numerical investigation spanning all detuning ranges. We report high-resolution stability diagrams for wide ranges of the main control parameters of the laser, as described by the Lang-Kobayashi model. In particular, we detail the parameter influence on dynamical performance and map the distribution of chaotic pulsations and self-generated periodic spiking with arbitrary periodicity. Special attention is given to the unfolding of regular pulse packages for both symmetric and non-symmetric configurations with respect to detuning. The influence of the delay -time on the self-organization of periodic and chaotic laser phases as a function of the coupling and detuning is also described in detail.

  9. Graphical modeling and query language for hospitals.

    Science.gov (United States)

    Barzdins, Janis; Barzdins, Juris; Rencis, Edgars; Sostaks, Agris

    2013-01-01

    So far there has been little evidence that implementation of the health information technologies (HIT) is leading to health care cost savings. One of the reasons for this lack of impact by the HIT likely lies in the complexity of the business process ownership in the hospitals. The goal of our research is to develop a business model-based method for hospital use which would allow doctors to retrieve directly the ad-hoc information from various hospital databases. We have developed a special domain-specific process modelling language called the MedMod. Formally, we define the MedMod language as a profile on UML Class diagrams, but we also demonstrate it on examples, where we explain the semantics of all its elements informally. Moreover, we have developed the Process Query Language (PQL) that is based on MedMod process definition language. The purpose of PQL is to allow a doctor querying (filtering) runtime data of hospital's processes described using MedMod. The MedMod language tries to overcome deficiencies in existing process modeling languages, allowing to specify the loosely-defined sequence of the steps to be performed in the clinical process. The main advantages of PQL are in two main areas - usability and efficiency. They are: 1) the view on data through "glasses" of familiar process, 2) the simple and easy-to-perceive means of setting filtering conditions require no more expertise than using spreadsheet applications, 3) the dynamic response to each step in construction of the complete query that shortens the learning curve greatly and reduces the error rate, and 4) the selected means of filtering and data retrieving allows to execute queries in O(n) time regarding the size of the dataset. We are about to continue developing this project with three further steps. First, we are planning to develop user-friendly graphical editors for the MedMod process modeling and query languages. The second step is to do evaluation of usability the proposed language and tool

  10. Macromolecular query language (MMQL): prototype data model and implementation.

    Science.gov (United States)

    Shindyalov, I N; Chang, W; Pu, C; Bourne, P E

    1994-11-01

    Macromolecular query language (MMQL) is an extensible interpretive language in which to pose questions concerning the experimental or derived features of the 3-D structure of biological macromolecules. MMQL portends to be intuitive with a simple syntax, so that from a user's perspective complex queries are easily written. A number of basic queries and a more complex query--determination of structures containing a five-strand Greek key motif--are presented to illustrate the strengths and weaknesses of the language. The predominant features of MMQL are a filter and pattern grammar which are combined to express a wide range of interesting biological queries. Filters permit the selection of object attributes, for example, compound name and resolution, whereas the patterns currently implemented query primary sequence, close contacts, hydrogen bonding, secondary structure, conformation and amino acid properties (volume, polarity, isoelectric point, hydrophobicity and different forms of exposure). MMQL queries are processed by MMQLlib; a C++ class library, to which new query methods and pattern types are easily added. The prototype implementation described uses PDBlib, another C(++)-based class library from representing the features of biological macromolecules at the level of detail parsable from a PDB file. Since PDBlib can represent data stored in relational and object-oriented databases, as well as PDB files, once these data are loaded they too can be queried by MMQL. Performance metrics are given for queries of PDB files for which all derived data are calculated at run time and compared to a preliminary version of OOPDB, a prototype object-oriented database with a schema based on a persistent version of PDBlib which offers more efficient data access and the potential to maintain derived information. MMQLlib, PDBlib and associated software are available via anonymous ftp from cuhhca.hhmi.columbia.edu.

  11. Fingerprinting Keywords in Search Queries over Tor

    Directory of Open Access Journals (Sweden)

    Oh Se Eun

    2017-10-01

    Full Text Available Search engine queries contain a great deal of private and potentially compromising information about users. One technique to prevent search engines from identifying the source of a query, and Internet service providers (ISPs from identifying the contents of queries is to query the search engine over an anonymous network such as Tor.

  12. Mining Longitudinal Web Queries: Trends and Patterns.

    Science.gov (United States)

    Wang, Peiling; Berry, Michael W.; Yang, Yiheng

    2003-01-01

    Analyzed user queries submitted to an academic Web site during a four-year period, using a relational database, to examine users' query behavior, to identify problems they encounter, and to develop techniques for optimizing query analysis and mining. Linguistic analyses focus on query structures, lexicon, and word associations using statistical…

  13. MyBestQuery - A serious game to collect manual query reformulation

    OpenAIRE

    Chifu, Adrian-Gabriel; Molina, Serge; Mothe, Josiane

    2016-01-01

    This paper presents MyBestQuery, a serious game designed to collect query reformulations from players. Query reformulation is a hot topic in information retrieval and covers many aspects. One of them is query reformulation analysis which is based on users' session. It can be used to understand user's intent or to measure his satisfaction with regards to the results he obtained when querying the search engine. Automatic query reformulation is another aspect of query reformulation. It automatic...

  14. Head First jQuery

    CERN Document Server

    Benedetti, Ryan

    2011-01-01

    Want to add more interactivity and polish to your websites? Discover how jQuery can help you build complex scripting functionality in just a few lines of code. With Head First jQuery, you'll quickly get up to speed on this amazing JavaScript library by learning how to navigate HTML documents while handling events, effects, callbacks, and animations. By the time you've completed the book, you'll be incorporating Ajax apps, working seamlessly with HTML and CSS, and handling data with PHP, MySQL and JSON. If you want to learn-and understand-how to create interactive web pages, unobtrusive scrip

  15. Multi-Dimensional Path Queries

    DEFF Research Database (Denmark)

    Bækgaard, Lars

    1998-01-01

    that connects a pair of paths. A path expression is a function that maps a set of path sets into a path set. Path sets can be joined, filtering conditions can restrict the set of qualifying paths, and aggregation functions can be applied to path elements. In particular, the aggregation function SET can be used...... to create nested path structures. We present an SQL-like query language that is based on path expressions and we show how to use it to express multi-dimensional path queries that are suited for advanced data analysis in decision support environments like data warehousing environments...

  16. Parameter Curation for Benchmark Queries

    NARCIS (Netherlands)

    Gubichev, Andrey; Boncz, Peter

    2014-01-01

    In this paper we consider the problem of generating parameters for benchmark queries so these have stable behavior despite being executed on datasets (real-world or synthetic) with skewed data distributions and value correlations. We show that uniform random sampling of the substitution parameters

  17. Automatically Preparing Safe SQL Queries

    Science.gov (United States)

    Bisht, Prithvi; Sistla, A. Prasad; Venkatakrishnan, V. N.

    We present the first sound program source transformation approach for automatically transforming the code of a legacy web application to employ PREPARE statements in place of unsafe SQL queries. Our approach therefore opens the way for eradicating the SQL injection threat vector from legacy web applications.

  18. Fuzzy Querying: Issues and Perspectives..

    Czech Academy of Sciences Publication Activity Database

    Kacprzyk, J.; Pasi, G.; Vojtáš, Peter; Zadrozny, S.

    2000-01-01

    Roč. 36, č. 6 (2000), s. 605-616 ISSN 0023-5954 Institutional research plan: AV0Z1030915 Keywords : flexible querying * information retrieval * fuzzy databases Subject RIV: BA - General Mathematics http://dml.cz/handle/10338.dmlcz/135376

  19. Enhancing Recall in Semantic Querying

    DEFF Research Database (Denmark)

    Rouces, Jacobo

    2013-01-01

    RDF and SPARQL are currently state-of-the-art W3C standards to respectively represent and query structured information, especially when information from different sources must be federated. However, there are various reasons for which the same knowledge can be modeled in RDF graphs that are both ...

  20. Querying Large Biological Network Datasets

    Science.gov (United States)

    Gulsoy, Gunhan

    2013-01-01

    New experimental methods has resulted in increasing amount of genetic interaction data to be generated every day. Biological networks are used to store genetic interaction data gathered. Increasing amount of data available requires fast large scale analysis methods. Therefore, we address the problem of querying large biological network datasets.…

  1. Design and analysis of stochastic DSS query optimizers in a distributed database system

    Directory of Open Access Journals (Sweden)

    Manik Sharma

    2016-07-01

    Full Text Available Query optimization is a stimulating task of any database system. A number of heuristics have been applied in recent times, which proposed new algorithms for substantially improving the performance of a query. The hunt for a better solution still continues. The imperishable developments in the field of Decision Support System (DSS databases are presenting data at an exceptional rate. The massive volume of DSS data is consequential only when it is able to access and analyze by distinctive researchers. Here, an innovative stochastic framework of DSS query optimizer is proposed to further optimize the design of existing query optimization genetic approaches. The results of Entropy Based Restricted Stochastic Query Optimizer (ERSQO are compared with the results of Exhaustive Enumeration Query Optimizer (EAQO, Simple Genetic Query Optimizer (SGQO, Novel Genetic Query Optimizer (NGQO and Restricted Stochastic Query Optimizer (RSQO. In terms of Total Costs, EAQO outperforms SGQO, NGQO, RSQO and ERSQO. However, stochastic approaches dominate in terms of runtime. The Total Costs produced by ERSQO is better than SGQO, NGQO and RGQO by 12%, 8% and 5% respectively. Moreover, the effect of replicating data on the Total Costs of DSS query is also examined. In addition, the statistical analysis revealed a 2-tailed significant correlation between the number of join operations and the Total Costs of distributed DSS query. Finally, in regard to the consistency of stochastic query optimizers, the results of SGQO, NGQO, RSQO and ERSQO are 96.2%, 97.2%, 97.45 and 97.8% consistent respectively.

  2. Optimizing Temporal Queries: Efficient Handling of Duplicates

    DEFF Research Database (Denmark)

    Toman, David; Bowman, Ivan Thomas

    2001-01-01

    , these query languages are implemented by translating temporal queries into standard relational queries. However, the compiled queries are often quite cumbersome and expensive to execute even using state-of-the- art relational products. This paper presents an optimization technique that produces more efficient...... translated SQL queries by taking into account the properties of the encoding used for temporal attributes. For concreteness, this translation technique is presented in the context of SQL/TP; however, these techniques are also applicable to other temporal query languages....

  3. Querying Natural Logic Knowledge Bases

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Jensen, Per Anker

    2017-01-01

    This paper describes the principles of a system applying natural logic as a knowledge base language. Natural logics are regimented fragments of natural language employing high level inference rules. We advocate the use of natural logic for knowledge bases dealing with querying of classes in ontol......This paper describes the principles of a system applying natural logic as a knowledge base language. Natural logics are regimented fragments of natural language employing high level inference rules. We advocate the use of natural logic for knowledge bases dealing with querying of classes...... in ontologies and class-relationships such as are common in life-science descriptions. The paper adopts a version of natural logic with recursive restrictive clauses such as relative clauses and adnominal prepositional phrases. It includes passive as well as active voice sentences. We outline a prototype...

  4. Flexible Query Answering Systems 2006

    DEFF Research Database (Denmark)

    This volume constitutes the proceedings of the Seventh International Conference on Flexible Query Answering Systems, FQAS 2006, held in Milan, Italy, on June 7--10, 2006. FQAS is the premier conference for researchers and practitioners concerned with the vital task of providing easy, flexible......, and intuitive access to information for every type of need. This multidisciplinary conference draws on several research areas, including information retrieval, database management, information filtering, knowledge representation, soft computing, management of multimedia information, and human...... submissions, relating to the topic of users posing queries and systems producing answers. The papers cover the fields: Database Management, Information Retrieval, Domain Modeling, Knowledge Representation and Ontologies, Knowledge Discovery and Data Mining, Artificial Intelligence, Classical and Non...

  5. Querying Sentiment Development over Time

    DEFF Research Database (Denmark)

    Andreasen, Troels; Christiansen, Henning; Have, Christian Theil

    2013-01-01

    A new language is introduced for describing hypotheses about fluctuations of measurable properties in streams of timestamped data, and as prime example, we consider trends of emotions in the constantly flowing stream of Twitter messages. The language, called EmoEpisodes, has a precise semantics...... that measures how well a hypothesis characterizes a given time interval; the semantics is parameterized so it can be adjusted to different views of the data. EmoEpisodes is extended to a query language with variables standing for unknown topics and emotions, and the query-answering mechanism will return...... instantiations for topics and emotions as well as time intervals that provide the largest deflections in this measurement. Experiments are performed on a selection of Twitter data to demonstrates the usefulness of the approach....

  6. A Comparison between Boundary and Continuous Conduction Modes in Single Phase PFC Using 600V Range Devices

    DEFF Research Database (Denmark)

    Hernandez Botella, Juan Carlos; Petersen, Lars Press; Andersen, Michael A. E.

    2015-01-01

    This paper presents an analysis and comparison of boundary conduction mode (BCM) and continuous conduction mode (CCM) in single phase power factor correction (PFC) applications. The comparison is based on double pulse tester (DPT) characterization results of state-of-the-art superjunction devices...... in the 600V range. The measured switching energy is used to evaluate the devices performance in a conventional PFC. This data is used together with a mathematical model for prediction of the conducted electromagnetic interference (EMI). This allows comparing the different devices in BCM and CCM operation...... modes and evaluating the performance as a function of the PFC power density and efficiency....

  7. A comparison of peer-to-peer query response modes

    CERN Document Server

    Hoschek, W

    2002-01-01

    In a large distributed system spanning many administrative domains such as a Grid (Foster et al., 2001), it is desirable to maintain and query dynamic and timely information about active participants such as services, resources and user communities. However, in such a database system, the set of information tuples in the universe is partitioned over one or more distributed nodes, for reasons including autonomy, scalability, availability, performance and security. This suggests the use of peer-to-peer (P2P) query technology. A variety of query response modes can be used to return matching query results from P2P nodes to an originator. Although from the functional perspective all response modes are equivalent, no mode is optimal under all circumstances. Which query response modes allow to express suitable trade-offs for a wide range ofP2P application? We answer this question by systematically describing and characterizing four query response modes for the unified peer-to-peer database framework (UPDF) proposed ...

  8. The effect of large decoherence on mixing time in continuous-time quantum walks on long-range interacting cycles

    Energy Technology Data Exchange (ETDEWEB)

    Salimi, S; Radgohar, R, E-mail: shsalimi@uok.ac.i, E-mail: r.radgohar@uok.ac.i [Faculty of Science, Department of Physics, University of Kurdistan, Pasdaran Ave, Sanandaj (Iran, Islamic Republic of)

    2010-01-28

    In this paper, we consider decoherence in continuous-time quantum walks on long-range interacting cycles (LRICs), which are the extensions of the cycle graphs. For this purpose, we use Gurvitz's model and assume that every node is monitored by the corresponding point-contact induced by the decoherence process. Then, we focus on large rates of decoherence and calculate the probability distribution analytically and obtain the lower and upper bounds of the mixing time. Our results prove that the mixing time is proportional to the rate of decoherence and the inverse of the square of the distance parameter (m). This shows that the mixing time decreases with increasing range of interaction. Also, what we obtain for m = 0 is in agreement with Fedichkin, Solenov and Tamon's results [48] for cycle, and we see that the mixing time of CTQWs on cycle improves with adding interacting edges.

  9. Colored Range Searching in Linear Space

    DEFF Research Database (Denmark)

    Grossi, Roberto; Vind, Søren Juhl

    2014-01-01

    In colored range searching, we are given a set of n colored points in d ≥ 2 dimensions to store, and want to support orthogonal range queries taking colors into account. In the colored range counting problem, a query must report the number of distinct colors found in the query range, while...

  10. Identifying Aspects for Web-Search Queries

    OpenAIRE

    Wu, Fei; Madhavan, Jayant; Halevy, Alon

    2014-01-01

    Many web-search queries serve as the beginning of an exploration of an unknown space of information, rather than looking for a specific web page. To answer such queries effec- tively, the search engine should attempt to organize the space of relevant information in a way that facilitates exploration. We describe the Aspector system that computes aspects for a given query. Each aspect is a set of search queries that together represent a distinct information need relevant to the original search...

  11. Efficient and Flexible KNN Query Processing in Real-Life Road Networks

    DEFF Research Database (Denmark)

    Lu, Yang; Bui, Bin; Zhao, Jiakui

    2008-01-01

    Along with the developments of mobile services, effectively modeling road networks and efficiently indexing and querying network constrained objects has become a challenging problem. In this paper, we first introduce a road network model which captures real-life road networks better than previous...... models. Then, based on the proposed model, we propose a novel index named the RNG (Road Network Grid) index for accelerating KNN queries and continuous KNN queries over road network constrained data points. In contrast to conventional methods, speed limitations and blocking information of roads...... are included into the RNG index, which enables the index to support both distance-based and time-based KNN queries and continuous KNN queries. Our work extends previous ones by taking into account more practical scenarios, such as complexities in real-life road networks and time-based KNN queries. Extensive...

  12. How Good Are Query Optimizers, Really?

    NARCIS (Netherlands)

    Leis, Viktor; Gubichev, Andrey; Mirchev, Atanas; Boncz, Peter; Kemper, Alfons; Neumann, Thomas

    2016-01-01

    Finding a good join order is crucial for query performance. In this paper, we introduce the Join Order Benchmark (JOB) and experimentally revisit the main components in the classic query optimizer architecture using a complex, real-world data set and realistic multi-join queries. We investigate the

  13. Heuristics-based query optimisation for SPARQL

    NARCIS (Netherlands)

    P. Tsialiamanis (Petros); E. Sidirourgos (Eleftherios); I. Fundulaki; V. Christophides; P.A. Boncz (Peter)

    2012-01-01

    textabstractQuery optimization in RDF Stores is a challenging problem as SPARQL queries typically contain many more joins than equivalent relational plans, and hence lead to a large join order search space. In such cases, cost-based query optimization often is not possible. One practical reason for

  14. How Good Are Query Optimizers, Really?

    NARCIS (Netherlands)

    V. Leis (Viktor); A. Gubichev (Andrey); A. Mirchev (Atanas); P.A. Boncz (Peter); T. Neumann (Thomas); A. Kemper (Alfons)

    2015-01-01

    htmlabstractFinding a good join order is crucial for query performance. In this paper, we introduce the Join Order Benchmark (JOB) and experimentally revisit the main components in the classic query optimizer architecture using a complex, real-world data set and realistic multi-join queries. We

  15. Web development with jQuery

    CERN Document Server

    York, Richard

    2015-01-01

    Newly revised and updated resource on jQuery's many features and advantages Web Development with jQuery offers a major update to the popular Beginning JavaScript and CSS Development with jQuery from 2009. More than half of the content is new or updated, and reflects recent innovations with regard to mobile applications, jQuery mobile, and the spectrum of associated plugins. Readers can expect thorough revisions with expanded coverage of events, CSS, AJAX, animation, and drag and drop. New chapters bring developers up to date on popular features like jQuery UI, navigation, tables, interacti

  16. Energy-Efficient Query Management Scheme for a Wireless Sensor Database System

    Directory of Open Access Journals (Sweden)

    Guofang Nan

    2010-01-01

    Full Text Available Minimizing the communication overhead to reduce the energy consumption is an essential consideration in sensor network applications, and existing research has mostly concentrated on data aggregation and in-network processing. However, effective query management to optimize the query aggregation plan at the gateway side is also a significant approach to energy saving in practice. In this paper, we present a multiquery management framework to support historical and continuous queries, where the key idea is to reduce common tasks in a collection of queries through merging and aggregation, according to query region, attribute, time duration, and frequency, by executing the common subqueries only once. In this framework, we propose a query management scheme to support query partitioning, region aggregation and approximate processing, time partitioning and aggregation rules, multirate queries, and historical database. In order to validate the performance of our algorithm, a heuristic routing protocol is also described. The performance simulation results show that the overall energy consumption for forwarding and answering a collection of queries can be significantly reduced by applying our query management scheme. The advantages and disadvantages of the proposed scheme are discussed, together with open research issues.

  17. A Streams-Based Framework for Defining Location-Based Queries

    DEFF Research Database (Denmark)

    Jensen, Christian Søndergaard; Xuegang, Huang

    2007-01-01

    n infrastructure is emerging that supports the delivery of on-line, location-enabled services to mobile users. Such services involve novel database queries, and the database research community is quite active in proposing techniques for the efficient processing of such queries. In parallel to this......, the management of data streams has become an active area of research. While most research in mobile services concerns performance issues, this paper aims to establish a formal framework for defining the semantics of queries encountered in mobile services, most notably the so-called continuous queries...... that are particularly relevant in this context. Rather than inventing an entirely new framework, the paper proposes a framework that builds on concepts from data streams and temporal databases. Definitions of example queries demonstrates how the framework enables clear formulation of query semantics and the comparison...

  18. Towards A Streams-Based Framework for Defining Location-Based Queries

    DEFF Research Database (Denmark)

    Huang, Xuegang; Jensen, Christian S.

    2004-01-01

    An infrastructure is emerging that supports the delivery of on-line, location-enabled services to mobile users. Such services involve novel database queries, and the database research community is quite active in proposing techniques for the effi- cient processing of such queries. In parallel...... to this, the management of data streams has become an active area of research. While most research in mobile services concerns performance issues, this paper aims to establish a formal framework for defining the semantics of queries encountered in mobile services, most notably the so-called continuous...... queries that are particularly relevant in this context. Rather than inventing an entirely new framework, the paper proposes a framework that builds on concepts from data streams and temporal databases. Definitions of example queries demonstrates how the framework enables clear formulation of query...

  19. Object-Extended OLAP Querying

    DEFF Research Database (Denmark)

    Pedersen, Torben Bach; Gu, Junmin; Shoshani, Arie

    2009-01-01

    On-line analytical processing (OLAP) systems based on a dimensional view of data have found widespread use in business applications and are being used increasingly in non-standard applications. These systems provide good performance and ease-of-use. However, the complex structures and relationships...... inherent in data in non-standard applications are not accommodated well by OLAP systems. In contrast, object database systems are built to handle such complexity, but do not support OLAP-type querying well. This paper presents the concepts and techniques underlying a flexible, "multi-model" federated...... system that enables OLAP users to exploit simultaneously the features of OLAP and object systems. The system allows data to be handled using the most appropriate data model and technology: OLAP systems for dimensional data and object database systems for more complex, general data. This allows data...

  20. Predicting Drug Recalls From Internet Search Engine Queries.

    Science.gov (United States)

    Yom-Tov, Elad

    2017-01-01

    Batches of pharmaceuticals are sometimes recalled from the market when a safety issue or a defect is detected in specific production runs of a drug. Such problems are usually detected when patients or healthcare providers report abnormalities to medical authorities. Here, we test the hypothesis that defective production lots can be detected earlier by monitoring queries to Internet search engines. We extracted queries from the USA to the Bing search engine, which mentioned one of the 5195 pharmaceutical drugs during 2015 and all recall notifications issued by the Food and Drug Administration (FDA) during that year. By using attributes that quantify the change in query volume at the state level, we attempted to predict if a recall of a specific drug will be ordered by FDA in a time horizon ranging from 1 to 40 days in future. Our results show that future drug recalls can indeed be identified with an AUC of 0.791 and a lift at 5% of approximately 6 when predicting a recall occurring one day ahead. This performance degrades as prediction is made for longer periods ahead. The most indicative attributes for prediction are sudden spikes in query volume about a specific medicine in each state. Recalls of prescription drugs and those estimated to be of medium-risk are more likely to be identified using search query data. These findings suggest that aggregated Internet search engine data can be used to facilitate in early warning of faulty batches of medicines.

  1. Adding query privacy to robust DHTs

    DEFF Research Database (Denmark)

    Backes, Michael; Goldberg, Ian; Kate, Aniket

    2012-01-01

    Interest in anonymous communication over distributed hash tables (DHTs) has increased in recent years. However, almost all known solutions solely aim at achieving sender or requestor anonymity in DHT queries. In many application scenarios, it is crucial that the queried key remains secret from...... intermediate peers that (help to) route the queries towards their destinations. In this paper, we satisfy this requirement by presenting an approach for providing privacy for the keys in DHT queries. We use the concept of oblivious transfer (OT) in communication over DHTs to preserve query privacy without...... compromising spam resistance. Although our OT-based approach can work over any DHT, we concentrate on robust DHTs that can tolerate Byzantine faults and resist spam. We choose the best-known robust DHT construction, and employ an efficient OT protocol well-suited for achieving our goal of obtaining query...

  2. EquiX-A Search and Query Language for XML.

    Science.gov (United States)

    Cohen, Sara; Kanza, Yaron; Kogan, Yakov; Sagiv, Yehoshua; Nutt, Werner; Serebrenik, Alexander

    2002-01-01

    Describes EquiX, a search language for XML that combines querying with searching to query the data and the meta-data content of Web pages. Topics include search engines; a data model for XML documents; search query syntax; search query semantics; an algorithm for evaluating a query on a document; and indexing EquiX queries. (LRW)

  3. Compressed Representations of Conjunctive Query Results

    OpenAIRE

    Deep, Shaleen; Koutris, Paraschos

    2017-01-01

    Relational queries, and in particular join queries, often generate large output results when executed over a huge dataset. In such cases, it is often infeasible to store the whole materialized output if we plan to reuse it further down a data processing pipeline. Motivated by this problem, we study the construction of space-efficient compressed representations of the output of conjunctive queries, with the goal of supporting the efficient access of the intermediate compressed result for a giv...

  4. Hierarchical Fuzzy Sets To Query Possibilistic Databases

    OpenAIRE

    Thomopoulos, Rallou; Buche, Patrice; Haemmerlé, Ollivier

    2008-01-01

    Within the framework of flexible querying of possibilistic databases, based on the fuzzy set theory, this chapter focuses on the case where the vocabulary used both in the querying language and in the data is hierarchically organized, which occurs in systems that use ontologies. We give an overview of previous works concerning two issues: firstly, flexible querying of imprecise data in the relational model; secondly, the introduction of fuzziness in hierarchies. Concerning the latter point, w...

  5. jQuery Tools UI Library

    CERN Document Server

    Libby, Alex

    2012-01-01

    A practical tutorial with powerful yet simple projects that are quick to implement. This book is aimed at developers who have prior jQuery knowledge, but may not have any prior experience with jQuery Tools. It is possible that they may have started with the basics of jQuery Tools, but want to learn more about how it can be used, as well as get ideas for future projects.

  6. A structural query system for Han characters

    DEFF Research Database (Denmark)

    Skala, Matthew

    2016-01-01

    The IDSgrep structural query system for Han character dictionaries is presented. This dictionary search system represents the spatial structure of Han characters using Extended Ideographic Description Sequences (EIDSes), a data model and syntax based on the Unicode IDS concept. It includes a query...... language for EIDS databases, with a freely available implementation and format translation from popular third-party IDS and XML character databases. The system is designed to suit the needs of font developers and foreign language learners. The search algorithm includes a bit vector index inspired by Bloom...... filters to support faster query operations. Experimental results are presented, evaluating the effect of the indexing on query performance....

  7. Secure Skyline Queries on Cloud Platform.

    Science.gov (United States)

    Liu, Jinfei; Yang, Juncheng; Xiong, Li; Pei, Jian

    2017-04-01

    Outsourcing data and computation to cloud server provides a cost-effective way to support large scale data storage and query processing. However, due to security and privacy concerns, sensitive data (e.g., medical records) need to be protected from the cloud server and other unauthorized users. One approach is to outsource encrypted data to the cloud server and have the cloud server perform query processing on the encrypted data only. It remains a challenging task to support various queries over encrypted data in a secure and efficient way such that the cloud server does not gain any knowledge about the data, query, and query result. In this paper, we study the problem of secure skyline queries over encrypted data. The skyline query is particularly important for multi-criteria decision making but also presents significant challenges due to its complex computations. We propose a fully secure skyline query protocol on data encrypted using semantically-secure encryption. As a key subroutine, we present a new secure dominance protocol, which can be also used as a building block for other queries. Finally, we provide both serial and parallelized implementations and empirically study the protocols in terms of efficiency and scalability under different parameter settings, verifying the feasibility of our proposed solutions.

  8. Thresholded Range Aggregation in Sensor Networks

    DEFF Research Database (Denmark)

    Yiu, Man Lung; Lin, Zhifeng; Mamoulis, Nikos

    2010-01-01

    The recent advances in wireless sensor technologies (e.g., Mica, Telos motes) enable the economic deployment of lightweight sensors for capturing data from their surrounding environment, serving various monitoring tasks, like forest wildfire alarming and volcano activity. We propose a novel query...... of other nodes in its neighborhood region. Furthermore, we extend our protocols for continuous evaluation of the TRA query. Experimental results show that our proposed solutions indeed offer substantial energy savings for both real and synthetic sensor networks.......The recent advances in wireless sensor technologies (e.g., Mica, Telos motes) enable the economic deployment of lightweight sensors for capturing data from their surrounding environment, serving various monitoring tasks, like forest wildfire alarming and volcano activity. We propose a novel query...... called thresholded range aggregate query (TRA), which retrieves the IDs of the sensors for which the average measurement in their neighborhood exceeds a user-given threshold. This query provides results that they are robust against individual sensor abnormality, and yet precisely summarize the sensors...

  9. Dissolution and Precipitation Behaviour during Continuous Heating of Al–Mg–Si Alloys in a Wide Range of Heating Rates

    Science.gov (United States)

    Osten, Julia; Milkereit, Benjamin; Schick, Christoph; Kessler, Olaf

    2015-01-01

    In the present study, the dissolution and precipitation behaviour of four different aluminium alloys (EN AW-6005A, EN AW-6082, EN AW-6016, and EN AW-6181) in four different initial heat treatment conditions (T4, T6, overaged, and soft annealed) was investigated during heating in a wide dynamic range. Differential scanning calorimetry (DSC) was used to record heating curves between 20 and 600 °C. Heating rates were studied from 0.01 K/s to 5 K/s. We paid particular attention to control baseline stability, generating flat baselines and allowing accurate quantitative evaluation of the resulting DSC curves. As the heating rate increases, the individual dissolution and precipitation reactions shift to higher temperatures. The reactions during heating are significantly superimposed and partially run simultaneously. In addition, precipitation and dissolution reactions are increasingly suppressed as the heating rate increases, whereby exothermic precipitation reactions are suppressed earlier than endothermic dissolution reactions. Integrating the heating curves allowed the enthalpy levels of the different initial microstructural conditions to be quantified. Referring to time–temperature–austenitisation diagrams for steels, continuous heating dissolution diagrams for aluminium alloys were constructed to summarise the results in graphical form. These diagrams may support process optimisation in heat treatment shops.

  10. An Improved Continuous-Time Model Predictive Control of Permanent Magnetic Synchronous Motors for a Wide-Speed Range

    Directory of Open Access Journals (Sweden)

    Dandan Su

    2017-12-01

    Full Text Available This paper proposes an improved continuous-time model predictive control (CTMPC of permanent magnetic synchronous motors (PMSMs for a wide-speed range, including the constant torque region and the flux-weakening (FW region. In the constant torque region, the mathematic models of PMSMs in dq-axes are decoupled without the limitation of DC-link voltage. However, in the FW region, the mathematic models of PMSMs in dq-axes are cross-coupled together with the limitation of DC-link voltage. A nonlinear PMSMs mathematic model in the FW region is presented based on the voltage angle. The solving of the nonlinear mathematic model of PMSMs in FW region will lead to heavy computation load for digital signal processing (DSP. To overcome such a problem, a linearization method of the voltage angle is also proposed to reduce the computation load. The selection of transiting points between the constant torque region and FW regions is researched to improve the performance of the driven system. Compared with the proportional integral (PI controller, the proposed CTMPC has obvious advantages in dealing with systems’ nonlinear constraints and improving system performance by restraining overshoot current under step torque changing. Both simulation and experimental results confirm the effectiveness of the proposed method in achieving good steady-state performance and smooth switching between the constant torque and FW regions.

  11. Correlated continuous time random walks: combining scale-invariance with long-range memory for spatial and temporal dynamics

    International Nuclear Information System (INIS)

    Schulz, Johannes H P; Chechkin, Aleksei V; Metzler, Ralf

    2013-01-01

    Standard continuous time random walk (CTRW) models are renewal processes in the sense that at each jump a new, independent pair of jump length and waiting time are chosen. Globally, anomalous diffusion emerges through scale-free forms of the jump length and/or waiting time distributions by virtue of the generalized central limit theorem. Here we present a modified version of recently proposed correlated CTRW processes, where we incorporate a power-law correlated noise on the level of both jump length and waiting time dynamics. We obtain a very general stochastic model, that encompasses key features of several paradigmatic models of anomalous diffusion: discontinuous, scale-free displacements as in Lévy flights, scale-free waiting times as in subdiffusive CTRWs, and the long-range temporal correlations of fractional Brownian motion (FBM). We derive the exact solutions for the single-time probability density functions and extract the scaling behaviours. Interestingly, we find that different combinations of the model parameters lead to indistinguishable shapes of the emerging probability density functions and identical scaling laws. Our model will be useful for describing recent experimental single particle tracking data that feature a combination of CTRW and FBM properties. (paper)

  12. Response of a continuous anaerobic digester to temperature transitions: A critical range for restructuring the microbial community structure and function.

    Science.gov (United States)

    Kim, Jaai; Lee, Changsoo

    2016-02-01

    Temperature is a crucial factor that significantly influences the microbial activity and so the methanation performance of an anaerobic digestion (AD) process. Therefore, how to control the operating temperature for optimal activity of the microbes involved is a key to stable AD. This study examined the response of a continuous anaerobic reactor to a series of temperature shifts over a wide range of 35-65 °C using a dairy-processing byproduct as model wastewater. During the long-term experiment for approximately 16 months, the reactor was subjected to stepwise temperature increases by 5 °C at a fixed HRT of 15 days. The reactor showed stable performance within the temperature range of 35-45 °C, with the methane production rate and yield being maximum at 45 °C (18% and 26% greater, respectively, than at 35 °C). However, the subsequent increase to 50 °C induced a sudden performance deterioration with a complete cessation of methane recovery, indicating that the temperature range between 45 °C and 50 °C had a critical impact on the transition of the reactor's methanogenic activity from mesophilic to thermophilic. This serious process perturbation was associated with a severe restructuring of the reactor microbial community structure, particularly of methanogens, quantitatively as well as qualitatively. Once restored by interrupted feeding for about two months, the reactor maintained fairly stable performance under thermophilic conditions until it was upset again at 65 °C. Interestingly, in contrast to most previous reports, hydrogenotrophs largely dominated the methanogen community at mesophilic temperatures while acetotrophs emerged as a major group at thermophilic temperature. This implies that the primary methanogenesis route of the reactor shifted from hydrogen- to acetate-utilizing pathways with the temperature shifts from mesophilic to thermophilic temperatures. Our observations suggest that a mesophilic digester may not need to be cooled at up

  13. I/O-Efficient Dynamic Planar Range Skyline Queries

    DEFF Research Database (Denmark)

    Kejlberg-Rasmussen, Casper; Tsakalidis, Konstantinos; Tsichlas, Kostas

    {n}{B^{1-\\epsilon}})$ blocks of space, for $n$ input planar points, $t$ reported points, and parameter $0 \\leq \\epsilon \\leq 1$. We obtain the result by extending Sundar's priority queues with attrition to support the operations \\textsc{DeleteMin} and \\textsc{CatenateAndAttrite} in $\\bigO (1)$ worst case I...

  14. Efficient Evaluation of Probabilistic Advanced Spatial Queries on Existentially Uncertain Data

    DEFF Research Database (Denmark)

    Yiu, Man Lung; Mamoulis, Nikos; Dai, Xiangyuan

    2009-01-01

    that exceeds a threshold. Accordingly, a ranking probabilistic spatial query selects the objects with the highest probabilities to qualify the spatial predicates. We propose adaptations of spatial access methods and search algorithms for probabilistic versions of range queries, nearest neighbors, spatial...

  15. Predicting leaf gravimetric water content from foliar reflectance across a range of plant species using continuous wavelet analysis.

    Science.gov (United States)

    Cheng, Tao; Rivard, Benoit; Sánchez-Azofeifa, Arturo G; Féret, Jean-Baptiste; Jacquemoud, Stephane; Ustin, Susan L

    2012-08-15

    Leaf water content is an important variable for understanding plant physiological properties. This study evaluates a spectral analysis approach, continuous wavelet analysis (CWA), for the spectroscopic estimation of leaf gravimetric water content (GWC, %) and determines robust spectral indicators of GWC across a wide range of plant species from different ecosystems. CWA is both applied to the Leaf Optical Properties Experiment (LOPEX) data set and a synthetic data set consisting of leaf reflectance spectra simulated using the leaf optical properties spectra (PROSPECT) model. The results for the two data sets, including wavelet feature selection and GWC prediction derived using those features, are compared to the results obtained from a previous study for leaf samples collected in the Republic of Panamá (PANAMA), to assess the predictive capabilities and robustness of CWA across species. Furthermore, predictive models of GWC using wavelet features derived from PROSPECT simulations are examined to assess their applicability to measured data. The two measured data sets (LOPEX and PANAMA) reveal five common wavelet feature regions that correlate well with leaf GWC. All three data sets display common wavelet features in three wavelength regions that span 1732-1736 nm at scale 4, 1874-1878 nm at scale 6, and 1338-1341 nm at scale 7 and produce accurate estimates of leaf GWC. This confirms the applicability of the wavelet-based methodology for estimating leaf GWC for leaves representative of various ecosystems. The PROSPECT-derived predictive models perform well on the LOPEX data set but are less successful on the PANAMA data set. The selection of high-scale and low-scale features emphasizes significant changes in both overall amplitude over broad spectral regions and local spectral shape over narrower regions in response to changes in leaf GWC. The wavelet-based spectral analysis tool adds a new dimension to the modeling of plant physiological properties with

  16. Mapping the Ultrafast Changes of Continuous Shape Measures in Photoexcited Spin Crossover Complexes without Long-Range Order

    Energy Technology Data Exchange (ETDEWEB)

    Canton, S. E. [Department; Zhang, X. [X-ray; Lawson Daku, M. L. [Département; Liu, Y. [Centre; Zhang, J. [School; Alvarez, S. [Departament

    2015-01-30

    Establishing a tractable yet complete reaction coordinate for the spin-state interconversion in d(4)-d(7) transition metal complexes is an integral aspect of controlling the dynamics that govern their functionality. For spin crossover phenomena, the limitations of a single-mode approximation that solely accounts for an isotropic increase in the metal-ligand bond length have long been recognized for all but the simple octahedral monodentate FeII compounds. However, identifying the coupled deformations that also impact on the unimolecular rate constants remains experimentally and theoretically challenging, especially for samples that do not display long-range order or when crystallization profoundly alters the dynamics. Owing to the rapid progress in ultrafast X-ray absorption spectroscopy (XAS), it is now possible to obtain transient structural information in any physical phase with unprecedented details. Using picosecond XAS and DFT modeling, the structure adopted by the photoinduced high-spin state of solvated [Fe(terpy)(2)](2+) (terpy: 2,2':6',2 ''-terpyridine) has been recently established. Based on these results, the methodology of the continuous shape measure is applied to classify and quantify the short-lived distortion of the first coordination shell. The reaction coordinate of the spin-state interconversion is clearly identified as a double axial bending. This finding sets a benchmark for gauging the influence of first-sphere and second-sphere interactions in the family of FeII complexes that incorporate terpy derivatives. Some implications for the optimization of related photoactive FeII complexes are also outlined.

  17. A general approach to query flattening

    NARCIS (Netherlands)

    van Ruth, J.

    The translation of queries from complex data models to simpler data models is a recurring theme in the construction of efficient data management systems. In this paper we propose a general framework to guide the translation from data models with nested types to a flat relational model (query

  18. The Data Cyclotron query processing scheme

    NARCIS (Netherlands)

    Goncalves, R.; Kersten, M.

    2011-01-01

    A grand challenge of distributed query processing is to devise a self-organizing architecture which exploits all hardware resources optimally to manage the database hot set, minimize query response time, and maximize throughput without single point global coordination. The Data Cyclotron

  19. The Data Cyclotron query processing scheme.

    NARCIS (Netherlands)

    R.A. Goncalves (Romulo); M.L. Kersten (Martin)

    2011-01-01

    htmlabstractA grand challenge of distributed query processing is to devise a self-organizing architecture which exploits all hardware resources optimally to manage the database hot set, minimize query response time, and maximize throughput without single point global coordination. The Data Cyclotron

  20. Querying and Mining Strings Made Easy

    KAUST Repository

    Sahli, Majed

    2017-10-13

    With the advent of large string datasets in several scientific and business applications, there is a growing need to perform ad-hoc analysis on strings. Currently, strings are stored, managed, and queried using procedural codes. This limits users to certain operations supported by existing procedural applications and requires manual query planning with limited tuning opportunities. This paper presents StarQL, a generic and declarative query language for strings. StarQL is based on a native string data model that allows StarQL to support a large variety of string operations and provide semantic-based query optimization. String analytic queries are too intricate to be solved on one machine. Therefore, we propose a scalable and efficient data structure that allows StarQL implementations to handle large sets of strings and utilize large computing infrastructures. Our evaluation shows that StarQL is able to express workloads of application-specific tools, such as BLAST and KAT in bioinformatics, and to mine Wikipedia text for interesting patterns using declarative queries. Furthermore, the StarQL query optimizer shows an order of magnitude reduction in query execution time.

  1. Fuzzy Query Processing Using Clustering Techniques.

    Science.gov (United States)

    Kamel, M.; And Others

    1990-01-01

    Discusses the problem of processing fuzzy queries in databases and information retrieval systems and presents a prototype of a fuzzy query processing system for databases that is based on data clustering and uses Pascal programing language. Clustering schemes are explained, and the system architecture that uses natural language is described. (14…

  2. Adding Query Privacy to Robust DHTs

    DEFF Research Database (Denmark)

    Backes, Michael; Goldberg, Ian; Kate, Aniket

    2011-01-01

    Interest in anonymous communication over distributed hash tables (DHTs) has increased in recent years. However, almost all known solutions solely aim at achieving sender or requestor anonymity in DHT queries. In many application scenarios, it is crucial that the queried key remains secret from...... intermediate peers that (help to) route the queries towards their destinations. In this paper, we satisfy this requirement by presenting an approach for providing privacy for the keys in DHT queries. We use the concept of oblivious transfer (OT) in communication over DHTs to preserve query privacy without...... compromising spam resistance. Although our OT-based approach can work over any DHT, we concentrate on communication over robust DHTs that can tolerate Byzantine faults and resist spam. We choose the best-known robust DHT construction, and employ an efficient OT protocol well-suited for achieving our goal...

  3. A NOVEL APPROACH OF INDEXING AND RETRIEVING SPATIAL POLYGONS FOR EFFICIENT SPATIAL REGION QUERIES

    Directory of Open Access Journals (Sweden)

    J. H. Zhao

    2017-10-01

    Full Text Available Spatial region queries are more and more widely used in web-based applications. Mechanisms to provide efficient query processing over geospatial data are essential. However, due to the massive geospatial data volume, heavy geometric computation, and high access concurrency, it is difficult to get response in real time. Spatial indexes are usually used in this situation. In this paper, based on k-d tree, we introduce a distributed KD-Tree (DKD-Tree suitbable for polygon data, and a two-step query algorithm. The spatial index construction is recursive and iterative, and the query is an in memory process. Both the index and query methods can be processed in parallel, and are implemented based on HDFS, Spark and Redis. Experiments on a large volume of Remote Sensing images metadata have been carried out, and the advantages of our method are investigated by comparing with spatial region queries executed on PostgreSQL and PostGIS. Results show that our approach not only greatly improves the efficiency of spatial region query, but also has good scalability, Moreover, the two-step spatial range query algorithm can also save cluster resources to support a large number of concurrent queries. Therefore, this method is very useful when building large geographic information systems.

  4. a Novel Approach of Indexing and Retrieving Spatial Polygons for Efficient Spatial Region Queries

    Science.gov (United States)

    Zhao, J. H.; Wang, X. Z.; Wang, F. Y.; Shen, Z. H.; Zhou, Y. C.; Wang, Y. L.

    2017-10-01

    Spatial region queries are more and more widely used in web-based applications. Mechanisms to provide efficient query processing over geospatial data are essential. However, due to the massive geospatial data volume, heavy geometric computation, and high access concurrency, it is difficult to get response in real time. Spatial indexes are usually used in this situation. In this paper, based on k-d tree, we introduce a distributed KD-Tree (DKD-Tree) suitbable for polygon data, and a two-step query algorithm. The spatial index construction is recursive and iterative, and the query is an in memory process. Both the index and query methods can be processed in parallel, and are implemented based on HDFS, Spark and Redis. Experiments on a large volume of Remote Sensing images metadata have been carried out, and the advantages of our method are investigated by comparing with spatial region queries executed on PostgreSQL and PostGIS. Results show that our approach not only greatly improves the efficiency of spatial region query, but also has good scalability, Moreover, the two-step spatial range query algorithm can also save cluster resources to support a large number of concurrent queries. Therefore, this method is very useful when building large geographic information systems.

  5. 40 CFR 60.4410 - How do I establish a valid parameter range if I have chosen to continuously monitor parameters?

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 6 2010-07-01 2010-07-01 false How do I establish a valid parameter... § 60.4410 How do I establish a valid parameter range if I have chosen to continuously monitor... continuously monitored and recorded during each run of the initial performance test, to establish acceptable...

  6. Spatio-Temporal Queries for moving objects Data warehousing

    OpenAIRE

    Esheiba, Leila; Mokhtar, Hoda M. O.; El-Sharkawi, Mohamed

    2013-01-01

    In the last decade, Moving Object Databases (MODs) have attracted a lot of attention from researchers. Several research works were conducted to extend traditional database techniques to accommodate the new requirements imposed by the continuous change in location information of moving objects. Managing, querying, storing, and mining moving objects were the key research directions. This extensive interest in moving objects is a natural consequence of the recent ubiquitous location-aware device...

  7. Characterization and Evaluation of 600 V Range Devices for Active Power Factor Correction in Boundary and Continuous Conduction Modes

    DEFF Research Database (Denmark)

    Hernandez Botella, Juan Carlos; Petersen, Lars Press; Andersen, Michael A. E.

    2015-01-01

    Traditional characterization of semiconductors switching dynamics is performed based on clamped inductive load measurements using the double pulse tester (DPT) configuration. This approach is valid for converters operating in continuous conduction mode (CCM), however in boundary conduction mode...

  8. Query Optimizations over Decentralized RDF Graphs

    KAUST Repository

    Abdelaziz, Ibrahim

    2017-05-18

    Applications in life sciences, decentralized social networks, Internet of Things, and statistical linked dataspaces integrate data from multiple decentralized RDF graphs via SPARQL queries. Several approaches have been proposed to optimize query processing over a small number of heterogeneous data sources by utilizing schema information. In the case of schema similarity and interlinks among sources, these approaches cause unnecessary data retrieval and communication, leading to poor scalability and response time. This paper addresses these limitations and presents Lusail, a system for scalable and efficient SPARQL query processing over decentralized graphs. Lusail achieves scalability and low query response time through various optimizations at compile and run times. At compile time, we use a novel locality-aware query decomposition technique that maximizes the number of query triple patterns sent together to a source based on the actual location of the instances satisfying these triple patterns. At run time, we use selectivity-awareness and parallel query execution to reduce network latency and to increase parallelism by delaying the execution of subqueries expected to return large results. We evaluate Lusail using real and synthetic benchmarks, with data sizes up to billions of triples on an in-house cluster and a public cloud. We show that Lusail outperforms state-of-the-art systems by orders of magnitude in terms of scalability and response time.

  9. Maximum-Likelihood Estimation for Frequency-Modulated Continuous-Wave Laser Ranging using Photon-Counting Detectors

    Science.gov (United States)

    2013-03-21

    instruments where frequency estimates are calcu- lated from coherently detected fields, e.g., coherent Doppler LIDAR . Our CRB results reveal that the best...field with the local reference field on a beam splitter and detecting the resultant beat modulation. In conventional FMCW ranging , the source modulation...Polarizer Short Delay To Shift Range Peak In Line Polarizer 99 99 90 1 1 10 APDs Fig. 5. (Color online) Block diagram of FMCW ranging experiment. 1 April

  10. Responsive web design with jQuery

    CERN Document Server

    Carlos, Gilberto

    2013-01-01

    Responsive Web Design with jQuery follows a standard tutorial-based approach, covering various aspects of responsive web design by building a comprehensive website.""Responsive Web Design with jQuery"" is aimed at web designers who are interested in building device-agnostic websites. You should have a grasp of standard HTML, CSS, and JavaScript development, and have a familiarity with graphic design. Some exposure to jQuery and HTML5 will be beneficial but isn't essential.

  11. Experimental quantum private queries with linear optics

    International Nuclear Information System (INIS)

    De Martini, Francesco; Giovannetti, Vittorio; Lloyd, Seth; Maccone, Lorenzo; Nagali, Eleonora; Sansoni, Linda; Sciarrino, Fabio

    2009-01-01

    The quantum private query is a quantum cryptographic protocol to recover information from a database, preserving both user and data privacy: the user can test whether someone has retained information on which query was asked and the database provider can test the amount of information released. Here we discuss a variant of the quantum private query algorithm that admits a simple linear optical implementation: it employs the photon's momentum (or time slot) as address qubits and its polarization as bus qubit. A proof-of-principle experimental realization is implemented.

  12. Instant MDX queries for SQL Server 2012

    CERN Document Server

    Emond, Nicholas

    2013-01-01

    Get to grips with a new technology, understand what it is and what it can do for you, and then get to work with the most important features and tasks. This short, focused guide is a great way to get stated with writing MDX queries. New developers can use this book as a reference for how to use functions and the syntax of a query as well as how to use Calculated Members and Named Sets.This book is great for new developers who want to learn the MDX query language from scratch and install SQL Server 2012 with Analysis Services

  13. Federated query processing for the semantic web

    CERN Document Server

    Buil-Aranda, C

    2014-01-01

    During the last years, the amount of RDF data has increased exponentially over the Web, exposed via SPARQL endpoints. These SPARQL endpoints allow users to direct SPARQL queries to the RDF data. Federated SPARQL query processing allows to query several of these RDF databases as if they were a single one, integrating the results from all of them. This is a key concept in the Web of Data and it is also a hot topic in the community. Besides of that, the W3C SPARQL-WG has standardized it in the new Recommendation SPARQL 1.1.This book provides a formalisation of the W3C proposed recommendation. Thi

  14. OntoQuery: easy-to-use web-based OWL querying

    Science.gov (United States)

    Tudose, Ilinca; Hastings, Janna; Muthukrishnan, Venkatesh; Owen, Gareth; Turner, Steve; Dekker, Adriano; Kale, Namrata; Ennis, Marcus; Steinbeck, Christoph

    2013-01-01

    Summary: The Web Ontology Language (OWL) provides a sophisticated language for building complex domain ontologies and is widely used in bio-ontologies such as the Gene Ontology. The Protégé-OWL ontology editing tool provides a query facility that allows composition and execution of queries with the human-readable Manchester OWL syntax, with syntax checking and entity label lookup. No equivalent query facility such as the Protégé Description Logics (DL) query yet exists in web form. However, many users interact with bio-ontologies such as chemical entities of biological interest and the Gene Ontology using their online Web sites, within which DL-based querying functionality is not available. To address this gap, we introduce the OntoQuery web-based query utility. Availability and implementation: The source code for this implementation together with instructions for installation is available at http://github.com/IlincaTudose/OntoQuery. OntoQuery software is fully compatible with all OWL-based ontologies and is available for download (CC-0 license). The ChEBI installation, ChEBI OntoQuery, is available at http://www.ebi.ac.uk/chebi/tools/ontoquery. Contact: hastings@ebi.ac.uk PMID:24008420

  15. Schedule Sales Query Report Generation System

    Data.gov (United States)

    General Services Administration — Schedule Sales Query presents sales volume figures as reported to GSA by contractors. The reports are generated as quarterly reports for the current year and the...

  16. Querying temporal databases via OWL 2 QL

    CSIR Research Space (South Africa)

    Klarman, S

    2014-06-01

    Full Text Available SQL:2011, the most recently adopted version of the SQL query language, has unprecedentedly standardized the representation of temporal data in relational databases. Following the successful paradigm of ontology-based data access, we develop a...

  17. Pro PHP and jQuery

    CERN Document Server

    Lengstorf, Jason

    2010-01-01

    This book is for intermediate programmers interested in building AJAX web applications using jQuery and PHP. Along with teaching some advanced PHP techniques, it will teach you how to take your dynamic applications to the next level by adding a JavaScript layer with jQuery. * Learn to utilize built-in PHP functions to build calendar tools.* Learn how jQuery can be used for AJAX, animation, client-side validation, and more.What you'll learn* Use PHP to build a calendar application that allows users to post, view, edit, and delete events.* Use jQuery to allow the calendar app to be viewed and ed

  18. Clean Air Markets - Compliance Query Wizard

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Compliance Query Wizard is part of a suite of Clean Air Markets-related tools that are accessible at http://ampd.epa.gov/ampd/. The Compliance module provides...

  19. Clean Air Markets - Allowances Query Wizard

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Allowances Query Wizard is part of a suite of Clean Air Markets-related tools that are accessible at http://camddataandmaps.epa.gov/gdm/index.cfm. The Allowances...

  20. ANSWERING GEOSPARQL QUERIES OVER RELATIONAL DATA

    Directory of Open Access Journals (Sweden)

    K. Bereta

    2017-07-01

    Full Text Available In this paper we present the system Ontop-spatial that is able to answer GeoSPARQL queries on top of geospatial relational databases, performing on-the-fly GeoSPARQL-to-SQL translation using ontologies and mappings. GeoSPARQL is a geospatial extension of the query language SPARQL standardized by OGC for querying geospatial RDF data. Our approach goes beyond relational databases and covers all data that can have a relational structure even at the logical level. Our purpose is to enable GeoSPARQL querying on-the-fly integrating multiple geospatial sources, without converting and materializing original data as RDF and then storing them in a triple store. This approach is more suitable in the cases where original datasets are stored in large relational databases (or generally in files with relational structure and/or get frequently updated.

  1. Path-based Queries on Trajectory Data

    DEFF Research Database (Denmark)

    Krogh, Benjamin Bjerre; Pelekis, Nikos; Theodoridis, Yannis

    2014-01-01

    In traffic research, management, and planning a number of path-based analyses are heavily used, e.g., for computing turn-times, evaluating green waves, or studying traffic flow. These analyses require retrieving the trajectories that follow the full path being analyzed. Existing path queries cannot...... sufficiently support such path-based analyses because they retrieve all trajectories that touch any edge in the path. In this paper, we define and formalize the strict path query. This is a novel query type tailored to support path-based analysis, where trajectories must follow all edges in the path...... a specific path by only retrieving data from the first and last edge in the path. To correctly answer strict path queries existing network-constrained trajectory indexes must retrieve data from all edges in the path. An extensive performance study of NETTRA using a very large real-world trajectory data set...

  2. Superfund Chemical Data Matrix (SCDM) Query

    Science.gov (United States)

    This site allows you to to easily query the Superfund Chemical Data Matrix (SCDM) and generate a list of the corresponding Hazard Ranking System (HRS) factor values, benchmarks, and data elements that you need.

  3. Query-Driven Visualization and Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Ruebel, Oliver; Bethel, E. Wes; Prabhat, Mr.; Wu, Kesheng

    2012-11-01

    This report focuses on an approach to high performance visualization and analysis, termed query-driven visualization and analysis (QDV). QDV aims to reduce the amount of data that needs to be processed by the visualization, analysis, and rendering pipelines. The goal of the data reduction process is to separate out data that is "scientifically interesting'' and to focus visualization, analysis, and rendering on that interesting subset. The premise is that for any given visualization or analysis task, the data subset of interest is much smaller than the larger, complete data set. This strategy---extracting smaller data subsets of interest and focusing of the visualization processing on these subsets---is complementary to the approach of increasing the capacity of the visualization, analysis, and rendering pipelines through parallelism. This report discusses the fundamental concepts in QDV, their relationship to different stages in the visualization and analysis pipelines, and presents QDV's application to problems in diverse areas, ranging from forensic cybersecurity to high energy physics.

  4. Menangkal Serangan SQL Injection Dengan Parameterized Query

    Directory of Open Access Journals (Sweden)

    Yulianingsih Yulianingsih

    2016-06-01

    Full Text Available Semakin meningkat pertumbuhan layanan informasi maka semakin tinggi pula tingkat kerentanan keamanan dari suatu sumber informasi. Melalui tulisan ini disajikan penelitian yang dilakukan secara eksperimen yang membahas tentang kejahatan penyerangan database secara SQL Injection. Penyerangan dilakukan melalui halaman autentikasi dikarenakan halaman ini merupakan pintu pertama akses yang seharusnya memiliki pertahanan yang cukup. Kemudian dilakukan eksperimen terhadap metode Parameterized Query untuk mendapatkan solusi terhadap permasalahan tersebut.   Kata kunci— Layanan Informasi, Serangan, eksperimen, SQL Injection, Parameterized Query.

  5. Queryll: Java Database Queries through Bytecode Rewriting

    OpenAIRE

    Iu, Christopher Ming-Yee; Zwaenepoel, Willy

    2006-01-01

    When interfacing Java with other systems such as databases, programmers must often program in special interface languages like SQL. Code written in these languages often needs to be embedded in strings where they cannot be error-checked at compile-time, or the Java compiler needs to be altered to directly recognize code written in these languages. We have taken a different approach to adding database query facilities to Java. Bytecode rewriting allows us to add query facilities to Java whose ...

  6. Nearest Neighbor Queries in Road Networks

    DEFF Research Database (Denmark)

    Jensen, Christian Søndergaard; Kolar, Jan; Pedersen, Torben Bach

    2003-01-01

    With wireless communications and geo-positioning being widely available, it becomes possible to offer new e-services that provide mobile users with information about other mobile objects. This paper concerns active, ordered k-nearest neighbor queries for query and data objects that are moving in ...... for the nearest neighbor search in the prototype is presented in detail. In addition, the paper reports on results from experiments with the prototype system....

  7. Evaluating SPARQL queries on massive RDF datasets

    KAUST Repository

    Al-Harbi, Razen

    2015-08-01

    Distributed RDF systems partition data across multiple computer nodes. Partitioning is typically based on heuristics that minimize inter-node communication and it is performed in an initial, data pre-processing phase. Therefore, the resulting partitions are static and do not adapt to changes in the query workload; as a result, existing systems are unable to consistently avoid communication for queries that are not favored by the initial data partitioning. Furthermore, for very large RDF knowledge bases, the partitioning phase becomes prohibitively expensive, leading to high startup costs. In this paper, we propose AdHash, a distributed RDF system which addresses the shortcomings of previous work. First, AdHash initially applies lightweight hash partitioning, which drastically minimizes the startup cost, while favoring the parallel processing of join patterns on subjects, without any data communication. Using a locality-aware planner, queries that cannot be processed in parallel are evaluated with minimal communication. Second, AdHash monitors the data access patterns and adapts dynamically to the query load by incrementally redistributing and replicating frequently accessed data. As a result, the communication cost for future queries is drastically reduced or even eliminated. Our experiments with synthetic and real data verify that AdHash (i) starts faster than all existing systems, (ii) processes thousands of queries before other systems become online, and (iii) gracefully adapts to the query load, being able to evaluate queries on billion-scale RDF data in sub-seconds. In this demonstration, audience can use a graphical interface of AdHash to verify its performance superiority compared to state-of-the-art distributed RDF systems.

  8. Minimizing I/O Costs of Multi-Dimensional Queries with BitmapIndices

    Energy Technology Data Exchange (ETDEWEB)

    Rotem, Doron; Stockinger, Kurt; Wu, Kesheng

    2006-03-30

    Bitmap indices have been widely used in scientific applications and commercial systems for processing complex,multi-dimensional queries where traditional tree-based indices would not work efficiently. A common approach for reducing the size of a bitmap index for high cardinality attributes is to group ranges of values of an attribute into bins and then build a bitmap for each bin rather than a bitmap for each value of the attribute. Binning reduces storage costs,however, results of queries based on bins often require additional filtering for discarding it false positives, i.e., records in the result that do not satisfy the query constraints. This additional filtering,also known as ''candidate checking,'' requires access to the base data on disk and involves significant I/O costs. This paper studies strategies for minimizing the I/O costs for ''candidate checking'' for multi-dimensional queries. This is done by determining the number of bins allocated for each dimension and then placing bin boundaries in optimal locations. Our algorithms use knowledge of data distribution and query workload. We derive several analytical results concerning optimal bin allocation for a probabilistic query model. Our experimental evaluation with real life data shows an average I/O cost improvement of at least a factor of 10 for multi-dimensional queries on datasets from two different applications. Our experiments also indicate that the speedup increases with the number of query dimensions.

  9. Continuing inflation at Three Sisters volcanic center, central Oregon Cascade Range, USA, from GPS, leveling, and InSAR observations

    Science.gov (United States)

    Dzurisin, Daniel; Lisowski, Michael; Wicks, Charles W.

    2009-12-01

    Uplift of a broad area centered ~6 km west of the summit of South Sister volcano started in September 1997 (onset estimated from model discussed in this paper) and was continuing when surveyed in August 2006. Surface displacements were measured whenever possible since August 1992 with satellite radar interferometry (InSAR), annually since August 2001 with GPS and leveling surveys, and with continuous GPS since May 2001. The average maximum displacement rate from InSAR decreased from 3-5 cm/yr during 1998-2001 to ~1.4 cm/yr during 2004-2006. The other datasets show a similar pattern, i.e., surface uplift and extension rates decreased over time but deformation continued through August 2006. Our best-fit model to the deformation data is a vertical, prolate, spheroidal point-pressure source located 4.9-5.4 km below the surface. The source inflation rate decreased exponentially during 2001-2006 with a 1/ e decay time of 5.3 ± 1.1 years. The net increase in source volume from September 1997 to August 2006 was 36.5-41.9 x 106 m3. A swarm of ~300 small ( M max = 1.9) earthquakes occurred beneath the deforming area in March 2004; no other unusual seismicity has been noted. Similar deformation episodes in the past probably would have gone unnoticed if, as we suspect, most are small intrusions that do not culminate in eruptions.

  10. Coherent MUSIC technique for range/angle information retrieval: Application to a frequency modulated continuous wave MIMO radar

    NARCIS (Netherlands)

    Belfiori, F.; Rossum, W. van; Hoogeboom, P.

    2014-01-01

    A coherent two-dimensional (2D) multiple signal classification (MUSIC) processing for the simultaneous estimation of angular and range target positions has been presented. A 2D spatial smoothing technique is also introduced to cope with the coherent behaviour of the received echoes, which may result

  11. Continuous fast focusing in trapezoidal void channel based on bidirectional isotachophoresis in wide pH range

    Czech Academy of Sciences Publication Activity Database

    Šťastná, Miroslava; Šlais, Karel

    2015-01-01

    Roč. 36, č. 20 (2015), s. 2579-2586 ISSN 0173-0835 R&D Projects: GA MV VG20112015021 Institutional support: RVO:68081715 Keywords : bidirectional isotachophoresis * trapezoidal void channel * wide pH range * proteins Subject RIV: CB - Analytical Chemistry, Separation Impact factor: 2.482, year: 2015 http://hdl.handle.net/11104/0250164

  12. A Query Cache Tool for Optimizing Repeatable and Parallel OLAP Queries

    Science.gov (United States)

    Santos, Ricardo Jorge; Bernardino, Jorge

    On-line analytical processing against data warehouse databases is a common form of getting decision making information for almost every business field. Decision support information oftenly concerns periodic values based on regular attributes, such as sales amounts, percentages, most transactioned items, etc. This means that many similar OLAP instructions are periodically repeated, and simultaneously, between the several decision makers. Our Query Cache Tool takes advantage of previously executed queries, storing their results and the current state of the data which was accessed. Future queries only need to execute against the new data, inserted since the queries were last executed, and join these results with the previous ones. This makes query execution much faster, because we only need to process the most recent data. Our tool also minimizes the execution time and resource consumption for similar queries simultaneously executed by different users, putting the most recent ones on hold until the first finish and returns the results for all of them. The stored query results are held until they are considered outdated, then automatically erased. We present an experimental evaluation of our tool using a data warehouse based on a real-world business dataset and use a set of typical decision support queries to discuss the results, showing a very high gain in query execution time.

  13. EnviroMeter: A Platform for Querying Community-Sensed Data

    OpenAIRE

    Sathe, Saket; Oviedo, Arthur; Chakraborty, Dipanjan; Aberer, Karl

    2013-01-01

    Efficiently querying data collected from Large-area Communitydriven Sensor Networks (LCSNs) is a new and challenging problem. In our previous works, we proposed adaptive techniques for learning models (e.g., statistical, non-parametric, etc.) from such data, considering the fact that LCSN data is typically geo-temporally skewed. In this paper, we present a demonstration of EnviroMeter. EnviroMeter uses our adaptive model creation techniques for processing continuous queries on community-sense...

  14. Continuing Inflation at Three Sisters Volcanic Center, Central Oregon Cascade Range, USA, From GPS, InSAR, and Leveling Observations

    Science.gov (United States)

    Lisowski, M.; Dzurisin, D.; Wicks, C. W.

    2007-12-01

    Uplift of a broad area centered ~5 km west of South Sister volcano in central Oregon started sometime after fall 1996, accelerated after fall 1998, and was continuing when last surveyed with GPS and leveling in fall 2006. Surface displacements were measured whenever possible since 1992 with satellite radar interferometry (InSAR), annually since 2001 with GPS and leveling campaigns, and with a continuous GPS station since 2001. The average maximum displacement rate from InSAR was 3 to 5 cm/yr during 1998--2001 and ~1.4 cm/yr during 2004--2006. The other three datasets show a similar pattern, i.e., surface dilation and uplift rates decreased over time but deformation continued through 2006. Our best-fit model is a spherical point pressure (Mogi) source located 6.0--6.5 km below the surface and 4.5--5 km west-southwest of the summit of South Sister volcano. Any marginal improvement gained by using a more complicated source shape is not constrained by the data. This same model fits the deformation data for 2001--2003 and 2003--2006 equally well, so there is no indication that the location or shape of the source has changed. However, the source inflation rate has decreased exponentially since 2001 with a 1/e decay time of about 4 years. The net increase in source volume from the beginning of the episode (~1997) through 2006 was 60 × 106 m3 ± 10 × 106 m3. The only unusual seismicity near the deforming area was a swarm of about 300 small earthquakes on March 23- -26, 2004 ---the first notable seismicity for at least two decades. Timing of the swarm generally coincides with slowing of surface deformation, but any link between the two, if one exists, is not understood. Similar episodes in the past probably would have gone unnoticed if, as we suspect, most were small intrusions that do not culminate in eruptions.

  15. Web search queries can predict stock market volumes.

    Science.gov (United States)

    Bordino, Ilaria; Battiston, Stefano; Caldarelli, Guido; Cristelli, Matthieu; Ukkonen, Antti; Weber, Ingmar

    2012-01-01

    We live in a computerized and networked society where many of our actions leave a digital trace and affect other people's actions. This has lead to the emergence of a new data-driven research field: mathematical methods of computer science, statistical physics and sociometry provide insights on a wide range of disciplines ranging from social science to human mobility. A recent important discovery is that search engine traffic (i.e., the number of requests submitted by users to search engines on the www) can be used to track and, in some cases, to anticipate the dynamics of social phenomena. Successful examples include unemployment levels, car and home sales, and epidemics spreading. Few recent works applied this approach to stock prices and market sentiment. However, it remains unclear if trends in financial markets can be anticipated by the collective wisdom of on-line users on the web. Here we show that daily trading volumes of stocks traded in NASDAQ-100 are correlated with daily volumes of queries related to the same stocks. In particular, query volumes anticipate in many cases peaks of trading by one day or more. Our analysis is carried out on a unique dataset of queries, submitted to an important web search engine, which enable us to investigate also the user behavior. We show that the query volume dynamics emerges from the collective but seemingly uncoordinated activity of many users. These findings contribute to the debate on the identification of early warnings of financial systemic risk, based on the activity of users of the www.

  16. Web Search Queries Can Predict Stock Market Volumes

    Science.gov (United States)

    Bordino, Ilaria; Battiston, Stefano; Caldarelli, Guido; Cristelli, Matthieu; Ukkonen, Antti; Weber, Ingmar

    2012-01-01

    We live in a computerized and networked society where many of our actions leave a digital trace and affect other people’s actions. This has lead to the emergence of a new data-driven research field: mathematical methods of computer science, statistical physics and sociometry provide insights on a wide range of disciplines ranging from social science to human mobility. A recent important discovery is that search engine traffic (i.e., the number of requests submitted by users to search engines on the www) can be used to track and, in some cases, to anticipate the dynamics of social phenomena. Successful examples include unemployment levels, car and home sales, and epidemics spreading. Few recent works applied this approach to stock prices and market sentiment. However, it remains unclear if trends in financial markets can be anticipated by the collective wisdom of on-line users on the web. Here we show that daily trading volumes of stocks traded in NASDAQ-100 are correlated with daily volumes of queries related to the same stocks. In particular, query volumes anticipate in many cases peaks of trading by one day or more. Our analysis is carried out on a unique dataset of queries, submitted to an important web search engine, which enable us to investigate also the user behavior. We show that the query volume dynamics emerges from the collective but seemingly uncoordinated activity of many users. These findings contribute to the debate on the identification of early warnings of financial systemic risk, based on the activity of users of the www. PMID:22829871

  17. Web search queries can predict stock market volumes.

    Directory of Open Access Journals (Sweden)

    Ilaria Bordino

    Full Text Available We live in a computerized and networked society where many of our actions leave a digital trace and affect other people's actions. This has lead to the emergence of a new data-driven research field: mathematical methods of computer science, statistical physics and sociometry provide insights on a wide range of disciplines ranging from social science to human mobility. A recent important discovery is that search engine traffic (i.e., the number of requests submitted by users to search engines on the www can be used to track and, in some cases, to anticipate the dynamics of social phenomena. Successful examples include unemployment levels, car and home sales, and epidemics spreading. Few recent works applied this approach to stock prices and market sentiment. However, it remains unclear if trends in financial markets can be anticipated by the collective wisdom of on-line users on the web. Here we show that daily trading volumes of stocks traded in NASDAQ-100 are correlated with daily volumes of queries related to the same stocks. In particular, query volumes anticipate in many cases peaks of trading by one day or more. Our analysis is carried out on a unique dataset of queries, submitted to an important web search engine, which enable us to investigate also the user behavior. We show that the query volume dynamics emerges from the collective but seemingly uncoordinated activity of many users. These findings contribute to the debate on the identification of early warnings of financial systemic risk, based on the activity of users of the www.

  18. Enabling Semantic Queries Against the Spatial Database

    Directory of Open Access Journals (Sweden)

    PENG, X.

    2012-02-01

    Full Text Available The spatial database based upon the object-relational database management system (ORDBMS has the merits of a clear data model, good operability and high query efficiency. That is why it has been widely used in spatial data organization and management. However, it cannot express the semantic relationships among geospatial objects, making the query results difficult to meet the user's requirement well. Therefore, this paper represents an attempt to combine the Semantic Web technology with the spatial database so as to make up for the traditional database's disadvantages. In this way, on the one hand, users can take advantages of ORDBMS to store and manage spatial data; on the other hand, if the spatial database is released in the form of Semantic Web, the users could describe a query more concisely with the cognitive pattern which is similar to that of daily life. As a consequence, this methodology enables the benefits of both Semantic Web and the object-relational database (ORDB available. The paper discusses systematically the semantic enriched spatial database's architecture, key technologies and implementation. Subsequently, we demonstrate the function of spatial semantic queries via a practical prototype system. The query results indicate that the method used in this study is feasible.

  19. Index and query methods in road networks

    CERN Document Server

    Feng, Jun

    2015-01-01

    This book presents the index and query techniques on road network and moving objects which are limited to road network. Here, the road network of non-Euclidean space has its unique characteristics such that two moving objects may be very close in a straight line distance. The index used in two-dimensional Euclidean space is not always appropriate for moving objects on road network. Therefore, the index structure needs to be improved in order to obtain suitable indexing methods, explore the shortest path and acquire nearest neighbor query and aggregation query methods under the new index structures. Chapter 1 of this book introduces the present situation of intelligent traffic and index in road network, Chapter 2 introduces the relevant existing spatial indexing methods. Chapter 3-5 focus on several issues of road network and query, they involves: traffic road network models (see Chapter 3), index structures (see Chapter 4) and aggregate query methods (see Chapter 5). Finally, in Chapter 6, the book briefly de...

  20. Direct Observation of Long-Range Transport Using Continuously Sounding Balloons and Near-Real-Time Trajectory Modeling

    Science.gov (United States)

    Voss, P. B.; Zaveri, R. A.; Berkowitz, C. M.

    2009-12-01

    Controlled Meteorological (CMET) balloons have been used in several recent studies to measure long-range transport over periods as long as 30 hours and distances up to 1000 kilometers. By repeatedly performing shallow soundings as they drift, CMET balloons can quantify evolving atmospheric structure, mixing events, shear advection, and dispersion during transport. In addition, the quasi-Lagrangian wind profiles can be used to drive a multi-layer trajectory model in which the advected air parcels follow the underlying terrain, or are constrained by altitude, potential temperature, or tracer concentration. Data from a coordinated balloon-aircraft study of long range transport over Texas (SETTS 2005) show that the reconstructed trajectories accurately track residual-layer urban outflow (and at times even its fine-scale structure) over distances of many hundreds of kilometers. The reconstructed trajectories and evolving profile visualizations are increasingly being made available in near-real time during balloon flights, supporting data-driven flight planning and sophisticated process studies relevant to atmospheric chemistry and climate. Multilayer trajectories (black grids) derived from CMET balloon flight paths (grey lines) for a transport event across Texas in 2005.

  1. Improving Academic Achievement through Continuous Assessment Methods: In the Case of Year Two Students of Animal and Range Sciences Department in Wolaita Sodo University, Ethiopia

    Science.gov (United States)

    Sarka, Samuel; Lijalem, Tsegay; Shibiru, Tilaye

    2017-01-01

    The aim of this study was to assessing and implementing of continuous assessment to enhance academic performance of 2nd year Animal and Range Sciences department students in Wolaita sodo university; and to take action (train) to raise the academic performance to a desirable state. For the purpose of surveying the students' level of performance…

  2. Processing of Extreme Moving-Object Update and Query Workloads in Main Memory

    DEFF Research Database (Denmark)

    Sidlauskas, Darius; Saltenis, Simonas; Jensen, Christian Søndergaard

    2014-01-01

    concurrency anomalies and to ensure correct system behavior, conflicting update and query operations must be serialized. In this setting, it is a key concern to avoid that operations are blocked, which leaves processing cores idle. To enable efficient processing, we first examine concurrency degrees from...... traditional transaction processing in the context of our target domain and propose new semantics that enable a high degree of parallelism and ensure up-to-date query results. We define the new semantics for range and k-nearest neighbor queries. Then we present a main-memory indexing technique called PGrid...

  3. Heuristic query optimization for query multiple table and multiple clausa on mobile finance application

    Science.gov (United States)

    Indrayana, I. N. E.; P, N. M. Wirasyanti D.; Sudiartha, I. KG

    2018-01-01

    Mobile application allow many users to access data from the application without being limited to space, space and time. Over time the data population of this application will increase. Data access time will cause problems if the data record has reached tens of thousands to millions of records.The objective of this research is to maintain the performance of data execution for large data records. One effort to maintain data access time performance is to apply query optimization method. The optimization used in this research is query heuristic optimization method. The built application is a mobile-based financial application using MySQL database with stored procedure therein. This application is used by more than one business entity in one database, thus enabling rapid data growth. In this stored procedure there is an optimized query using heuristic method. Query optimization is performed on a “Select” query that involves more than one table with multiple clausa. Evaluation is done by calculating the average access time using optimized and unoptimized queries. Access time calculation is also performed on the increase of population data in the database. The evaluation results shown the time of data execution with query heuristic optimization relatively faster than data execution time without using query optimization.

  4. jQuery Mobile Up and Running

    CERN Document Server

    Firtman, Maximiliano

    2012-01-01

    Would you like to build one mobile web application that works on iPad and Kindle Fire as well as iPhone and Android smartphones? This introductory guide to jQuery Mobile shows you how. Through a series of hands-on exercises, you'll learn the best ways to use this framework's many interface components to build customizable, multiplatform apps. You don't need any programming skills or previous experience with jQuery to get started. By the time you finish this book, you'll know how to create responsive, Ajax-based interfaces that work on a variety of smartphones and tablets, using jQuery Mobile

  5. A Query System for Texts with Macros

    Science.gov (United States)

    Kwon, Keehang; Kang, Dae-Seong; Kim, Jinsoo

    We propose a query language based on extended regular expressions. This language extends texts with text-generating macros. These macros make it possible to define languages in a compressed, elegant way. This paper also extends queries with linear implications and additive (classical) conjunctions. To be precise, it allows goals of the form D _??_ G and G1 & G2 where D is a text or a macro and G is a query. The first goal is solved by adding D to the current text and then solving G. This goal is flexible in controlling the current text dynamically. The second goal is solved by solving both G1 and G2 from the current text. This goal is particularly useful for internet search.

  6. Optimal Planar Orthogonal Skyline Counting Queries

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Larsen, Kasper Green

    2014-01-01

    The skyline of a set of points in the plane is the subset of maximal points, where a point (x,y) is maximal if no other point (x',y') satisfies x'≥ x and y'≥ x. We consider the problem of preprocessing a set P of n points into a space efficient static data structure supporting orthogonal skyline...... counting queries, i.e. given a query rectangle R to report the size of the skyline of P\\cap R. We present a data structure for storing n points with integer coordinates having query time O(lg n/lglg n) and space usage O(n). The model of computation is a unit cost RAM with logarithmic word size. We prove...

  7. jQuery for designers beginner's guide

    CERN Document Server

    MacLees, Natalie

    2014-01-01

    A step-by-step guide that spices up your web pages and designs them in the way you want using the most widely used JavaScript library, jQuery. The beginner-friendly and easy-to-understand approach of the book will help get to grips with jQuery in no time. If you know the fundamentals of HTML and CSS, and want to extend your knowledge by learning to use JavaScript, then this is just the book for you. jQuery makes JavaScript straightforward and approachable - you'll be surprised at how easy it can be to add animations and special effects to your beautifully designed pages.

  8. Evaluating Trajectory Queries over Imprecise Location Data

    DEFF Research Database (Denmark)

    Xie, Scott, Xike; Cheng, Reynold; Yiu, Man Lung

    2012-01-01

    Trajectory queries, which retrieve nearby objects for every point of a given route, can be used to identify alerts of potential threats along a vessel route, or monitor the adjacent rescuers to a travel path. However, the locations of these objects (e.g., threats, succours) may not be precisely...... obtained due to hardware limitations of measuring devices, as well as the constantly-changing nature of the external environment. Ignoring data uncertainty can render low query quality, and cause undesirable consequences such as missing alerts of threats and poor response time in rescue operations. Also...

  9. Query Optimization Techniques in Microsoft SQL Server

    Directory of Open Access Journals (Sweden)

    Costel Gabriel CORLATAN

    2014-09-01

    Full Text Available Microsoft SQL Server is a relational database management system, having MS-SQL and Transact-SQL as primary structured programming languages. They rely on relational algebra which is mainly used for data insertion, modifying, deletion and retrieval, as well as for data access controlling. The problem with getting the expected results is handled by the management system which has the purpose of finding the best execution plan, this process being called optimization. The most frequently used queries are those of data retrieval through SELECT command. We have to take into consideration that not only the select queries need optimization, but also other objects, such as: index, view or statistics.

  10. Fundamentals of Physical Design and Query Compilation

    CERN Document Server

    Toman, David

    2011-01-01

    Query compilation is the problem of translating user requests formulated over purely conceptual and domain specific ways of understanding data, commonly called logical designs, to efficient executable programs called query plans. Such plans access various concrete data sources through their low-level often iterator-based interfaces. An appreciation of the concrete data sources, their interfaces and how such capabilities relate to logical design is commonly called a physical design. This book is an introduction to the fundamental methods underlying database technology that solves the problem of

  11. Implementation of Quantum Private Queries Using Nuclear Magnetic Resonance

    International Nuclear Information System (INIS)

    Wang Chuan; Hao Liang; Zhao Lian-Jie

    2011-01-01

    We present a modified protocol for the realization of a quantum private query process on a classical database. Using one-qubit query and CNOT operation, the query process can be realized in a two-mode database. In the query process, the data privacy is preserved as the sender would not reveal any information about the database besides her query information, and the database provider cannot retain any information about the query. We implement the quantum private query protocol in a nuclear magnetic resonance system. The density matrix of the memory registers are constructed. (general)

  12. Lazy Toggle PRM: A single-query approach to motion planning

    KAUST Repository

    Denny, Jory

    2013-05-01

    Probabilistic RoadMaps (PRMs) are quite suc-cessful in solving complex and high-dimensional motion plan-ning problems. While particularly suited for multiple-query scenarios and expansive spaces, they lack efficiency in both solving single-query scenarios and mapping narrow spaces. Two PRM variants separately tackle these gaps. Lazy PRM reduces the computational cost of roadmap construction for single-query scenarios by delaying roadmap validation until query time. Toggle PRM is well suited for mapping narrow spaces by mapping both Cfree and Cobst, which gives certain theoretical benefits. However, fully validating the two resulting roadmaps can be costly. We present a strategy, Lazy Toggle PRM, for integrating these two approaches into a method which is both suited for narrow passages and efficient single-query calculations. This simultaneously addresses two challenges of PRMs. Like Lazy PRM, Lazy Toggle PRM delays validation of roadmaps until query time, but if no path is found, the algorithm augments the roadmap using the Toggle PRM methodology. We demonstrate the effectiveness of Lazy Toggle PRM in a wide range of scenarios, including those with narrow passages and high descriptive complexity (e.g., those described by many triangles), concluding that it is more effective than existing methods in solving difficult queries. © 2013 IEEE.

  13. Evolutionary Algorithms for Boolean Queries Optimization

    Czech Academy of Sciences Publication Activity Database

    Húsek, Dušan; Snášel, Václav; Neruda, Roman; Owais, S.S.J.; Krömer, P.

    2006-01-01

    Roč. 3, č. 1 (2006), s. 15-20 ISSN 1790-0832 R&D Projects: GA AV ČR 1ET100300414 Institutional research plan: CEZ:AV0Z10300504 Keywords : evolutionary algorithms * genetic algorithms * information retrieval * Boolean query Subject RIV: BA - General Mathematics

  14. Boolean Queries Optimization by Genetic Algorithms

    Czech Academy of Sciences Publication Activity Database

    Húsek, Dušan; Owais, S.S.J.; Krömer, P.; Snášel, Václav

    2005-01-01

    Roč. 15, - (2005), s. 395-409 ISSN 1210-0552 R&D Projects: GA AV ČR 1ET100300414 Institutional research plan: CEZ:AV0Z10300504 Keywords : evolutionary algorithms * genetic algorithms * genetic programming * information retrieval * Boolean query Subject RIV: BB - Applied Statistics, Operational Research

  15. An Ensemble Approach for Expanding Queries

    Science.gov (United States)

    2012-11-01

    nephritis ” in query number 145. lupus nephritis ( nephritis OR lupus lupus OR glomerulonephritis mycophenolate OR mofetil glomerulonephritis OR... lupus cyclophosphamide membranous OR lupus OR nephritis OR syndrome diffuse OR lupus OR glomerulonephritis OR syndrome sle OR...document collection (Table 1). Table 1. Stop words. High frequency words Common English stop words but treatment normal him who after over

  16. Flattening Queries over Nested Data Types

    NARCIS (Netherlands)

    van Ruth, J.

    2006-01-01

    The theory developed in this thesis provides a method to improve the efficiency of querying nested data. The roots of this research lie in the tension between data model expressiveness and performance. Obviously, more expressive data models are more convenient for application programmers. For many

  17. Web-Based Distributed XML Query Processing

    NARCIS (Netherlands)

    Smiljanic, M.; Feng, L.; Jonker, Willem; Blanken, Henk; Grabs, T.; Schek, H-J.; Schenkel, R.; Weikum, G.

    2003-01-01

    Web-based distributed XML query processing has gained in importance in recent years due to the widespread popularity of XML on the Web. Unlike centralized and tightly coupled distributed systems, Web-based distributed database systems are highly unpredictable and uncontrollable, with a rather

  18. Path Minima Queries in Dynamic Weighted Trees

    DEFF Research Database (Denmark)

    Davoodi, Pooya; Brodal, Gerth Stølting; Satti, Srinivasa Rao

    2011-01-01

    update time?} in the comparison and the RAM models. These structures also support inserting a node on an edge, inserting a leaf, and contracting edges. When only insertion and deletion of leaves are desired, we give data structures in the comparison and the RAM models, with optimal query time...

  19. Enabling Incremental Query Re-Optimization.

    Science.gov (United States)

    Liu, Mengmeng; Ives, Zachary G; Loo, Boon Thau

    2016-01-01

    As declarative query processing techniques expand to the Web, data streams, network routers, and cloud platforms, there is an increasing need to re-plan execution in the presence of unanticipated performance changes. New runtime information may affect which query plan we prefer to run. Adaptive techniques require innovation both in terms of the algorithms used to estimate costs , and in terms of the search algorithm that finds the best plan. We investigate how to build a cost-based optimizer that recomputes the optimal plan incrementally given new cost information, much as a stream engine constantly updates its outputs given new data. Our implementation especially shows benefits for stream processing workloads. It lays the foundations upon which a variety of novel adaptive optimization algorithms can be built. We start by leveraging the recently proposed approach of formulating query plan enumeration as a set of recursive datalog queries ; we develop a variety of novel optimization approaches to ensure effective pruning in both static and incremental cases. We further show that the lessons learned in the declarative implementation can be equally applied to more traditional optimizer implementations.

  20. Query-by-Emoji Video Search

    NARCIS (Netherlands)

    Cappallo, S.; Mensink, T.; Snoek, C.G.M.

    2015-01-01

    This technical demo presents Emoji2Video, a query-by-emoji interface for exploring video collections. Ideogram-based video search and representation presents an opportunity for an intuitive, visual interface and concise non-textual summary of video contents, in a form factor that is ideal for small

  1. Beginning SQL queries from novice to professional

    CERN Document Server

    Churcher, Clare

    2016-01-01

    Anyone who does any work at all with databases needs to know something of SQL. This is a friendly and easy-to-read guide to writing queries with the all-important - in the database world - SQL language. The author writes with exceptional clarity.

  2. Approximate Nearest Neighbor Queries among Parallel Segments

    DEFF Research Database (Denmark)

    Emiris, Ioannis Z.; Malamatos, Theocharis; Tsigaridas, Elias

    2010-01-01

    We develop a data structure for answering efficiently approximate nearest neighbor queries over a set of parallel segments in three dimensions. We connect this problem to approximate nearest neighbor searching under weight constraints and approximate nearest neighbor searching on historical data...

  3. Query and document models for enterprise search

    NARCIS (Netherlands)

    Balog, K.; Hofmann, K.; Weerkamp, W.; de Rijke, M.; Voorhees, E.M.; Buckland, L.P.

    2008-01-01

    We describe our participation in the TREC 2007 Enterprise track and detail our language modeling-based approaches. For document search, our focus was on estimating a mixture model using a standard web collection, and on constructing query models by employing blind relevance feedback and using the

  4. Exploiting cost distributions for query optimization

    NARCIS (Netherlands)

    F. Waas; A.J. Pellenkoft (Jan)

    1998-01-01

    textabstractLarge-scale query optimization is, besides its practical relevance, a hard test case for optimization techniques. Since exact methods cannot be applied due to the combinatorial explosion of the search space, heuristics and probabilistic strategies have been deployed for more than a

  5. CUFID-query: accurate network querying through random walk based network flow estimation.

    Science.gov (United States)

    Jeong, Hyundoo; Qian, Xiaoning; Yoon, Byung-Jun

    2017-12-28

    Functional modules in biological networks consist of numerous biomolecules and their complicated interactions. Recent studies have shown that biomolecules in a functional module tend to have similar interaction patterns and that such modules are often conserved across biological networks of different species. As a result, such conserved functional modules can be identified through comparative analysis of biological networks. In this work, we propose a novel network querying algorithm based on the CUFID (Comparative network analysis Using the steady-state network Flow to IDentify orthologous proteins) framework combined with an efficient seed-and-extension approach. The proposed algorithm, CUFID-query, can accurately detect conserved functional modules as small subnetworks in the target network that are expected to perform similar functions to the given query functional module. The CUFID framework was recently developed for probabilistic pairwise global comparison of biological networks, and it has been applied to pairwise global network alignment, where the framework was shown to yield accurate network alignment results. In the proposed CUFID-query algorithm, we adopt the CUFID framework and extend it for local network alignment, specifically to solve network querying problems. First, in the seed selection phase, the proposed method utilizes the CUFID framework to compare the query and the target networks and to predict the probabilistic node-to-node correspondence between the networks. Next, the algorithm selects and greedily extends the seed in the target network by iteratively adding nodes that have frequent interactions with other nodes in the seed network, in a way that the conductance of the extended network is maximally reduced. Finally, CUFID-query removes irrelevant nodes from the querying results based on the personalized PageRank vector for the induced network that includes the fully extended network and its neighboring nodes. Through extensive

  6. Indexing, Query Processing, and Clustering of Spatio-Temporal Text Objects

    DEFF Research Database (Denmark)

    Skovsgaard, Anders

    With the increasing mobile use of the web from geo-positioned devices, the Internet is increasingly acquiring a spatial aspect, with still more types of content being geo-tagged. As a result of this development, a wide range of location-aware queries and applications have emerged. The large amoun...... partial results. The results shows excellent indexing and query execution performance on a standard DBMS......) spatio-temporal aggregates, and (iii) spatio-textual region querying without special purpose index structures. First, two novel techniques to perform grouping of spatio-textual objects are presented. In the first technique, top-k groups of objects are returned while taking into account aspects......, the grouping of spatio-textual objects is done without considering query locations, and a clustering approach is proposed that takes into account both the spatial and textual attributes of the objects. The technique expands clusters based on a proposed quality function that enables clusters of arbitrary shape...

  7. On the evaluation of fuzzy quantified queries in a database management system

    Science.gov (United States)

    Bosc, Patrick; Pivert, Olivier

    1992-01-01

    Many propositions to extend database management systems have been made in the last decade. Some of them aim at the support of a wider range of queries involving fuzzy predicates. Unfortunately, these queries are somewhat complex and the question of their efficiency is a subject under discussion. In this paper, we focus on a particular subset of queries, namely those using fuzzy quantified predicates. More precisely, we will consider the case where such predicates apply to individual elements as well as to sets of elements. Thanks to some interesting properties of alpha-cuts of fuzzy sets, we are able to show that the evaluation of these queries can be significantly improved with respect to a naive strategy based on exhaustive scans of sets or files.

  8. jQuery UI 1.10 the user interface library for jQuery

    CERN Document Server

    Libby, Alex

    2013-01-01

    This book consists of an easy-to-follow, example-based approach that leads you step-by-step through the implementation and customization of each library component.This book is for frontend designers and developers who need to learn how to use jQuery UI quickly. To get the most out of this book, you should have a good working knowledge of HTML, CSS, and JavaScript, and should ideally be comfortable using jQuery.

  9. The survey of large-scale query classification

    Science.gov (United States)

    Zhou, Sanduo; Cheng, Kefei; Men, Lijun

    2017-04-01

    In recent years, a lot of researches have been done on query classification. The paper introduces the recent researches on query classification in detail, mainly including the source of query log, the category systems, the feature extraction methods, classification methods and the evaluation methodology. Then it discusses the issues of large-scale query classification and the solved methods combined with big data analysis systems. The research result shows there still are several problems and challenges, such as lack of authoritative classification system and evaluation methodology, efficiency of the feature extraction method, uncertainty of the performance on large-scale query log and the further query classification on the big data platform, etc.

  10. Cumulative query method for influenza surveillance using search engine data.

    Science.gov (United States)

    Seo, Dong-Woo; Jo, Min-Woo; Sohn, Chang Hwan; Shin, Soo-Yong; Lee, JaeHo; Yu, Maengsoo; Kim, Won Young; Lim, Kyoung Soo; Lee, Sang-Il

    2014-12-16

    Internet search queries have become an important data source in syndromic surveillance system. However, there is currently no syndromic surveillance system using Internet search query data in South Korea. The objective of this study was to examine correlations between our cumulative query method and national influenza surveillance data. Our study was based on the local search engine, Daum (approximately 25% market share), and influenza-like illness (ILI) data from the Korea Centers for Disease Control and Prevention. A quota sampling survey was conducted with 200 participants to obtain popular queries. We divided the study period into two sets: Set 1 (the 2009/10 epidemiological year for development set 1 and 2010/11 for validation set 1) and Set 2 (2010/11 for development Set 2 and 2011/12 for validation Set 2). Pearson's correlation coefficients were calculated between the Daum data and the ILI data for the development set. We selected the combined queries for which the correlation coefficients were .7 or higher and listed them in descending order. Then, we created a cumulative query method n representing the number of cumulative combined queries in descending order of the correlation coefficient. In validation set 1, 13 cumulative query methods were applied, and 8 had higher correlation coefficients (min=.916, max=.943) than that of the highest single combined query. Further, 11 of 13 cumulative query methods had an r value of ≥.7, but 4 of 13 combined queries had an r value of ≥.7. In validation set 2, 8 of 15 cumulative query methods showed higher correlation coefficients (min=.975, max=.987) than that of the highest single combined query. All 15 cumulative query methods had an r value of ≥.7, but 6 of 15 combined queries had an r value of ≥.7. Cumulative query method showed relatively higher correlation with national influenza surveillance data than combined queries in the development and validation set.

  11. Date restricted queries in web search engines

    OpenAIRE

    Lewandowski, Dirk

    2004-01-01

    Search engines usually offer a date restricted search on their advanced search pages. But determining the actual update of a web page is not without problems. We conduct a study testing date restricted queries on the search engines Google, Teoma and Yahoo!. We find that these searches fail to work properly in the examined engines. We discuss implications of this for further research and search engine development.

  12. Advanced SPARQL querying in small molecule databases

    Czech Academy of Sciences Publication Activity Database

    Galgonek, Jakub; Hurt, T.; Michlíková, V.; Onderka, P.; Schwarz, J.; Vondrášek, Jiří

    2016-01-01

    Roč. 8, Jun 6 (2016), č. článku 31. ISSN 1758-2946 R&D Projects : GA MŠk(CZ) LM2015047 Institutional support: RVO:61388963 Keywords : Resource Description Framework * SPARQL query language * Database of small molecules Subject RIV: CF - Physical ; Theoretical Chemistry Impact factor: 4.220, year: 2016 http://jcheminf.springeropen.com/articles/10.1186/s13321-016-0144-4

  13. Query-Structure Based Web Page Indexing

    Science.gov (United States)

    2012-11-01

    task. 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT Same as Report (SAR) 18. NUMBER OF PAGES 13 19a. NAME OF...finding, Entity finding, and Web pages classification . The design of highly-scalable indexing algorithms is needed, especially with an estimate of one...content, e.g., “ Fibromyalgia " or "Lipoma". • Combining: this type of query is processed using primitive keywords from urls and/or titles that imply

  14. TEMPORAL QUERY PROCESSIG USING SQL SERVER

    OpenAIRE

    Vali Shaik, Mastan; Sujatha, P

    2017-01-01

    Most data sources in real-life are not static but change their information in time. This evolution of data in time can give valuable insights to business analysts. Temporal data refers to data, where changes over time or temporal aspects play a central role. Temporal data denotes the evaluation of object characteristics over time. One of the main unresolved problems that arise during the data mining process is treating data that contains temporal information. Temporal queries on time evolving...

  15. A Practical Python API for Querying AFLOWLIB

    OpenAIRE

    Rosenbrock, Conred W.

    2017-01-01

    Large databases such as aflowlib.org provide valuable data sources for discovering material trends through machine learning. Although a REST API and query language are available, there is a learning curve associated with the AFLUX language that acts as a barrier for new users. Additionally, the data is stored using non-standard serialization formats. Here we present a high-level API that allows immediate access to the aflowlib data using standard python operators and language features. It pro...

  16. STOQS: The Spatial Temporal Oceanographic Query System

    Science.gov (United States)

    McCann, M. P.; Schramm, R.

    2010-12-01

    The Spatial-Temporal Oceanographic Query System (STOQS) has been developed at the Monterey Bay Aquarium Research Institute to improve access and visualization of a multi-decadal archive of upper water column observations. STOQS consists of a set of applications, operational procedures, and a geospatial relational database. Borrowing a database schema from the Geographic Information System community we've implemented a database that is tuned for efficient queries across several dimensions of the data model. An Object Relational Mapping (ORM) tool was used to hide the complexity of SQL that results from our highly normalized data model. The Python scripting language is used to write the Extract Translate Load (ETL) programs for populating the database with data from our long-term operational archives. These archives include collections of Climate Forecast convention netCDF files of mooring and autonomous underwater vehicle data and other special purpose relational databases. This poster describes the specific tools and techniques used to implement STOQS. Though still in development the system already provides benefits to users through a Google Earth interface and an ability to conduct fast queries across multiple previously non-interoperable data sets.

  17. Approximate furthest neighbor with application to annulus query

    DEFF Research Database (Denmark)

    Pagh, Rasmus; Silvestri, Francesco; Sivertsen, Johan von Tangen

    2016-01-01

    Much recent work has been devoted to approximate nearest neighbor queries. Motivated by applications in recommender systems, we consider approximate furthest neighbor (AFN) queries and present a simple, fast, and highly practical data structure for answering AFN queries in high-dimensional Euclid......Much recent work has been devoted to approximate nearest neighbor queries. Motivated by applications in recommender systems, we consider approximate furthest neighbor (AFN) queries and present a simple, fast, and highly practical data structure for answering AFN queries in high...... a variation based on a query-independent ordering of the database points; while this does not have the provable approximation factor of the query-dependent data structure, it offers significant improvement in time and space complexity. We give a theoretical analysis and experimental results. As an application...

  18. Late Triassic granites from Bangka, Indonesia: A continuation of the Main Range granite province of the South-East Asian Tin Belt

    Science.gov (United States)

    Ng, Samuel Wai-Pan; Whitehouse, Martin J.; Roselee, Muhammad H.; Teschner, Claudia; Murtadha, Sayed; Oliver, Grahame J. H.; Ghani, Azman A.; Chang, Su-Chin

    2017-05-01

    The South-East Asian Tin Belt is one of the most tin-productive regions in the world. It comprises three north-south oriented granite provinces, of which the arc-related Eastern granite province and the collision-related Main Range granite province run across Thailand, Singapore, and Indonesia. These tin-producing granite provinces with different mineral assemblages are separated by Paleo-Tethyan sutures exposed in Thailand and Malaysia. The Eastern Province is usually characterised by granites with biotite ± hornblende. Main Range granites are sometimes characterised by the presence of biotite ± muscovite. However, the physical boundary between the two types of granite is not well-defined on the Indonesian Tin Islands, because the Paleo-Tethyan suture is not exposed on land there. Both hornblende-bearing (previously interpreted as I-type) and hornblende-barren (previously interpreted as S-type) granites are apparently randomly distributed on the Indonesian Tin Islands. Granites exposed on Bangka, the largest and southernmost Tin Island, no matter whether they are hornblende-bearing or hornblende-barren, are geochemically similar to Malaysian Main Range granites. The average ɛNd(t) value obtained from the granites from Bangka (average ɛNd(t) = -8.2) falls within the range of the Main Range Province (-9.6 to -5.4). These granites have SIMS zircon U-Pb ages of ca. 225 Ma and ca. 220 Ma, respectively that are both within the period of Main Range magmatism (∼226-201 Ma) in the Peninsular Malaysia. We suggest that the granites exposed on Bangka represent the continuation of the Main Range Province, and that the Paleo-Tethyan suture lies to the east of the island.

  19. Seasonal trends in restless legs symptomatology: evidence from Internet search query data.

    Science.gov (United States)

    Ingram, David G; Plante, David T

    2013-12-01

    Patients with Willis-Ekbom disease (restless legs syndrome [RLS]) frequently report seasonal worsening of their symptoms; however, seasonal patterns in this disorder have not been systematically evaluated. The purpose of our investigation was to utilize Internet search query data to test the hypothesis that restless legs symptoms vary by season, with worsening in the summer months. Internet search query data were obtained from Google Trends. Monthly normalized search volume was determined for the term restless legs between January 2004 and December 2012. Using cosinor analysis, seasonal effects were tested for data from the United States, Australia, Germany, the United Kingdom, and Canada. Cosinor analysis revealed statistically significant seasonal effects on search queries in the United States (P=.005), Australia (P=.00007), Germany (P=.00009), and the United Kingdom (P=.003), though a trend was present in the search data from Canada (P=.098). Search queries peaked in summer months in both northern (June and July) and southern (January) hemispheres. Search query volume increased by 24-40% during summer relative to winter months across all evaluated countries. Evidence from Internet search queries across a wide range of dates and geographic areas suggested a seasonality of restless legs symptomatology with a peak in summer months. Our novel finding in RLS epidemiology needs to be confirmed in additional samples, and underlying mechanisms must be elucidated. Copyright © 2013 Elsevier B.V. All rights reserved.

  20. Scalable Top-k Spatio-Temporal Term Querying

    DEFF Research Database (Denmark)

    Skovsgaard, Anders; Sidlauskas, Darius; Jensen, Christian S.

    2014-01-01

    With the rapidly increasing deployment of Internet-connected, location-aware mobile devices, very large and increasing amounts of geo-tagged and timestamped user-generated content, such as microblog posts, are being generated. We present indexing, update, and query processing techniques that are ......With the rapidly increasing deployment of Internet-connected, location-aware mobile devices, very large and increasing amounts of geo-tagged and timestamped user-generated content, such as microblog posts, are being generated. We present indexing, update, and query processing techniques...... that are capable of providing the top-k terms seen in posts in a user-specified spatio-temporal range. The techniques enable interactive response times in the millisecond range in a realistic setting where the arrival rate of posts exceeds today's average tweet arrival rate by a factor of 4-10. The techniques...... adaptively maintain the most frequent items at various spatial and temporal granularities. They extend existing frequent item counting techniques to maintain exact counts rather than approximations. An extensive empirical study with a large collection of geo-tagged tweets shows that the proposed techniques...

  1. Semantic querying of data guided by Formal Concept Analysis

    OpenAIRE

    Codocedo , Victor; Lykourentzou , Ioanna; Napoli , Amedeo

    2012-01-01

    International audience; In this paper we present a novel approach to handle querying over a concept lattice of documents and annotations. We focus on the problem of "non-matching documents", which are those that, despite being semantically relevant to the user query, do not contain the query's elements and hence cannot be retrieved by typical string matching approaches. In order to find these documents, we modify the initial user query using the concept lattice as a guide. We achieve this by ...

  2. Range Selection and Median

    DEFF Research Database (Denmark)

    Jørgensen, Allan Grønlund; Larsen, Kasper Green

    2011-01-01

    and several natural special cases thereof. The rst special case is known as range median, which arises when k is xed to b(j 􀀀 i + 1)=2c. The second case, denoted prex selection, arises when i is xed to 0. Finally, we also consider the bounded rank prex selection problem and the xed rank range......Range selection is the problem of preprocessing an input array A of n unique integers, such that given a query (i; j; k), one can report the k'th smallest integer in the subarray A[i];A[i+1]; : : : ;A[j]. In this paper we consider static data structures in the word-RAM for range selection...... selection problem. In the former, data structures must support prex selection queries under the assumption that k for some value n given at construction time, while in the latter, data structures must support range selection queries where k is xed beforehand for all queries. We prove cell probe lower bounds...

  3. Parallelizing Federated SPARQL Queries in Presence of Replicated Data

    DEFF Research Database (Denmark)

    Minier, Thomas; Montoya, Gabriela; Skaf-Molli, Hala

    2017-01-01

    Federated query engines have been enhanced to exploit new data localities created by replicated data, e.g., Fedra. However, existing replication aware federated query engines mainly focus on pruning sources during the source selection and query decomposition in order to reduce intermediate results...

  4. Multiple Query Evaluation Based on an Enhanced Genetic Algorithm.

    Science.gov (United States)

    Tamine, Lynda; Chrisment, Claude; Boughanem, Mohand

    2003-01-01

    Explains the use of genetic algorithms to combine results from multiple query evaluations to improve relevance in information retrieval. Discusses niching techniques, relevance feedback techniques, and evolution heuristics, and compares retrieval results obtained by both genetic multiple query evaluation and classical single query evaluation…

  5. User Simulations for Interactive Search : Evaluating Personalized Query Suggestion

    NARCIS (Netherlands)

    Verberne, S.; Sappelli, M.; Järvelin, K.; Kraaij, W.

    2015-01-01

    In this paper, we address the question “what is the influence of user search behaviour on the effectiveness of personalized query suggestion?”. We implemented a method for query suggestion that generates candidate follow-up queries from the documents clicked by the user. This is a potentially

  6. Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition

    Directory of Open Access Journals (Sweden)

    Suzuki Motoyuki

    2009-01-01

    Full Text Available Abstract We are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves Web documents using the chosen keywords. A problem is that the selected keywords tend to contain misrecognized words. The proposed method introduces two new ideas for avoiding the effects of keywords derived from misrecognized words. The first idea is to compose multiple queries from selected keyword candidates so that the misrecognized words and correct words do not fall into one query. The second idea is that the number of Web documents downloaded for each query is determined according to the "query relevance." Combining these two ideas, we can alleviate bad effect of misrecognized keywords by decreasing the number of downloaded Web documents from queries that contain misrecognized keywords. Finally, we examine a method of determining the number of iterative adaptations based on the recognition likelihood. Experiments have shown that the proposed stopping criterion can determine almost the optimum number of iterations. In the final experiment, the word accuracy without adaptation (55.29% was improved to 60.38%, which was 1.13 point better than the result of the conventional unsupervised adaptation method (59.25%.

  7. Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition

    Directory of Open Access Journals (Sweden)

    Akinori Ito

    2009-01-01

    Full Text Available We are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves Web documents using the chosen keywords. A problem is that the selected keywords tend to contain misrecognized words. The proposed method introduces two new ideas for avoiding the effects of keywords derived from misrecognized words. The first idea is to compose multiple queries from selected keyword candidates so that the misrecognized words and correct words do not fall into one query. The second idea is that the number of Web documents downloaded for each query is determined according to the “query relevance.” Combining these two ideas, we can alleviate bad effect of misrecognized keywords by decreasing the number of downloaded Web documents from queries that contain misrecognized keywords. Finally, we examine a method of determining the number of iterative adaptations based on the recognition likelihood. Experiments have shown that the proposed stopping criterion can determine almost the optimum number of iterations. In the final experiment, the word accuracy without adaptation (55.29% was improved to 60.38%, which was 1.13 point better than the result of the conventional unsupervised adaptation method (59.25%.

  8. The I4 Online Query Tool for Earth Observations Data

    Science.gov (United States)

    Stefanov, William L.; Vanderbloemen, Lisa A.; Lawrence, Samuel J.

    2015-01-01

    The NASA Earth Observation System Data and Information System (EOSDIS) delivers an average of 22 terabytes per day of data collected by orbital and airborne sensor systems to end users through an integrated online search environment (the Reverb/ECHO system). Earth observations data collected by sensors on the International Space Station (ISS) are not currently included in the EOSDIS system, and are only accessible through various individual online locations. This increases the effort required by end users to query multiple datasets, and limits the opportunity for data discovery and innovations in analysis. The Earth Science and Remote Sensing Unit of the Exploration Integration and Science Directorate at NASA Johnson Space Center has collaborated with the School of Earth and Space Exploration at Arizona State University (ASU) to develop the ISS Instrument Integration Implementation (I4) data query tool to provide end users a clean, simple online interface for querying both current and historical ISS Earth Observations data. The I4 interface is based on the Lunaserv and Lunaserv Global Explorer (LGE) open-source software packages developed at ASU for query of lunar datasets. In order to avoid mirroring existing databases - and the need to continually sync/update those mirrors - our design philosophy is for the I4 tool to be a pure query engine only. Once an end user identifies a specific scene or scenes of interest, I4 transparently takes the user to the appropriate online location to download the data. The tool consists of two public-facing web interfaces. The Map Tool provides a graphic geobrowser environment where the end user can navigate to an area of interest and select single or multiple datasets to query. The Map Tool displays active image footprints for the selected datasets (Figure 1). Selecting a footprint will open a pop-up window that includes a browse image and a link to available image metadata, along with a link to the online location to order or

  9. Secure Count Query on Encrypted Genomic Data.

    Science.gov (United States)

    Hasan, Mohammad Zahidul; Rahman Mahdi, Md Safiur; Sadat, Md Nazmus; Mohammed, Noman

    2018-03-14

    Human genomic information can yield more effective healthcare by guiding medical decisions. Therefore, genomics research is gaining popularity as it can identify potential correlations between a disease and a certain gene, which improves the safety and efficacy of drug treatment and can also develop more effective prevention strategies [1]. To reduce the sampling error and to increase the statistical accuracy of this type of research projects, data from different sources need to be brought together since a single organization does not necessarily possess required amount of data. In this case, data sharing among multiple organizations must satisfy strict policies (for instance, HIPAA and PIPEDA) that have been enforced to regulate privacy-sensitive data sharing. Storage and computation on the shared data can be outsourced to a third party cloud service provider, equipped with enormous storage and computation resources. However, outsourcing data to a third party is associated with a potential risk of privacy violation of the participants, whose genomic sequence or clinical profile is used in these studies. In this article, we propose a method for secure sharing and computation on genomic data in a semi-honest cloud server. In particular, there are two main contributions. Firstly, the proposed method can handle biomedical data containing both genotype and phenotype. Secondly, our proposed index tree scheme reduces the computational overhead significantly for executing secure count query operation. In our proposed method, the confidentiality of shared data is ensured through encryption, while making the entire computation process efficient and scalable for cutting-edge biomedical applications. We evaluated our proposed method in terms of efficiency on a database of Single-Nucleotide Polymorphism (SNP) sequences, and experimental results demonstrate that the execution time for a query of 50 SNPs in a database of 50000 records is approximately 5 seconds, where each

  10. Advanced Query and Data Mining Capabilities for MaROS

    Science.gov (United States)

    Wang, Paul; Wallick, Michael N.; Allard, Daniel A.; Gladden, Roy E.; Hy, Franklin H.

    2013-01-01

    The Mars Relay Operational Service (MaROS) comprises a number of tools to coordinate, plan, and visualize various aspects of the Mars Relay network. These levels include a Web-based user interface, a back-end "ReSTlet" built in Java, and databases that store the data as it is received from the network. As part of MaROS, the innovators have developed and implemented a feature set that operates on several levels of the software architecture. This new feature is an advanced querying capability through either the Web-based user interface, or through a back-end REST interface to access all of the data gathered from the network. This software is not meant to replace the REST interface, but to augment and expand the range of available data. The current REST interface provides specific data that is used by the MaROS Web application to display and visualize the information; however, the returned information from the REST interface has typically been pre-processed to return only a subset of the entire information within the repository, particularly only the information that is of interest to the GUI (graphical user interface). The new, advanced query and data mining capabilities allow users to retrieve the raw data and/or to perform their own data processing. The query language used to access the repository is a restricted subset of the structured query language (SQL) that can be built safely from the Web user interface, or entered as freeform SQL by a user. The results are returned in a CSV (Comma Separated Values) format for easy exporting to third party tools and applications that can be used for data mining or user-defined visualization and interpretation. This is the first time that a service is capable of providing access to all cross-project relay data from a single Web resource. Because MaROS contains the data for a variety of missions from the Mars network, which span both NASA and ESA, the software also establishes an access control list (ACL) on each data record

  11. Spatio-temporal databases complex motion pattern queries

    CERN Document Server

    Vieira, Marcos R

    2013-01-01

    This brief presents several new query processing techniques, called complex motion pattern queries, specifically designed for very large spatio-temporal databases of moving objects. The brief begins with the definition of flexible pattern queries, which are powerful because of the integration of variables and motion patterns. This is followed by a summary of the expressive power of patterns and flexibility of pattern queries. The brief then present the Spatio-Temporal Pattern System (STPS) and density-based pattern queries. STPS databases contain millions of records with information about mobi

  12. Deep web query interface understanding and integration

    CERN Document Server

    Dragut, Eduard C; Yu, Clement T

    2012-01-01

    There are millions of searchable data sources on the Web and to a large extent their contents can only be reached through their own query interfaces. There is an enormous interest in making the data in these sources easily accessible. There are primarily two general approaches to achieve this objective. The first is to surface the contents of these sources from the deep Web and add the contents to the index of regular search engines. The second is to integrate the searching capabilities of these sources and support integrated access to them. In this book, we introduce the state-of-the-art tech

  13. Downloading Multiple Records Using Query Strings

    Directory of Open Access Journals (Sweden)

    Adam Crymble

    2012-11-01

    Full Text Available Downloading a single record from a website is easy, but downloading many records at a time – an increasingly frequent need for a historian – is much more efficient using a programming language such as Python. In this lesson, we will write a program that will download a series of records from the Old Bailey Online using custom search criteria, and save them to a directory on our computer. This process involves interpreting and manipulating URL Query Strings. In this case, the tutorial will seek to download sources that contain references to people of African descent that were published in the Old Bailey Proceedings between 1700 and 1750.

  14. Optimizing queries in SQL Server 2008

    Directory of Open Access Journals (Sweden)

    Ion LUNGU

    2010-05-01

    Full Text Available Starting from the need to develop efficient IT systems, we intend to review theoptimization methods and tools that can be used by SQL Server database administratorsand developers of applications based on Microsoft technology, focusing on the latestversion of the proprietary DBMS, SQL Server 2008. We’ll reflect on the objectives tobe considered in improving the performance of SQL Server instances, we will tackle themostly used techniques for analyzing and optimizing queries and we will describe the“Optimize for ad hoc workloads”, “Plan Freezing” and “Optimize for unknown" newoptions, accompanied by relevant code examples.

  15. Mobile Information Access with Spoken Query Answering

    DEFF Research Database (Denmark)

    Brøndsted, Tom; Larsen, Henrik Legind; Larsen, Lars Bo

    2006-01-01

    This paper addresses the problem of information and service accessibility in mobile devices with limited resources. A solution is developed and tested through a prototype that applies state-of-the-art Distributed Speech Recognition (DSR) and knowledge-based Information Retrieval (IR) processing...... for spoken query answering. For the DSR part, a configurable DSR system is implemented on the basis of the ETSI-DSR advanced front-end and the SPHINX IV recognizer. For the knowledge-based IR part, a distributed system solution is developed for fast retrieval of the most relevant documents, with a text...

  16. An empirical study using range of motion and pain score as determinants for continuous passive motion: outcomes following total knee replacement surgery in an adult population.

    Science.gov (United States)

    Tabor, Danielle

    2013-01-01

    The continuous passive motion (CPM) machine is one means by which to rehabilitate the knee after total knee replacement surgery. This study sought to determine which total knee replacement patients, if any, benefit from the use of the CPM machine. For the study period, most patients received active physical therapy. Patients were placed in the CPM machine if, on postoperative day 1, they had a range of motion less than or equal to 45° and/or pain score of 8 or greater on a numeric rating scale of 0-10, 0 being no pain and 10 being the worst pain. Both groups of patients healed at similar rates. The incidence of adverse events, length of stay, and functional outcomes was comparable between groups. Given the demonstrated lack of relative benefit to the patient and the cost of the CPM, this study supported discontinuing the routine use of the CPM.

  17. Room-temperature continuous-wave operation in the telecom wavelength range of GaSb-based lasers monolithically grown on Si

    Science.gov (United States)

    Castellano, A.; Cerutti, L.; Rodriguez, J. B.; Narcy, G.; Garreau, A.; Lelarge, F.; Tournié, E.

    2017-06-01

    We report on electrically pumped GaSb-based laser diodes monolithically grown on Si and operating in a continuous wave (cw) in the telecom wavelength range. The laser structures were grown by molecular-beam epitaxy on 6°-off (001) substrates. The devices were processed in coplanar contact geometry. 100 μm × 1 mm laser diodes exhibited a threshold current density of 1 kA/cm-2 measured under pulsed operation at 20 °C. CW operation was achieved up to 35 °C with 10 μm × 1 mm diodes. The output power at 20 °C was around 3 mW/uncoated facet, and the cw emission wavelength 1.59 μm, in the C/L-band of telecom systems.

  18. A 1-V 60-μW 85-dB dynamic range continuous-time third-order sigma-delta modulator

    International Nuclear Information System (INIS)

    Li Yuanwen; Qi Da; Dong Yifeng; Xu Jun; Ren Junyan

    2009-01-01

    A 1-V third order one-bit continuous-time (CT) EA modulator is presented. Designed in the SMIC mixed-signal 0.13-μm CMOS process, the modulator utilizes active RC integrators to implement the loop filter. An efficient circuit design methodology for the CT ΣΔ modulator is proposed and verified. Low power dissipation is achieved through the use of two-stage class A/AB amplifiers. The presented modulator achieves 81.4-dB SNDR and 85-dB dynamic range in a 20-kHz bandwidth with an over sampling ratio of 128. The total power consumption of the modulator is only 60 μW from a 1-V power supply and the prototype occupies an active area of 0.12 mm 2 . (semiconductor integrated circuits)

  19. Lost in translation? A multilingual Query Builder improves the quality of PubMed queries: a randomised controlled trial.

    Science.gov (United States)

    Schuers, Matthieu; Joulakian, Mher; Kerdelhué, Gaetan; Segas, Léa; Grosjean, Julien; Darmoni, Stéfan J; Griffon, Nicolas

    2017-07-03

    MEDLINE is the most widely used medical bibliographic database in the world. Most of its citations are in English and this can be an obstacle for some researchers to access the information the database contains. We created a multilingual query builder to facilitate access to the PubMed subset using a language other than English. The aim of our study was to assess the impact of this multilingual query builder on the quality of PubMed queries for non-native English speaking physicians and medical researchers. A randomised controlled study was conducted among French speaking general practice residents. We designed a multi-lingual query builder to facilitate information retrieval, based on available MeSH translations and providing users with both an interface and a controlled vocabulary in their own language. Participating residents were randomly allocated either the French or the English version of the query builder. They were asked to translate 12 short medical questions into MeSH queries. The main outcome was the quality of the query. Two librarians blind to the arm independently evaluated each query, using a modified published classification that differentiated eight types of errors. Twenty residents used the French version of the query builder and 22 used the English version. 492 queries were analysed. There were significantly more perfect queries in the French group vs. the English group (respectively 37.9% vs. 17.9%; p < 0.01). It took significantly more time for the members of the English group than the members of the French group to build each query, respectively 194 sec vs. 128 sec; p < 0.01. This multi-lingual query builder is an effective tool to improve the quality of PubMed queries in particular for researchers whose first language is not English.

  20. CrossQuery: a web tool for easy associative querying of transcriptome data.

    Directory of Open Access Journals (Sweden)

    Toni U Wagner

    Full Text Available Enormous amounts of data are being generated by modern methods such as transcriptome or exome sequencing and microarray profiling. Primary analyses such as quality control, normalization, statistics and mapping are highly complex and need to be performed by specialists. Thereafter, results are handed back to biomedical researchers, who are then confronted with complicated data lists. For rather simple tasks like data filtering, sorting and cross-association there is a need for new tools which can be used by non-specialists. Here, we describe CrossQuery, a web tool that enables straight forward, simple syntax queries to be executed on transcriptome sequencing and microarray datasets. We provide deep-sequencing data sets of stem cell lines derived from the model fish Medaka and microarray data of human endothelial cells. In the example datasets provided, mRNA expression levels, gene, transcript and sample identification numbers, GO-terms and gene descriptions can be freely correlated, filtered and sorted. Queries can be saved for later reuse and results can be exported to standard formats that allow copy-and-paste to all widespread data visualization tools such as Microsoft Excel. CrossQuery enables researchers to quickly and freely work with transcriptome and microarray data sets requiring only minimal computer skills. Furthermore, CrossQuery allows growing association of multiple datasets as long as at least one common point of correlated information, such as transcript identification numbers or GO-terms, is shared between samples. For advanced users, the object-oriented plug-in and event-driven code design of both server-side and client-side scripts allow easy addition of new features, data sources and data types.

  1. Optimal Static Range Reporting in One Dimension

    DEFF Research Database (Denmark)

    Alstrup, Stephen; Brodal, Gerth Stølting; Rauhe, Theis

    2001-01-01

    We consider static one dimensional range searching problems. These problems are to build static data structures for an integer set S \\subseteq U, where U = \\{0,1,\\dots,2^w-1\\}, which support various queries for integer intervals of U. For the query of reporting all integers in S contained within...

  2. Extending OLAP Querying to External Object

    DEFF Research Database (Denmark)

    Pedersen, Torben Bach; Shoshani, Arie; Gu, Junmin

    On-Line Analytical Processing (OLAP) systems based on a dimensional view of data have found widespread use in business applications and are being used increasingly in non-standard applications. These systems provide good performance and ease-of-use. However, the complex structures and relationships...... inherent in data in nonstandard applications are not accommodated well by OLAP systems. In contrast, object database systems are built to handle such complexity, but do not support OLAP-type querying well. This paper presents the concepts and techniques underlying a flexible, multi-model federated system...... that enables OLAP users to exploit simultaneously the features of OLAP and object systems. The system allows data to be handled using the most appropriate data model and technology: OLAP systems for dimensional data and object database systems for more complex, general data. Additionally, physical data...

  3. Node Query Preservation for Deterministic Linear Top-Down Tree Transducers

    Directory of Open Access Journals (Sweden)

    Kazuki Miyahara

    2013-11-01

    Full Text Available This paper discusses the decidability of node query preservation problems for XML document transformations. We assume a transformation given by a deterministic linear top-down data tree transducer (abbreviated as DLT^V and an n-ary query based on runs of a tree automaton. We say that a DLT^V Tr strongly preserves a query Q if there is a query Q' such that for every document t, the answer set of Q' for Tr(t is equal to the answer set of Q for t. Also we say that Tr weakly preserves Q if there is a query Q' such that for every t_d in the range of Tr, the answer set of Q' for t_d is equal to the union of the answer set of Q for t such that t_d = Tr(t. We show that the weak preservation problem is coNP-complete and the strong preservation problem is in 2-EXPTIME.

  4. Querying and Extracting Timeline Information from Road Traffic Sensor Data

    Science.gov (United States)

    Imawan, Ardi; Indikawati, Fitri Indra; Kwon, Joonho; Rao, Praveen

    2016-01-01

    The escalation of traffic congestion in urban cities has urged many countries to use intelligent transportation system (ITS) centers to collect historical traffic sensor data from multiple heterogeneous sources. By analyzing historical traffic data, we can obtain valuable insights into traffic behavior. Many existing applications have been proposed with limited analysis results because of the inability to cope with several types of analytical queries. In this paper, we propose the QET (querying and extracting timeline information) system—a novel analytical query processing method based on a timeline model for road traffic sensor data. To address query performance, we build a TQ-index (timeline query-index) that exploits spatio-temporal features of timeline modeling. We also propose an intuitive timeline visualization method to display congestion events obtained from specified query parameters. In addition, we demonstrate the benefit of our system through a performance evaluation using a Busan ITS dataset and a Seattle freeway dataset. PMID:27563900

  5. A novel adaptive Cuckoo search for optimal query plan generation.

    Science.gov (United States)

    Gomathi, Ramalingam; Sharmila, Dhandapani

    2014-01-01

    The emergence of multiple web pages day by day leads to the development of the semantic web technology. A World Wide Web Consortium (W3C) standard for storing semantic web data is the resource description framework (RDF). To enhance the efficiency in the execution time for querying large RDF graphs, the evolving metaheuristic algorithms become an alternate to the traditional query optimization methods. This paper focuses on the problem of query optimization of semantic web data. An efficient algorithm called adaptive Cuckoo search (ACS) for querying and generating optimal query plan for large RDF graphs is designed in this research. Experiments were conducted on different datasets with varying number of predicates. The experimental results have exposed that the proposed approach has provided significant results in terms of query execution time. The extent to which the algorithm is efficient is tested and the results are documented.

  6. A Novel Adaptive Cuckoo Search for Optimal Query Plan Generation

    Directory of Open Access Journals (Sweden)

    Ramalingam Gomathi

    2014-01-01

    Full Text Available The emergence of multiple web pages day by day leads to the development of the semantic web technology. A World Wide Web Consortium (W3C standard for storing semantic web data is the resource description framework (RDF. To enhance the efficiency in the execution time for querying large RDF graphs, the evolving metaheuristic algorithms become an alternate to the traditional query optimization methods. This paper focuses on the problem of query optimization of semantic web data. An efficient algorithm called adaptive Cuckoo search (ACS for querying and generating optimal query plan for large RDF graphs is designed in this research. Experiments were conducted on different datasets with varying number of predicates. The experimental results have exposed that the proposed approach has provided significant results in terms of query execution time. The extent to which the algorithm is efficient is tested and the results are documented.

  7. Structured Query Translation in Peer to Peer Database Sharing Systems

    Directory of Open Access Journals (Sweden)

    Mehedi Masud

    2009-10-01

    Full Text Available This paper presents a query translation mechanism between heterogeneous peers in Peer to Peer Database Sharing Systems (PDSSs. A PDSS combines a database management system with P2P functionalities. The local databases on peers are called peer databases. In a PDSS, each peer chooses its own data model and schema and maintains data independently without any global coordinator. One of the problems in such a system is translating queries between peers, taking into account both the schema and data heterogeneity. Query translation is the problem of rewriting a query posed in terms of one peer schema to a query in terms of another peer schema. This paper proposes a query translation mechanism between peers where peers are acquainted in data sharing systems through data-level mappings for sharing data.

  8. RCQ-GA: RDF Chain Query Optimization Using Genetic Algorithms

    Science.gov (United States)

    Hogenboom, Alexander; Milea, Viorel; Frasincar, Flavius; Kaymak, Uzay

    The application of Semantic Web technologies in an Electronic Commerce environment implies a need for good support tools. Fast query engines are needed for efficient querying of large amounts of data, usually represented using RDF. We focus on optimizing a special class of SPARQL queries, the so-called RDF chain queries. For this purpose, we devise a genetic algorithm called RCQ-GA that determines the order in which joins need to be performed for an efficient evaluation of RDF chain queries. The approach is benchmarked against a two-phase optimization algorithm, previously proposed in literature. The more complex a query is, the more RCQ-GA outperforms the benchmark in solution quality, execution time needed, and consistency of solution quality. When the algorithms are constrained by a time limit, the overall performance of RCQ-GA compared to the benchmark further improves.

  9. SM4MQ: A Semantic Model for Multidimensional Queries

    DEFF Research Database (Denmark)

    Varga, Jovan; Dobrokhotova, Ekaterina; Romero, Oscar

    2017-01-01

    On-Line Analytical Processing (OLAP) is a data analysis approach to support decision-making. On top of that, Exploratory OLAP is a novel initiative for the convergence of OLAP and the Semantic Web (SW) that enables the use of OLAP techniques on SW data. Moreover, OLAP approaches exploit different...... metadata artifacts (e.g., queries) to assist users with the analysis. However, modeling and sharing of most of these artifacts are typically overlooked. Thus, in this paper we focus on the query metadata artifact in the Exploratory OLAP context and propose an RDF-based vocabulary for its representation......, sharing, and reuse on the SW. As OLAP is based on the underlying multidimensional (MD) data model we denote such queries as MD queries and define SM4MQ: A Semantic Model for Multidimensional Queries. Furthermore, we propose a method to automate the exploitation of queries by means of SPARQL. We apply...

  10. Evaluation of Sub Query Performance in SQL Server

    Directory of Open Access Journals (Sweden)

    Oktavia Tanty

    2014-03-01

    Full Text Available The paper explores several sub query methods used in a query and their impact on the query performance. The study uses experimental approach to evaluate the performance of each sub query methods combined with indexing strategy. The sub query methods consist of in, exists, relational operator and relational operator combined with top operator. The experimental shows that using relational operator combined with indexing strategy in sub query has greater performance compared with using same method without indexing strategy and also other methods. In summary, for application that emphasized on the performance of retrieving data from database, it better to use relational operator combined with indexing strategy. This study is done on Microsoft SQL Server 2012.

  11. Multi-Dimensional Top-k Dominating Queries

    DEFF Research Database (Denmark)

    Yiu, Man Lung; Mamoulis, Nikos

    2009-01-01

    The top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. This query is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects. In addition, it combines the advantages of top......-k and skyline queries without sharing their disadvantages: (i) the output size can be controlled, (ii) no ranking functions need to be specified by users, and (iii) the result is independent of the scales at different dimensions. Despite their importance, top-k dominating queries have not received adequate...... of the query which considers dominance in dimensional subspaces. Experiments using synthetic and real datasets demonstrate that our algorithms significantly outperform a previous skyline-based approach. We also illustrate the applicability of this multi-dimensional analysis query by studying the meaningfulness...

  12. VMQL: A Visual Language for Ad-Hoc Model Querying

    DEFF Research Database (Denmark)

    Störrle, Harald

    2011-01-01

    facilities are inadequate. The Visual Model Query Language (VMQL) is a novel approach that uses the respective modeling language of the source model as the query language, too. The semantics of VMQL is defined formally based on graphs, so that query execution can be defined as graph matching. VMQL has been...... applied to several visual modeling languages, implemented, and validated in small case studies, and several controlled experiments....

  13. Adaptive and Optimized RDF Query Interface for Distributed WFS Data

    Directory of Open Access Journals (Sweden)

    Tian Zhao

    2017-04-01

    Full Text Available Web Feature Service (WFS is a protocol for accessing geospatial data stores such as databases and Shapefiles over the Web. However, WFS does not provide direct access to data distributed in multiple servers. In addition, WFS features extracted from their original sources are not convenient for user access due to the lack of connection to high-level concepts. Users are facing the choices of either querying each WFS server first and then integrating the results, or converting the data from all WFS servers to a more expressive format such as RDF (Resource Description Framework and then querying the integrated data. The first choice requires additional programming while the second choice is not practical for large or frequently updated datasets. The new contribution of this paper is that we propose a novel adaptive and optimized RDF query interface to overcome the aforementioned limitation. Specifically, in this paper, we propose a novel algorithm to query and synthesize distributed WFS data through an RDF query interface, where users can specify data requests to multiple WFS servers using a single RDF query. Users can also define a simple configuration to associate WFS feature types, attributes, and values with RDF classes, properties, and values so that user queries can be written using a more uniform and informative vocabulary. The algorithm translates each RDF query written in SPARQL-like syntax to multiple WFS GetFeature requests, and then converts and integrates the multiple WFS results to get the answers to the original query. The generated GetFeature requests are sent asynchronously and simultaneously to WFS servers to take advantage of the server parallelism. The results of each GetFeature request are cached to improve query response time for subsequent queries that involve one or more of the cached requests. A JavaScript-based prototype is implemented and experimental results show that the query response time can be greatly reduced through

  14. SeqWare Query Engine: storing and searching sequence data in the cloud

    Directory of Open Access Journals (Sweden)

    Merriman Barry

    2010-12-01

    , and a common data interface to simplify development of analytical tools. The range of data types supported, the ease of querying and integrating with existing tools, and the robust scalability of the underlying cloud-based technologies make SeqWare Query Engine a nature fit for storing and searching ever-growing genome sequence datasets.

  15. SeqWare Query Engine: storing and searching sequence data in the cloud.

    Science.gov (United States)

    O'Connor, Brian D; Merriman, Barry; Nelson, Stanley F

    2010-12-21

    analytical tools. The range of data types supported, the ease of querying and integrating with existing tools, and the robust scalability of the underlying cloud-based technologies make SeqWare Query Engine a nature fit for storing and searching ever-growing genome sequence datasets.

  16. Group-by Skyline Query Processing in Relational Engines

    DEFF Research Database (Denmark)

    Yiu, Man Lung; Luk, Ming-Hay; Lo, Eric

    2009-01-01

    The skyline operator was first proposed in 2001 for retrieving interesting tuples from a dataset. Since then, 100+ skyline-related papers have been published; however, we discovered that one of the most intuitive and practical type of skyline queries, namely, group-by skyline queries remains...... the missing cost model for the BBS algorithm. Experimental results show that our techniques are able to devise the best query plans for a variety of group-by skyline queries. Our focus is on algorithms that can be directly implemented in today's commercial database systems without the addition of new access...

  17. The Query Complexity of Finding a Hidden Permutation

    DEFF Research Database (Denmark)

    Afshani, Peyman; Afrawal, Manindra; Benjamin, Doerr

    2012-01-01

    We study the query complexity of determining a hidden permutation. More specifically, we study the problem of learning a secret (z) consisting of a binary string z of length n and a permutation of [n]. The secret must be unveiled by asking queries x01n , and for each query asked, we are returned ...... applications in many other query complexity problems.......We study the query complexity of determining a hidden permutation. More specifically, we study the problem of learning a secret (z) consisting of a binary string z of length n and a permutation of [n]. The secret must be unveiled by asking queries x01n , and for each query asked, we are returned...... the score fz(x) defined as fz(x):=maxi[0n]ji:z(j)=x(j); i.e., the length of the longest common prefix of x and z with respect to . The goal is to minimize the number of queries asked. Our main result are matching upper and lower bounds for this problem, both for deterministic and randomized query schemes...

  18. The effect of query complexity on Web searching results

    Directory of Open Access Journals (Sweden)

    B.J. Jansen

    2000-01-01

    Full Text Available This paper presents findings from a study of the effects of query structure on retrieval by Web search services. Fifteen queries were selected from the transaction log of a major Web search service in simple query form with no advanced operators (e.g., Boolean operators, phrase operators, etc. and submitted to 5 major search engines - Alta Vista, Excite, FAST Search, Infoseek, and Northern Light. The results from these queries became the baseline data. The original 15 queries were then modified using the various search operators supported by each of the 5 search engines for a total of 210 queries. Each of these 210 queries was also submitted to the applicable search service. The results obtained were then compared to the baseline results. A total of 2,768 search results were returned by the set of all queries. In general, increasing the complexity of the queries had little effect on the results with a greater than 70% overlap in results, on average. Implications for the design of Web search services and directions for future research are discussed.

  19. Efficient Processing of Multiple DTW Queries in Time Series Databases

    DEFF Research Database (Denmark)

    Kremer, Hardy; Günnemann, Stephan; Ivanescu, Anca-Maria

    2011-01-01

    Dynamic Time Warping (DTW) is a widely used distance measure for time series that has been successfully used in science and many other application domains. As DTW is computationally expensive, there is a strong need for efficient query processing algorithms. Such algorithms exist for single queries....... In many of today’s applications, however, large numbers of queries arise at any given time. Existing DTW techniques do not process multiple DTW queries simultaneously, a serious limitation which slows down overall processing. In this paper, we propose an efficient processing approach for multiple DTW...

  20. An Approach to Assist Designers With Their Queries and Designs

    DEFF Research Database (Denmark)

    Ahmed, Saeema

    2006-01-01

    Recent research investigating how engineers search for information has concluded that engineering designers acquire assistance when formulating queries. An approach to assist designers with their queries is presented. This approach forms part of a knowledge management system, where indexed...... documents are entered in to a knowledge-based system and is generated dynamically. The network can be used to assist a designer in searching for information; reformulating a query and; to prompt design tasks. This paper presents an approach to prompt designers with their design queries, along with some...

  1. Web Database Schema Identification through Simple Query Interface

    Science.gov (United States)

    Lin, Ling; Zhou, Lizhu

    Web databases provide different types of query interfaces to access the data records stored in the backend databases. While most existing works exploit a complex query interface with multiple input fields to perform schema identification of the Web databases, little attention has been paid on how to identify the schema of web databases by simple query interface (SQI), which has only one single query text input field. This paper proposes a new method of instance-based query probing to identify WDBs' interface and result schema for SQI. The interface schema identification problem is defined as generating the fullcondition query of SQI and a novel query probing strategy is proposed. The result schema is also identified based on the result webpages of SQI's full-condition query, and an extended identification of the non-query attributes is proposed to improve the attribute recall rate. Experimental results on web databases of online shopping for book, movie and mobile phone show that our method is effective and efficient.

  2. Substring Range Reporting

    DEFF Research Database (Denmark)

    Bille, Philip; Gørtz, Inge Li

    2011-01-01

    We revisit various string indexing problems with range reporting features, namely, position-restricted substring searching, indexing substrings with gaps, and indexing substrings with intervals. We obtain the following main results. – We give efficient reductions for each of the above problems...... to a new problem, which we call substring range reporting. Hence, we unify the previous work by showing that we may restrict our attention to a single problem rather than studying each of the above problems individually. – We show how to solve substring range reporting with optimal query time and little...... space. Combined with our reductions this leads to significantly improved time-space trade-offs for the above problems. In particular, for each problem we obtain the first solutions with optimal time query and O(n logO(1) n) space, where n is the length of the indexed string. Our bounds for substring...

  3. Substring Range Reporting

    DEFF Research Database (Denmark)

    Bille, Philip; Gørtz, Inge Li

    2014-01-01

    We revisit various string indexing problems with range reporting features, namely, position-restricted substring searching, indexing substrings with gaps, and indexing substrings with intervals. We obtain the following main results. We give efficient reductions for each of the above problems...... to a new problem, which we call substring range reporting. Hence, we unify the previous work by showing that we may restrict our attention to a single problem rather than studying each of the above problems individually. We show how to solve substring range reporting with optimal query time and little...... space. Combined with our reductions this leads to significantly improved time-space trade-offs for the above problems. In particular, for each problem we obtain the first solutions with optimal time query and O(nlog O(1) n) space, where n is the length of the indexed string. We show that our techniques...

  4. Query by image example: The CANDID approach

    Energy Technology Data Exchange (ETDEWEB)

    Kelly, P.M.; Cannon, M. [Los Alamos National Lab., NM (United States). Computer Research and Applications Group; Hush, D.R. [Univ. of New Mexico, Albuquerque, NM (United States). Dept. of Electrical and Computer Engineering

    1995-02-01

    CANDID (Comparison Algorithm for Navigating Digital Image Databases) was developed to enable content-based retrieval of digital imagery from large databases using a query-by-example methodology. A user provides an example image to the system, and images in the database that are similar to that example are retrieved. The development of CANDID was inspired by the N-gram approach to document fingerprinting, where a ``global signature`` is computed for every document in a database and these signatures are compared to one another to determine the similarity between any two documents. CANDID computes a global signature for every image in a database, where the signature is derived from various image features such as localized texture, shape, or color information. A distance between probability density functions of feature vectors is then used to compare signatures. In this paper, the authors present CANDID and highlight two results from their current research: subtracting a ``background`` signature from every signature in a database in an attempt to improve system performance when using inner-product similarity measures, and visualizing the contribution of individual pixels in the matching process. These ideas are applicable to any histogram-based comparison technique.

  5. Modeling Large Time Series for Efficient Approximate Query Processing

    DEFF Research Database (Denmark)

    Perera, Kasun S; Hahmann, Martin; Lehner, Wolfgang

    2015-01-01

    -wise aggregation to derive the models. These models are initially created from the original data and are kept in the database along with it. Subsequent queries are answered using the stored models rather than scanning and processing the original datasets. In order to support model query processing, we maintain...

  6. Video Stream Retrieval of Unseen Queries using Semantic Memory

    NARCIS (Netherlands)

    Cappallo, S.; Mensink, T.; Snoek, C.G.M.; Wilson, R.C.; Hancock, E.R.; Smith, W.A.P.

    2016-01-01

    Retrieval of live, user-broadcast video streams is an under-addressed and increasingly relevant challenge. The on-line nature of the problem requires temporal evaluation and the unforeseeable scope of potential queries motivates an approach which can accommodate arbitrary search queries. To account

  7. Real SQL queries 50 challenges : practice for reporting and analysis

    CERN Document Server

    Cohen, Brian; Mishra, Neerja

    2015-01-01

    Queries improve when challenges are authentic. This book sets your learning on the fast track with realistic problems to solve. Topics span sales, marketing, human resources, purchasing, and production. Real SQL Queries: 50 Challenges is perfect for analysts, report writers, or anyone searching for a hands-on approach to learning SQL Server.

  8. The Odyssey Approach for Optimizing Federated SPARQL Queries

    DEFF Research Database (Denmark)

    Montoya, Gabriela; Skaf-Molli, Hala; Hose, Katja

    2017-01-01

    because (ii) there is only limited access to statistics about schema and instance data of remote sources. To overcome these challenges, most federated query engines rely on heuristics to reduce the space of possible query execution plans or on dynamic programming strategies to produce optimal plans...

  9. Formal specification of a query expression generator using RSL ...

    African Journals Online (AJOL)

    Formal methods are used for the specification of the query generator which is not the usual practice in the specification of query generators. We use RSL, the RAISE Specification Language, to formally specify our generator. From the specification, an implementation of our generator is generated in C++ using a command ...

  10. Query Classification and Study of University Students' Search Trends

    Science.gov (United States)

    Maabreh, Majdi A.; Al-Kabi, Mohammed N.; Alsmadi, Izzat M.

    2012-01-01

    Purpose: This study is an attempt to develop an automatic identification method for Arabic web queries and divide them into several query types using data mining. In addition, it seeks to evaluate the impact of the academic environment on using the internet. Design/methodology/approach: The web log files were collected from one of the higher…

  11. Investigating queries and search failures in academic search

    NARCIS (Netherlands)

    Li, X.; Schijvenaars, B.J.A.; de Rijke, M.

    Academic search concerns the retrieval and profiling of information objects in the domain of academic research. In this paper we reveal important observations of academic search queries, and provide an algorithmic solution to address a type of failure during search sessions: null queries. We start

  12. Mining Web Query Logs to Analyze Political Issues

    NARCIS (Netherlands)

    Weber, I.; Garimella, V.R.K.; Borra, E.; Contractor, N.; Uzzi, B.

    2012-01-01

    We present a novel approach to using anonymized web search query logs to analyze and visualize political issues. Our starting point is a list of politically annotated blogs (left vs. right). We use this list to assign a numerical political leaning to queries leading to clicks on these blogs.

  13. On the Suitability of Skyline Queries for Data Exploration

    DEFF Research Database (Denmark)

    Chester, Sean; Mortensen, Michael Lind; Assent, Ira

    2014-01-01

    The skyline operator has been studied in database research for multi-criteria decision making. Until now the focus has been on the efficiency or accuracy of single queries. In practice, however, users are increasingly confronted with unknown data collections, where precise query formulation proves...

  14. A framework for query optimization to support data mining

    NARCIS (Netherlands)

    S.R. Choenni (Sunil); A.P.J.M. Siebes (Arno)

    1996-01-01

    textabstractIn order to extract knowledge from databases, data mining algorithms heavily query the databases. Inefficient processing of these queries will inevitably have its impact on the performance of these algorithms, making them less valuable. In this paper, we describe an optimization

  15. Pareto-depth for multiple-query image retrieval.

    Science.gov (United States)

    Hsiao, Ko-Jen; Calder, Jeff; Hero, Alfred O

    2015-02-01

    Most content-based image retrieval systems consider either one single query, or multiple queries that include the same object or represent the same semantic information. In this paper, we consider the content-based image retrieval problem for multiple query images corresponding to different image semantics. We propose a novel multiple-query information retrieval algorithm that combines the Pareto front method with efficient manifold ranking. We show that our proposed algorithm outperforms state of the art multiple-query retrieval algorithms on real-world image databases. We attribute this performance improvement to concavity properties of the Pareto fronts, and prove a theoretical result that characterizes the asymptotic concavity of the fronts.

  16. Query Log Analysis of an Electronic Health Record Search Engine

    Science.gov (United States)

    Yang, Lei; Mei, Qiaozhu; Zheng, Kai; Hanauer, David A.

    2011-01-01

    We analyzed a longitudinal collection of query logs of a full-text search engine designed to facilitate information retrieval in electronic health records (EHR). The collection, 202,905 queries and 35,928 user sessions recorded over a course of 4 years, represents the information-seeking behavior of 533 medical professionals, including frontline practitioners, coding personnel, patient safety officers, and biomedical researchers for patient data stored in EHR systems. In this paper, we present descriptive statistics of the queries, a categorization of information needs manifested through the queries, as well as temporal patterns of the users’ information-seeking behavior. The results suggest that information needs in medical domain are substantially more sophisticated than those that general-purpose web search engines need to accommodate. Therefore, we envision there exists a significant challenge, along with significant opportunities, to provide intelligent query recommendations to facilitate information retrieval in EHR. PMID:22195150

  17. Processing SPARQL queries with regular expressions in RDF databases.

    Science.gov (United States)

    Lee, Jinsoo; Pham, Minh-Duc; Lee, Jihwan; Han, Wook-Shin; Cho, Hune; Yu, Hwanjo; Lee, Jeong-Hoon

    2011-03-29

    As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users' requests for extracting information from the RDF data as well as the lack of users' knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns.

  18. ASSET Queries: A Set-Oriented and Column-Wise Approach to Modern OLAP

    Science.gov (United States)

    Chatziantoniou, Damianos; Sotiropoulos, Yannis

    Modern data analysis has given birth to numerous grouping constructs and programming paradigms, way beyond the traditional group by. Applications such as data warehousing, web log analysis, streams monitoring and social networks understanding necessitated the use of data cubes, grouping variables, windows and MapReduce. In this paper we review the associated set (ASSET) concept and discuss its applicability in both continuous and traditional data settings. Given a set of values B, an associated set over B is just a collection of annotated data multisets, one for each b(B. The goal is to efficiently compute aggregates over these data sets. An ASSET query consists of repeated definitions of associated sets and aggregates of these, possibly correlated, resembling a spreadsheet document. We review systems implementing ASSET queries both in continuous and persistent contexts and argue for associated sets' analytical abilities and optimization opportunities.

  19. jQuery UI 1.7 the user interface library for jQuery

    CERN Document Server

    Wellman, Dan

    2009-01-01

    An example-based approach leads you step-by-step through the implementation and customization of each library component and its associated resources in turn. To emphasize the way that jQuery UI takes the difficulty out of user interface design and implementation, each chapter ends with a 'fun with' section that puts together what you've learned throughout the chapter to make a usable and fun page. In these sections you'll often get to experiment with the latest associated technologies like AJAX and JSON. This book is for front-end designers and developers who need to quickly learn how to use t

  20. Research in Mobile Database Query Optimization and Processing

    Directory of Open Access Journals (Sweden)

    Agustinus Borgy Waluyo

    2005-01-01

    Full Text Available The emergence of mobile computing provides the ability to access information at any time and place. However, as mobile computing environments have inherent factors like power, storage, asymmetric communication cost, and bandwidth limitations, efficient query processing and minimum query response time are definitely of great interest. This survey groups a variety of query optimization and processing mechanisms in mobile databases into two main categories, namely: (i query processing strategy, and (ii caching management strategy. Query processing includes both pull and push operations (broadcast mechanisms. We further classify push operation into on-demand broadcast and periodic broadcast. Push operation (on-demand broadcast relates to designing techniques that enable the server to accommodate multiple requests so that the request can be processed efficiently. Push operation (periodic broadcast corresponds to data dissemination strategies. In this scheme, several techniques to improve the query performance by broadcasting data to a population of mobile users are described. A caching management strategy defines a number of methods for maintaining cached data items in clients' local storage. This strategy considers critical caching issues such as caching granularity, caching coherence strategy and caching replacement policy. Finally, this survey concludes with several open issues relating to mobile query optimization and processing strategy.

  1. Summarization of Text Document Using Query Dependent Parsing Techniques

    Science.gov (United States)

    Rokade, P. P.; Mrunal, Bewoor; Patil, S. H.

    2010-11-01

    World Wide Web is the largest source of information. Huge amount of data is present on the Web. There has been a great amount of work on query-independent summarization of documents. However, due to the success of Web search engines query-specific document summarization (query result snippets) has become an important problem. In this paper a method to create query specific summaries by identifying the most query-relevant fragments and combining them using the semantic associations within the document is discussed. In particular, first a structure is added to the documents in the preprocessing stage and converts them to document graphs. The present research work focuses on analytical study of different document clustering and summarization techniques currently the most research is focused on Query-Independent summarization. The main aim of this research work is to combine the both approaches of document clustering and query dependent summarization. This mainly includes applying different clustering algorithms on a text document. Create a weighted document graph of the resulting graph based on the keywords. And obtain the document graph to get the summary of the document. The performance of the summary using different clustering techniques will be analyzed and the optimal approach will be suggested.

  2. Dynamic range majority data structures

    DEFF Research Database (Denmark)

    Elmasry, Amr Ahmed Abd Elmoneim; He, Meng; Munro, J. Ian

    2011-01-01

    Given a set P of n coloured points on the real line, we study the problem of answering range α-majority (or "heavy hitter") queries on P. More specifically, for a query range Q, we want to return each colour that is assigned to more than an α-fraction of the points contained in Q. We present a new......((lg n/(α lglg n)). For constant values of α, this improved query time matches an existing lower bound, for any data structure with polylogarithmic update time. We also generalize our data structure to handle sets of points in d-dimensions, for d ≥ 2, as well as dynamic arrays, in which each entry...

  3. Random and directed walk-based top-(k) queries in wireless sensor networks.

    Science.gov (United States)

    Fu, Jun-Song; Liu, Yun

    2015-05-26

    In wireless sensor networks, filter-based top-  query approaches are the state-of-the-art solutions and have been extensively researched in the literature, however, they are very sensitive to the network parameters, including the size of the network, dynamics of the sensors' readings and declines in the overall range of all the readings. In this work, a random walk-based top-  query approach called RWTQ and a directed walk-based top-  query approach called DWTQ are proposed. At the beginning of a top-  query, one or several tokens are sent to the specific node(s) in the network by the base station. Then, each token walks in the network independently to record and process the readings in a random or directed way. A strategy of choosing the "right" way in DWTQ is carefully designed for the token(s) to arrive at the high-value regions as soon as possible. When designing the walking strategy for DWTQ, the spatial correlations of the readings are also considered. Theoretical analysis and simulation results indicate that RWTQ and DWTQ both are very robust against these parameters discussed previously. In addition, DWTQ outperforms TAG, FILA and EXTOK in transmission cost, energy consumption and network lifetime.

  4. Intelligent query processing for semantic mediation of information systems

    Directory of Open Access Journals (Sweden)

    Saber Benharzallah

    2011-11-01

    Full Text Available We propose an intelligent and an efficient query processing approach for semantic mediation of information systems. We propose also a generic multi agent architecture that supports our approach. Our approach focuses on the exploitation of intelligent agents for query reformulation and the use of a new technology for the semantic representation. The algorithm is self-adapted to the changes of the environment, offers a wide aptitude and solves the various data conflicts in a dynamic way; it also reformulates the query using the schema mediation method for the discovered systems and the context mediation for the other systems.

  5. A new weighted fuzzy grammar on object oriented database queries

    Directory of Open Access Journals (Sweden)

    Ali Haroonabadi

    2012-08-01

    Full Text Available The fuzzy object oriented database model is often used to handle the existing imprecise and complicated objects for many real-world applications. The main focus of this paper is on fuzzy queries and tries to analyze a complicated and complex query to get more meaningful and closer responses. The method permits the user to provide the possibility of allocating the weight to various parts of the query, which makes it easier to follow better goals and return the target objects.

  6. jQuery 2.0 animation techniques beginner's guide

    CERN Document Server

    Culpepper, Adam

    2013-01-01

    This book is a guide to help you create attractive web page animations using jQuery. Written in a friendly and engaging approach this book is designed to be placed alongside your computer as a mentor.If you are a web designer or a frontend developer or if you want to learn how to animate the user interface of your web applications with jQuery, this book is for you. Experience with jQuery or Javascript would be helpful but solid knowledge base of HTML and CSS is assumed.

  7. MEDLINE clinical queries are robust when searching in recent publishing years.

    Science.gov (United States)

    Wilczynski, Nancy L; McKibbon, K Ann; Walter, Stephen D; Garg, Amit X; Haynes, R Brian

    2013-01-01

    To determine if the PubMed and Ovid MEDLINE clinical queries (which were developed in the publishing year 2000, for the purpose categories therapy, diagnosis, prognosis, etiology, and clinical prediction guides) perform as well when searching in current publishing years. A gold standard database of recently published research literature was created using the McMaster health knowledge refinery (http://hiru.mcmaster.ca/hiru/HIRU_McMaster_HKR.aspx) and its continuously updated database, McMaster PLUS (http://hiru.mcmaster.ca/hiru/HIRU_McMaster_PLUS_projects.aspx). This database contains articles from over 120 clinical journals that are tagged for meeting or not meeting criteria for scientific merit and clinical relevance. The clinical queries sensitive ('broad') and specific ('narrow') search filters were tested in this gold standard database, and sensitivity and specificity were calculated and compared with those originally reported for the clinical queries. In all cases, the sensitivity of the highly sensitive search filters and the specificity of the highly specific search filters did not differ substantively when comparing results derived in 2000 with those derived in a more current database. In addition, in all cases, the specificities for the highly sensitive search filters and the sensitivities for the highly specific search filters remained above 50% when testing them in the current database. These results are reassuring for modern-day searchers. The clinical queries that were derived in the year 2000 perform equally well a decade later. The PubMed and Ovid MEDLINE clinical queries have been revalidated and remain a useful public resource for searching the world's medical literature for research that is most relevant to clinical care.

  8. A journey to Semantic Web query federation in the life sciences.

    Science.gov (United States)

    Cheung, Kei-Hoi; Frost, H Robert; Marshall, M Scott; Prud'hommeaux, Eric; Samwald, Matthias; Zhao, Jun; Paschke, Adrian

    2009-10-01

    As interest in adopting the Semantic Web in the biomedical domain continues to grow, Semantic Web technology has been evolving and maturing. A variety of technological approaches including triplestore technologies, SPARQL endpoints, Linked Data, and Vocabulary of Interlinked Datasets have emerged in recent years. In addition to the data warehouse construction, these technological approaches can be used to support dynamic query federation. As a community effort, the BioRDF task force, within the Semantic Web for Health Care and Life Sciences Interest Group, is exploring how these emerging approaches can be utilized to execute distributed queries across different neuroscience data sources. We have created two health care and life science knowledge bases. We have explored a variety of Semantic Web approaches to describe, map, and dynamically query multiple datasets. We have demonstrated several federation approaches that integrate diverse types of information about neurons and receptors that play an important role in basic, clinical, and translational neuroscience research. Particularly, we have created a prototype receptor explorer which uses OWL mappings to provide an integrated list of receptors and executes individual queries against different SPARQL endpoints. We have also employed the AIDA Toolkit, which is directed at groups of knowledge workers who cooperatively search, annotate, interpret, and enrich large collections of heterogeneous documents from diverse locations. We have explored a tool called "FeDeRate", which enables a global SPARQL query to be decomposed into subqueries against the remote databases offering either SPARQL or SQL query interfaces. Finally, we have explored how to use the vocabulary of interlinked Datasets (voiD) to create metadata for describing datasets exposed as Linked Data URIs or SPARQL endpoints. We have demonstrated the use of a set of novel and state-of-the-art Semantic Web technologies in support of a neuroscience query

  9. ALGORITMA RC4 DALAM PROTEKSI TRANSMISI DAN HASIL QUERY UNTUK ORDBMS POSTGRESQL

    Directory of Open Access Journals (Sweden)

    Yuri Ariyanto

    2009-01-01

    Full Text Available In this research will be worked through about how cryptography RC4's algorithm implementation in protection to query result and of query, security by encryption and descryption up to both is in network. Implementation of this research which is build software in client that function access databases that is placed by the side of server. Software that building to have facility for encryption and descryption query result and of query that is sent from client goes to server and. transmission query result and of query can secure its security. Well guaranted transmission security him of query result and of query can be told to succeed if success software can encryption query result and of query which transmission so that in the event of scanning to both, scanning will not understand data content. Conclusion of this research that is woke up software succeed encryption query and result of query which transmission between application of client and of server databases. Abstract in Bahasa Indonesia: Pada penelitian ini dibahas mengenai bagaimana mengimplementasikan algoritma kriptografi RC4 dalam proteksi terhadap query dan hasil query, pengamanan dilakukan dengan cara melakukan enkripsi dan dekripsi selama keduanya berada di dalam jaringan. Pengimplementasian dari penelitian ini yaitu membangun sebuah software yang akan diletakkan di sisi client yang berfungsi mengakses database yang diletakkan di sisi server. Software yang dibangun memiliki fasilitas untuk mengenkripsi dan mendektipsi query dan hasil query yang dikirimkan dari client ke server dan juga sebaliknya. Dengan demikian tramsmisi query dan hasil query dapat terjamin keamanannya.Terjaminnya keamanan transmisi query dan hasil query dapat dikatakan berhasil jika software berhasil mengenkripsi query dan hasil query yang ditransmisikan sehingga apabila terjadi penyadapan terhadap keduanya, penyadap tidak akan mengerti isi data tersebut. Kesimpulan dari penelitian ini yaitu software yang dibangun

  10. Development and Validation of Queries Using Structured Query Language (SQL) to Determine the Utilization of Comparison Imaging in Radiology Reports Stored on PACS

    OpenAIRE

    Lakhani, Paras; Menschik, Elliot D.; Goldszal, Alberto F.; Murray, Joseph P.; Weiner, Mark G.; Langlotz, Curtis P.

    2006-01-01

    The purpose of this research was to develop queries that quantify the utilization of comparison imaging in free-text radiology reports. The queries searched for common phrases that indicate whether comparison imaging was utilized, not available, or not mentioned. The queries were iteratively refined and tested on random samples of 100 reports with human review as a reference standard until the precision and recall of the queries did not improve significantly between iterations. Then, query ac...

  11. A Revisit of Query Expansion with Different Semantic Levels

    DEFF Research Database (Denmark)

    Zhang, Ce; Cui, Bin; Cong, Gao

    2009-01-01

    Query expansion has received extensive attention in information retrieval community. Although semantic based query expansion appears to be promising in improving retrieval performance, previous research has shown that it cannot consistently improve retrieval performance. It is a tricky problem...... to automatically determine whether to do query expansion for a given query. In this paper, we introduce Compact Concept Ontology (CCO) and provide users the option of exploring different semantic levels by using different CCOs. Experimental results show our approach is superior to previous work in many cases....... Additionally, we integrate the proposed methods into a text-based video search system (iVSearcher), to improve the user’s experience and retrieval performance significantly. To the best of our knowledge, this is the first system that integrates semantic information into video search and explores different...

  12. An introduction to XML query processing and keyword search

    CERN Document Server

    Lu, Jiaheng

    2013-01-01

    This book systematically and comprehensively covers the latest advances in XML data searching. It presents an extensive overview of the current query processing and keyword search techniques on XML data.

  13. Determinacy in Static Analysis of jQuery

    DEFF Research Database (Denmark)

    Andreasen, Esben; Møller, Anders

    2014-01-01

    Static analysis for JavaScript can potentially help programmers find errors early during development. Although much progress has been made on analysis techniques, a major obstacle is the prevalence of libraries, in particular jQuery, which apply programming patterns that have detrimental conseque......Static analysis for JavaScript can potentially help programmers find errors early during development. Although much progress has been made on analysis techniques, a major obstacle is the prevalence of libraries, in particular jQuery, which apply programming patterns that have detrimental...... present a static dataflow analysis for JavaScript that infers and exploits determinacy information on-the-fly, to enable analysis of some of the most complex parts of jQuery. The techniques are implemented in the TAJS analysis tool and evaluated on a collection of small programs that use jQuery. Our...

  14. Parasol: An Architecture for Cross-Cloud Federated Graph Querying

    Energy Technology Data Exchange (ETDEWEB)

    Lieberman, Michael; Choudhury, Sutanay; Hughes, Marisa; Patrone, Dennis; Hider, Sandy; Piatko, Christine; Chapman, Matthew; Marple, JP; Silberberg, David

    2014-06-22

    Large scale data fusion of multiple datasets can often provide in- sights that examining datasets individually cannot. However, when these datasets reside in different data centers and cannot be collocated due to technical, administrative, or policy barriers, a unique set of problems arise that hamper querying and data fusion. To ad- dress these problems, a system and architecture named Parasol is presented that enables federated queries over graph databases residing in multiple clouds. Parasol’s design is flexible and requires only minimal assumptions for participant clouds. Query optimization techniques are also described that are compatible with Parasol’s lightweight architecture. Experiments on a prototype implementation of Parasol indicate its suitability for cross-cloud federated graph queries.

  15. Human Cell and Tissue Establishment Registration Public Query

    Data.gov (United States)

    U.S. Department of Health & Human Services — This application provides Human Cell and Tissue registration information for registered, inactive, and pre-registered firms. Query options are by Establishment Name,...

  16. Nearest private query based on quantum oblivious key distribution

    Science.gov (United States)

    Xu, Min; Shi, Run-hua; Luo, Zhen-yu; Peng, Zhen-wan

    2017-12-01

    Nearest private query is a special private query which involves two parties, a user and a data owner, where the user has a private input (e.g., an integer) and the data owner has a private data set, and the user wants to query which element in the owner's private data set is the nearest to his input without revealing their respective private information. In this paper, we first present a quantum protocol for nearest private query, which is based on quantum oblivious key distribution (QOKD). Compared to the classical related protocols, our protocol has the advantages of the higher security and the better feasibility, so it has a better prospect of applications.

  17. External Data Structures for Shortest Path Queries on Planar Digraphs

    DEFF Research Database (Denmark)

    Arge, Lars; Toma, Laura

    2005-01-01

    In this paper we present space-query trade-offs for external memory data structures that answer shortest path queries on planar directed graphs. For any S = Ω(N 1 + ε) and S = O(N2/B), our main result is a family of structures that use S space and answer queries in O(N2/ S B) I/Os, thus obtaining...... optimal space-query product O(N2/B). An S space structure can be constructed in O(√S · sort(N)) I/Os, where sort(N) is the number of I/Os needed to sort N elements, B is the disk block size, and N is the size of the graph....

  18. Multiresolution Cube Estimators for Sensor Network Aggregate Queries

    OpenAIRE

    Meliou, Alexandra; Guestrin, Carlos; Hellerstein, Joseph M.

    2010-01-01

    In this work we present in-network techniques to improve the efficiency of spatial aggregate queries. Such queries are very common in a sensornet setting, demanding more targeted techniques for their handling. Our approach constructs and maintains multi-resolution cube hierarchies inside the network, which can be constructed in a distributed fashion. In case of failures, recovery can also be performed with in-network decisions. In this paper we demonstrate how in-network cube hierarchies can ...

  19. The retrieval effectiveness of search engines on navigational queries

    OpenAIRE

    Lewandowski, Dirk

    2011-01-01

    Purpose - To test major Web search engines on their performance on navigational queries, i.e. searches for homepages. Design/methodology/approach - 100 real user queries are posed to six search engines (Google, Yahoo, MSN, Ask, Seekport, and Exalead). Users described the desired pages, and the results position of these is recorded. Measured success N and mean reciprocal rank are calculated. Findings - Performance of the major search engines Google, Yahoo, and MSN is best, with around 90 perce...

  20. A distributed query execution engine of big attributed graphs.

    Science.gov (United States)

    Batarfi, Omar; Elshawi, Radwa; Fayoumi, Ayman; Barnawi, Ahmed; Sakr, Sherif

    2016-01-01

    A graph is a popular data model that has become pervasively used for modeling structural relationships between objects. In practice, in many real-world graphs, the graph vertices and edges need to be associated with descriptive attributes. Such type of graphs are referred to as attributed graphs. G-SPARQL has been proposed as an expressive language, with a centralized execution engine, for querying attributed graphs. G-SPARQL supports various types of graph querying operations including reachability, pattern matching and shortest path where any G-SPARQL query may include value-based predicates on the descriptive information (attributes) of the graph edges/vertices in addition to the structural predicates. In general, a main limitation of centralized systems is that their vertical scalability is always restricted by the physical limits of computer systems. This article describes the design, implementation in addition to the performance evaluation of DG-SPARQL, a distributed, hybrid and adaptive parallel execution engine of G-SPARQL queries. In this engine, the topology of the graph is distributed over the main memory of the underlying nodes while the graph data are maintained in a relational store which is replicated on the disk of each of the underlying nodes. DG-SPARQL evaluates parts of the query plan via SQL queries which are pushed to the underlying relational stores while other parts of the query plan, as necessary, are evaluated via indexless memory-based graph traversal algorithms. Our experimental evaluation shows the efficiency and the scalability of DG-SPARQL on querying massive attributed graph datasets in addition to its ability to outperform the performance of Apache Giraph, a popular distributed graph processing system, by orders of magnitudes.

  1. Inductive queries for a drug designing robot scientist

    OpenAIRE

    King, Ross D.; Schierz, Amanda C.; Clare, Amanda; Rowland, Jem; Sparkes, Andrew; Nijssen, Siegfried; Ramon, Jan

    2010-01-01

    It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We in...

  2. Interactive Query Formulation for Object Search

    NARCIS (Netherlands)

    Gevers, T.; Smeulders, A.W.M.; Huijsmans, D.P.; Smeulders, A.W.M.

    1999-01-01

    Snakes provide high-level information in the form of continuity constraints and minimum energy constraints related to the contour shape and image features. In this paper, we aim at using color invariant gradient information, as image features, to guide the deformation process. We focus on using

  3. Representation and alignment of sung queries for music information retrieval

    Science.gov (United States)

    Adams, Norman H.; Wakefield, Gregory H.

    2005-09-01

    The pursuit of robust and rapid query-by-humming systems, which search melodic databases using sung queries, is a common theme in music information retrieval. The retrieval aspect of this database problem has received considerable attention, whereas the front-end processing of sung queries and the data structure to represent melodies has been based on musical intuition and historical momentum. The present work explores three time series representations for sung queries: a sequence of notes, a ``smooth'' pitch contour, and a sequence of pitch histograms. The performance of the three representations is compared using a collection of naturally sung queries. It is found that the most robust performance is achieved by the representation with highest dimension, the smooth pitch contour, but that this representation presents a formidable computational burden. For all three representations, it is necessary to align the query and target in order to achieve robust performance. The computational cost of the alignment is quadratic, hence it is necessary to keep the dimension small for rapid retrieval. Accordingly, iterative deepening is employed to achieve both robust performance and rapid retrieval. Finally, the conventional iterative framework is expanded to adapt the alignment constraints based on previous iterations, further expediting retrieval without degrading performance.

  4. Comparison of Rumen Fluid pH by Continuous Telemetry System and Bench pH Meter in Sheep with Different Ranges of Ruminal pH

    Directory of Open Access Journals (Sweden)

    Leonardo F. Reis

    2014-01-01

    Full Text Available We aimed to compare the measurements of sheep ruminal pH using a continuous telemetry system or a bench pH meter using sheep with different degrees of ruminal pH. Ruminal lactic acidosis was induced in nine adult crossbred Santa Ines sheep by the administration of 15 g of sucrose per kg/BW. Samples of rumen fluid were collected at the baseline, before the induction of acidosis (T0 and at six, 12, 18, 24, 48, and 72 hours after the induction for pH measurement using a bench pH meter. During this 72-hour period, all animals had electrodes for the continuous measurement of pH. The results were compared using the Bland-Altman analysis of agreement, Pearson coefficients of correlation and determination, and paired analysis of variance with Student’s t-test. The measurement methods presented a strong correlation (r=0.94, P<0.05 but the rumen pH that was measured continuously using a telemetry system resulted in lower values than the bench pH meter (overall mean of 5.38 and 5.48, resp., P=0.0001. The telemetry system was able to detect smaller changes in rumen fluid pH and was more accurate in diagnosing both subacute ruminal lactic acidosis and acute ruminal lactic acidosis in sheep.

  5. Comparison of rumen fluid pH by continuous telemetry system and bench pH meter in sheep with different ranges of ruminal pH.

    Science.gov (United States)

    Reis, Leonardo F; Minervino, Antonio H H; Araújo, Carolina A S C; Sousa, Rejane S; Oliveira, Francisco L C; Rodrigues, Frederico A M L; Meira-Júnior, Enoch B S; Barrêto-Júnior, Raimundo A; Mori, Clara S; Ortolani, Enrico L

    2014-01-01

    We aimed to compare the measurements of sheep ruminal pH using a continuous telemetry system or a bench pH meter using sheep with different degrees of ruminal pH. Ruminal lactic acidosis was induced in nine adult crossbred Santa Ines sheep by the administration of 15 g of sucrose per kg/BW. Samples of rumen fluid were collected at the baseline, before the induction of acidosis (T 0) and at six, 12, 18, 24, 48, and 72 hours after the induction for pH measurement using a bench pH meter. During this 72-hour period, all animals had electrodes for the continuous measurement of pH. The results were compared using the Bland-Altman analysis of agreement, Pearson coefficients of correlation and determination, and paired analysis of variance with Student's t-test. The measurement methods presented a strong correlation (r = 0.94, P pH that was measured continuously using a telemetry system resulted in lower values than the bench pH meter (overall mean of 5.38 and 5.48, resp., P = 0.0001). The telemetry system was able to detect smaller changes in rumen fluid pH and was more accurate in diagnosing both subacute ruminal lactic acidosis and acute ruminal lactic acidosis in sheep.

  6. Operating range, hold-up, droplet size and axial mixing of pulsed plate columns in highly disperse and low-continuity volume flows

    International Nuclear Information System (INIS)

    Schmidt, H.; Miller, H.

    Operating behavior, hold-up, droplet size and axial mixing are investigated in highly disperse and slightly continuous volume flows in a pulsed plate column. The geometry of the column of 4-m length and 10-cm inside diameter was held constant. The hole shape of the column bases was changed, wherby the cylindrical, sharp-edge drilled hole is compared with the punched, nozzle-shaped hole in their effects on the fluid-dynamic behavior. In this case we varied the volume flows, the ratio of volume flows, the pulse frequency and the operating temperature. The operation was held constant for the aqueous, the organic, the continuous and the disperse phases. The objective was to demonstrate the applicability of pulsed plate columns with very large differences between the organic disperse and the aqueous continuous volume flow, to obtain design data for such columns and to perform a scale-up to industrial reprocessing plant-size. 18 references, 11 figures, 3 tables

  7. Comparison of Rumen Fluid pH by Continuous Telemetry System and Bench pH Meter in Sheep with Different Ranges of Ruminal pH

    OpenAIRE

    Reis, Leonardo F.; Minervino, Antonio H. H.; Araújo, Carolina A. S. C.; Sousa, Rejane S.; Oliveira, Francisco L. C.; Rodrigues, Frederico A. M. L.; Meira-Júnior, Enoch B. S.; Barrêto-Júnior, Raimundo A.; Mori, Clara S.; Ortolani, Enrico L.

    2014-01-01

    We aimed to compare the measurements of sheep ruminal pH using a continuous telemetry system or a bench pH meter using sheep with different degrees of ruminal pH. Ruminal lactic acidosis was induced in nine adult crossbred Santa Ines sheep by the administration of 15 g of sucrose per kg/BW. Samples of rumen fluid were collected at the baseline, before the induction of acidosis (T 0) and at six, 12, 18, 24, 48, and 72 hours after the induction for pH measurement using a bench pH meter. During ...

  8. BioMart – biological queries made easy

    Directory of Open Access Journals (Sweden)

    Thorisson Gudmundur

    2009-01-01

    Full Text Available Abstract Background Biologists need to perform complex queries, often across a variety of databases. Typically, each data resource provides an advanced query interface, each of which must be learnt by the biologist before they can begin to query them. Frequently, more than one data source is required and for high-throughput analysis, cutting and pasting results between websites is certainly very time consuming. Therefore, many groups rely on local bioinformatics support to process queries by accessing the resource's programmatic interfaces if they exist. This is not an efficient solution in terms of cost and time. Instead, it would be better if the biologist only had to learn one generic interface. BioMart provides such a solution. Results BioMart enables scientists to perform advanced querying of biological data sources through a single web interface. The power of the system comes from integrated querying of data sources regardless of their geographical locations. Once these queries have been defined, they may be automated with its "scripting at the click of a button" functionality. BioMart's capabilities are extended by integration with several widely used software packages such as BioConductor, DAS, Galaxy, Cytoscape, Taverna. In this paper, we describe all aspects of BioMart from a user's perspective and demonstrate how it can be used to solve real biological use cases such as SNP selection for candidate gene screening or annotation of microarray results. Conclusion BioMart is an easy to use, generic and scalable system and therefore, has become an integral part of large data resources including Ensembl, UniProt, HapMap, Wormbase, Gramene, Dictybase, PRIDE, MSD and Reactome. BioMart is freely accessible to use at http://www.biomart.org.

  9. Policy Analysis Screening System (PASS) demonstration: sample queries and terminal instructions

    Energy Technology Data Exchange (ETDEWEB)

    None

    1979-10-16

    This document contains the input and output for the Policy Analysis Screening System (PASS) demonstration. This demonstration is stored on a portable disk at the Environmental Impacts Division. Sample queries presented here include: (1) how to use PASS; (2) estimated 1995 energy consumption from Mid-Range Energy-Forecasting System (MEFS) data base; (3) pollution projections from Strategic Environmental Assessment System (SEAS) data base; (4) diesel auto regulations; (5) diesel auto health effects; (6) oil shale health and safety measures; (7) water pollution effects of SRC; (8) acid rainfall from Energy Environmental Statistics (EES) data base; 1990 EIA electric generation by fuel type; sulfate concentrations by Federal region; forecast of 1995 SO/sub 2/ emissions in Region III; and estimated electrical generating capacity in California to 1990. The file name for each query is included.

  10. 3D Hilbert Space Filling Curves in 3D City Modeling for Faster Spatial Queries

    DEFF Research Database (Denmark)

    Ujang, Uznir; Antón Castro, Francesc/François; Azri, Suhaibah

    2014-01-01

    The advantages of three dimensional (3D) city models can be seen in various applications including photogrammetry, urban and regional planning, computer games, etc. They expand the visualization and analysis capabilities of Geographic Information Systems on cities, and they can be developed using...... objects. In this research, the authors propose an opponent data constellation technique of space-filling curves (3D Hilbert curves) for 3D city model data representation. Unlike previous methods, that try to project 3D or n-dimensional data down to 2D or 3D using Principal Component Analysis (PCA......) or Hilbert mappings, in this research, they extend the Hilbert space-filling curve to one higher dimension for 3D city model data implementations. The query performance was tested for single object, nearest neighbor and range search queries using a CityGML dataset of 1,000 building blocks and the results...

  11. Accelerating SPARQL Queries and Analytics on RDF Data

    KAUST Repository

    Al-Harbi, Razen

    2016-11-09

    The complexity of SPARQL queries and RDF applications poses great challenges on distributed RDF management systems. SPARQL workloads are dynamic and con- sist of queries with variable complexities. Hence, systems that use static partitioning su↵er from communication overhead for workloads that generate excessive communi- cation. Concurrently, RDF applications are becoming more sophisticated, mandating analytical operations that extend beyond SPARQL queries. Being primarily designed and optimized to execute SPARQL queries, which lack procedural capabilities, exist- ing systems are not suitable for rich RDF analytics. This dissertation tackles the problem of accelerating SPARQL queries and RDF analytics on distributed shared-nothing RDF systems. First, a distributed RDF en- gine, coined AdPart, is introduced. AdPart uses lightweight hash partitioning for sharding triples using their subject values; rendering its startup overhead very low. The locality-aware query optimizer of AdPart takes full advantage of the partition- ing to (i) support the fully parallel processing of join patterns on subjects and (ii) minimize data communication for general queries by applying hash distribution of intermediate results instead of broadcasting, wherever possible. By exploiting hash- based locality, AdPart achieves better or comparable performance to systems that employ sophisticated partitioning schemes. To cope with workloads dynamism, AdPart is extended to dynamically adapt to workload changes. AdPart monitors the data access patterns and dynamically redis- tributes and replicates the instances of the most frequent patterns among workers.Consequently, the communication cost for future queries is drastically reduced or even eliminated. Experiments with synthetic and real data verify that AdPart starts faster than all existing systems and gracefully adapts to the query load. Finally, to support and accelerate rich RDF analytical tasks, a vertex-centric RDF analytics framework is

  12. Secure Nearest Neighbor Query on Crowd-Sensing Data

    Directory of Open Access Journals (Sweden)

    Ke Cheng

    2016-09-01

    Full Text Available Nearest neighbor queries are fundamental in location-based services, and secure nearest neighbor queries mainly focus on how to securely and quickly retrieve the nearest neighbor in the outsourced cloud server. However, the previous big data system structure has changed because of the crowd-sensing data. On the one hand, sensing data terminals as the data owner are numerous and mistrustful, while, on the other hand, in most cases, the terminals find it difficult to finish many safety operation due to computation and storage capability constraints. In light of they Multi Owners and Multi Users (MOMU situation in the crowd-sensing data cloud environment, this paper presents a secure nearest neighbor query scheme based on the proxy server architecture, which is constructed by protocols of secure two-party computation and secure Voronoi diagram algorithm. It not only preserves the data confidentiality and query privacy but also effectively resists the collusion between the cloud server and the data owners or users. Finally, extensive theoretical and experimental evaluations are presented to show that our proposed scheme achieves a superior balance between the security and query performance compared to other schemes.

  13. Application of Machine Learning Algorithms for the Query Performance Prediction

    Directory of Open Access Journals (Sweden)

    MILICEVIC, M.

    2015-08-01

    Full Text Available This paper analyzes the relationship between the system load/throughput and the query response time in a real Online transaction processing (OLTP system environment. Although OLTP systems are characterized by short transactions, which normally entail high availability and consistent short response times, the need for operational reporting may jeopardize these objectives. We suggest a new approach to performance prediction for concurrent database workloads, based on the system state vector which consists of 36 attributes. There is no bias to the importance of certain attributes, but the machine learning methods are used to determine which attributes better describe the behavior of the particular database server and how to model that system. During the learning phase, the system's profile is created using multiple reference queries, which are selected to represent frequent business processes. The possibility of the accurate response time prediction may be a foundation for automated decision-making for database (DB query scheduling. Possible applications of the proposed method include adaptive resource allocation, quality of service (QoS management or real-time dynamic query scheduling (e.g. estimation of the optimal moment for a complex query execution.

  14. Parallel Index and Query for Large Scale Data Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Chou, Jerry; Wu, Kesheng; Ruebel, Oliver; Howison, Mark; Qiang, Ji; Prabhat,; Austin, Brian; Bethel, E. Wes; Ryne, Rob D.; Shoshani, Arie

    2011-07-18

    Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index and query technologies are critical for facilitating interactive exploration of large datasets, but numerous challenges remain in terms of designing a system for process- ing general scientific datasets. The system needs to be able to run on distributed multi-core platforms, efficiently utilize underlying I/O infrastructure, and scale to massive datasets. We present FastQuery, a novel software framework that address these challenges. FastQuery utilizes a state-of-the-art index and query technology (FastBit) and is designed to process mas- sive datasets on modern supercomputing platforms. We apply FastQuery to processing of a massive 50TB dataset generated by a large scale accelerator modeling code. We demonstrate the scalability of the tool to 11,520 cores. Motivated by the scientific need to search for inter- esting particles in this dataset, we use our framework to reduce search time from hours to tens of seconds.

  15. Six Years in the Life of a Mother Bear - The Longest Continuous Heart Rate Recordings from a Free-Ranging Mammal

    Science.gov (United States)

    Laske, Timothy G.; Iaizzo, Paul A.; Garshelis, David L.

    2017-01-01

    Physiological monitoring of free-ranging wild animals is providing new insights into their adaptations to a changing environment. American black bears (Ursus americanus) are highly adaptable mammals, spending up to half the year hibernating, and the remainder of the year attempting to gain weight on a landscape with foods that vary seasonally and year to year. We recorded heart rate (HR) and corresponding activity of an adult female black bear over the course of six years, using an implanted monitor. Despite yearly differences in food, and an every-other year reproductive cycle, this bear exhibited remarkable consistency in HR and activity. HR increased for 12 weeks in spring, from minimal hibernation levels (mean 20-25 beats/minute [bpm]; min 10 bpm) to summer active levels (July daytime: mean 95 bpm). Timing was delayed following one cold winter. In August the bear switched from primarily diurnal to nocturnal, coincident with the availability of baits set by legal hunters. Activity in autumn was higher when the bear was with cubs. Birthing of cubs in January was identified by a transient increase in HR and activity. Long-term physiological and behavioral monitoring is valuable for understanding adaptations of free-ranging animals to climate change, food availability, and human-related stressors.

  16. A Novel Two-Tier Cooperative Caching Mechanism for the Optimization of Multi-Attribute Periodic Queries in Wireless Sensor Networks

    Science.gov (United States)

    Zhou, ZhangBing; Zhao, Deng; Shu, Lei; Tsang, Kim-Fung

    2015-01-01

    Wireless sensor networks, serving as an important interface between physical environments and computational systems, have been used extensively for supporting domain applications, where multiple-attribute sensory data are queried from the network continuously and periodically. Usually, certain sensory data may not vary significantly within a certain time duration for certain applications. In this setting, sensory data gathered at a certain time slot can be used for answering concurrent queries and may be reused for answering the forthcoming queries when the variation of these data is within a certain threshold. To address this challenge, a popularity-based cooperative caching mechanism is proposed in this article, where the popularity of sensory data is calculated according to the queries issued in recent time slots. This popularity reflects the possibility that sensory data are interested in the forthcoming queries. Generally, sensory data with the highest popularity are cached at the sink node, while sensory data that may not be interested in the forthcoming queries are cached in the head nodes of divided grid cells. Leveraging these cooperatively cached sensory data, queries are answered through composing these two-tier cached data. Experimental evaluation shows that this approach can reduce the network communication cost significantly and increase the network capability. PMID:26131665

  17. A new method suitable for calculating accurately wetting temperature over a wide range of conditions: Based on the adaptation of continuation algorithm to classical DFT

    Science.gov (United States)

    Zhou, Shiqi

    2017-11-01

    A new scheme is put forward to determine the wetting temperature (Tw) by utilizing the adaptation of arc-length continuation algorithm to classical density functional theory (DFT) used originally by Frink and Salinger, and its advantages are summarized into four points: (i) the new scheme is applicable whether the wetting occurs near a planar or a non-planar surface, whereas a zero contact angle method is considered only applicable to a perfectly flat solid surface, as demonstrated previously and in this work, and essentially not fit for non-planar surface. (ii) The new scheme is devoid of an uncertainty, which plagues a pre-wetting extrapolation method and originates from an unattainability of the infinitely thick film in the theoretical calculation. (iii) The new scheme can be similarly and easily applied to extreme instances characterized by lower temperatures and/or higher surface attraction force field, which, however, can not be dealt with by the pre-wetting extrapolation method because of the pre-wetting transition being mixed with many layering transitions and the difficulty in differentiating varieties of the surface phase transitions. (iv) The new scheme still works in instance wherein the wetting transition occurs close to the bulk critical temperature; however, this case completely can not be managed by the pre-wetting extrapolation method because near the bulk critical temperature the pre-wetting region is extremely narrow, and no enough pre-wetting data are available for use of the extrapolation procedure.

  18. A Low-Power Single-Bit Continuous-Time ΔΣ Converter with 92.5 dB Dynamic Range for Biomedical Applications

    Directory of Open Access Journals (Sweden)

    Vishal Saxena

    2012-07-01

    Full Text Available A third-order single-bit CT-ΔΣ modulator for generic biomedical applications is implemented in a 0.15 µm FDSOI CMOS process. The overall power efficiency is attained by employing a single-bit ΔΣ and a subthreshold FDSOI process. The loop-filter coefficients are determined using a systematic design centering approach by accounting for the integrator non-idealities. The single-bit CT-ΔΣ modulator consumes 110 µW power from a 1.5 V power supply when clocked at 6.144 MHz. The simulation results for the modulator exhibit a dynamic range of 94.4 dB and peak SNDR of 92.4 dB for 6 kHz signal bandwidth. The figure of merit (FoM for the third-order, single-bit CT-ΔΣ modulator is 0.271 pJ/level.

  19. Elastic Spatial Query Processing in OpenStack Cloud Computing Environment for Time-Constraint Data Analysis

    Directory of Open Access Journals (Sweden)

    Wei Huang

    2017-03-01

    Full Text Available Geospatial big data analysis (GBDA is extremely significant for time-constraint applications such as disaster response. However, the time-constraint analysis is not yet a trivial task in the cloud computing environment. Spatial query processing (SQP is typical computation-intensive and indispensable for GBDA, and the spatial range query, join query, and the nearest neighbor query algorithms are not scalable without using MapReduce-liked frameworks. Parallel SQP algorithms (PSQPAs are trapped in screw-processing, which is a known issue in Geoscience. To satisfy time-constrained GBDA, we propose an elastic SQP approach in this paper. First, Spark is used to implement PSQPAs. Second, Kubernetes-managed Core Operation System (CoreOS clusters provide self-healing Docker containers for running Spark clusters in the cloud. Spark-based PSQPAs are submitted to Docker containers, where Spark master instances reside. Finally, the horizontal pod auto-scaler (HPA would scale-out and scale-in Docker containers for supporting on-demand computing resources. Combined with an auto-scaling group of virtual instances, HPA helps to find each of the five nearest neighbors for 46,139,532 query objects from 834,158 spatial data objects in less than 300 s. The experiments conducted on an OpenStack cloud demonstrate that auto-scaling containers can satisfy time-constraint GBDA in clouds.

  20. Browsing schematics: Query-filtered graphs with context nodes

    Science.gov (United States)

    Ciccarelli, Eugene C.; Nardi, Bonnie A.

    1988-01-01

    The early results of a research project to create tools for building interfaces to intelligent systems on the NASA Space Station are reported. One such tool is the Schematic Browser which helps users engaged in engineering problem solving find and select schematics from among a large set. Users query for schematics with certain components, and the Schematic Browser presents a graph whose nodes represent the schematics with those components. The query greatly reduces the number of choices presented to the user, filtering the graph to a manageable size. Users can reformulate and refine the query serially until they locate the schematics of interest. To help users maintain orientation as they navigate a large body of data, the graph also includes nodes that are not matches but provide global and local context for the matching nodes. Context nodes include landmarks, ancestors, siblings, children and previous matches.

  1. An Optimal Dynamic Data Structure for Stabbing-Semigroup Queries

    DEFF Research Database (Denmark)

    Agarwal, Pankaj K.; Arge, Lars; Kaplan, Haim

    2012-01-01

    {R}$, the stabbing-semigroup query asks for computing $\\sum_{s \\in S(q)} \\omega(s)$. We propose a linear-size dynamic data structure, under the pointer-machine model, that answers queries in worst-case $O(\\log n)$ time and supports both insertions and deletions of intervals in amortized $O(\\log n)$ time....... It is the first data structure that attains the optimal $O(\\log n)$ bound for all three operations. Furthermore, our structure can easily be adapted to external memory, where we obtain a linear-size structure that answers queries and supports updates in $O(\\log_B n)$ I/Os, where B is the disk block size....... For the restricted case of a nested family of intervals (either every pair of intervals is disjoint or one contains the other), we present a simpler solution based on dynamic trees...

  2. Investigation in Query System Framework for High Energy Physics

    CERN Document Server

    Jatuphattharachat, Thanat

    2017-01-01

    We summarize an investigation in query system framework for HEP (High Energy Physics). Our work was an investigation on distributed server part of Femtocode, which is a query language that provides the ability for physicists to make plots and other aggregations in real-time. To make the system more robust and capable of processing large amount of data quickly, it is necessary to deploy the system on a redundant and distributed computing cluster. This project aims to investigate third party coordination and resource management frameworks which fit into the design of real-time distributed query system. Zookeeper, Mesos and Marathon are the main frameworks for this investigation. The results indicate that Zookeeper is good for job coordinator and job tracking as it provides robust, fast, simple and transparent read and write process for all connecting client across distributed Zookeeper server. Furthermore, it also supports high availability access and consistency guarantee within specific time bound.

  3. Optimizing RDF Data Cubes for Efficient Processing of Analytical Queries

    DEFF Research Database (Denmark)

    Jakobsen, Kim Ahlstrøm; Andersen, Alex B.; Hose, Katja

    2015-01-01

    data warehouses and data cubes. Today, external data sources are essential for analytics and, as the Semantic Web gains popularity, more and more external sources are available in native RDF. With the recent SPARQL 1.1 standard, performing analytical queries over RDF data sources has finally become......In today’s data-driven world, analytical querying, typically based on the data cube concept, is the cornerstone of answering important business questions and making data-driven decisions. Traditionally, the underlying analytical data was mostly internal to the organization and stored in relational...... feasible. However, unlike their relational counterparts, RDF data cubes stores lack optimizations that enable fast querying. In this paper, we present an approach to optimizing RDF data cubes that is based on three novel cube patterns that optimize RDF data cubes, as well as associated algorithms...

  4. Nowcasting Mobile Games Ranking Using Web Search Query Data

    Directory of Open Access Journals (Sweden)

    Yoones A. Sekhavat

    2016-01-01

    Full Text Available In recent years, the Internet has become embedded into the purchasing decision of consumers. The purpose of this paper is to study whether the Internet behavior of users correlates with their actual behavior in computer games market. Rather than proposing the most accurate model for computer game sales, we aim to investigate to what extent web search query data can be exploited to nowcast (contraction of “now” and “forecasting” referring to techniques used to make short-term forecasts (predict the present status of the ranking of mobile games in the world. Google search query data is used for this purpose, since this data can provide a real-time view on the topics of interest. Various statistical techniques are used to show the effectiveness of using web search query data to nowcast mobile games ranking.

  5. A Foundation for Efficient Indoor Distance-Aware Query Processing

    DEFF Research Database (Denmark)

    Lu, Hua; Cao, Xin; Jensen, Christian Søndergaard

    2012-01-01

    model that integrates indoor distance seamlessly. To enable the use of the model as a foundation for query processing, we develop accompanying, efficient algorithms that compute indoor distances for different indoor entities like doors as well as locations. We also propose an indexing framework...... indoor distances. However, existing indoor space models do not account well for indoor distances. To address this shortcoming, we propose a data management infrastructure that captures indoor distance and facilitates distance-aware query processing. In particular, we propose a distance-aware indoor space...... that accommodates indoor distances that are pre-computed using the proposed algorithms. On top of this foundation, we develop efficient algorithms for typical indoor, distance-aware queries. The results of an extensive experimental evaluation demonstrate the efficacy of the proposals....

  6. Query Expansion: Is It Necessary In Textual Case-Based Reasoning ...

    African Journals Online (AJOL)

    Query expansion (QE) is the process of transforming a seed query to improve retrieval performance in information retrieval operations. It is often intended to overcome a vocabulary mismatch between the query and the document collection. Query expansion is known to improve retrieval effectiveness of some information ...

  7. Digital Library Query Clearing Using Clustering and Fuzzy Decision-Making.

    Science.gov (United States)

    Heywood, M. I.; Zincir-Heywood, A. N.; Chatwin, C. R.

    2000-01-01

    Proposes and analyzes a method for servicing keyword queries expressed in a digital library. Topics include efficiency via the concept of customers and producers; grouping queries into clusters of similar concepts; information density of the library; query delay; query priorities; and fuzzy decision-making. (Author/LRW)

  8. Flexible and Efficient Resolution of Skyline Query Size Constraints

    DEFF Research Database (Denmark)

    Lu, Hua; Jensen, Christian S.; Zhang, Zhenjie

    2011-01-01

    , the former often incurs too many ties in its ranking, and the latter is inapplicable for k>;s. Based on these observations, the paper proposes a new approach, called skyline ordering, that forms a skyline-based partitioning of a given data set such that an order exists among the partitions. Then, set......Given a set of multidimensional points, a skyline query returns the interesting points that are not dominated by other points. It has been observed that the actual cardinality (s) of a skyline query result may differ substantially from the desired result cardinality (k), which has prompted studies...

  9. Tag cloud generation for results of multiple keywords queries

    DEFF Research Database (Denmark)

    Leginus, Martin; Dolog, Peter; Lage, Ricardo Gomes

    2013-01-01

    In this paper we study tag cloud generation for retrieved results of multiple keyword queries. It is motivated by many real world scenarios such as personalization tasks, surveillance systems and information retrieval tasks defined with multiple keywords. We adjust the state-of-the-art tag cloud...... generation techniques for multiple keywords query results. Consequently, we conduct the extensive evaluation on top of three distinct collaborative tagging systems. The graph-based methods perform significantly better for the Movielens and Bibsonomy datasets. Tag cloud generation based on maximal coverage...

  10. Instant jQuery Flot visual data analysis

    CERN Document Server

    Peiris, Brian

    2013-01-01

    Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. A quick, instruction-based guide full of examples that details on the various aspects of Flot and how users can apply it to data groups for interactive data representation techniques.If you are a data visualization developer, mapping and presentation software developer, or anyone with an interest in jQuery visualization, this book is ideal for you. If you have a working knowledge of jQuery and JavaScript, you can use this book to add sophisticated visualizations to your web applicat

  11. Web page sorting algorithm based on query keyword distance relation

    Science.gov (United States)

    Yang, Han; Cui, Hong Gang; Tang, Hao

    2017-08-01

    In order to optimize the problem of page sorting, according to the search keywords in the web page in the relationship between the characteristics of the proposed query keywords clustering ideas. And it is converted into the degree of aggregation of the search keywords in the web page. Based on the PageRank algorithm, the clustering degree factor of the query keyword is added to make it possible to participate in the quantitative calculation. This paper proposes an improved algorithm for PageRank based on the distance relation between search keywords. The experimental results show the feasibility and effectiveness of the method.

  12. SM4MQ: A Semantic Model for Multidimensional Queries

    DEFF Research Database (Denmark)

    Varga, Jovan; Dobrokhotova, Ekaterina; Romero, Oscar

    2017-01-01

    metadata artifacts (e.g., queries) to assist users with the analysis. However, modeling and sharing of most of these artifacts are typically overlooked. Thus, in this paper we focus on the query metadata artifact in the Exploratory OLAP context and propose an RDF-based vocabulary for its representation......On-Line Analytical Processing (OLAP) is a data analysis approach to support decision-making. On top of that, Exploratory OLAP is a novel initiative for the convergence of OLAP and the Semantic Web (SW) that enables the use of OLAP techniques on SW data. Moreover, OLAP approaches exploit different...

  13. Location-Dependent Query Processing Under Soft Real-Time Constraints

    Directory of Open Access Journals (Sweden)

    Zoubir Mammeri

    2009-01-01

    Full Text Available In recent years, mobile devices and applications achieved an increasing development. In database field, this development required methods to consider new query types like location-dependent queries (i.e. the query results depend on the query issuer location. Although several researches addressed problems related to location-dependent query processing, a few works considered timing requirements that may be associated with queries (i.e., the query results must be delivered to mobile clients on time. The main objective of this paper is to propose a solution for location-dependent query processing under soft real-time constraints. Hence, we propose methods to take into account client location-dependency and to maximize the percentage of queries respecting their deadlines. We validate our proposal by implementing a prototype based on Oracle DBMS. Performance evaluation results show that the proposed solution optimizes the percentage of queries meeting their deadlines and the communication cost.

  14. Seasonal trends in sleep-disordered breathing: evidence from Internet search engine query data.

    Science.gov (United States)

    Ingram, David G; Matthews, Camilla K; Plante, David T

    2015-03-01

    The primary aim of the current study was to test the hypothesis that there is a seasonal component to snoring and obstructive sleep apnea (OSA) through the use of Google search engine query data. Internet search engine query data were retrieved from Google Trends from January 2006 to December 2012. Monthly normalized search volume was obtained over that 7-year period in the USA and Australia for the following search terms: "snoring" and "sleep apnea". Seasonal effects were investigated by fitting cosinor regression models. In addition, the search terms "snoring children" and "sleep apnea children" were evaluated to examine seasonal effects in pediatric populations. Statistically significant seasonal effects were found using cosinor analysis in both USA and Australia for "snoring" (p search term in Australia (p = 0.13). Seasonal patterns for "snoring children" and "sleep apnea children" were observed in the USA (p = 0.002 and p search volume to examine these search terms in Australia. All searches peaked in the winter or early spring in both countries, with the magnitude of seasonal effect ranging from 5 to 50 %. Our findings indicate that there are significant seasonal trends for both snoring and sleep apnea internet search engine queries, with a peak in the winter and early spring. Further research is indicated to determine the mechanisms underlying these findings, whether they have clinical impact, and if they are associated with other comorbid medical conditions that have similar patterns of seasonal exacerbation.

  15. Relaxing rdf queries based on user and domain preferences

    DEFF Research Database (Denmark)

    Dolog, Peter; Stueckenschmidt, Heiner; Wache, Holger

    2009-01-01

    knowledge and user preferences. We describe a framework for information access that combines query refinement and relaxation in order to provide robust, personalized access to heterogeneous resource description framework data as well as an implementation in terms of rewriting rules and explain its...

  16. From Nested-Loop to Join Queries in OODB

    NARCIS (Netherlands)

    Steenhagen, H.J.; Steenhagen, H.J.; Apers, Peter M.G.; Blanken, Henk; de By, R.A.

    Most declarative SQL-like query languages for object-oriented database systems are orthogonal languages allowing for arbitrary nesting of expressions in the select-, from-, and where-clause. Expressions in the from-clause may be base tables as well as set-valued attributes. In this paper, we propose

  17. MOCQL: A Declarative Language for Ad-Hoc Model Querying

    DEFF Research Database (Denmark)

    Störrle, Harald

    2013-01-01

    Language (MOCQL), an experimental declarative textual language to express queries (and constraints) on models. We introduce MOCQL by examples and its grammar, evaluate its usability by means of controlled experiments, and find that modelers perform better and experience less cognitive load when working...

  18. CIRQuL: Complex Information Retrieval Query Language

    NARCIS (Netherlands)

    Mihajlovic, V.; Hiemstra, Djoerd; Apers, Peter M.G.

    In this paper we will present a new framework for the retrieval of XML documents. We will describe the extension for existing query languages (XPath and XQuery) geared toward ranked information retrieval and full-text search in XML documents. Furthermore we will present language models for ranked

  19. Method of and device for querying of protected structured data

    NARCIS (Netherlands)

    Brinkman, Richard; Doumen, J.M.; Jonker, Willem; Schoenmakers, B.

    Method of and device for querying of protected data structured in the form of a tree. A corresponding tree of node polynomials is constructed such that each node polynomial evaluates to zero for an input equal to an identifier assigned to a node name occurring in a branch of the data tree starting

  20. Method of and device for querying of protected structured data

    NARCIS (Netherlands)

    Jonker, Willem; Brinkman, Richard; Doumen, J.M.; Schoenmakers, Berry

    2005-01-01

    Method of and device for querying of protected data structured in the form of a tree. A corresponding tree of node polynomials is constructed such that each node polynomial evaluates to zero for an input equal to an identifier assigned to a node name occurring in a branch of the data tree starting

  1. Developing responsive web applications with Ajax and jQuery

    CERN Document Server

    Patel, Sandeep Kumar

    2014-01-01

    This book is a standard tutorial for web application developers presented in a comprehensive, step-by-step manner to explain the nuances involved. It has an abundance of code and examples supporting explanations of each feature. This book is intended for Java developers wanting to create rich and responsive applications using AJAX. Basic experience of using jQuery is assumed.

  2. Constraint based frequent pattern mining for generalized query ...

    African Journals Online (AJOL)

    The World-Wide Web provides every Internet citizen access to an abundance of information, but difficulty increases in identifying the relevant piece of information. Popular Search engine uses log for keeping track of user activities including user queries, click-through and their behavior. Research in web mining tries to ...

  3. Dictionary Writing System (DWS) + Corpus Query Package (CQP ...

    African Journals Online (AJOL)

    In this article the integrated corpus query functionality of the dictionary compilation software TshwaneLex is analysed. Attention is given to the handling of both raw corpus data and annotated corpus data. With regard to the latter it is shown how, with a minimum of human effort, machine learning techniques can be employed ...

  4. Most Recent Match Queries in On-Line Suffix Trees

    DEFF Research Database (Denmark)

    Larsson, N. Jesper

    2014-01-01

    A suffix tree is able to efficiently locate a pattern in an indexed string, but not in general the most recent copy of the pattern in an online stream, which is desirable in some applications. We study the most general version of the problem of locating a most recent match: supporting queries...

  5. Optimizing Aggregate SPARQL Queries Using Materialized RDF Views

    DEFF Research Database (Denmark)

    Ibragimov, Dilshod; Hose, Katja; Pedersen, Torben Bach

    2016-01-01

    be created and used as a source of precomputed partial results during query processing. However, materialized view techniques as proposed for relational databases do not support RDF specifics, such as incompleteness and the need to support implicit (derived) information. To overcome these challenges...

  6. Algebra-Based Optimization of XML-Extended OLAP Queries

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    2006-01-01

    In today’s OLAP systems, integrating fast changing data physically into a cube is complex and time-consuming. Our solution, the “OLAP-XML Federation System,” makes it possible to reference the fast changing data in XML format in OLAP queries without physical integration. In this paper, we introdu...

  7. Using Description Logics to Model Context Aware Query Preferences

    NARCIS (Netherlands)

    van Bunningen, A.H.; Feng, L.; Apers, Peter M.G.

    Users’ preferences have traditionally been exploited in query personalization to better serve their information needs. With the emerging ubiquitous computing technologies, users will be situated in an Ambient Intelligent (AmI) environment, where users’ database access will not occur at a single

  8. Project Lefty: More Bang for the Search Query

    Science.gov (United States)

    Varnum, Ken

    2010-01-01

    This article describes the Project Lefty, a search system that, at a minimum, adds a layer on top of traditional federated search tools that will make the wait for results more worthwhile for researchers. At best, Project Lefty improves search queries and relevance rankings for web-scale discovery tools to make the results themselves more relevant…

  9. Exploring query execution strategies for JIT vectorization and SIMD

    NARCIS (Netherlands)

    T.K. Gubner (Tim); P.A. Boncz (Peter)

    2017-01-01

    textabstractThis paper partially explores the design space for efficient query processors on future hardware that is rich in SIMD capabilities. It departs from two well-known approaches: (1) interpreted block-at-a-time execution (a.k.a. "vectorization") and (2) "data-centric" JIT compilation, as in

  10. Approaches for parallel data loading and data querying

    Directory of Open Access Journals (Sweden)

    Vlad DIACONITA

    2015-07-01

    Full Text Available This paper aims to bring contributions in data loading and data querying using products from the Apache Hadoop ecosystem. Currently, we talk about Big Data at up to zettabytes scale (10^21 bytes. Research in this area is usually interdisciplinary combining elements from statistics, system integration, parallel processing and cloud computing.

  11. Spatio-temporal keyword queries in social networks

    DEFF Research Database (Denmark)

    Cozza, V.; Messina, Alessandro; Montesi, D.

    2013-01-01

    Due to the large amount of social network data produced at an ever growing speed and their complex nature, recent works have addressed the problem of efficiently querying such data according to social, temporal or spatial dimensions. In this work we propose a data model that keeps into account all...

  12. NoDB: efficient query execution on raw data files

    NARCIS (Netherlands)

    I. Alagiannis; R Borovica; M. Branco; S. Idreos (Stratos); A. Ailamaki

    2012-01-01

    htmlabstractAs data collections become larger and larger, data loading evolves to a major bottleneck. Many applications already avoid using database systems, e.g., scientific data analysis and social networks, due to the complexity and the increased data-to-query time. For such applications data

  13. Extracting Rankings for Spatial Keyword Queries from GPS Data

    DEFF Research Database (Denmark)

    Keles, Ilkcan; Jensen, Christian Søndergaard; Saltenis, Simonas

    2018-01-01

    a model that synthesizes a ranking of points of interest (PoI) for a given query using historical trips extracted from GPS data. To extract trips, we propose a novel PoI assignment method that makes use of distances and temporal information. We also propose a PageRank-based smoothing method to be able...

  14. Learning from the History of Distributed Query Processing

    DEFF Research Database (Denmark)

    Betz, Heiko; Gropengießer, Francis; Hose, Katja

    2012-01-01

    The vision of the Semantic Web has triggered the development of various new applications and opened up new directions in research. Recently, much effort has been put into the development of techniques for query processing over Linked Data. Being based upon techniques originally developed for dist...

  15. Ad-Hoc Queries over Document Collections - A Case Study

    Science.gov (United States)

    Löser, Alexander; Lutter, Steffen; Düssel, Patrick; Markl, Volker

    We discuss the novel problem of supporting analytical business intelligence queries over web-based textual content, e.g., BI-style reports based on 100.000's of documents from an ad-hoc web search result. Neither conventional search engines nor conventional Business Intelligence and ETL tools address this problem, which lies at the intersection of their capabilities. "Google Squared" or our system GOOLAP.info, are examples of these kinds of systems. They execute information extraction methods over one or several document collections at query time and integrate extracted records into a common view or tabular structure. Frequent extraction and object resolution failures cause incomplete records which could not be joined into a record answering the query. Our focus is the identification of join-reordering heuristics maximizing the size of complete records answering a structured query. With respect to given costs for document extraction we propose two novel join-operations: The multi-way CJ-operator joins records from multiple relationships extracted from a single document. The two-way join-operator DJ ensures data density by removing incomplete records from results. In a preliminary case study we observe that our join-reordering heuristics positively impact result size, record density and lower execution costs.

  16. MRA Based Efficient Database Storing and Fast Querying Technique

    Directory of Open Access Journals (Sweden)

    Mitko Kostov

    2017-02-01

    Full Text Available In this paper we consider a specific way of organizing 1D signals or 2D image databases, such that a more efficient storage and faster querying is achieved. A multiresolution technique of data processing is used in order of saving the most significant processed data.

  17. An empirical study on SAJQ (Sorting Algorithm for Join Queries

    Directory of Open Access Journals (Sweden)

    Hassan I. Mathkour

    2010-06-01

    Full Text Available Most queries that applied on database management systems (DBMS depend heavily on the performance of the used sorting algorithm. In addition to have an efficient sorting algorithm, as a primary feature, stability of such algorithms is a major feature that is needed in performing DBMS queries. In this paper, we study a new Sorting Algorithm for Join Queries (SAJQ that has both advantages of being efficient and stable. The proposed algorithm takes the advantage of using the m-way-merge algorithm in enhancing its time complexity. SAJQ performs the sorting operation in a time complexity of O(nlogm, where n is the length of the input array and m is number of sub-arrays used in sorting. An unsorted input array of length n is arranged into m sorted sub-arrays. The m-way-merge algorithm merges the sorted m sub-arrays into the final output sorted array. The proposed algorithm keeps the stability of the keys intact. An analytical proof has been conducted to prove that, in the worst case, the proposed algorithm has a complexity of O(nlogm. Also, a set of experiments has been performed to investigate the performance of the proposed algorithm. The experimental results have shown that the proposed algorithm outperforms other Stable–Sorting algorithms that are designed for join-based queries.

  18. Four queries concerning the metaphysics of early human embryogenesis.

    Science.gov (United States)

    Howsepian, A A

    2008-04-01

    In this essay, I attempt to provide answers to the following four queries concerning the metaphysics of early human embryogenesis. (1) Following its first cellular fission, is it coherent to claim that one and only one of two "blastomeric" twins of a human zygote is identical with that zygote? (2) Following the fusion of two human pre-embryos, is it coherent to claim that one and only one pre-fusion pre-embryo is identical with that postfusion pre-embryo? (3) Does a live human being come into existence only when its brain comes into existence? (4) At implantation, does a pre-embryo become a mere part of its mother? I argue that either if things have quidditative properties or if criterialism is false, then queries (1) and (2) can be answered in the affirmative; that in light of recent developments in theories of human death and in light of a more "functional" theory of brains, query (3) can be answered in the negative; and that plausible mereological principles require a negative answer to query (4).

  19. Secure quantum private information retrieval using phase-encoded queries

    International Nuclear Information System (INIS)

    Olejnik, Lukasz

    2011-01-01

    We propose a quantum solution to the classical private information retrieval (PIR) problem, which allows one to query a database in a private manner. The protocol offers privacy thresholds and allows the user to obtain information from a database in a way that offers the potential adversary, in this model the database owner, no possibility of deterministically establishing the query contents. This protocol may also be viewed as a solution to the symmetrically private information retrieval problem in that it can offer database security (inability for a querying user to steal its contents). Compared to classical solutions, the protocol offers substantial improvement in terms of communication complexity. In comparison with the recent quantum private queries [Phys. Rev. Lett. 100, 230502 (2008)] protocol, it is more efficient in terms of communication complexity and the number of rounds, while offering a clear privacy parameter. We discuss the security of the protocol and analyze its strengths and conclude that using this technique makes it challenging to obtain the unconditional (in the information-theoretic sense) privacy degree; nevertheless, in addition to being simple, the protocol still offers a privacy level. The oracle used in the protocol is inspired both by the classical computational PIR solutions as well as the Deutsch-Jozsa oracle.

  20. Dictionary Writing System (DWS) + Corpus Query Package (CQP):

    African Journals Online (AJOL)

    R.B. Ruthven

    Most professional dictionary houses that compile dictionaries in-house use (at least) two sets of tools: a dictionary writing system (DWS) on the one hand, and some sort of corpus query package. (CQP) on the other. A team of local IT gurus will then typically ensure the tran- sition of data between these two systems. It is very ...

  1. Integrating Non-Spatial Preferences into Spatial Location Queries

    DEFF Research Database (Denmark)

    Qu, Qiang; Liu, Siyuan; Yang, Bin

    2014-01-01

    and upper bounds that enable search-space pruning and thus improve performance. Finally, we provide a generalization of the above query and also extend the algorithms to support the generalization. We report on an experimental evaluation of the proposed algorithms using real point of interest data from...

  2. In-route skyline querying for location-based services

    DEFF Research Database (Denmark)

    Xuegang, Huang; Jensen, Kristian S.

    2005-01-01

    With the emergence of an infrastructure for location-aware mobile services, the processing of advanced, location-based queries that are expected to underlie such services is gaining in relevance, While much work has assumed that users move in Euclidean space, this paper assumes that movement is c...

  3. Multidimensional Data Model and Query Language for Informetrics.

    Science.gov (United States)

    Niemi, Timo; Hirvonen, Lasse; Jarvelin, Kalervo

    2003-01-01

    Discusses multidimensional data analysis, or online analytical processing (OLAP), which offer a single subject-oriented source for analyzing summary data based on various dimensions. Develops a conceptual/logical multidimensional model for supporting the needs of informetrics, including a multidimensional query language whose basic idea is to…

  4. Adaptive query parallelization in multi-core column stores

    NARCIS (Netherlands)

    M.M. Gawade (Mrunal); M.L. Kersten (Martin); M.M. Gawade (Mrunal); M.L. Kersten (Martin)

    2016-01-01

    htmlabstractWith the rise of multi-core CPU platforms, their optimal utilization for in-memory OLAP workloads using column store databases has become one of the biggest challenges. Some of the inherent limi- tations in the achievable query parallelism are due to the degree of parallelism

  5. Translating OSQL-Queries into Efficient Set Expressions

    NARCIS (Netherlands)

    Steenhagen, H.J.; Steenhagen, H.J.; de By, R.A.; Blanken, Henk

    Efficient query processing is one of the key promises of database technology. With the evolution of supported data models—from relational via nested relational to object-oriented—the need for such efficiency has not diminished, and the general problem has increased in complexity. In this paper, we

  6. Query Recommendation in the Domain of Information for Children

    NARCIS (Netherlands)

    Duarte Torres, Sergio; Hiemstra, Djoerd; Weber, Ingmar; Serdyukov, Pavel

    Children represent an increasing part of web users. One of the key problems that hamper their search experience is their limited vocabulary, their difficulty to use the right keywords, and the inappropriateness of general- purpose query suggestions. In this work, we propose a method that uses tags

  7. Using Clinicians’ Search Query Data to Monitor Influenza Epidemics

    Science.gov (United States)

    Santillana, Mauricio; Nsoesie, Elaine O.; Mekaru, Sumiko R.; Scales, David; Brownstein, John S.

    2014-01-01

    Search query information from a clinician's database, UpToDate, is shown to predict influenza epidemics in the United States in a timely manner. Our results show that digital disease surveillance tools based on experts' databases may be able to provide an alternative, reliable, and stable signal for accurate predictions of influenza outbreaks. PMID:25115873

  8. Medical Query Language: Improved Access to MUMPS Databases

    Science.gov (United States)

    Webster, Sally; Morgan, Mary; Barnett, G. Octo

    1987-01-01

    The Medical Query Language (MQL) is a tool which enables medical staff, administrators, and system managers to generate ANSI Standard MUMPS programs to add flexibility to an existing database management system. This paper describes the features of MQL as it is used by a number of diverse sites for report generation, ad-hoc searches, and quality assurance checks.

  9. Joint Top-K Spatial Keyword Query Processing

    DEFF Research Database (Denmark)

    Wu, Dinming; Yiu, Man Lung; Cong, Gao

    2012-01-01

    keyword queries. Empirical studies show that the proposed solution is efficient on real data sets. We also offer analytical studies on synthetic data sets to demonstrate the efficiency of the proposed solution. Index Terms IEEE Terms Electronic mail , Google , Indexes , Joints , Mobile communication...

  10. Generic multiset programming for language-integrated querying

    DEFF Research Database (Denmark)

    Henglein, Fritz; Larsen, Ken Friis

    2010-01-01

    This paper demonstrates how relational algebraic programming based on efficient symbolic representations of multisets and operations on them can be applied to the query sublanguage of SQL in a type-safe fashion. In essence, it provides a library for naïve programming with multisets in a generalized...

  11. VIGOR: Interactive Visual Exploration of Graph Query Results.

    Science.gov (United States)

    Pienta, Robert; Hohman, Fred; Endert, Alex; Tamersoy, Acar; Roundy, Kevin; Gates, Chris; Navathe, Shamkant; Chau, Duen Horng

    2018-01-01

    Finding patterns in graphs has become a vital challenge in many domains from biological systems, network security, to finance (e.g., finding money laundering rings of bankers and business owners). While there is significant interest in graph databases and querying techniques, less research has focused on helping analysts make sense of underlying patterns within a group of subgraph results. Visualizing graph query results is challenging, requiring effective summarization of a large number of subgraphs, each having potentially shared node-values, rich node features, and flexible structure across queries. We present VIGOR, a novel interactive visual analytics system, for exploring and making sense of query results. VIGOR uses multiple coordinated views, leveraging different data representations and organizations to streamline analysts sensemaking process. VIGOR contributes: (1) an exemplar-based interaction technique, where an analyst starts with a specific result and relaxes constraints to find other similar results or starts with only the structure (i.e., without node value constraints), and adds constraints to narrow in on specific results; and (2) a novel feature-aware subgraph result summarization. Through a collaboration with Symantec, we demonstrate how VIGOR helps tackle real-world problems through the discovery of security blindspots in a cybersecurity dataset with over 11,000 incidents. We also evaluate VIGOR with a within-subjects study, demonstrating VIGOR's ease of use over a leading graph database management system, and its ability to help analysts understand their results at higher speed and make fewer errors.

  12. The encoding complexity of two dimensional range minimum data structures

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Brodnik, Andrej; Davoodi, Pooya

    2013-01-01

    In the two-dimensional range minimum query problem an input matrix A of dimension m ×n, m ≤ n, has to be preprocessed into a data structure such that given a query rectangle within the matrix, the position of a minimum element within the query range can be reported. We consider the space complexity...... of the encoding variant of the problem where queries have access to the constructed data structure but can not access the input matrix A, i.e. all information must be encoded in the data structure. Previously it was known how to solve the problem with space O(mn min {m,logn}) bits (and with constant query time...

  13. IJA: An Efficient Algorithm for Query Processing in Sensor Networks

    Directory of Open Access Journals (Sweden)

    Dong Hwa Kim

    2011-01-01

    Full Text Available One of main features in sensor networks is the function that processes real time state information after gathering needed data from many domains. The component technologies consisting of each node called a sensor node that are including physical sensors, processors, actuators and power have advanced significantly over the last decade. Thanks to the advanced technology, over time sensor networks have been adopted in an all-round industry sensing physical phenomenon. However, sensor nodes in sensor networks are considerably constrained because with their energy and memory resources they have a very limited ability to process any information compared to conventional computer systems. Thus query processing over the nodes should be constrained because of their limitations. Due to the problems, the join operations in sensor networks are typically processed in a distributed manner over a set of nodes and have been studied. By way of example while simple queries, such as select and aggregate queries, in sensor networks have been addressed in the literature, the processing of join queries in sensor networks remains to be investigated. Therefore, in this paper, we propose and describe an Incremental Join Algorithm (IJA in Sensor Networks to reduce the overhead caused by moving a join pair to the final join node or to minimize the communication cost that is the main consumer of the battery when processing the distributed queries in sensor networks environments. At the same time, the simulation result shows that the proposed IJA algorithm significantly reduces the number of bytes to be moved to join nodes compared to the popular synopsis join algorithm.

  14. Persistent Identifiers for Improved Accessibility for Linked Data Querying

    Science.gov (United States)

    Shepherd, A.; Chandler, C. L.; Arko, R. A.; Fils, D.; Jones, M. B.; Krisnadhi, A.; Mecum, B.

    2016-12-01

    The adoption of linked open data principles within the geosciences has increased the amount of accessible information available on the Web. However, this data is difficult to consume for those who are unfamiliar with Semantic Web technologies such as Web Ontology Language (OWL), Resource Description Framework (RDF) and SPARQL - the RDF query language. Consumers would need to understand the structure of the data and how to efficiently query it. Furthermore, understanding how to query doesn't solve problems of poor precision and recall in search results. For consumers unfamiliar with the data, full-text searches are most accessible, but not ideal as they arrest the advantages of data disambiguation and co-reference resolution efforts. Conversely, URI searches across linked data can deliver improved search results, but knowledge of these exact URIs may remain difficult to obtain. The increased adoption of Persistent Identifiers (PIDs) can lead to improved linked data querying by a wide variety of consumers. Because PIDs resolve to a single entity, they are an excellent data point for disambiguating content. At the same time, PIDs are more accessible and prominent than a single data provider's linked data URI. When present in linked open datasets, PIDs provide balance between the technical and social hurdles of linked data querying as evidenced by the NSF EarthCube GeoLink project. The GeoLink project, funded by NSF's EarthCube initiative, have brought together data repositories include content from field expeditions, laboratory analyses, journal publications, conference presentations, theses/reports, and funding awards that span scientific studies from marine geology to marine ecosystems and biogeochemistry to paleoclimatology.

  15. An organizational framework and strategic implementation for system-level change to enhance research-based practice: QUERI Series.

    Science.gov (United States)

    Stetler, Cheryl B; McQueen, Lynn; Demakis, John; Mittman, Brian S

    2008-05-29

    The continuing gap between available evidence and current practice in health care reinforces the need for more effective solutions, in particular related to organizational context. Considerable advances have been made within the U.S. Veterans Health Administration (VA) in systematically implementing evidence into practice. These advances have been achieved through a system-level program focused on collaboration and partnerships among policy makers, clinicians, and researchers. The Quality Enhancement Research Initiative (QUERI) was created to generate research-driven initiatives that directly enhance health care quality within the VA and, simultaneously, contribute to the field of implementation science. This paradigm-shifting effort provided a natural laboratory for exploring organizational change processes. This article describes the underlying change framework and implementation strategy used to operationalize QUERI. QUERI used an evidence-based organizational framework focused on three contextual elements: 1) cultural norms and values, in this case related to the role of health services researchers in evidence-based quality improvement; 2) capacity, in this case among researchers and key partners to engage in implementation research; 3) and supportive infrastructures to reinforce expectations for change and to sustain new behaviors as part of the norm. As part of a QUERI Series in Implementation Science, this article describes the framework's application in an innovative integration of health services research, policy, and clinical care delivery. QUERI's experience and success provide a case study in organizational change. It demonstrates that progress requires a strategic, systems-based effort. QUERI's evidence-based initiative involved a deliberate cultural shift, requiring ongoing commitment in multiple forms and at multiple levels. VA's commitment to QUERI came in the form of visionary leadership, targeted allocation of resources, infrastructure refinements

  16. An organizational framework and strategic implementation for system-level change to enhance research-based practice: QUERI Series

    Directory of Open Access Journals (Sweden)

    Mittman Brian S

    2008-05-01

    Full Text Available Abstract Background The continuing gap between available evidence and current practice in health care reinforces the need for more effective solutions, in particular related to organizational context. Considerable advances have been made within the U.S. Veterans Health Administration (VA in systematically implementing evidence into practice. These advances have been achieved through a system-level program focused on collaboration and partnerships among policy makers, clinicians, and researchers. The Quality Enhancement Research Initiative (QUERI was created to generate research-driven initiatives that directly enhance health care quality within the VA and, simultaneously, contribute to the field of implementation science. This paradigm-shifting effort provided a natural laboratory for exploring organizational change processes. This article describes the underlying change framework and implementation strategy used to operationalize QUERI. Strategic approach to organizational change QUERI used an evidence-based organizational framework focused on three contextual elements: 1 cultural norms and values, in this case related to the role of health services researchers in evidence-based quality improvement; 2 capacity, in this case among researchers and key partners to engage in implementation research; 3 and supportive infrastructures to reinforce expectations for change and to sustain new behaviors as part of the norm. As part of a QUERI Series in Implementation Science, this article describes the framework's application in an innovative integration of health services research, policy, and clinical care delivery. Conclusion QUERI's experience and success provide a case study in organizational change. It demonstrates that progress requires a strategic, systems-based effort. QUERI's evidence-based initiative involved a deliberate cultural shift, requiring ongoing commitment in multiple forms and at multiple levels. VA's commitment to QUERI came in the

  17. RESEARCH ON EXTENSION OF SPARQL ONTOLOGY QUERY LANGUAGE CONSIDERING THE COMPUTATION OF INDOOR SPATIAL RELATIONS

    Directory of Open Access Journals (Sweden)

    C. Li

    2015-05-01

    Full Text Available A method suitable for indoor complex semantic query considering the computation of indoor spatial relations is provided According to the characteristics of indoor space. This paper designs ontology model describing the space related information of humans, events and Indoor space objects (e.g. Storey and Room as well as their relations to meet the indoor semantic query. The ontology concepts are used in IndoorSPARQL query language which extends SPARQL syntax for representing and querying indoor space. And four types specific primitives for indoor query, "Adjacent", "Opposite", "Vertical" and "Contain", are defined as query functions in IndoorSPARQL used to support quantitative spatial computations. Also a method is proposed to analysis the query language. Finally this paper adopts this method to realize indoor semantic query on the study area through constructing the ontology model for the study building. The experimental results show that the method proposed in this paper can effectively support complex indoor space semantic query.

  18. Dreamweaver CS6 HTML5, CSS3, responsive design, and jQuery

    CERN Document Server

    Karlins, David

    2013-01-01

    This book combines accessible, clear, engaging, and candid reference material, advice, and shortcuts with substantial stepbystep instructions for creating a wide range of HTML5 and CSS3 designs and page content in Dreamweaver.This book is geared towards experienced Dreamweaver web designers migrating to HTML5 and jQuery. It also targets web designers new to Dreamweaver who want to jump with two feet into the most current web design tools and features. While focused primarily on Dreamweaver CS5.5, the book includes content of value to readers using older versions of Dreamweaver with directions

  19. Professional XMPP Programming with JavaScript and jQuery

    CERN Document Server

    Moffitt, Jack

    2010-01-01

    Create real-time, highly interactive apps quickly with the powerful XMPP protocol. XMPP is a robust protocol used for a wide range of applications, including instant messaging, multi-user chat, voice and video conferencing, collaborative spaces, real-time gaming, data synchronization, and search. This book teaches you how to harness the power of XMPP in your own apps and presents you with all the tools you need to build the next generation of apps using XMPP or add new features to your current apps. Featuring the JavaScript language throughout and making use of the jQuery library, the book con

  20. New data structures for orthogonal range searching

    DEFF Research Database (Denmark)

    Alstrup, Stephen; Brodal, Gerth Stølting; Rauhe, Theis

    2000-01-01

    We present new general techniques for static orthogonal range searching problems in two and higher dimensions. For the general range reporting problem in R3, we achieve query time O(log n+k) using space O(n log1+ε n), where n denotes the number of stored points and k the number of points to be re...

  1. Knowledge and theme discovery across very large biological data sets using distributed queries: a prototype combining unstructured and structured data.

    Directory of Open Access Journals (Sweden)

    Uma S Mudunuri

    Full Text Available As the discipline of biomedical science continues to apply new technologies capable of producing unprecedented volumes of noisy and complex biological data, it has become evident that available methods for deriving meaningful information from such data are simply not keeping pace. In order to achieve useful results, researchers require methods that consolidate, store and query combinations of structured and unstructured data sets efficiently and effectively. As we move towards personalized medicine, the need to combine unstructured data, such as medical literature, with large amounts of highly structured and high-throughput data such as human variation or expression data from very large cohorts, is especially urgent. For our study, we investigated a likely biomedical query using the Hadoop framework. We ran queries using native MapReduce tools we developed as well as other open source and proprietary tools. Our results suggest that the available technologies within the Big Data domain can reduce the time and effort needed to utilize and apply distributed queries over large datasets in practical clinical applications in the life sciences domain. The methodologies and technologies discussed in this paper set the stage for a more detailed evaluation that investigates how various data structures and data models are best mapped to the proper computational framework.

  2. Managing and Querying Image Annotation and Markup in XML.

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid.

  3. Towards Automatic Improvement of Patient Queries in Health Retrieval Systems

    Directory of Open Access Journals (Sweden)

    Nesrine KSENTINI

    2016-07-01

    Full Text Available With the adoption of health information technology for clinical health, e-health is becoming usual practice today. Users of this technology find it difficult to seek information relevant to their needs due to the increasing amount of the clinical and medical data on the web, and the lack of knowledge of medical jargon. In this regards, a method is described to improve user's needs by automatically adding new related terms to their queries which appear in the same context of the original query in order to improve final search results. This method is based on the assessment of semantic relationships defined by a proposed statistical method between a set of terms or keywords. Experiments were performed on CLEF-eHealth-2015 database and the obtained results show the effectiveness of our proposed method.

  4. Managing and querying image annotation and markup in XML

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-03-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid.

  5. Query optimization for graph analytics on linked data using SPARQL

    Energy Technology Data Exchange (ETDEWEB)

    Hong, Seokyong [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lee, Sangkeun [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lim, Seung -Hwan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sukumar, Sreenivas R. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Vatsavai, Ranga Raju [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2015-07-01

    Triplestores that support query languages such as SPARQL are emerging as the preferred and scalable solution to represent data and meta-data as massive heterogeneous graphs using Semantic Web standards. With increasing adoption, the desire to conduct graph-theoretic mining and exploratory analysis has also increased. Addressing that desire, this paper presents a solution that is the marriage of Graph Theory and the Semantic Web. We present software that can analyze Linked Data using graph operations such as counting triangles, finding eccentricity, testing connectedness, and computing PageRank directly on triple stores via the SPARQL interface. We describe the process of optimizing performance of the SPARQL-based implementation of such popular graph algorithms by reducing the space-overhead, simplifying iterative complexity and removing redundant computations by understanding query plans. Our optimized approach shows significant performance gains on triplestores hosted on stand-alone workstations as well as hardware-optimized scalable supercomputers such as the Cray XMT.

  6. Query-Adaptive Reciprocal Hash Tables for Nearest Neighbor Search.

    Science.gov (United States)

    Liu, Xianglong; Deng, Cheng; Lang, Bo; Tao, Dacheng; Li, Xuelong

    2016-02-01

    Recent years have witnessed the success of binary hashing techniques in approximate nearest neighbor search. In practice, multiple hash tables are usually built using hashing to cover more desired results in the hit buckets of each table. However, rare work studies the unified approach to constructing multiple informative hash tables using any type of hashing algorithms. Meanwhile, for multiple table search, it also lacks of a generic query-adaptive and fine-grained ranking scheme that can alleviate the binary quantization loss suffered in the standard hashing techniques. To solve the above problems, in this paper, we first regard the table construction as a selection problem over a set of candidate hash functions. With the graph representation of the function set, we propose an efficient solution that sequentially applies normalized dominant set to finding the most informative and independent hash functions for each table. To further reduce the redundancy between tables, we explore the reciprocal hash tables in a boosting manner, where the hash function graph is updated with high weights emphasized on the misclassified neighbor pairs of previous hash tables. To refine the ranking of the retrieved buckets within a certain Hamming radius from the query, we propose a query-adaptive bitwise weighting scheme to enable fine-grained bucket ranking in each hash table, exploiting the discriminative power of its hash functions and their complement for nearest neighbor search. Moreover, we integrate such scheme into the multiple table search using a fast, yet reciprocal table lookup algorithm within the adaptive weighted Hamming radius. In this paper, both the construction method and the query-adaptive search method are general and compatible with different types of hashing algorithms using different feature spaces and/or parameter settings. Our extensive experiments on several large-scale benchmarks demonstrate that the proposed techniques can significantly outperform both

  7. Most Recent Match Queries in On-Line Suffix Trees

    DEFF Research Database (Denmark)

    Larsson, N. Jesper

    2014-01-01

    A suffix tree is able to efficiently locate a pattern in an indexed string, but not in general the most recent copy of the pattern in an online stream, which is desirable in some applications. We study the most general version of the problem of locating a most recent match: supporting queries......-window indexing, and sketch a possible optimization for use in the special case of Lempel-Ziv compression....

  8. Federated query services provided by the Seamless SAR Archive project

    Science.gov (United States)

    Baker, S.; Bryson, G.; Buechler, B.; Meertens, C. M.; Crosby, C. J.; Fielding, E. J.; Nicoll, J.; Youn, C.; Baru, C.

    2013-12-01

    The NASA Advancing Collaborative Connections for Earth System Science (ACCESS) seamless synthetic aperture radar (SAR) archive (SSARA) project is a 2-year collaboration between UNAVCO, the Alaska Satellite Facility (ASF), the Jet Propulsion Laboratory (JPL), and OpenTopography at the San Diego Supercomputer Center (SDSC) to design and implement a seamless distributed access system for SAR data and derived data products (i.e. interferograms). A major milestone for the first year of the SSARA project was a unified application programming interface (API) for SAR data search and results at ASF and UNAVCO (WInSAR and EarthScope data archives) through the use of simple web services. A federated query service was developed using the unified APIs, providing users a single search interface for both archives (http://www.unavco.org/ws/brokered/ssara/sar/search). A command line client that utilizes this new service is provided as an open source utility for the community on GitHub (https://github.com/bakerunavco/SSARA). Further API development and enhancements added more InSAR specific keywords and quality control parameters (Doppler centroid, faraday rotation, InSAR stack size, and perpendicular baselines). To facilitate InSAR processing, the federated query service incorporated URLs for DEM (from OpenTopography) and tropospheric corrections (from the JPL OSCAR service) in addition to the URLs for SAR data. This federated query service will provide relevant QC metadata for selecting pairs of SAR data for InSAR processing and all the URLs necessary for interferogram generation. Interest from the international community has prompted an effort to incorporate other SAR data archives (the ESA Virtual Archive 4 and the DLR TerraSAR-X_SSC Geohazard Supersites and Natural Laboratories collections) into the federated query service which provide data for researchers outside the US and North America.

  9. A Simple Linear Ranking Algorithm Using Query Dependent Intercept Variables

    OpenAIRE

    Ailon, Nir

    2008-01-01

    The LETOR website contains three information retrieval datasets used as a benchmark for testing machine learning ideas for ranking. Algorithms participating in the challenge are required to assign score values to search results for a collection of queries, and are measured using standard IR ranking measures (NDCG, precision, MAP) that depend only the relative score-induced order of the results. Similarly to many of the ideas proposed in the participating algorithms, we train a linear classifi...

  10. Capturing Ridge Functions in High Dimensions from Point Queries

    KAUST Repository

    Cohen, Albert

    2011-12-21

    Constructing a good approximation to a function of many variables suffers from the "curse of dimensionality". Namely, functions on ℝ N with smoothness of order s can in general be captured with accuracy at most O(n -s/N) using linear spaces or nonlinear manifolds of dimension n. If N is large and s is not, then n has to be chosen inordinately large for good accuracy. The large value of N often precludes reasonable numerical procedures. On the other hand, there is the common belief that real world problems in high dimensions have as their solution, functions which are more amenable to numerical recovery. This has led to the introduction of models for these functions that do not depend on smoothness alone but also involve some form of variable reduction. In these models it is assumed that, although the function depends on N variables, only a small number of them are significant. Another variant of this principle is that the function lives on a low dimensional manifold. Since the dominant variables (respectively the manifold) are unknown, this leads to new problems of how to organize point queries to capture such functions. The present paper studies where to query the values of a ridge function f(x)=g(a · x) when both a∈ℝ N and g ∈ C[0,1] are unknown. We establish estimates on how well f can be approximated using these point queries under the assumptions that g ∈ C s[0,1]. We also study the role of sparsity or compressibility of a in such query problems. © 2011 Springer Science+Business Media, LLC.

  11. Can Google Trends search queries contribute to risk diversification?

    Czech Academy of Sciences Publication Activity Database

    Krištoufek, Ladislav

    2013-01-01

    Roč. 3, č. 2713 (2013), s. 1-5 ISSN 2045-2322 R&D Projects: GA ČR GA402/09/0965 Institutional support: RVO:67985556 Keywords : Google Trends * diversification * portfolio Subject RIV: AH - Economics Impact factor: 5.078, year: 2013 http://library.utia.cas.cz/separaty/2013/E/kristoufek-can google trends search queries contribute to risk diversification.pdf

  12. Ensemble learned vaccination uptake prediction using web search queries

    OpenAIRE

    Hansen, Niels Dalum; Lioma, Christina; Mølbak, Kåre

    2016-01-01

    We present a method that uses ensemble learning to combine clinical and web-mined time-series data in order to predict future vaccination uptake. The clinical data is official vaccination registries, and the web data is query frequencies collected from Google Trends. Experiments with official vaccine records show that our method predicts vaccination uptake eff?ectively (4.7 Root Mean Squared Error). Whereas performance is best when combining clinical and web data, using solely web data yields...

  13. Modeling and Querying Business Data with Artifact Lifecycle

    Directory of Open Access Journals (Sweden)

    Danfeng Zhao

    2015-01-01

    Full Text Available Business data has been one of the current and future research frontiers, with such big data characteristics as high-volume, high-velocity, high-privacy, and so forth. Most corporations view their business data as a valuable asset and make efforts on the development and optimal utilization on these data. Unfortunately, data management technology at present has been lagging behind the requirements of business big data era. Based on previous business process knowledge, a lifecycle of business data is modeled to achieve consistent description between the data and processes. On this basis, a business data partition method based on user interest is proposed which aims to get minimum number of interferential tuples. Then, to balance data privacy and data transmission cost, our strategy is to explore techniques to execute SQL queries over encrypted business data, split the computations of queries across the server and the client, and optimize the queries with syntax tree. Finally, an instance is provided to verify the usefulness and availability of the proposed method.

  14. BAMQL: a query language for extracting reads from BAM files.

    Science.gov (United States)

    Masella, Andre P; Lalansingh, Christopher M; Sivasundaram, Pragash; Fraser, Michael; Bristow, Robert G; Boutros, Paul C

    2016-08-11

    It is extremely common to need to select a subset of reads from a BAM file based on their specific properties. Typically, a user unpacks the BAM file to a text stream using SAMtools, parses and filters the lines using AWK, then repacks them using SAMtools. This process is tedious and error-prone. In particular, when working with many columns of data, mix-ups are common and the bit field containing the flags is unintuitive. There are several libraries for reading BAM files, such as Bio-SamTools for Perl and pysam for Python. Both allow access to the BAM's read information and can filter reads, but require substantial boilerplate code; this is high overhead for mostly ad hoc filtering. We have created a query language that gathers reads using a collection of predicates and common logical connectives. Queries run faster than equivalents and can be compiled to native code for embedding in larger programs. BAMQL provides a user-friendly, powerful and performant way to extract subsets of BAM files for ad hoc analyses or integration into applications. The query language provides a collection of predicates beyond those in SAMtools, and more flexible connectives.

  15. Query-Biased Preview over Outsourced and Encrypted Data

    Directory of Open Access Journals (Sweden)

    Ningduo Peng

    2013-01-01

    document to check if it contains the desired content. An informative query-biased preview feature, as applied in modern search engine, could help the users to learn about the content without downloading the entire document. However, when the data are encrypted, securely extracting a keyword-in-context snippet from the data as a preview becomes a challenge. Based on private information retrieval protocol and the core concept of searchable encryption, we propose a single-server and two-round solution to securely obtain a query-biased snippet over the encrypted data from the server. We achieve this novel result by making a document (plaintext previewable under any cryptosystem and constructing a secure index to support dynamic computation for a best matched snippet when queried by some keywords. For each document, the scheme has O(d storage complexity and O(log(d/s+s+d/s communication complexity, where d is the document size and s is the snippet length.

  16. On the query reformulation technique for effective MEDLINE document retrieval.

    Science.gov (United States)

    Yoo, Sooyoung; Choi, Jinwook

    2010-10-01

    Improving the retrieval accuracy of MEDLINE documents is still a challenging issue due to low retrieval precision. Focusing on a query expansion technique based on pseudo-relevance feedback (PRF), this paper addresses the problem by systematically examining the effects of expansion term selection and adjustment of the term weights of the expanded query using a set of MEDLINE test documents called OHSUMED. Implementing a baseline information retrieval system based on the Okapi BM25 retrieval model, we compared six well-known term ranking algorithms for useful expansion term selection and then compared traditional term reweighting algorithms with our new variant of the standard Rocchio's feedback formula, which adopts a group-based weighting scheme. Our experimental results on the OHSUMED test collection showed a maximum improvement of 20.2% and 20.4% for mean average precision and recall measures over unexpanded queries when terms were expanded using a co-occurrence analysis-based term ranking algorithm in conjunction with our term reweighting algorithm (p-valueretrieval.

  17. Searchable Data Vault: Encrypted Queries in Secure Distributed Cloud Storage

    Directory of Open Access Journals (Sweden)

    Geong Sen Poh

    2017-05-01

    Full Text Available Cloud storage services allow users to efficiently outsource their documents anytime and anywhere. Such convenience, however, leads to privacy concerns. While storage providers may not read users’ documents, attackers may possibly gain access by exploiting vulnerabilities in the storage system. Documents may also be leaked by curious administrators. A simple solution is for the user to encrypt all documents before submitting them. This method, however, makes it impossible to efficiently search for documents as they are all encrypted. To resolve this problem, we propose a multi-server searchable symmetric encryption (SSE scheme and construct a system called the searchable data vault (SDV. A unique feature of the scheme is that it allows an encrypted document to be divided into blocks and distributed to different storage servers so that no single storage provider has a complete document. By incorporating the scheme, the SDV protects the privacy of documents while allowing for efficient private queries. It utilizes a web interface and a controller that manages user credentials, query indexes and submission of encrypted documents to cloud storage services. It is also the first system that enables a user to simultaneously outsource and privately query documents from a few cloud storage services. Our preliminary performance evaluation shows that this feature introduces acceptable computation overheads when compared to submitting documents directly to a cloud storage service.

  18. FastQuery: A Parallel Indexing System for Scientific Data

    Energy Technology Data Exchange (ETDEWEB)

    Chou, Jerry; Wu, Kesheng; Prabhat,

    2011-07-29

    Modern scientific datasets present numerous data management and analysis challenges. State-of-the- art index and query technologies such as FastBit can significantly improve accesses to these datasets by augmenting the user data with indexes and other secondary information. However, a challenge is that the indexes assume the relational data model but the scientific data generally follows the array data model. To match the two data models, we design a generic mapping mechanism and implement an efficient input and output interface for reading and writing the data and their corresponding indexes. To take advantage of the emerging many-core architectures, we also develop a parallel strategy for indexing using threading technology. This approach complements our on-going MPI-based parallelization efforts. We demonstrate the flexibility of our software by applying it to two of the most commonly used scientific data formats, HDF5 and NetCDF. We present two case studies using data from a particle accelerator model and a global climate model. We also conducted a detailed performance study using these scientific datasets. The results show that FastQuery speeds up the query time by a factor of 2.5x to 50x, and it reduces the indexing time by a factor of 16 on 24 cores.

  19. Private and Efficient Query Processing on Outsourced Genomic Databases.

    Science.gov (United States)

    Ghasemi, Reza; Al Aziz, Md Momin; Mohammed, Noman; Dehkordi, Massoud Hadian; Jiang, Xiaoqian

    2017-09-01

    Applications of genomic studies are spreading rapidly in many domains of science and technology such as healthcare, biomedical research, direct-to-consumer services, and legal and forensic. However, there are a number of obstacles that make it hard to access and process a big genomic database for these applications. First, sequencing genomic sequence is a time consuming and expensive process. Second, it requires large-scale computation and storage systems to process genomic sequences. Third, genomic databases are often owned by different organizations, and thus, not available for public usage. Cloud computing paradigm can be leveraged to facilitate the creation and sharing of big genomic databases for these applications. Genomic data owners can outsource their databases in a centralized cloud server to ease the access of their databases. However, data owners are reluctant to adopt this model, as it requires outsourcing the data to an untrusted cloud service provider that may cause data breaches. In this paper, we propose a privacy-preserving model for outsourcing genomic data to a cloud. The proposed model enables query processing while providing privacy protection of genomic databases. Privacy of the individuals is guaranteed by permuting and adding fake genomic records in the database. These techniques allow cloud to evaluate count and top-k queries securely and efficiently. Experimental results demonstrate that a count and a top-k query over 40 Single Nucleotide Polymorphisms (SNPs) in a database of 20 000 records takes around 100 and 150 s, respectively.

  20. Federated querying architecture with clinical & translational health IT application.

    Science.gov (United States)

    Livne, Oren E; Schultz, N Dustin; Narus, Scott P

    2011-10-01

    We present a software architecture that federates data from multiple heterogeneous health informatics data sources owned by multiple organizations. The architecture builds upon state-of-the-art open-source Java and XML frameworks in innovative ways. It consists of (a) federated query engine, which manages federated queries and result set aggregation via a patient identification service; and (b) data source facades, which translate the physical data models into a common model on-the-fly and handle large result set streaming. System modules are connected via reusable Apache Camel integration routes and deployed to an OSGi enterprise service bus. We present an application of our architecture that allows users to construct queries via the i2b2 web front-end, and federates patient data from the University of Utah Enterprise Data Warehouse and the Utah Population database. Our system can be easily adopted, extended and integrated with existing SOA Healthcare and HL7 frameworks such as i2b2 and caGrid.

  1. GEOYASGUI: THE GEOSPARQL QUERY EDITOR AND RESULT SET VISUALIZER

    Directory of Open Access Journals (Sweden)

    W. Beek

    2017-07-01

    Full Text Available The Netherlands' Cadastre, Land Registry and Mapping Agency – in short Kadaster – collects and registers administrative and spatial data on property and the rights involved. This includes for ships, aircraft and telecommunications networks. Doing so, Kadaster protects legal certainty. The Kadaster publishes many large authoritative datasets including several key registers of the Dutch Government (Topography, Addresses and Buildings. Furthermore Kadaster is also developing and maintaining the PDOK shared service, in which about 100 spatial datasets are being published in several formats, including an incredible amount of detailed geospatial objects. Geospatial objects include all plots of land, all buildings, all roads and all lampposts. These objects are spatially and/or conceptually related, but are maintained by different data curators. As a result these datasets are syntactically and architecturally disjoint, and using them together currently requires non-trivial human labor. In response to this, Kadaster is currently publishing its geo-spatial data assets as Linked Open Data. The standardized query language for Linked Open Geodata is GeoSPARQL. Unfortunately, current tooling does not support writing and evaluating GeoSPARQL queries. This paper presents GeoYASGUI, a GeoSPARQL editor and result-set viewer with IDE capabilities. GeoYASGUI is not a new software product, but an integration of and a collection of updates to existing Open Source libraries. With GeoYASGUI it becomes possible to query the rich Open Data assets of the Kadaster.

  2. Development of Semantic Web - Markup Languages, Web Services, Rules, Explanation, Querying, Proof and Reasoning

    National Research Council Canada - National Science Library

    McGuinness, Deborah

    2008-01-01

    ...-S), the Web Ontology Query Language (OWL-QL) and Semantic Web Rule Language (SWRL) W3C submissions. This report contains the evolution of these markup languages as well as a discussion of semantic query languages, proof and explanation...

  3. Ad-hoc Content-based Queries and Data Analysis for Virtual Observatories, Phase II

    Data.gov (United States)

    National Aeronautics and Space Administration — Aquilent, Inc. proposes to support ad-hoc, content-based query and data retrieval from virtual observatories (VxO) by developing 1) Higher Order Query Services that...

  4. Evaluation of DNA extraction methods and their clinical application for direct detection of causative bacteria in continuous ambulatory peritoneal dialysis culture fluids from patients with peritonitis by using broad-range PCR.

    Science.gov (United States)

    Kim, Si Hyun; Jeong, Haeng Soon; Kim, Yeong Hoon; Song, Sae Am; Lee, Ja Young; Oh, Seung Hwan; Kim, Hye Ran; Lee, Jeong Nyeo; Kho, Weon-Gyu; Shin, Jeong Hwan

    2012-03-01

    The aims of this study were to compare several DNA extraction methods and 16S rDNA primers and to evaluate the clinical utility of broad-range PCR in continuous ambulatory peritoneal dialysis (CAPD) culture fluids. Six type strains were used as model organisms in dilutions from 10(8) to 10(0) colony-forming units (CFU)/mL for the evaluation of 5 DNA extraction methods and 5 PCR primer pairs. Broad-range PCR was applied to 100 CAPD culture fluids, and the results were compared with conventional culture results. There were some differences between the various DNA extraction methods and primer sets with regard to the detection limits. The InstaGene Matrix (Bio-Rad Laboratories, USA) and Exgene Clinic SV kits (GeneAll Biotechnology Co. Ltd, Korea) seem to have higher sensitivities than the others. The results of broad-range PCR were concordant with the results from culture in 97% of all cases (97/100). Two culture-positive cases that were broad-range PCR-negative were identified as Candida albicans, and 1 PCR-positive but culture-negative sample was identified as Bacillus circulans by sequencing. Two samples among 54 broad-range PCR-positive products could not be sequenced. There were differences in the analytical sensitivity of various DNA extraction methods and primers for broad-range PCR. The broad-range PCR assay can be used to detect bacterial pathogens in CAPD culture fluid as a supplement to culture methods.

  5. Continuous sampling from distributed streams

    DEFF Research Database (Denmark)

    Graham, Cormode; Muthukrishnan, S.; Yi, Ke

    2012-01-01

    A fundamental problem in data management is to draw and maintain a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streaming data sets, this problem becomes particularly difficult when the data is shared across multiple distribu......A fundamental problem in data management is to draw and maintain a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streaming data sets, this problem becomes particularly difficult when the data is shared across multiple...... distributed sites. The main challenge is to ensure that a sample is drawn uniformly across the union of the data while minimizing the communication needed to run the protocol on the evolving data. At the same time, it is also necessary to make the protocol lightweight, by keeping the space and time costs low...... for each participant. In this article, we present communication-efficient protocols for continuously maintaining a sample (both with and without replacement) from k distributed streams. These apply to the case when we want a sample from the full streams, and to the sliding window cases of only the W most...

  6. A study of medical and health queries to web search engines.

    Science.gov (United States)

    Spink, Amanda; Yang, Yin; Jansen, Jim; Nykanen, Pirrko; Lorence, Daniel P; Ozmutlu, Seda; Ozmutlu, H Cenk

    2004-03-01

    This paper reports findings from an analysis of medical or health queries to different web search engines. We report results: (i). comparing samples of 10000 web queries taken randomly from 1.2 million query logs from the AlltheWeb.com and Excite.com commercial web search engines in 2001 for medical or health queries, (ii). comparing the 2001 findings from Excite and AlltheWeb.com users with results from a previous analysis of medical and health related queries from the Excite Web search engine for 1997 and 1999, and (iii). medical or health advice-seeking queries beginning with the word 'should'. Findings suggest: (i). a small percentage of web queries are medical or health related, (ii). the top five categories of medical or health queries were: general health, weight issues, reproductive health and puberty, pregnancy/obstetrics, and human relationships, and (iii). over time, the medical and health queries may have declined as a proportion of all web queries, as the use of specialized medical/health websites and e-commerce-related queries has increased. Findings provide insights into medical and health-related web querying and suggests some implications for the use of the general web search engines when seeking medical/health information.

  7. A hierarchical recurrent encoder-decoder for generative context-aware query suggestion

    DEFF Research Database (Denmark)

    Sordoni, Alessandro; Bengio, Yoshua; Vahabi, Hossein

    2015-01-01

    Users may strive to formulate an adequate textual query for their information need. Search engines assist the users by presenting query suggestions. To preserve the original search intent, suggestions should be context-aware and account for the previous queries issued by the user. Achieving conte...

  8. Query optimization through the looking glass, and what we found running the Join Order Benchmark

    NARCIS (Netherlands)

    V. Leis (Viktor); B. Radke (Bernhard); A. Gubichev (Andrey); A. Mirchev (Atanas); P.A. Boncz (Peter); A. Kemper (Alfons); T. Neumann (Thomas)

    2018-01-01

    textabstractFinding a good join order is crucial for query performance. In this paper, we introduce the Join Order Benchmark that works on real-life data riddled with correlations and introduces 113 complex join queries. We experimentally revisit the main components in the classic query optimizer

  9. Audio Query by Example Using Similarity Measures between Probability Density Functions of Features

    Directory of Open Access Journals (Sweden)

    Marko Helén

    2010-01-01

    Full Text Available This paper proposes a query by example system for generic audio. We estimate the similarity of the example signal and the samples in the queried database by calculating the distance between the probability density functions (pdfs of their frame-wise acoustic features. Since the features are continuous valued, we propose to model them using Gaussian mixture models (GMMs or hidden Markov models (HMMs. The models parametrize each sample efficiently and retain sufficient information for similarity measurement. To measure the distance between the models, we apply a novel Euclidean distance, approximations of Kullback-Leibler divergence, and a cross-likelihood ratio test. The performance of the measures was tested in simulations where audio samples are automatically retrieved from a general audio database, based on the estimated similarity to a user-provided example. The simulations show that the distance between probability density functions is an accurate measure for similarity. Measures based on GMMs or HMMs are shown to produce better results than that of the existing methods based on simpler statistics or histograms of the features. A good performance with low computational cost is obtained with the proposed Euclidean distance.

  10. AQUAdexIM: highly efficient in-memory indexing and querying of astronomy time series images

    Science.gov (United States)

    Hong, Zhi; Yu, Ce; Wang, Jie; Xiao, Jian; Cui, Chenzhou; Sun, Jizhou

    2016-12-01

    Astronomy has always been, and will continue to be, a data-based science, and astronomers nowadays are faced with increasingly massive datasets, one key problem of which is to efficiently retrieve the desired cup of data from the ocean. AQUAdexIM, an innovative spatial indexing and querying method, performs highly efficient on-the-fly queries under users' request to search for Time Series Images from existing observation data on the server side and only return the desired FITS images to users, so users no longer need to download entire datasets to their local machines, which will only become more and more impractical as the data size keeps increasing. Moreover, AQUAdexIM manages to keep a very low storage space overhead and its specially designed in-memory index structure enables it to search for Time Series Images of a given area of the sky 10 times faster than using Redis, a state-of-the-art in-memory database.

  11. Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea.

    Science.gov (United States)

    Woo, Hyekyung; Cho, Youngtae; Shim, Eunyoung; Lee, Jong-Koo; Lee, Chang-Gun; Kim, Seong Hwan

    2016-07-04

    As suggested as early as in 2006, logs of queries submitted to search engines seeking information could be a source for detection of emerging influenza epidemics if changes in the volume of search queries are monitored (infodemiology). However, selecting queries that are most likely to be associated with influenza epidemics is a particular challenge when it comes to generating better predictions. In this study, we describe a methodological extension for detecting influenza outbreaks using search query data; we provide a new approach for query selection through the exploration of contextual information gleaned from social media data. Additionally, we evaluate whether it is possible to use these queries for monitoring and predicting influenza epidemics in South Korea. Our study was based on freely available weekly influenza incidence data and query data originating from the search engine on the Korean website Daum between April 3, 2011 and April 5, 2014. To select queries related to influenza epidemics, several approaches were applied: (1) exploring influenza-related words in social media data, (2) identifying the chief concerns related to influenza, and (3) using Web query recommendations. Optimal feature selection by least absolute shrinkage and selection operator (Lasso) and support vector machine for regression (SVR) were used to construct a model predicting influenza epidemics. In total, 146 queries related to influenza were generated through our initial query selection approach. A considerable proportion of optimal features for final models were derived from queries with reference to the social media data. The SVR model performed well: the prediction values were highly correlated with the recent observed influenza-like illness (r=.956; Psearch queries to enhance influenza surveillance in South Korea. In addition, an approach for query selection using social media data seems ideal for supporting influenza surveillance based on search query data.

  12. QCS : a system for querying, clustering, and summarizing documents.

    Energy Technology Data Exchange (ETDEWEB)

    Dunlavy, Daniel M.

    2006-08-01

    Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test sets from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence ''trimming'', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of

  13. QCS: a system for querying, clustering and summarizing documents.

    Energy Technology Data Exchange (ETDEWEB)

    Dunlavy, Daniel M.; Schlesinger, Judith D. (Center for Computing Sciences, Bowie, MD); O' Leary, Dianne P. (University of Maryland, College Park, MD); Conroy, John M. (Center for Computing Sciences, Bowie, MD)

    2006-10-01

    Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test sets from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence 'trimming', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design

  14. On Space Efficient Two Dimensional Range Minimum Data Structures

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Davoodi, Pooya; Rao, S. Srinivasa

    2012-01-01

    of the problem, the lower bound is tight up to a constant factor. In two dimensions, we complement the lower bound with an indexing data structure of size O(N/c) bits which can be preprocessed in O(N) time to support O(clog 2 c) query time. For c=O(1), this is the first O(1) query time algorithm using a data......The two dimensional range minimum query problem is to preprocess a static m by n matrix (two dimensional array) A of size N=m⋅n, such that subsequent queries, asking for the position of the minimum element in a rectangular range within A, can be answered efficiently. We study the trade-off between...... the space and query time of the problem. We show that every algorithm enabled to access A during the query and using a data structure of size O(N/c) bits requires Ω(c) query time, for any c where 1≤c≤N. This lower bound holds for arrays of any dimension. In particular, for the one dimensional version...

  15. JavaScript & jQuery The Missing Manual

    CERN Document Server

    McFarland, David

    2011-01-01

    JavaScript lets you supercharge your HTML with animation, interactivity, and visual effects-but many web designers find the language hard to learn. This jargon-free guide covers JavaScript basics and shows you how to save time and effort with the jQuery library of prewritten JavaScript code. You'll soon be building web pages that feel and act like desktop programs, without having to do much programming. The important stuff you need to know: Make your pages interactive. Create JavaScript events that react to visitor actions.Use animations and effects. Build drop-down navigation menus, pop-ups

  16. jQuery 2.0 development cookbook

    CERN Document Server

    Revill, Leon

    2014-01-01

    Taking a recipe-based approach, this book presents numerous practical examples that you can use directly in your applications. The book covers the essential issues you will face while developing your web applications and gives you solutions to them. The recipes in this book are written in a manner that rapidly takes you from beginner to expert level.This book is for web developers of all skill levels. Although some knowledge of JavaScript, HTML, and CSS is required, this Cookbook will teach jQuery newcomers all the basics required to move on to the more complex examples of this book, which wil

  17. DiffTool: building, visualizing and querying protein clusters.

    Science.gov (United States)

    Chetouani, Farid; Glaser, Philippe; Kunst, Frank

    2002-08-01

    DiffTool is a resource to build and visualize protein clusters computed from a sequence database. The package provides a clustering tool to construct protein families according to sequence similarities and a web interface to query the corresponding clusters. A subtractive genome analysis tool selects protein families specific for a genome or a group of genomes. For each protein cluster, DiffTool includes access to sequences, coloured multiple alignments and phylogenetic trees. A cluster database built from yeast and complete prokaryotic genomes is queryable at http://bioweb.pasteur.fr/seqanal/difftool. All the Perl sources are freely available to non-profit organizations upon request.

  18. Data management and query processing in semantic web databases

    CERN Document Server

    Groppe, Sven

    2011-01-01

    The Semantic Web, which is intended to establish a machine-understandable Web, is currently changing from being an emerging trend to a technology used in complex real-world applications. A number of standards and techniques have been developed by the World Wide Web Consortium (W3C), e.g., the Resource Description Framework (RDF), which provides a general method for conceptual descriptions for Web resources, and SPARQL, an RDF querying language. Recent examples of large RDF data with billions of facts include the UniProt comprehensive catalog of protein sequence, function and annotation data, t

  19. Securing Document Warehouses against Brute Force Query Attacks

    Directory of Open Access Journals (Sweden)

    Sergey Vladimirovich Zapechnikov

    2017-04-01

    Full Text Available The paper presents the scheme of data management and protocols for securing document collection against adversary users who try to abuse their access rights to find out the full content of confidential documents. The configuration of secure document retrieval system is described and a suite of protocols among the clients, warehouse server, audit server and database management server is specified. The scheme makes it infeasible for clients to establish correspondence between the documents relevant to different search queries until a moderator won’t give access to these documents. The proposed solution allows ensuring higher security level for document warehouses.

  20. Query Processing and Interlinking of Fuzzy Object-Oriented Database

    OpenAIRE

    Shweta Dwivedi; Santosh Kumar

    2017-01-01

    Due to the many limitation and poor data handling in the existing relational database, the software professional and researchers moves towards the object-oriented database which has much better capability to handling the real and complex real world data i.e. clear and crisp data and also have the capability to perform some huge and complex queries in an effective manner. On the other hand, a new approach in database is introduced named as Fuzzy Object-Oriented Database (FOOD); it has all the ...

  1. Ensemble learned vaccination uptake prediction using web search queries

    DEFF Research Database (Denmark)

    Hansen, Niels Dalum; Lioma, Christina; Mølbak, Kåre

    2016-01-01

    We present a method that uses ensemble learning to combine clinical and web-mined time-series data in order to predict future vaccination uptake. The clinical data is official vaccination registries, and the web data is query frequencies collected from Google Trends. Experiments with official...... vaccine records show that our method predicts vaccination uptake eff?ectively (4.7 Root Mean Squared Error). Whereas performance is best when combining clinical and web data, using solely web data yields comparative performance. To our knowledge, this is the ?first study to predict vaccination uptake...

  2. Technologies for conceptual modelling and intelligent query formulation

    CSIR Research Space (South Africa)

    Alberts, R

    2008-11-01

    Full Text Available modelling and intelligent query formulation R ALBERTS1, K BRITZ1, A GERBER1, K HALLAND1,2, T MEYER1, L PRETORIUS1,2 (1) Knowledge Systems Group, Meraka Institute, CSIR, Pretoria, Gauteng, South Africa (2) School of Computing, University of South... this problem still more pressing: in this case an excess of information can be equivalent to an absence of information It is therefore necessary to use tools that organise data into intelligible and easily-accessible structures and return answers...

  3. Real-Time Spatial Queries for Moving Objects Using Storm Topology

    Directory of Open Access Journals (Sweden)

    Feng Zhang

    2016-09-01

    Full Text Available With the rapid development of mobile data acquisition technology, the volume of available spatial data is growing at an increasingly fast pace. The real-time processing of big spatial data has become a research frontier in the field of Geographic Information Systems (GIS. To cope with these highly dynamic data, we aim to reduce the time complexity of data updating by modifying the traditional spatial index. However, existing algorithms and data structures are based on single work nodes, which are incapable of handling the required high numbers and update rates of moving objects. In this paper, we present a distributed spatial index based on Apache Storm, an open-source distributed real-time computation system. Using this approach, we compare the range and K-nearest neighbor (KNN query efficiency of four spatial indexes on a single dataset and introduce a method of performing spatial joins between two moving datasets. In particular, we build a secondary distributed index for spatial join queries based on the grid-partition index. Finally, a series of experiments are presented to explore the factors that affect the performance of the distributed index and to demonstrate the feasibility of the proposed distributed index based on Storm. As a real-world application, this approach has been integrated into an information system that provides real-time traffic decision support.

  4. Framing memories: How the retrieval query format shapes the neural bases of remembering.

    Science.gov (United States)

    Raposo, Ana; Frade, Sofia; Alves, Mara

    2016-08-01

    The way memory questions are framed influences the information that is searched, retrieved, and monitored during remembering. This fMRI study aimed at clarifying how the format of the retrieval query shapes the neural basis of source recollection. During encoding, participants made semantic (pleasantness) or perceptual (number of letters) judgments about words. Subsequently, in a source memory test, the retrieval query was manipulated such that for half of the items from each encoding task, the retrieval query emphasized the semantic source (i.e., semantic query format: "Is this word from the pleasantness task?"), whereas for the other half the retrieval query emphasized the alternate, perceptual source (i.e., perceptual query format: "Is this word from the letter task?"). The results showed that the semantic query format was associated with higher source recognition than the perceptual query format. This behavioral advantage was accompanied by increased activation in several regions associated to controlled semantic elaboration and monitoring of internally-generated features about the past event. In particular, for items semantically encoded, the semantic query, relative to the perceptual query, induced activation in medial prefrontal cortex (PFC), hippocampal, parahippocampal and middle temporal cortex. Conversely, for items perceptually encoded, the semantic query recruited the lateral PFC and occipital-fusiform areas. Interestingly, the semantic format also influenced the processing of new items, eliciting greater L lateral and medial PFC activation. In contrast, the perceptual query format (versus the semantic format) only prompted greater activation in R orbitofrontal cortex and the R inferior parietal lobe, for items encoded in a perceptual manner and for new items, respectively. The results highlight the role of the retrieval query format in source remembering, showing that the retrieval query that emphasizes the semantic source promotes the use of semantic

  5. Symbolic representation and visual querying of left ventricular image sequences.

    Science.gov (United States)

    Baroni, M; Del Bimbo, A; Evangelist, A; Vicario, E

    1999-01-01

    In the evaluation of regional left ventricular function, relevant cardiac disorders manifest themselves not only in static features, such as shape descriptors and motion excursion in end-diastolic and end-systolic frames, but also in their temporal evolution. In common diagnostic practice, such dynamic patterns are analysed by direct inspection of frame sequences through the use of a moviola. This permits only a subjective and poorly defined evaluation of functional parameters, and definitely prevents a systematic and reproducible analysis of large sets of reports. Retrieval by contents techniques may overcome this limitation by permitting the automatic comparison of the reports in a database against queries expressing descriptive properties related to significant pathological conditions. A system is presented which is aimed at investigating the potential of this approach by supporting retrieval by contents from a database of cineangiographic or echocardiographic images. The system relies on a symbolic description of both geometrical and temporal properties of left ventricular contours. This is derived automatically by an image processing and interpretation module and associated with the report at its storage time. In the retrieval stage, queries are expressed by means of an iconic visual language which describes searched content properties over a computer screen. The system automatically interprets iconic statements and compares them against concrete descriptions in the database. This enables medical users to interact with the system to search for motion and shape abnormalities on a regional basis, in single or homogeneous groups of reports, so as to enable both prospective and retrospective diagnosis.

  6. Efficient Partitioning of Large Databases without Query Statistics

    Directory of Open Access Journals (Sweden)

    Shahidul Islam KHAN

    2016-11-01

    Full Text Available An efficient way of improving the performance of a database management system is distributed processing. Distribution of data involves fragmentation or partitioning, replication, and allocation process. Previous research works provided partitioning based on empirical data about the type and frequency of the queries. These solutions are not suitable at the initial stage of a distributed database as query statistics are not available then. In this paper, I have presented a fragmentation technique, Matrix based Fragmentation (MMF, which can be applied at the initial stage as well as at later stages of distributed databases. Instead of using empirical data, I have developed a matrix, Modified Create, Read, Update and Delete (MCRUD, to partition a large database properly. Allocation of fragments is done simultaneously in my proposed technique. So using MMF, no additional complexity is added for allocating the fragments to the sites of a distributed database as fragmentation is synchronized with allocation. The performance of a DDBMS can be improved significantly by avoiding frequent remote access and high data transfer among the sites. Results show that proposed technique can solve the initial partitioning problem of large distributed databases.

  7. Representing and querying now-relative relational medical data.

    Science.gov (United States)

    Anselma, Luca; Piovesan, Luca; Stantic, Bela; Terenziani, Paolo

    2018-03-01

    Temporal information plays a crucial role in medicine. Patients' clinical records are intrinsically temporal. Thus, in Medical Informatics there is an increasing need to store, support and query temporal data (particularly in relational databases), in order, for instance, to supplement decision-support systems. In this paper, we show that current approaches to relational data have remarkable limitations in the treatment of "now-relative" data (i.e., data holding true at the current time). This can severely compromise their applicability in general, and specifically in the medical context, where "now-relative" data are essential to assess the current status of the patients. We propose a theoretically grounded and application-independent relational approach to cope with now-relative data (which can be paired, e.g., with different decision support systems) overcoming such limitations. We propose a new temporal relational representation, which is the first relational model coping with the temporal indeterminacy intrinsic in now-relative data. We also propose new temporal algebraic operators to query them, supporting the distinction between possible and necessary time, and Allen's temporal relations between data. We exemplify the impact of our approach, and study the theoretical and computational properties of the new representation and algebra. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. Cardinal Direction Relations Query Modeling Based on Geo-Ontology

    Science.gov (United States)

    Zhu, X.; Chen, D.; Zhou, C.; Li, M.; Xiao, W.

    2012-08-01

    Direction relations, as an important spatial relationship, is simply expressed as object prosperity in traditional geo-ontology. The lacking of explicit specifications and reasoning rules of direction relations in geo-ontology result in the difficult or inflexible of spatial reasoning. Also, digital gazetteers provide information on named features, linking the feature's name with its location and its type. Although the location information is incomplete and not exact, the implicit spatial information, for example spatial relationships and spatial scale, can be extract using the appropriate models based on geo-ontology. In this paper, we proposed a novel conceptual framework of direction relations in order to formalize the semantics and implicit information of direction relations, and present an extraction algorithm of implicit information based on previous researches, which will produce a complete query instance of direction relations. At last, the most suitable direction physical model is recommended to calculation module according to relevant rules. And the experimental results show that this direction query model not only extracted the implicit information effectively, but also made a reasonable interpretation for the user's intention.

  9. From Provenance Standards and Tools to Queries and Actionable Provenance

    Science.gov (United States)

    Ludaescher, B.

    2017-12-01

    The W3C PROV standard provides a minimal core for sharing retrospective provenance information for scientific workflows and scripts. PROV extensions such as DataONE's ProvONE model are necessary for linking runtime observables in retrospective provenance records with conceptual-level prospective provenance information, i.e., workflow (or dataflow) graphs. Runtime provenance recorders, such as DataONE's RunManager for R, or noWorkflow for Python capture retrospective provenance automatically. YesWorkflow (YW) is a toolkit that allows researchers to declare high-level prospective provenance models of scripts via simple inline comments (YW-annotations), revealing the computational modules and dataflow dependencies in the script. By combining and linking both forms of provenance, important queries and use cases can be supported that neither provenance model can afford on its own. We present existing and emerging provenance tools developed for the DataONE and SKOPE (Synthesizing Knowledge of Past Environments) projects. We show how the different tools can be used individually and in combination to model, capture, share, query, and visualize provenance information. We also present challenges and opportunities for making provenance information more immediately actionable for the researchers who create it in the first place. We argue that such a shift towards "provenance-for-self" is necessary to accelerate the creation, sharing, and use of provenance in support of transparent, reproducible computational and data science.

  10. Query Migration from Object Oriented World to Semantic World

    Directory of Open Access Journals (Sweden)

    Nassima Soussi

    2016-06-01

    Full Text Available In the last decades, object-oriented approach was able to take a large share of databases market aiming to design and implement structured and reusable software through the composition of independent elements in order to have programs with a high performance. On the other hand, the mass of information stored in the web is increasing day after day with a vertiginous speed, exposing the currently web faced with the problem of creating a bridge so as to facilitate access to data between different applications and systems as well as to look for relevant and exact information wished by users. In addition, all existing approach of rewriting object oriented languages to SPARQL language rely on models transformation process to guarantee this mapping. All the previous raisons has prompted us to write this paper in order to bridge an important gap between these two heterogeneous worlds (object oriented and semantic web world by proposing the first provably semantics preserving OQLto-SPARQL translation algorithm for each element of OQL Query (SELECT clause, FROM clause, FILTER constraint, implicit/ explicit join and union/intersection SELECT queries.

  11. Enabling complex queries to drug information sources through functional composition.

    Science.gov (United States)

    Peters, Lee; Mortensen, Jonathan; Nguyen, Thang; Bodenreider, Olivier

    2013-01-01

    Our objective was to enable an end-user to create complex queries to drug information sources through functional composition, by creating sequences of functions from application program interfaces (API) to drug terminologies. The development of a functional composition model seeks to link functions from two distinct APIs. An ontology was developed using Protégé to model the functions of the RxNorm and NDF-RT APIs by describing the semantics of their input and output. A set of rules were developed to define the interoperable conditions for functional composition. The operational definition of interoperability between function pairs is established by executing the rules on the ontology. We illustrate that the functional composition model supports common use cases, including checking interactions for RxNorm drugs and deploying allergy lists defined in reference to drug properties in NDF-RT. This model supports the RxMix application (http://mor.nlm.nih.gov/RxMix/), an application we developed for enabling complex queries to the RxNorm and NDF-RT APIs.

  12. Modeling and Querying Moving Objects with Social Relationships

    Directory of Open Access Journals (Sweden)

    Hengcai Zhang

    2016-07-01

    Full Text Available Current moving-object database (MOD systems focus on management of movement data, but pay less attention to modelling social relationships between moving objects and spatial-temporal trajectories in an integrated manner. This paper combines moving-object database and social network systems and presents a novel data model called Geo-Social-Moving (GSM that enables the unified management of trajectories, underlying geographical space and social relationships for mass moving objects. A bulk of user-defined data types and corresponding operators are also proposed to facilitate geo-social queries on moving objects. An implementation framework for the GSM model is proposed, and a prototype system based on native Neo4J is then developed with two real-world data sets from the location-based social network systems. Compared with solutions based on traditional extended relational database management systems characterized by time-consuming table join operations, the proposed GSM model characterized by graph traversal is argued to be more powerful in representing mass moving objects with social relationships, and more efficient and stable for geo-social querying.

  13. XQOWL: An Extension of XQuery for OWL Querying and Reasoning

    Directory of Open Access Journals (Sweden)

    Jesús M. Almendros-Jiménez

    2015-01-01

    Full Text Available One of the main aims of the so-called Web of Data is to be able to handle heterogeneous resources where data can be expressed in either XML or RDF. The design of programming languages able to handle both XML and RDF data is a key target in this context. In this paper we present a framework called XQOWL that makes possible to handle XML and RDF/OWL data with XQuery. XQOWL can be considered as an extension of the XQuery language that connects XQuery with SPARQL and OWL reasoners. XQOWL embeds SPARQL queries (via Jena SPARQL engine in XQuery and enables to make calls to OWL reasoners (HermiT, Pellet and FaCT++ from XQuery. It permits to combine queries against XML and RDF/OWL resources as well as to reason with RDF/OWL data. Therefore input data can be either XML or RDF/OWL and output data can be formatted in XML (also using RDF/OWL XML serialization.

  14. Aspects of data modeling and query processing for complex multidimensional data

    DEFF Research Database (Denmark)

    Pedersen, Torben Bach

    This thesis is about data modeling and query processing for complex multidimensional data. Multidimensional data has become the subject of much attention in both academia and industry in recent years, fueled by the popularity of data warehousing and On-Line Analytical Processing (OLAP) applications....... One application area where complex multidimensional data is common is within medical informatics, an area that may benefit significantly from the functionality offered by data warehousing and OLAP. However, the special nature of clinical applications poses different and new requirements to data...... support, advanced classification structures, continuously valued data, dimensionally reduced data, and the integration of complex data. OLAP systems typically employ multidimensional data models to structure their data. This thesis identifies eleven modeling requirements for multidimensional data models...

  15. A Study of Library Databases by Translating Those SQL Queries into Relational Algebra and Generating Query Trees

    OpenAIRE

    Santhi Lasya; Sreekar Tanuku

    2011-01-01

    Even in this World Wide Web era where there is unrestricted access to a lot of articles and books at a mouses click, the role of an organized library is immense. It is vital to have effective software to manage various functions in a library and the fundamental for effective software is the underlying database access and the queries used. And hence library databases become our use-case for this study. This paper starts off with considering a basic ER model of a typical library relational data...

  16. PropBase Query Layer: a single portal to UK subsurface physical property databases

    Science.gov (United States)

    Kingdon, Andrew; Nayembil, Martin L.; Richardson, Anne E.; Smith, A. Graham

    2013-04-01

    Until recently, the delivery of geological information for industry and public was achieved by geological mapping. Now pervasively available computers mean that 3D geological models can deliver realistic representations of the geometric location of geological units, represented as shells or volumes. The next phase of this process is to populate these with physical properties data that describe subsurface heterogeneity and its associated uncertainty. Achieving this requires capture and serving of physical, hydrological and other property information from diverse sources to populate these models. The British Geological Survey (BGS) holds large volumes of subsurface property data, derived both from their own research data collection and also other, often commercially derived data sources. This can be voxelated to incorporate this data into the models to demonstrate property variation within the subsurface geometry. All property data held by BGS has for many years been stored in relational databases to ensure their long-term continuity. However these have, by necessity, complex structures; each database contains positional reference data and model information, and also metadata such as sample identification information and attributes that define the source and processing. Whilst this is critical to assessing these analyses, it also hugely complicates the understanding of variability of the property under assessment and requires multiple queries to study related datasets making extracting physical properties from these databases difficult. Therefore the PropBase Query Layer has been created to allow simplified aggregation and extraction of all related data and its presentation of complex data in simple, mostly denormalized, tables which combine information from multiple databases into a single system. The structure from each relational database is denormalized in a generalised structure, so that each dataset can be viewed together in a common format using a simple

  17. Incentives for Delay-Constrained Data Query and Feedback in Mobile Opportunistic Crowdsensing.

    Science.gov (United States)

    Liu, Yang; Li, Fan; Wang, Yu

    2016-07-21

    In this paper, we propose effective data collection schemes that stimulate cooperation between selfish users in mobile opportunistic crowdsensing. A query issuer generates a query and requests replies within a given delay budget. When a data provider receives the query for the first time from an intermediate user, the former replies to it and authorizes the latter as the owner of the reply. Different data providers can reply to the same query. When a user that owns a reply meets the query issuer that generates the query, it requests the query issuer to pay credits. The query issuer pays credits and provides feedback to the data provider, which gives the reply. When a user that carries a feedback meets the data provider, the data provider pays credits to the user in order to adjust its claimed expertise. Queries, replies and feedbacks can be traded between mobile users. We propose an effective mechanism to define rewards for queries, replies and feedbacks. We formulate the bargain process as a two-person cooperative game, whose solution is found by using the Nash theorem. To improve the credit circulation, we design an online auction process, in which the wealthy user can buy replies and feedbacks from the starving one using credits. We have carried out extensive simulations based on real-world traces to evaluate the proposed schemes.

  18. GeoSpark SQL: An Effective Framework Enabling Spatial Queries on Spark

    Directory of Open Access Journals (Sweden)

    Zhou Huang

    2017-09-01

    Full Text Available In the era of big data, Internet-based geospatial information services such as various LBS apps are deployed everywhere, followed by an increasing number of queries against the massive spatial data. As a result, the traditional relational spatial database (e.g., PostgreSQL with PostGIS and Oracle Spatial cannot adapt well to the needs of large-scale spatial query processing. Spark is an emerging outstanding distributed computing framework in the Hadoop ecosystem. This paper aims to address the increasingly large-scale spatial query-processing requirement in the era of big data, and proposes an effective framework GeoSpark SQL, which enables spatial queries on Spark. On the one hand, GeoSpark SQL provides a convenient SQL interface; on the other hand, GeoSpark SQL achieves both efficient storage management and high-performance parallel computing through integrating Hive and Spark. In this study, the following key issues are discussed and addressed: (1 storage management methods under the GeoSpark SQL framework, (2 the spatial operator implementation approach in the Spark environment, and (3 spatial query optimization methods under Spark. Experimental evaluation is also performed and the results show that GeoSpark SQL is able to achieve real-time query processing. It should be noted that Spark is not a panacea. It is observed that the traditional spatial database PostGIS/PostgreSQL performs better than GeoSpark SQL in some query scenarios, especially for the spatial queries with high selectivity, such as the point query and the window query. In general, GeoSpark SQL performs better when dealing with compute-intensive spatial queries such as the kNN query and the spatial join query.

  19. Building and Querying RDF/OWL Database of Semantically Annotated Nuclear Medicine Images.

    Science.gov (United States)

    Hwang, Kyung Hoon; Lee, Haejun; Koh, Geon; Willrett, Debra; Rubin, Daniel L

    2017-02-01

    As the use of positron emission tomography-computed tomography (PET-CT) has increased rapidly, there is a need to retrieve relevant medical images that can assist image interpretation. However, the images themselves lack the explicit information needed for query. We constructed a semantically structured database of nuclear medicine images using the Annotation and Image Markup (AIM) format and evaluated the ability the AIM annotations to improve image search. We created AIM annotation templates specific to the nuclear medicine domain and used them to annotate 100 nuclear medicine PET-CT studies in AIM format using controlled vocabulary. We evaluated image retrieval from 20 specific clinical queries. As the gold standard, two nuclear medicine physicians manually retrieved the relevant images from the image database using free text search of radiology reports for the same queries. We compared query results with the manually retrieved results obtained by the physicians. The query performance indicated a 98 % recall for simple queries and a 89 % recall for complex queries. In total, the queries provided 95 % (75 of 79 images) recall, 100 % precision, and an F1 score of 0.97 for the 20 clinical queries. Three of the four images missed by the queries required reasoning for successful retrieval. Nuclear medicine images augmented using semantic annotations in AIM enabled high recall and precision for simple queries, helping physicians to retrieve the relevant images. Further study using a larger data set and the implementation of an inference engine may improve query results for more complex queries.

  20. LHCb: Optimising query execution time in LHCb Bookkeeping System using partition pruning and partition wise joins

    CERN Multimedia

    Mathe, Z

    2013-01-01

    The LHCb experiment produces a huge amount of data which has associated metadata such as run number, data taking condition (detector status when the data was taken), simulation condition, etc. The data are stored in files, replicated on the Computing Grid around the world. The LHCb Bookkeeping System provides methods for retrieving datasets based on their metadata. The metadata is stored in a hybrid database model, which is a mixture of Relational and Hierarchical database models and is based on the Oracle Relational Database Management System (RDBMS). The database access has to be reliable and fast. In order to achieve a high timing performance, the tables are partitioned and the queries are executed in parallel. When we store large amounts of data the partition pruning is essential for database performance, because it reduces the amount of data retrieved from the disk and optimises the resource utilisation. This research presented here is focusing on the extended composite partitioning strategy such as rang...

  1. Preventing SQL Injection through Automatic Query Sanitization with ASSIST

    Directory of Open Access Journals (Sweden)

    Raymond Mui

    2010-09-01

    Full Text Available Web applications are becoming an essential part of our everyday lives. Many of our activities are dependent on the functionality and security of these applications. As the scale of these applications grows, injection vulnerabilities such as SQL injection are major security challenges for developers today. This paper presents the technique of automatic query sanitization to automatically remove SQL injection vulnerabilities in code. In our technique, a combination of static analysis and program transformation are used to automatically instrument web applications with sanitization code. We have implemented this technique in a tool named ASSIST (Automatic and Static SQL Injection Sanitization Tool for protecting Java-based web applications. Our experimental evaluation showed that our technique is effective against SQL injection vulnerabilities and has a low overhead.

  2. Big Data Analytics with Datalog Queries on Spark.

    Science.gov (United States)

    Shkapsky, Alexander; Yang, Mohan; Interlandi, Matteo; Chiu, Hsuan; Condie, Tyson; Zaniolo, Carlo

    2016-01-01

    There is great interest in exploiting the opportunity provided by cloud computing platforms for large-scale analytics. Among these platforms, Apache Spark is growing in popularity for machine learning and graph analytics. Developing efficient complex analytics in Spark requires deep understanding of both the algorithm at hand and the Spark API or subsystem APIs (e.g., Spark SQL, GraphX). Our BigDatalog system addresses the problem by providing concise declarative specification of complex queries amenable to efficient evaluation. Towards this goal, we propose compilation and optimization techniques that tackle the important problem of efficiently supporting recursion in Spark. We perform an experimental comparison with other state-of-the-art large-scale Datalog systems and verify the efficacy of our techniques and effectiveness of Spark in supporting Datalog-based analytics.

  3. A System for Conceptual Pathway Finding and Deductive Querying

    DEFF Research Database (Denmark)

    Andreasen, Troels; Styltsvig, Henrik Bulskov; Fischer Nilsson, Jørgen

    . The system applies a graph form computed from the input natural logic sentences. The graph form generalizes the usual partial-order ontological sub-class structures by accommodation of affirmative sentences comprising recursive phrase structures. In this paper we focus on the logical inference rules......We describe principles and design of a system for knowledge bases applying a natural logic. Natural logics are forms of logic which appear as stylized fragments of natural language sentences. Accordingly, such knowledge base sentences can be read and understood directly by a domain expert...... for extending the concept graph form enabling deductive querying as well as computation of pathways between the concepts mentioned in the sentences....

  4. A System for Conceptual Pathway Finding and Deductive Querying

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Nilsson, Jørgen Fischer

    2015-01-01

    . The system applies a graph form computed from the input natural logic sentences. The graph form generalizes the usual partial-order ontological sub-class structures by accommodation of affirmative sentences comprising recursive phrase structures. In this paper we focus on the logical inference rules......We describe principles and design of a system for knowledge bases applying a natural logic. Natural logics are forms of logic which appear as stylized fragments of natural language sentences. Accordingly, such knowledge base sentences can be read and understood directly by a domain expert...... for extending the concept graph form enabling deductive querying as well as computation of pathways between the concepts mentioned in the sentences....

  5. Ontology based heterogeneous materials database integration and semantic query

    Science.gov (United States)

    Zhao, Shuai; Qian, Quan

    2017-10-01

    Materials digital data, high throughput experiments and high throughput computations are regarded as three key pillars of materials genome initiatives. With the fast growth of materials data, the integration and sharing of data is very urgent, that has gradually become a hot topic of materials informatics. Due to the lack of semantic description, it is difficult to integrate data deeply in semantic level when adopting the conventional heterogeneous database integration approaches such as federal database or data warehouse. In this paper, a semantic integration method is proposed to create the semantic ontology by extracting the database schema semi-automatically. Other heterogeneous databases are integrated to the ontology by means of relational algebra and the rooted graph. Based on integrated ontology, semantic query can be done using SPARQL. During the experiments, two world famous First Principle Computational databases, OQMD and Materials Project are used as the integration targets, which show the availability and effectiveness of our method.

  6. Pattern Discovery and Change Detection of Online Music Query Streams

    Science.gov (United States)

    Li, Hua-Fu

    In this paper, an efficient stream mining algorithm, called FTP-stream (Frequent Temporal Pattern mining of streams), is proposed to find the frequent temporal patterns over melody sequence streams. In the framework of our proposed algorithm, an effective bit-sequence representation is used to reduce the time and memory needed to slide the windows. The FTP-stream algorithm can calculate the support threshold in only a single pass based on the concept of bit-sequence representation. It takes the advantage of "left" and "and" operations of the representation. Experiments show that the proposed algorithm only scans the music query stream once, and runs significant faster and consumes less memory than existing algorithms, such as SWFI-stream and Moment.

  7. Native Language Integrated Queries with CppLINQ in C++

    Science.gov (United States)

    Vassilev, V.

    2015-05-01

    Programming language evolution brought to us the domain-specific languages (DSL). They proved to be very useful for expressing specific concepts, turning into a vital ingredient even for general-purpose frameworks. Supporting declarative DSLs (such as SQL) in imperative languages (such as C++) can happen in the manner of language integrated query (LINQ). We investigate approaches to integrate LINQ programming language, native to C++. We review its usability in the context of high energy physics. We present examples using CppLINQ for a few types data analysis workflows done by the end-users doing data analysis. We discuss evidences how this DSL technology can simplify massively parallel grid system such as PROOF.

  8. Web-based topology queries on a BIM model

    DEFF Research Database (Denmark)

    Rasmussen, Mads Holten; Hviid, Christian Anker; Karlshøj, Jan

    Building Information Modeling (BIM) is in the industry often confused with 3D-modeling regardless that the potential of modeling information goes way beyond performing clash detections on geometrical objects occupying the same physical space. Lately, several research projects have tried to change...... that by extending BIM with information using linked data technologies. However, when showing information alone the strong communication benefits of 3D are neglected, and a practical way of connecting the two worlds is currently missing. In this paper, we present a prototype of a visual query interface running...... is to establish a baseline for discussion of the general design choices that have been considered, and the developed application further serves as a proof of concept for combining BIM model data with a knowledge graph and potentially other sources of Linked Open Data, in a simple web interface....

  9. Towards Optimal Multi-Dimensional Query Processing with BitmapIndices

    Energy Technology Data Exchange (ETDEWEB)

    Rotem, Doron; Stockinger, Kurt; Wu, Kesheng

    2005-09-30

    Bitmap indices have been widely used in scientific applications and commercial systems for processing complex, multi-dimensional queries where traditional tree-based indices would not work efficiently. This paper studies strategies for minimizing the access costs for processing multi-dimensional queries using bitmap indices with binning. Innovative features of our algorithm include (a) optimally placing the bin boundaries and (b) dynamically reordering the evaluation of the query terms. In addition, we derive several analytical results concerning optimal bin allocation for a probabilistic query model. Our experimental evaluation with real life data shows an average I/O cost improvement of at least a factor of 10 for multi-dimensional queries on datasets from two different applications. Our experiments also indicate that the speedup increases with the number of query dimensions.

  10. Entropy Based Analysis of DNS Query Traffic in the Campus Network

    Directory of Open Access Journals (Sweden)

    Dennis Arturo Ludeña Romaña

    2008-10-01

    Full Text Available We carried out the entropy based study on the DNS query traffic from the campus network in a university through January 1st, 2006 to March 31st, 2007. The results are summarized, as follows: (1 The source IP addresses- and query keyword-based entropies change symmetrically in the DNS query traffic from the outside of the campus network when detecting the spam bot activity on the campus network. On the other hand (2, the source IP addresses- and query keywordbased entropies change similarly each other when detecting big DNS query traffic caused by prescanning or distributed denial of service (DDoS attack from the campus network. Therefore, we can detect the spam bot and/or DDoS attack bot by only watching DNS query access traffic.

  11. Query-by-Example Music Information Retrieval by Score-Informed Source Separation and Remixing Technologies

    OpenAIRE

    Goto Masataka; Itoyama Katsutoshi; Komatani Kazunori; Ogata Tetsuya; Okuno HiroshiG

    2010-01-01

    We describe a novel query-by-example (QBE) approach in music information retrieval that allows a user to customize query examples by directly modifying the volume of different instrument parts. The underlying hypothesis of this approach is that the musical mood of retrieved results changes in relation to the volume balance of different instruments. On the basis of this hypothesis, we aim to clarify the relationship between the change in the volume balance of a query and the genre of the retr...

  12. An Adaptive Genetic Algorithm with Dynamic Population Size for Optimizing Join Queries

    OpenAIRE

    Vellev, Stoyan

    2008-01-01

    The problem of finding the optimal join ordering executing a query to a relational database management system is a combinatorial optimization problem, which makes deterministic exhaustive solution search unacceptable for queries with a great number of joined relations. In this work an adaptive genetic algorithm with dynamic population size is proposed for optimizing large join queries. The performance of the algorithm is compared with that of several classical non-determinis...

  13. THE FORECASTING POWER OF INTERNET SEARCH QUERIES IN THE BRAZILIAN FINANCIAL MARKET

    OpenAIRE

    RAMOS, HENRIQUE PINTO; RIBEIRO, KADJA KATHERINE MENDES; PERLIN, MARCELO SCHERER

    2017-01-01

    ABSTRACT Purpose: To analyze the predictability of Google's search queries in the Brazilian financial market. Originality/gap/relevance/implications: Despite a growing foreign literature using Google's search query data, there is no acknowledgement of work on this area in Brazil. An application to the Brazilian financial market shows new sources of information about market movements and may contribute to researchers and practitioners to understand how changes in specific search queries af...

  14. Performance evaluation of unified medical language system®'s synonyms expansion to query PubMed

    Directory of Open Access Journals (Sweden)

    Griffon Nicolas

    2012-02-01

    Full Text Available Abstract Background PubMed is the main access to medical literature on the Internet. In order to enhance the performance of its information retrieval tools, primarily non-indexed citations, the authors propose a method: expanding users' queries using Unified Medical Language System' (UMLS synonyms i.e. all the terms gathered under one unique Concept Unique Identifier. Methods This method was evaluated using queries constructed to emphasize the differences between this new method and the current PubMed automatic term mapping. Four experts assessed citation relevance. Results Using UMLS, we were able to retrieve new citations in 45.5% of queries, which implies a small increase in recall. The new strategy led to a heterogeneous 23.7% mean increase in non-indexed citation retrieved. Of these, 82% have been published less than 4 months earlier. The overall mean precision was 48.4% but differed according to the evaluators, ranging from 36.7% to 88.1% (Inter rater agreement was poor: kappa = 0.34. Conclusions This study highlights the need for specific search tools for each type of user and use-cases. The proposed strategy may be useful to retrieve recent scientific advancement.

  15. Query transformations and their role in Web searching by the members of the general public

    Directory of Open Access Journals (Sweden)

    Martin Whittle

    2006-01-01

    Full Text Available Introduction. This paper reports preliminary research in a primarily experimental study of how the general public search for information on the Web. The focus is on the query transformation patterns that characterise searching. Method. In this work, we have used transaction logs from the Excite search engine to develop methods for analysing query transformations that should aid the analysis of our ongoing experimental work. Our methods involve the use of similarity techniques to link queries with the most similar previous query in a train. The resulting query transformations are represented as a list of codes representing a whole search. Analysis. It is shown how query transformation sequences can be represented as graphical networks and some basic statistical results are shown. A correlation analysis is performed to examine the co-occurrence of Boolean and quotation mark changes with the syntactic changes. Results. A frequency analysis of the occurrence of query transformation codes is presented. The connectivity of graphs obtained from the query transformation is investigated and found to follow an exponential scaling law. The correlation analysis reveals a number of patterns that provide some interesting insights into Web searching by the general public. Conclusion. We have developed analytical methods based on query similarity that can be applied to our current experimental work with volunteer subjects. The results of these will form part of a database with the aim of developing an improved understanding of how the public search the Web.

  16. Improving the Usability of OCL as an Ad-hoc Model Querying Language

    DEFF Research Database (Denmark)

    Störrle, Harald

    2013-01-01

    The OCL is often perceived as dicult to learn and use. In previous research, we have dened experimental query languages exhibiting higher levels of usability than OCL. However, none of these alternatives can rival OCL in terms of adoption and support. In an attempt to leverage the lessons learned...... from our research and make it accessible to the OCL community, we propose the OCL Query API (OQAPI), a library of query-predicates to improve the user-friendliness of OCL for ad-hoc querying. The usability of OQAPI is studied using controlled experiments. We nd considerable evidence to support our...

  17. Using Common Table Expressions to Build a Scalable Boolean Query Generator for Clinical Data Warehouses

    Science.gov (United States)

    Harris, Daniel R.; Henderson, Darren W.; Kavuluru, Ramakanth; Stromberg, Arnold J.; Johnson, Todd R.

    2015-01-01

    We present a custom, Boolean query generator utilizing common-table expressions (CTEs) that is capable of scaling with big datasets. The generator maps user-defined Boolean queries, such as those interactively created in clinical-research and general-purpose healthcare tools, into SQL. We demonstrate the effectiveness of this generator by integrating our work into the Informatics for Integrating Biology and the Bedside (i2b2) query tool and show that it is capable of scaling. Our custom generator replaces and outperforms the default query generator found within the Clinical Research Chart (CRC) cell of i2b2. In our experiments, sixteen different types of i2b2 queries were identified by varying four constraints: date, frequency, exclusion criteria, and whether selected concepts occurred in the same encounter. We generated non-trivial, random Boolean queries based on these 16 types; the corresponding SQL queries produced by both generators were compared by execution times. The CTE-based solution significantly outperformed the default query generator and provided a much more consistent response time across all query types (M=2.03, SD=6.64 vs. M=75.82, SD=238.88 seconds). Without costly hardware upgrades, we provide a scalable solution based on CTEs with very promising empirical results centered on performance gains. The evaluation methodology used for this provides a means of profiling clinical data warehouse performance. PMID:25192572

  18. Geospatial-Enabled RuleML in a Study on Querying Respiratory Disease Information

    DEFF Research Database (Denmark)

    Gao, Sheng; Boley, Harold; Mioc, Darka

    2009-01-01

    health data query and representation framework is proposed through the formalization of spatial information. We include the geometric representation in RuleML deduction, and apply ontologies and rules for querying and representing health information. Corresponding geospatial built-ins were implemented...... as an extension to OO jDREW. Case studies were carried out using geospatial-enabled RuleML queries for respiratory disease information. The paper thus demonstrates the use of RuleML for geospatial-semantic querying and representing of health information....

  19. Querying Two Boundary Points for Shortest Paths in a Polygonal Domain

    OpenAIRE

    Bae, Sang Won; Okamoto, Yoshio

    2009-01-01

    We consider a variant of two-point Euclidean shortest path query problem: given a polygonal domain, build a data structure for two-point shortest path query, provided that query points always lie on the boundary of the domain. As a main result, we show that a logarithmic-time query for shortest paths between boundary points can be performed using O~ (n^5) preprocessing time and O(n^5) space where n is the number of corners of the polygonal domain and the O~ notation suppresses the polylogarit...

  20. LAILAPS-QSM: A RESTful API and JAVA library for semantic query suggestions.

    Science.gov (United States)

    Chen, Jinbo; Scholz, Uwe; Zhou, Ruonan; Lange, Matthias

    2018-03-12

    In order to access and filter content of life-science databases, full text search is a widely applied query interface. But its high flexibility and intuitiveness is paid for with potentially imprecise and incomplete query results. To reduce this drawback, query assistance systems suggest those combinations of keywords with the highest potential to match most of the relevant data records. Widespread approaches are syntactic query corrections that avoid misspelling and support expansion of words by suffixes and prefixes. Synonym expansion approaches apply thesauri, ontologies, and query logs. All need laborious curation and maintenance. Furthermore, access to query logs is in general restricted. Approaches that infer related queries by their query profile like research field, geographic location, co-authorship, affiliation etc. require user's registration and its public accessibility that contradict privacy concerns. To overcome these drawbacks, we implemented LAILAPS-QSM, a machine learning approach that reconstruct possible linguistic contexts of a given keyword query. The context is referred from the text records that are stored in the databases that are going to be queried or extracted for a general purpose query suggestion from PubMed abstracts and UniProt data. The supplied tool suite enables the pre-processing of these text records and the further computation of customized distributed word vectors. The latter are used to suggest alternative keyword queries. An evaluated of the query suggestion quality was done for plant science use cases. Locally present experts enable a cost-efficient quality assessment in the categories trait, biological entity, taxonomy, affiliation, and metabolic function which has been performed using ontology term similarities. LAILAPS-QSM mean information content similarity for 15 representative queries is 0.70, whereas 34% have a score above 0.80. In comparison, the information content similarity for human expert made query suggestions

  1. Towards Hybrid Online On-Demand Querying of Realtime Data with Stateful Complex Event Processing

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, Qunzhi; Simmhan, Yogesh; Prasanna, Viktor K.

    2013-10-09

    Emerging Big Data applications in areas like e-commerce and energy industry require both online and on-demand queries to be performed over vast and fast data arriving as streams. These present novel challenges to Big Data management systems. Complex Event Processing (CEP) is recognized as a high performance online query scheme which in particular deals with the velocity aspect of the 3-V’s of Big Data. However, traditional CEP systems do not consider data variety and lack the capability to embed ad hoc queries over the volume of data streams. In this paper, we propose H2O, a stateful complex event processing framework, to support hybrid online and on-demand queries over realtime data. We propose a semantically enriched event and query model to address data variety. A formal query algebra is developed to precisely capture the stateful and containment semantics of online and on-demand queries. We describe techniques to achieve the interactive query processing over realtime data featured by efficient online querying, dynamic stream data persistence and on-demand access. The system architecture is presented and the current implementation status reported.

  2. ConnectomeExplorer: Query-guided visual analysis of large volumetric neuroscience data

    KAUST Repository

    Beyer, Johanna

    2013-12-01

    This paper presents ConnectomeExplorer, an application for the interactive exploration and query-guided visual analysis of large volumetric electron microscopy (EM) data sets in connectomics research. Our system incorporates a knowledge-based query algebra that supports the interactive specification of dynamically evaluated queries, which enable neuroscientists to pose and answer domain-specific questions in an intuitive manner. Queries are built step by step in a visual query builder, building more complex queries from combinations of simpler queries. Our application is based on a scalable volume visualization framework that scales to multiple volumes of several teravoxels each, enabling the concurrent visualization and querying of the original EM volume, additional segmentation volumes, neuronal connectivity, and additional meta data comprising a variety of neuronal data attributes. We evaluate our application on a data set of roughly one terabyte of EM data and 750 GB of segmentation data, containing over 4,000 segmented structures and 1,000 synapses. We demonstrate typical use-case scenarios of our collaborators in neuroscience, where our system has enabled them to answer specific scientific questions using interactive querying and analysis on the full-size data for the first time. © 1995-2012 IEEE.

  3. Approximate Range Emptiness in Constant Time and Optimal Space

    DEFF Research Database (Denmark)

    Goswami, Mayank; Jørgensen, Allan Grønlund; Larsen, Kasper Green

    2015-01-01

    {Bloom filters} from single point queries to any interval length L. Setting the false positive rate to ε/L and performing L queries, Bloom filters yield a solution to this problem with space O(nlg(L/ε)) bits, false positive probability bounded by ε for intervals of length up to L, using query time O......(Llg(L/ε)). Our first contribution is to show that the space/error trade-off cannot be improved asymptotically: Any data structure for answering approximate range emptiness queries on intervals of length up to L with false positive probability ε, must use space Ω(nlg(L/ε))−O(n) bits. On the positive side we show...... that the query time can be improved greatly, to constant time, while matching our space lower bound up to a lower order additive term. This result is achieved through a succinct data structure for (non-approximate 1d) range emptiness/reporting queries, which may be of independent interest....

  4. On Space Efficient Two Dimensional Range Minimum Data Structures

    DEFF Research Database (Denmark)

    Davoodi, Pooya; Brodal, Gerth Stølting; Rao, S. Srinivasa

    2010-01-01

    , the lower bound is tight up to a constant factor. In two dimensions, we complement the lower bound with an indexing data structure of size O(N/c) bits additional space which can be preprocessed in O(N) time and achieves O(clog2 c) query time. For c = O(1), this is the first O(1) query time algorithm using......The two dimensional range minimum query problem is to preprocess a static two dimensional m by n array A of size N = m · n, such that subsequent queries, asking for the position of the minimum element in a rectangular range within A, can be answered efficiently. We study the trade-off between...... optimal O(N) bits additional space. For the case where queries can not probe A, we give a data structure of size O(N· min {m,logn}) bits with O(1) query time, assuming m ≤ n. This leaves a gap to the lower bound of Ω(Nlogm) bits for this version of the problem....

  5. LibKiSAO: a Java library for Querying KiSAO

    Directory of Open Access Journals (Sweden)

    Zhukova Anna

    2012-09-01

    Full Text Available Abstract Background The Kinetic Simulation Algorithm Ontology (KiSAO supplies information about existing algorithms available for the simulation of Systems Biology models, their characteristics, parameters and inter-relationships. KiSAO enables the unambiguous identification of algorithms from simulation descriptions. Information about analogous methods having similar characteristics and about algorithm parameters incorporated into KiSAO is desirable for simulation tools. To retrieve this information programmatically an application programming interface (API for KiSAO is needed. Findings We developed libKiSAO, a Java library to enable querying of the KiSA Ontology. It implements methods to retrieve information about simulation algorithms stored in KiSAO, their characteristics and parameters, and methods to query the algorithm hierarchy and search for similar algorithms providing comparable results for the same simulation set-up. Using libKiSAO, simulation tools can make logical inferences based on this knowledge and choose the most appropriate algorithm to perform a simulation. LibKiSAO also enables simulation tools to handle a wider range of simulation descriptions by determining which of the available methods are similar and can be used instead of the one indicated in the simulation description if that one is not implemented. Conclusions LibKiSAO enables Java applications to easily access information about simulation algorithms, their characteristics and parameters stored in the OWL-encoded Kinetic Simulation Algorithm Ontology. LibKiSAO can be used by simulation description editors and simulation tools to improve reproducibility of computational simulation tasks and facilitate model re-use.

  6. Linearity of network proximity measures: implications for set-based queries and significance testing.

    Science.gov (United States)

    Maxwell, Sean; Chance, Mark R; Koyutürk, Mehmet

    2017-05-01

    In recent years, various network proximity measures have been proposed to facilitate the use of biomolecular interaction data in a broad range of applications. These applications include functional annotation, disease gene prioritization, comparative analysis of biological systems and prediction of new interactions. In such applications, a major task is the scoring or ranking of the nodes in the network in terms of their proximity to a given set of 'seed' nodes (e.g. a group of proteins that are identified to be associated with a disease, or are deferentially expressed in a certain condition). Many different network proximity measures are utilized for this purpose, and these measures are quite diverse in terms of the benefits they offer. We propose a unifying framework for characterizing network proximity measures for set-based queries. We observe that many existing measures are linear, in that the proximity of a node to a set of nodes can be represented as an aggregation of its proximity to the individual nodes in the set. Based on this observation, we propose methods for processing of set-based proximity queries that take advantage of sparse local proximity information. In addition, we provide an analytical framework for characterizing the distribution of proximity scores based on reference models that accurately capture the characteristics of the seed set (e.g. degree distribution and biological function). The resulting framework facilitates computation of exact figures for the statistical significance of network proximity scores, enabling assessment of the accuracy of Monte Carlo simulation based estimation methods. Implementations of the methods in this paper are available at https://bioengine.case.edu/crosstalker which includes a robust visualization for results viewing. stm@case.edu or mxk331@case.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions

  7. A Geospatial Semantic Enrichment and Query Service for Geotagged Photographs

    Science.gov (United States)

    Ennis, Andrew; Nugent, Chris; Morrow, Philip; Chen, Liming; Ioannidis, George; Stan, Alexandru; Rachev, Preslav

    2015-01-01

    With the increasing abundance of technologies and smart devices, equipped with a multitude of sensors for sensing the environment around them, information creation and consumption has now become effortless. This, in particular, is the case for photographs with vast amounts being created and shared every day. For example, at the time of this writing, Instagram users upload 70 million photographs a day. Nevertheless, it still remains a challenge to discover the “right” information for the appropriate purpose. This paper describes an approach to create semantic geospatial metadata for photographs, which can facilitate photograph search and discovery. To achieve this we have developed and implemented a semantic geospatial data model by which a photograph can be enrich with geospatial metadata extracted from several geospatial data sources based on the raw low-level geo-metadata from a smartphone photograph. We present the details of our method and implementation for searching and querying the semantic geospatial metadata repository to enable a user or third party system to find the information they are looking for. PMID:26205265

  8. An Effective Framework for Distributed Geospatial Query Processing in Grids

    Directory of Open Access Journals (Sweden)

    CHEN, B.

    2010-08-01

    Full Text Available The emergence of Internet has greatly revolutionized the way that geospatial information is collected, managed, processed and integrated. There are several important research issues to be addressed for distributed geospatial applications. First, the performance of geospatial applications is needed to be considered in the Internet environment. In this regard, the Grid as an effective distributed computing paradigm is a good choice. The Grid uses a series of middleware to interconnect and merge various distributed resources into a super-computer with capability of high performance computation. Secondly, it is necessary to ensure the secure use of independent geospatial applications in the Internet environment. The Grid just provides the utility of secure access to distributed geospatial resources. Additionally, it makes good sense to overcome the heterogeneity between individual geospatial information systems in Internet. The Open Geospatial Consortium (OGC proposes a number of generalized geospatial standards e.g. OGC Web Services (OWS to achieve interoperable access to geospatial applications. The OWS solution is feasible and widely adopted by both the academic community and the industry community. Therefore, we propose an integrated framework by incorporating OWS standards into Grids. Upon the framework distributed geospatial queries can be performed in an interoperable, high-performance and secure Grid environment.

  9. Neural network for intelligent query of an FBI forensic database

    Science.gov (United States)

    Uvanni, Lee A.; Rainey, Timothy G.; Balasubramanian, Uma; Brettle, Dean W.; Weingard, Fred; Sibert, Robert W.; Birnbaum, Eric

    1997-02-01

    Examiner is an automated fired cartridge case identification system utilizing a dual-use neural network pattern recognition technology, called the statistical-multiple object detection and location system (S-MODALS) developed by Booz(DOT)Allen & Hamilton, Inc. in conjunction with Rome Laboratory. S-MODALS was originally designed for automatic target recognition (ATR) of tactical and strategic military targets using multisensor fusion [electro-optical (EO), infrared (IR), and synthetic aperture radar (SAR)] sensors. Since S-MODALS is a learning system readily adaptable to problem domains other than automatic target recognition, the pattern matching problem of microscopic marks for firearms evidence was analyzed using S-MODALS. The physics; phenomenology; discrimination and search strategies; robustness requirements; error level and confidence level propagation that apply to the pattern matching problem of military targets were found to be applicable to the ballistic domain as well. The Examiner system uses S-MODALS to rank a set of queried cartridge case images from the most similar to the least similar image in reference to an investigative fired cartridge case image. The paper presents three independent tests and evaluation studies of the Examiner system utilizing the S-MODALS technology for the Federal Bureau of Investigation.

  10. An Association-Oriented Partitioning Approach for Streaming Graph Query

    Directory of Open Access Journals (Sweden)

    Yun Hao

    2017-01-01

    Full Text Available The volumes of real-world graphs like knowledge graph are increasing rapidly, which makes streaming graph processing a hot research area. Processing graphs in streaming setting poses significant challenges from different perspectives, among which graph partitioning method plays a key role. Regarding graph query, a well-designed partitioning method is essential for achieving better performance. Existing offline graph partitioning methods often require full knowledge of the graph, which is not possible during streaming graph processing. In order to handle this problem, we propose an association-oriented streaming graph partitioning method named Assc. This approach first computes the rank values of vertices with a hybrid approximate PageRank algorithm. After splitting these vertices with an adapted variant affinity propagation algorithm, the process order on vertices in the sliding window can be determined. Finally, according to the level of these vertices and their association, the partition where the vertices should be distributed is decided. We compare its performance with a set of streaming graph partition methods and METIS, a widely adopted offline approach. The results show that our solution can partition graphs with hundreds of millions of vertices in streaming setting on a large collection of graph datasets and our approach outperforms other graph partitioning methods.

  11. A Geospatial Semantic Enrichment and Query Service for Geotagged Photographs

    Directory of Open Access Journals (Sweden)

    Andrew Ennis

    2015-07-01

    Full Text Available With the increasing abundance of technologies and smart devices, equipped with a multitude of sensors for sensing the environment around them, information creation and consumption has now become effortless. This, in particular, is the case for photographs with vast amounts being created and shared every day. For example, at the time of this writing, Instagram users upload 70 million photographs a day. Nevertheless, it still remains a challenge to discover the “right” information for the appropriate purpose. This paper describes an approach to create semantic geospatial metadata for photographs, which can facilitate photograph search and discovery. To achieve this we have developed and implemented a semantic geospatial data model by which a photograph can be enrich with geospatial metadata extracted from several geospatial data sources based on the raw low-level geo-metadata from a smartphone photograph. We present the details of our method and implementation for searching and querying the semantic geospatial metadata repository to enable a user or third party system to find the information they are looking for.

  12. An efficient plane-grating monochromator based on conical diffraction for continuous tuning in the entire soft X-ray range including tender X-rays (2-8 keV).

    Science.gov (United States)

    Jark, Werner

    2016-01-01

    Recently it was verified that the diffraction efficiency of reflection gratings with rectangular profile, when illuminated at grazing angles of incidence with the beam trajectory along the grooves and not perpendicular to them, remains very high for tender X-rays of several keV photon energy. This very efficient operation of a reflection grating in the extreme off-plane orientation, i.e. in conical diffraction, offers the possibility of designing a conical diffraction monochromator scheme that provides efficient continuous photon energy tuning over rather large tuning ranges. For example, the tuning could cover photon energies from below 1000 eV up to 8 keV. The expected transmission of the entire instrument is high as all components are always operated below the critical angle for total reflection. In the simplest version of the instrument a plane grating is preceded by a plane mirror rotating simultaneously with it. The photon energy selection will then be made using the combination of a focusing mirror and exit slit. As is common for grating monochromators for soft X-ray radiation, the minimum spectral bandwidth is source-size-limited, while the bandwidth can be adjusted freely to any larger value. As far as tender X-rays (2-8 keV) are concerned, the minimum bandwidth is at least one and up to two orders of magnitude larger than the bandwidth provided by Si(111) double-crystal monochromators in a collimated beam. Therefore the instrument will provide more flux, which can even be increased at the expense of a bandwidth increase. On the other hand, for softer X-rays with photon energies below 1 keV, competitive relative spectral resolving powers of the order of 10000 are possible.

  13. Consciousness as a process of queries and answers in architectures based on in situ representations

    NARCIS (Netherlands)

    van der Velde, F.; van der Velde, Frank

    2013-01-01

    Functional or access consciousness can be described as an ongoing dynamic process of queries and answers. Whenever we have an awareness of an object or its surroundings, it consists of the dynamic process that answers (implicit) queries like "What is the color or shape of the object?" or "What

  14. Query representation by structured concept threads with application to interactive video retrieval

    NARCIS (Netherlands)

    Wang, D.; Wang, Z.; Li, J.; Zhang, B.; Li, X.

    2009-01-01

    In this paper, we provide a new formulation for video queries as structured combination of concept threads, contributing to the general query-by-concept paradigm. Occupying a low-dimensional region in the concept space, concept thread defines a ranked list of video documents ordered by their

  15. Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records.

    Science.gov (United States)

    Luo, Yuan; Szolovits, Peter

    2016-01-01

    In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen's interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen's relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions.

  16. Approximate distance oracles for planar graphs with improved query time-space tradeoff

    DEFF Research Database (Denmark)

    Wulff-Nilsen, Christian

    2016-01-01

    We consider approximate distance oracles for edge-weighted n-vertex undirected planar graphs. Given fixed ϵ > 0, we present a (1 + ϵ)-approximate distance oracle with O(n(log log n)2) space and O((loglogr?,)3) query time. This improves the previous best product of query time and space...

  17. Bat-Inspired Algorithm Based Query Expansion for Medical Web Information Retrieval.

    Science.gov (United States)

    Khennak, Ilyes; Drias, Habiba

    2017-02-01

    With the increasing amount of medical data available on the Web, looking for health information has become one of the most widely searched topics on the Internet. Patients and people of several backgrounds are now using Web search engines to acquire medical information, including information about a specific disease, medical treatment or professional advice. Nonetheless, due to a lack of medical knowledge, many laypeople have difficulties in forming appropriate queries to articulate their inquiries, which deem their search queries to be imprecise due the use of unclear keywords. The use of these ambiguous and vague queries to describe the patients' needs has resulted in a failure of Web search engines to retrieve accurate and relevant information. One of the most natural and promising method to overcome this drawback is Query Expansion. In this paper, an original approach based on Bat Algorithm is proposed to improve the retrieval effectiveness of query expansion in medical field. In contrast to the existing literature, the proposed approach uses Bat Algorithm to find the best expanded query among a set of expanded query candidates, while maintaining low computational complexity. Moreover, this new approach allows the determination of the length of the expanded query empirically. Numerical results on MEDLINE, the on-line medical information database, show that the proposed approach is more effective and efficient compared to the baseline.

  18. An XML-Enabled Data Mining Query Language XML-DMQL

    NARCIS (Netherlands)

    Feng, L.; Dillon, T.

    2005-01-01

    Inspired by the good work of Han et al. (1996) and Elfeky et al. (2001) on the design of data mining query languages for relational and object-oriented databases, in this paper, we develop an expressive XML-enabled data mining query language by extension of XQuery. We first describe some

  19. UMass at TREC WEB 2014: Entity Query Feature Expansion using Knowledge Base Links

    Science.gov (United States)

    2014-11-01

    task on the category A subset and demonstrate the benefit of entity-centric approaches even for non-entity queries like “dark chocolate health benefits...category A subset and demonstrate the benefit of entity-centric approaches even for non-entity queries like ???dark chocolate health benefits???. 15

  20. Practical querying of temporal data via OWL 2 QL and SQL: 2011

    CSIR Research Space (South Africa)

    Klarman, S

    2013-12-01

    Full Text Available We develop a practical approach to querying temporal data stored in temporal SQL:2011 databases through the semantic layer of OWL 2 QL ontologies. An interval-based temporal query language (TQL), which we propose for this task, is defined via...

  1. DirQ: A Directed Query Dissemination Scheme for Wireless Sensor Networks

    NARCIS (Netherlands)

    Chatterjea, Supriyo; De Luigi, Simone; Havinga, Paul J.M.; Kaminska, B

    This paper describes a Directed Query Dissemination Scheme, DirQ that routes queries to the appropriate source nodes based on both constant and dynamic-valued attributes such as sensor types and sensor values. Location information is not essential for the operation of DirQ. DirQ only uses locally

  2. Robust Runtime Optimization and Skew-Resistant Execution of Analytical SPARQL Queries on Pig

    NARCIS (Netherlands)

    S Kotoulas; J. Urbani; P.A. Boncz (Peter); P. Mika

    2012-01-01

    textabstractWe describe a system that incrementally translates SPARQL queries to Pig Latin and executes them on a Hadoop cluster. This system is designed to work eciently on complex queries with many self-joins over huge datasets, avoiding job failures even in the case of joins with unexpected

  3. A Database Query Processing Model in Peer-To-Peer Network ...

    African Journals Online (AJOL)

    Peer-to-peer databases are becoming more prevalent on the internet for sharing and distributing applications, documents, files, and other digital media. The problem associated with answering large-scale ad hoc analysis queries, aggregation queries, on these databases poses unique challenges. This paper presents an ...

  4. Memory-Aware Query Routing in Interactive Web-based Information Systems

    NARCIS (Netherlands)

    F. Waas; M.L. Kersten (Martin)

    2001-01-01

    textabstractQuery throughput is one of the primary optimization goals in interactive web-based information systems in order to achieve the performance necessary to serve large user communities. Queries in this application domain differ significantly from those in traditional database applications:

  5. A probabilistic approach for mapping free-text queries to complex web forms

    NARCIS (Netherlands)

    Tjin-Kam-Jet, Kien; Trieschnigg, Rudolf Berend; Hiemstra, Djoerd

    Web applications with complex interfaces consisting of multiple input fields should understand free-text queries. We propose a probabilistic approach to map parts of a free-text query to the fields of a complex web form. Our method uses token models rather than only static dictionaries to create

  6. Counting, Enumerating and Sampling of Execution Plans in a Cost-Based Query Optimizer

    NARCIS (Netherlands)

    F. Waas; C.A. Galindo-Legaria

    2000-01-01

    textabstractTesting an SQL database system by running large sets of deterministic or stochastic SQL statements is common practice in commercial database development. However, code defects often remain undetected as the query optimizer's choice of an execution plan is not only depending on the query

  7. An Overview of Data Models and Query Languages for Content-based Video Retrieval

    NARCIS (Netherlands)

    Petkovic, M.; Jonker, Willem

    As a large amount of video data becomes publicly available, the need to model and query this data efficiently becomes significant. Consequently, content-based retrieval of video data turns out to be a challenging and important problem addressing areas such as video modelling, indexing, querying,

  8. Stemming Methodologies Over Individual Query Words for an Arabic Information Retrieval System.

    Science.gov (United States)

    Abu-Salem, Hani; Al-Omari, Mahmoud; Evens, Martha W.

    1999-01-01

    Investigates how to improve the performance of an Arabic Information Retrieval System (Arabic-IRS) by imposing the retrieval method, Mixed Stemming, over individual words of a query depending on the importance of the word, the stem or the root of the query terms in the database. This method computes term importance using a Term Frequency and…

  9. Evaluating XML-Extended OLAP Queries Based on a Physical Algebra

    DEFF Research Database (Denmark)

    Yin, Xuepeng; Pedersen, Torben Bach

    2006-01-01

    In today’s OLAP systems, physically integrating fast-changing data, e.g., stock quotes, into a cube is complex and time-consuming. This data is likely to be available in XML format on the WWW; thus, instead of physical integration, making XML data logically federated with OLAP systems is desirable....... In this paper, we extend previous work on the logical federation of OLAP and XML data sources by presenting a simplified query semantics, a physical query algebra and a robust OLAP-XML query engine as well as the query evaluation techniques. Performance experiments with a prototypical implementation suggest...... that the performance for OLAP-XML federations is comparable to queries on physically integrated data....

  10. Review: Implementasi Holap Untuk Optimasi Query Sistem Basis Data Terdistribusi Dengan Pendekatan Algoritma Genetik

    Directory of Open Access Journals (Sweden)

    Rahmad Syaifudin

    2016-01-01

    Full Text Available Distributed Database is one of database that is under control of the Database Management System (DBMS was focused on storage devices are separated from one and another. Optimization data query on distributed database system not be separated from data processing methods that used. Then for fast query optimization this database need some required methods that can optimize it. Hybrid online analytical processing (HOLAP or often to call Hybrid-OLAP is one of technology for optimization query on distributed database. Genetic Algorithm is one of algorithm for heuristic searching was based on the mechanisms of biological evolution. Process of genetic algorithm is combining a selection process, using a crossover operator and mutation to get the best solution. From the reviews about implementation HOLAP with Genetic Algorithm approach was expected being used as a basis research on HOLAP implementation for query optimization on distributed database with genetic algorithm approach. Keywords : Query Optimization; Distributed database; HOLAP; OLAP; Genetetic algorithm.

  11. Continuous auditing & continuous monitoring : Continuous value?

    NARCIS (Netherlands)

    van Hillo, Rutger; Weigand, Hans; Espana, S; Ralyte, J; Souveyet, C

    2016-01-01

    Advancements in information technology, new laws and regulations and rapidly changing business conditions have led to a need for more timely and ongoing assurance with effectively working controls. Continuous Auditing (CA) and Continuous Monitoring (CM) technologies have made this possible by

  12. Cyber Graph Queries for Geographically Distributed Data Centers

    Energy Technology Data Exchange (ETDEWEB)

    Berry, Jonathan W. [Mail Stop, Albuquerque, NM (United States); Collins, Michael [Christopher Newport Univ., VA (United States); Kearns, Aaron [Univ. of New Mexico, Albuquerque, NM (United States); Phillips, Cynthia A. [Mail Stop, Albuquerque, NM (United States); Saia, Jared [Univ. of New Mexico, Albuquerque, NM (United States)

    2015-05-01

    We present new algorithms for a distributed model for graph computations motivated by limited information sharing we first discussed in [20]. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a polylogarithmic size subgraph; 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centers. We have algorithms in both setting for s - t connectivity in both models. We also give an algorithm in the low-communication model for finding a planted clique. This is an anomaly- detection problem, finding a subgraph that is larger and denser than expected. For both the low- communication algorithms, we exploit structural properties of social networks to prove perfor- mance bounds better than what is possible for general graphs. For s - t connectivity, we use known properties. For planted clique, we propose a new property: bounded number of triangles per node. This property is based upon evidence from the social science literature. We found that classic examples of social networks do not have the bounded-triangles property. This is because many social networks contain elements that are non-human, such as accounts for a business, or other automated accounts. We describe some initial attempts to distinguish human nodes from automated nodes in social networks based only on topological properties.

  13. QueryArch3D: Querying and Visualising 3D Models of a Maya Archaeological Site in a Web-Based Interface

    Directory of Open Access Journals (Sweden)

    Giorgio Agugiaro

    2011-12-01

    Full Text Available Constant improvements in the field of surveying, computing and distribution of digital-content are reshaping the way Cultural Heritage can be digitised and virtually accessed, even remotely via web. A traditional 2D approach for data access, exploration, retrieval and exploration may generally suffice, however more complex analyses concerning spatial and temporal features require 3D tools, which, in some cases, have not yet been implemented or are not yet generally commercially available. Efficient organisation and integration strategies applicable to the wide array of heterogeneous data in the field of Cultural Heritage represent a hot research topic nowadays. This article presents a visualisation and query tool (QueryArch3D conceived to deal with multi-resolution 3D models. Geometric data are organised in successive levels of detail (LoD, provided with geometric and semantic hierarchies and enriched with attributes coming from external data sources. The visualisation and query front-end enables the 3D navigation of the models in a virtual environment, as well as the interaction with the objects by means of queries based on attributes or on geometries. The tool can be used as a standalone application, or served through the web. The characteristics of the research work, along with some implementation issues and the developed QueryArch3D tool will be discussed and presented.

  14. CSA: A Credibility Search Algorithm Based on Different Query in Unstructured Peer-to-Peer Networks

    Directory of Open Access Journals (Sweden)

    Hongyan Mei

    2014-01-01

    Full Text Available Efficient searching for resources has become a challenging task with less network bandwidth consumption in unstructured peer-to-peer (P2P networks. Heuristic search mechanism is an effective method which depends on the previous searches to guide future ones. In the proposed methods, searching for high-repetition resources is more effective. However, the performances of the searches for nonrepetition or low-repetition or rare resources need to be improved. As for this problem, considering the similarity between social networks and unstructured P2P networks, we present a credibility search algorithm based on different queries according to the trust production principle in sociology and psychology. In this method, queries are divided into familiar queries and unfamiliar queries. For different queries, we adopt different ways to get the credibility of node to its each neighbor. And then queries should be forwarded by the neighbor nodes with higher credibility. Experimental results show that our method can improve query hit rate and reduce search delay with low bandwidth consumption in three different network topologies under static and dynamic network environments.

  15. Federated queries of clinical data repositories: the sum of the parts does not equal the whole.

    Science.gov (United States)

    Weber, Griffin M

    2013-06-01

    In 2008 we developed a shared health research information network (SHRINE), which for the first time enabled research queries across the full patient populations of four Boston hospitals. It uses a federated architecture, where each hospital returns only the aggregate count of the number of patients who match a query. This allows hospitals to retain control over their local databases and comply with federal and state privacy laws. However, because patients may receive care from multiple hospitals, the result of a federated query might differ from what the result would be if the query were run against a single central repository. This paper describes the situations when this happens and presents a technique for correcting these errors. We use a one-time process of identifying which patients have data in multiple repositories by comparing one-way hash values of patient demographics. This enables us to partition the local databases such that all patients within a given partition have data at the same subset of hospitals. Federated queries are then run separately on each partition independently, and the combined results are presented to the user. Using theoretical bounds and simulated hospital networks, we demonstrate that once the partitions are made, SHRINE can produce more precise estimates of the number of patients matching a query. Uncertainty in the overlap of patient populations across hospitals limits the effectiveness of SHRINE and other federated query tools. Our technique reduces this uncertainty while retaining an aggregate federated architecture.

  16. Fragger: a protein fragment picker for structural queries [version 2; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Francois Berenger

    2018-04-01

    Full Text Available Protein modeling and design activities often require querying the Protein Data Bank (PDB with a structural fragment, possibly containing gaps. For some applications, it is preferable to work on a specific subset of the PDB or with unpublished structures. These requirements, along with specific user needs, motivated the creation of a new software to manage and query 3D protein fragments. Fragger is a protein fragment picker that allows protein fragment databases to be created and queried. All fragment lengths are supported and any set of PDB files can be used to create a database. Fragger can efficiently search a fragment database with a query fragment and a distance threshold. Matching fragments are ranked by distance to the query. The query fragment can have structural gaps and the allowed amino acid sequences matching a query can be constrained via a regular expression of one-letter amino acid codes. Fragger also incorporates a tool to compute the backbone RMSD of one versus many fragments in high throughput. Fragger should be useful for protein design, loop grafting and related structural bioinformatics tasks.

  17. An Energy-Efficient Skyline Query for Massively Multidimensional Sensing Data.

    Science.gov (United States)

    Wang, Yan; Wei, Wei; Deng, Qingxu; Liu, Wei; Song, Houbing

    2016-01-09

    Cyber physical systems (CPS) sense the environment based on wireless sensor networks. The sensing data of such systems present the characteristics of massiveness and multi-dimensionality. As one of the major monitoring methods used in in safe production monitoring and disaster early-warning applications, skyline query algorithms are extensively adopted for multiple-objective decision analysis of these sensing data. With the expansion of network sizes, the amount of sensing data increases sharply. Then, how to improve the query efficiency of skyline query algorithms and reduce the transmission energy consumption become pressing and difficult to accomplish issues. Therefore, this paper proposes a new energy-efficient skyline query method for massively multidimensional sensing data. First, the method uses a node cut strategy to dynamically generate filtering tuples with little computational overhead when collecting query results instead of issuing queries with filters. It can judge the domination relationship among different nodes, remove the detected data sets of dominated nodes that are irrelevant to the query, modify the query path dynamically, and reduce the data comparison and computational overhead. The efficient dynamic filter generated by this strategy uses little non-skyline data transmission in the network, and the transmission distance is very short. Second, our method also employs the tuple-cutting strategy inside the node and generates the local cutting tuples by the sub-tree with the node itself as the root node, which will be used to cut the detected data within the nodes of the sub-tree. Therefore, it can further control the non-skyline data uploading. A large number of experimental results show that our method can quickly return an overview of the monitored area and reduce the communication overhead. Additionally, it can shorten the response time and improve the efficiency of the query.

  18. An Energy-Efficient Skyline Query for Massively Multidimensional Sensing Data

    Directory of Open Access Journals (Sweden)

    Yan Wang

    2016-01-01

    Full Text Available Cyber physical systems (CPS sense the environment based on wireless sensor networks. The sensing data of such systems present the characteristics of massiveness and multi-dimensionality. As one of the major monitoring methods used in in safe production monitoring and disaster early-warning applications, skyline query algorithms are extensively adopted for multiple-objective decision analysis of these sensing data. With the expansion of network sizes, the amount of sensing data increases sharply. Then, how to improve the query efficiency of skyline query algorithms and reduce the transmission energy consumption become pressing and difficult to accomplish issues. Therefore, this paper proposes a new energy-efficient skyline query method for massively multidimensional sensing data. First, the method uses a node cut strategy to dynamically generate filtering tuples with little computational overhead when collecting query results instead of issuing queries with filters. It can judge the domination relationship among different nodes, remove the detected data sets of dominated nodes that are irrelevant to the query, modify the query path dynamically, and reduce the data comparison and computational overhead. The efficient dynamic filter generated by this strategy uses little non-skyline data transmission in the network, and the transmission distance is very short. Second, our method also employs the tuple-cutting strategy inside the node and generates the local cutting tuples by the sub-tree with the node itself as the root node, which will be used to cut the detected data within the nodes of the sub-tree. Therefore, it can further control the non-skyline data uploading. A large number of experimental results show that our method can quickly return an overview of the monitored area and reduce the communication overhead. Additionally, it can shorten the response time and improve the efficiency of the query.

  19. BioFed: federated query processing over life sciences linked open data.

    Science.gov (United States)

    Hasnain, Ali; Mehmood, Qaiser; Sana E Zainab, Syeda; Saleem, Muhammad; Warren, Claude; Zehra, Durre; Decker, Stefan; Rebholz-Schuhmann, Dietrich

    2017-03-15

    Biomedical data, e.g. from knowledge bases and ontologies, is increasingly made available following open linked data principles, at best as RDF triple data. This is a necessary step towards unified access to biological data sets, but this still requires solutions to query multiple endpoints for their heterogeneous data to eventually retrieve all the meaningful information. Suggested solutions are based on query federation approaches, which require the submission of SPARQL queries to endpoints. Due to the size and complexity of available data, these solutions have to be optimised for efficient retrieval times and for users in life sciences research. Last but not least, over time, the reliability of data resources in terms of access and quality have to be monitored. Our solution (BioFed) federates data over 130 SPARQL endpoints in life sciences and tailors query submission according to the provenance information. BioFed has been evaluated against the state of the art solution FedX and forms an important benchmark for the life science domain. The efficient cataloguing approach of the federated query processing system 'BioFed', the triple pattern wise source selection and the semantic source normalisation forms the core to our solution. It gathers and integrates data from newly identified public endpoints for federated access. Basic provenance information is linked to the retrieved data. Last but not least, BioFed makes use of the latest SPARQL standard (i.e., 1.1) to leverage the full benefits for query federation. The evaluation is based on 10 simple and 10 complex queries, which address data in 10 major and very popular data sources (e.g., Dugbank, Sider). BioFed is a solution for a single-point-of-access for a large number of SPARQL endpoints providing life science data. It facilitates efficient query generation for data access and provides basic provenance information in combination with the retrieved data. BioFed fully supports SPARQL 1.1 and gives access to the

  20. Sharing-Aware Horizontal Partitioning for Exploiting Correlations during Query Processing

    DEFF Research Database (Denmark)

    Tzoumas, Kostas; Deshpande, Amol; Jensen, Christian Søndergaard

    2010-01-01

    Optimization of join queries based on average selectivities is suboptimal in highly correlated databases. In such databases, relations are naturally divided into partitions, each partition having substantially different statistical characteristics. It is very compelling to discover such data...... partitions during query optimization and create multiple plans for a given query, one plan being optimal for a particular combination of data partitions. This scenario calls for the sharing of state among plans, so that common intermediate results are not recomputed. We study this problem in a setting...