WorldWideScience

Sample records for graphs-at-a-time query language

  1. Implementing GraphQL as a Query Language for Deductive Databases in SWI-Prolog Using DCGs, Quasi Quotations, and Dicts

    Directory of Open Access Journals (Sweden)

    Falco Nogatz

    2017-01-01

    Full Text Available The methods to access large relational databases in a distributed system are well established: the relational query language SQL often serves as a language for data access and manipulation, and in addition public interfaces are exposed using communication protocols like REST. Similarly to REST, GraphQL is the query protocol of an application layer developed by Facebook. It provides a unified interface between the client and the server for data fetching and manipulation. Using GraphQL's type system, it is possible to specify data handling of various sources and to combine, e.g., relational with NoSQL databases. In contrast to REST, GraphQL provides a single API endpoint and supports flexible queries over linked data. GraphQL can also be used as an interface for deductive databases. In this paper, we give an introduction of GraphQL and a comparison to REST. Using language features recently added to SWI-Prolog 7, we have developed the Prolog library GraphQL.pl, which implements the GraphQL type system and query syntax as a domain-specific language with the help of definite clause grammars (DCG, quasi quotations, and dicts. Using our library, the type system created for a deductive database can be validated, while the query system provides a unified interface for data access and introspection.

  2. Labeling RDF Graphs for Linear Time and Space Querying

    Science.gov (United States)

    Furche, Tim; Weinzierl, Antonius; Bry, François

    Indices and data structures for web querying have mostly considered tree shaped data, reflecting the view of XML documents as tree-shaped. However, for RDF (and when querying ID/IDREF constraints in XML) data is indisputably graph-shaped. In this chapter, we first study existing indexing and labeling schemes for RDF and other graph datawith focus on support for efficient adjacency and reachability queries. For XML, labeling schemes are an important part of the widespread adoption of XML, in particular for mapping XML to existing (relational) database technology. However, the existing indexing and labeling schemes for RDF (and graph data in general) sacrifice one of the most attractive properties of XML labeling schemes, the constant time (and per-node space) test for adjacency (child) and reachability (descendant). In the second part, we introduce the first labeling scheme for RDF data that retains this property and thus achieves linear time and space processing of acyclic RDF queries on a significantly larger class of graphs than previous approaches (which are mostly limited to tree-shaped data). Finally, we show how this labeling scheme can be applied to (acyclic) SPARQL queries to obtain an evaluation algorithm with time and space complexity linear in the number of resources in the queried RDF graph.

  3. A distributed query execution engine of big attributed graphs.

    Science.gov (United States)

    Batarfi, Omar; Elshawi, Radwa; Fayoumi, Ayman; Barnawi, Ahmed; Sakr, Sherif

    2016-01-01

    A graph is a popular data model that has become pervasively used for modeling structural relationships between objects. In practice, in many real-world graphs, the graph vertices and edges need to be associated with descriptive attributes. Such type of graphs are referred to as attributed graphs. G-SPARQL has been proposed as an expressive language, with a centralized execution engine, for querying attributed graphs. G-SPARQL supports various types of graph querying operations including reachability, pattern matching and shortest path where any G-SPARQL query may include value-based predicates on the descriptive information (attributes) of the graph edges/vertices in addition to the structural predicates. In general, a main limitation of centralized systems is that their vertical scalability is always restricted by the physical limits of computer systems. This article describes the design, implementation in addition to the performance evaluation of DG-SPARQL, a distributed, hybrid and adaptive parallel execution engine of G-SPARQL queries. In this engine, the topology of the graph is distributed over the main memory of the underlying nodes while the graph data are maintained in a relational store which is replicated on the disk of each of the underlying nodes. DG-SPARQL evaluates parts of the query plan via SQL queries which are pushed to the underlying relational stores while other parts of the query plan, as necessary, are evaluated via indexless memory-based graph traversal algorithms. Our experimental evaluation shows the efficiency and the scalability of DG-SPARQL on querying massive attributed graph datasets in addition to its ability to outperform the performance of Apache Giraph, a popular distributed graph processing system, by orders of magnitudes.

  4. Query Optimizations over Decentralized RDF Graphs

    KAUST Repository

    Abdelaziz, Ibrahim

    2017-05-18

    Applications in life sciences, decentralized social networks, Internet of Things, and statistical linked dataspaces integrate data from multiple decentralized RDF graphs via SPARQL queries. Several approaches have been proposed to optimize query processing over a small number of heterogeneous data sources by utilizing schema information. In the case of schema similarity and interlinks among sources, these approaches cause unnecessary data retrieval and communication, leading to poor scalability and response time. This paper addresses these limitations and presents Lusail, a system for scalable and efficient SPARQL query processing over decentralized graphs. Lusail achieves scalability and low query response time through various optimizations at compile and run times. At compile time, we use a novel locality-aware query decomposition technique that maximizes the number of query triple patterns sent together to a source based on the actual location of the instances satisfying these triple patterns. At run time, we use selectivity-awareness and parallel query execution to reduce network latency and to increase parallelism by delaying the execution of subqueries expected to return large results. We evaluate Lusail using real and synthetic benchmarks, with data sizes up to billions of triples on an in-house cluster and a public cloud. We show that Lusail outperforms state-of-the-art systems by orders of magnitude in terms of scalability and response time.

  5. Constant time distance queries in planar unweighted graphs with subquadratic preprocessing time

    DEFF Research Database (Denmark)

    Wulff-Nilsen, C.

    2013-01-01

    Let G be an n-vertex planar, undirected, and unweighted graph. It was stated as open problems whether the Wiener index, defined as the sum of all-pairs shortest path distances, and the diameter of G can be computed in o(n(2)) time. We show that both problems can be solved in O(n(2) log log n/log n......) time with O(n) space. The techniques that we apply allow us to build, within the same time bound, an oracle for exact distance queries in G. More generally, for any parameter S is an element of [(log n/log log n)(2), n(2/5)], distance queries can be answered in O (root S log S/log n) time per query...... with O(n(2)/root S) preprocessing time and space requirement. With respect to running time, this is better than previous algorithms when log S = o(log n). All algorithms have linear space requirement. Our results generalize to a larger class of graphs including those with a fixed excluded minor. (C) 2012...

  6. The CMS DBS query language

    International Nuclear Information System (INIS)

    Kuznetsov, Valentin; Riley, Daniel; Afaq, Anzar; Sekhri, Vijay; Guo Yuyi; Lueking, Lee

    2010-01-01

    The CMS experiment has implemented a flexible and powerful system enabling users to find data within the CMS physics data catalog. The Dataset Bookkeeping Service (DBS) comprises a database and the services used to store and access metadata related to CMS physics data. To this, we have added a generalized query system in addition to the existing web and programmatic interfaces to the DBS. This query system is based on a query language that hides the complexity of the underlying database structure by discovering the join conditions between database tables. This provides a way of querying the system that is simple and straightforward for CMS data managers and physicists to use without requiring knowledge of the database tables or keys. The DBS Query Language uses the ANTLR tool to build the input query parser and tokenizer, followed by a query builder that uses a graph representation of the DBS schema to construct the SQL query sent to underlying database. We will describe the design of the query system, provide details of the language components and overview of how this component fits into the overall data discovery system architecture.

  7. Approximate distance oracles for planar graphs with improved query time-space tradeoff

    DEFF Research Database (Denmark)

    Wulff-Nilsen, Christian

    2016-01-01

    We consider approximate distance oracles for edge-weighted n-vertex undirected planar graphs. Given fixed ϵ > 0, we present a (1 + ϵ)-approximate distance oracle with O(n(log log n)2) space and O((loglogr?,)3) query time. This improves the previous best product of query time and space...... of the oracles of Thorup (FOCS 2001, J. ACM 2004) and Klein (SODA 2002) from O(nlogn) to O(n(loglogn)5)....

  8. On a Fuzzy Algebra for Querying Graph Databases

    OpenAIRE

    Pivert , Olivier; Thion , Virginie; Jaudoin , Hélène; Smits , Grégory

    2014-01-01

    International audience; This paper proposes a notion of fuzzy graph database and describes a fuzzy query algebra that makes it possible to handle such database, which may be fuzzy or not, in a flexible way. The algebra, based on fuzzy set theory and the concept of a fuzzy graph, is composed of a set of operators that can be used to express preference queries on fuzzy graph databases. The preferences concern i) the content of the vertices of the graph and ii) the structure of the graph. In a s...

  9. Query optimization for graph analytics on linked data using SPARQL

    Energy Technology Data Exchange (ETDEWEB)

    Hong, Seokyong [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lee, Sangkeun [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lim, Seung -Hwan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sukumar, Sreenivas R. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Vatsavai, Ranga Raju [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2015-07-01

    Triplestores that support query languages such as SPARQL are emerging as the preferred and scalable solution to represent data and meta-data as massive heterogeneous graphs using Semantic Web standards. With increasing adoption, the desire to conduct graph-theoretic mining and exploratory analysis has also increased. Addressing that desire, this paper presents a solution that is the marriage of Graph Theory and the Semantic Web. We present software that can analyze Linked Data using graph operations such as counting triangles, finding eccentricity, testing connectedness, and computing PageRank directly on triple stores via the SPARQL interface. We describe the process of optimizing performance of the SPARQL-based implementation of such popular graph algorithms by reducing the space-overhead, simplifying iterative complexity and removing redundant computations by understanding query plans. Our optimized approach shows significant performance gains on triplestores hosted on stand-alone workstations as well as hardware-optimized scalable supercomputers such as the Cray XMT.

  10. Manchester visual query language

    Science.gov (United States)

    Oakley, John P.; Davis, Darryl N.; Shann, Richard T.

    1993-04-01

    We report a database language for visual retrieval which allows queries on image feature information which has been computed and stored along with images. The language is novel in that it provides facilities for dealing with feature data which has actually been obtained from image analysis. Each line in the Manchester Visual Query Language (MVQL) takes a set of objects as input and produces another, usually smaller, set as output. The MVQL constructs are mainly based on proven operators from the field of digital image analysis. An example is the Hough-group operator which takes as input a specification for the objects to be grouped, a specification for the relevant Hough space, and a definition of the voting rule. The output is a ranked list of high scoring bins. The query could be directed towards one particular image or an entire image database, in the latter case the bins in the output list would in general be associated with different images. We have implemented MVQL in two layers. The command interpreter is a Lisp program which maps each MVQL line to a sequence of commands which are used to control a specialized database engine. The latter is a hybrid graph/relational system which provides low-level support for inheritance and schema evolution. In the paper we outline the language and provide examples of useful queries. We also describe our solution to the engineering problems associated with the implementation of MVQL.

  11. LogiQL a query language for smart databases

    CERN Document Server

    Halpin, Terry

    2014-01-01

    LogiQL is a new state-of-the-art programming language based on Datalog. It can be used to build applications that combine transactional, analytical, graph, probabilistic, and mathematical programming. LogiQL makes it possible to build hybrid applications that previously required multiple programming languages and databases. In this first book to cover LogiQL, the authors explain how to design, implement, and query deductive databases using this new programming language. LogiQL's declarative approach enables complex data structures and business rules to be simply specified and then automaticall

  12. Relative expressive power of navigational querying on graphs using transitive closure

    OpenAIRE

    Surinx, Dimitri; Fletcher, George H. L.; Gyssens, Marc; Leinders, Dirk; Van den Bussche, Jan; Van Gucht, Dirk; Vansummeren, Stijn; Wu, Yuqing

    2015-01-01

    Motivated by both established and new applications, we study navigational query languages for graphs (binary relations). The simplest language has only the two operators union and composition, together with the identity relation. We make more powerful languages by adding any of the following operators: intersection; set difference; projection; coprojection; converse; transitive closure; and the diversity relation. All these operators map binary relations to binary relations. We compare the ex...

  13. Query Optimizations over Decentralized RDF Graphs

    KAUST Repository

    Abdelaziz, Ibrahim; Mansour, Essam; Ouzzani, Mourad; Aboulnaga, Ashraf; Kalnis, Panos

    2017-01-01

    Applications in life sciences, decentralized social networks, Internet of Things, and statistical linked dataspaces integrate data from multiple decentralized RDF graphs via SPARQL queries. Several approaches have been proposed to optimize query

  14. VIGOR: Interactive Visual Exploration of Graph Query Results.

    Science.gov (United States)

    Pienta, Robert; Hohman, Fred; Endert, Alex; Tamersoy, Acar; Roundy, Kevin; Gates, Chris; Navathe, Shamkant; Chau, Duen Horng

    2018-01-01

    Finding patterns in graphs has become a vital challenge in many domains from biological systems, network security, to finance (e.g., finding money laundering rings of bankers and business owners). While there is significant interest in graph databases and querying techniques, less research has focused on helping analysts make sense of underlying patterns within a group of subgraph results. Visualizing graph query results is challenging, requiring effective summarization of a large number of subgraphs, each having potentially shared node-values, rich node features, and flexible structure across queries. We present VIGOR, a novel interactive visual analytics system, for exploring and making sense of query results. VIGOR uses multiple coordinated views, leveraging different data representations and organizations to streamline analysts sensemaking process. VIGOR contributes: (1) an exemplar-based interaction technique, where an analyst starts with a specific result and relaxes constraints to find other similar results or starts with only the structure (i.e., without node value constraints), and adds constraints to narrow in on specific results; and (2) a novel feature-aware subgraph result summarization. Through a collaboration with Symantec, we demonstrate how VIGOR helps tackle real-world problems through the discovery of security blindspots in a cybersecurity dataset with over 11,000 incidents. We also evaluate VIGOR with a within-subjects study, demonstrating VIGOR's ease of use over a leading graph database management system, and its ability to help analysts understand their results at higher speed and make fewer errors.

  15. Parasol: An Architecture for Cross-Cloud Federated Graph Querying

    Energy Technology Data Exchange (ETDEWEB)

    Lieberman, Michael; Choudhury, Sutanay; Hughes, Marisa; Patrone, Dennis; Hider, Sandy; Piatko, Christine; Chapman, Matthew; Marple, JP; Silberberg, David

    2014-06-22

    Large scale data fusion of multiple datasets can often provide in- sights that examining datasets individually cannot. However, when these datasets reside in different data centers and cannot be collocated due to technical, administrative, or policy barriers, a unique set of problems arise that hamper querying and data fusion. To ad- dress these problems, a system and architecture named Parasol is presented that enables federated queries over graph databases residing in multiple clouds. Parasol’s design is flexible and requires only minimal assumptions for participant clouds. Query optimization techniques are also described that are compatible with Parasol’s lightweight architecture. Experiments on a prototype implementation of Parasol indicate its suitability for cross-cloud federated graph queries.

  16. A graph rewriting programming language for graph drawing

    OpenAIRE

    Rodgers, Peter

    1998-01-01

    This paper describes Grrr, a prototype visual graph drawing tool. Previously there were no visual languages for programming graph drawing algorithms despite the inherently visual nature of the process. The languages which gave a diagrammatic view of graphs were not computationally complete and so could not be used to implement complex graph drawing algorithms. Hence current graph drawing tools are all text based. Recent developments in graph rewriting systems have produced computationally com...

  17. EmptyHeaded: A Relational Engine for Graph Processing.

    Science.gov (United States)

    Aberger, Christopher R; Tu, Susan; Olukotun, Kunle; Ré, Christopher

    2016-01-01

    There are two types of high-performance graph processing engines: low- and high-level engines. Low-level engines (Galois, PowerGraph, Snap) provide optimized data structures and computation models but require users to write low-level imperative code, hence ensuring that efficiency is the burden of the user. In high-level engines, users write in query languages like datalog (SociaLite) or SQL (Grail). High-level engines are easier to use but are orders of magnitude slower than the low-level graph engines. We present EmptyHeaded, a high-level engine that supports a rich datalog-like query language and achieves performance comparable to that of low-level engines. At the core of EmptyHeaded's design is a new class of join algorithms that satisfy strong theoretical guarantees but have thus far not achieved performance comparable to that of specialized graph processing engines. To achieve high performance, EmptyHeaded introduces a new join engine architecture, including a novel query optimizer and data layouts that leverage single-instruction multiple data (SIMD) parallelism. With this architecture, EmptyHeaded outperforms high-level approaches by up to three orders of magnitude on graph pattern queries, PageRank, and Single-Source Shortest Paths (SSSP) and is an order of magnitude faster than many low-level baselines. We validate that EmptyHeaded competes with the best-of-breed low-level engine (Galois), achieving comparable performance on PageRank and at most 3× worse performance on SSSP.

  18. Reactome graph database: Efficient access to complex pathway data

    Science.gov (United States)

    Korninger, Florian; Viteri, Guilherme; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D’Eustachio, Peter

    2018-01-01

    Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types. PMID:29377902

  19. Reactome graph database: Efficient access to complex pathway data.

    Directory of Open Access Journals (Sweden)

    Antonio Fabregat

    2018-01-01

    Full Text Available Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j as well as the new ContentService (REST API that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.

  20. Reactome graph database: Efficient access to complex pathway data.

    Science.gov (United States)

    Fabregat, Antonio; Korninger, Florian; Viteri, Guilherme; Sidiropoulos, Konstantinos; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning

    2018-01-01

    Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.

  1. Graph Mining Meets the Semantic Web

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Sangkeun (Matt) [ORNL; Sukumar, Sreenivas R [ORNL; Lim, Seung-Hwan [ORNL

    2015-01-01

    The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today, data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. We address that need through implementation of three popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, and PageRank). We implement these algorithms as SPARQL queries, wrapped within Python scripts. We evaluate the performance of our implementation on 6 real world data sets and show graph mining algorithms (that have a linear-algebra formulation) can indeed be unleashed on data represented as RDF graphs using the SPARQL query interface.

  2. SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases.

    Science.gov (United States)

    Schweiger, Dominik; Trajanoski, Zlatko; Pabinger, Stephan

    2014-08-15

    Semantic Web has established itself as a framework for using and sharing data across applications and database boundaries. Here, we present a web-based platform for querying biological Semantic Web databases in a graphical way. SPARQLGraph offers an intuitive drag & drop query builder, which converts the visual graph into a query and executes it on a public endpoint. The tool integrates several publicly available Semantic Web databases, including the databases of the just recently released EBI RDF platform. Furthermore, it provides several predefined template queries for answering biological questions. Users can easily create and save new query graphs, which can also be shared with other researchers. This new graphical way of creating queries for biological Semantic Web databases considerably facilitates usability as it removes the requirement of knowing specific query languages and database structures. The system is freely available at http://sparqlgraph.i-med.ac.at.

  3. RDF-GL : a SPARQL-based graphical query language for RDF

    NARCIS (Netherlands)

    Hogenboom, F.P.; Milea, D.V.; Frasincar, F.; Kaymak, U.; Chbeir, R.; Badr, Y.; Abraham, A.; Hassanien, A.-E.

    2010-01-01

    This chapter presents RDF-GL, a graphical query language (GQL) for RDF. The GQL is based on the textual query language SPARQL and mainly focuses on SPARQL SELECT queries. The advantage of a GQL over textual query languages is that complexity is hidden through the use of graphical symbols. RDF-GL is

  4. RDF-GL: A SPARQL-Based Graphical Query Language for RDF

    Science.gov (United States)

    Hogenboom, Frederik; Milea, Viorel; Frasincar, Flavius; Kaymak, Uzay

    This chapter presents RDF-GL, a graphical query language (GQL) for RDF. The GQL is based on the textual query language SPARQL and mainly focuses on SPARQL SELECT queries. The advantage of a GQL over textual query languages is that complexity is hidden through the use of graphical symbols. RDF-GL is supported by a Java-based editor, SPARQLinG, which is presented as well. The editor does not only allow for RDF-GL query creation, but also converts RDF-GL queries to SPARQL queries and is able to subsequently execute these. Experiments show that using the GQL in combination with the editor makes RDF querying more accessible for end users.

  5. Ontology Based Queries - Investigating a Natural Language Interface

    NARCIS (Netherlands)

    van der Sluis, Ielka; Hielkema, F.; Mellish, C.; Doherty, G.

    2010-01-01

    In this paper we look at what may be learned from a comparative study examining non-technical users with a background in social science browsing and querying metadata. Four query tasks were carried out with a natural language interface and with an interface that uses a web paradigm with hyperlinks.

  6. Complex analyses on clinical information systems using restricted natural language querying to resolve time-event dependencies.

    Science.gov (United States)

    Safari, Leila; Patrick, Jon D

    2018-06-01

    This paper reports on a generic framework to provide clinicians with the ability to conduct complex analyses on elaborate research topics using cascaded queries to resolve internal time-event dependencies in the research questions, as an extension to the proposed Clinical Data Analytics Language (CliniDAL). A cascaded query model is proposed to resolve internal time-event dependencies in the queries which can have up to five levels of criteria starting with a query to define subjects to be admitted into a study, followed by a query to define the time span of the experiment. Three more cascaded queries can be required to define control groups, control variables and output variables which all together simulate a real scientific experiment. According to the complexity of the research questions, the cascaded query model has the flexibility of merging some lower level queries for simple research questions or adding a nested query to each level to compose more complex queries. Three different scenarios (one of them contains two studies) are described and used for evaluation of the proposed solution. CliniDAL's complex analyses solution enables answering complex queries with time-event dependencies at most in a few hours which manually would take many days. An evaluation of results of the research studies based on the comparison between CliniDAL and SQL solutions reveals high usability and efficiency of CliniDAL's solution. Copyright © 2018 Elsevier Inc. All rights reserved.

  7. Querying Sentiment Development over Time

    DEFF Research Database (Denmark)

    Andreasen, Troels; Christiansen, Henning; Have, Christian Theil

    2013-01-01

    A new language is introduced for describing hypotheses about fluctuations of measurable properties in streams of timestamped data, and as prime example, we consider trends of emotions in the constantly flowing stream of Twitter messages. The language, called EmoEpisodes, has a precise semantics...... that measures how well a hypothesis characterizes a given time interval; the semantics is parameterized so it can be adjusted to different views of the data. EmoEpisodes is extended to a query language with variables standing for unknown topics and emotions, and the query-answering mechanism will return...... instantiations for topics and emotions as well as time intervals that provide the largest deflections in this measurement. Experiments are performed on a selection of Twitter data to demonstrates the usefulness of the approach....

  8. Fuzzy Graph Language Recognizability

    OpenAIRE

    Kalampakas , Antonios; Spartalis , Stefanos; Iliadis , Lazaros

    2012-01-01

    Part 5: Fuzzy Logic; International audience; Fuzzy graph language recognizability is introduced along the lines of the established theory of syntactic graph language recognizability by virtue of the algebraic structure of magmoids. The main closure properties of the corresponding class are investigated and several interesting examples of fuzzy graph languages are examined.

  9. Macromolecular query language (MMQL): prototype data model and implementation.

    Science.gov (United States)

    Shindyalov, I N; Chang, W; Pu, C; Bourne, P E

    1994-11-01

    Macromolecular query language (MMQL) is an extensible interpretive language in which to pose questions concerning the experimental or derived features of the 3-D structure of biological macromolecules. MMQL portends to be intuitive with a simple syntax, so that from a user's perspective complex queries are easily written. A number of basic queries and a more complex query--determination of structures containing a five-strand Greek key motif--are presented to illustrate the strengths and weaknesses of the language. The predominant features of MMQL are a filter and pattern grammar which are combined to express a wide range of interesting biological queries. Filters permit the selection of object attributes, for example, compound name and resolution, whereas the patterns currently implemented query primary sequence, close contacts, hydrogen bonding, secondary structure, conformation and amino acid properties (volume, polarity, isoelectric point, hydrophobicity and different forms of exposure). MMQL queries are processed by MMQLlib; a C++ class library, to which new query methods and pattern types are easily added. The prototype implementation described uses PDBlib, another C(++)-based class library from representing the features of biological macromolecules at the level of detail parsable from a PDB file. Since PDBlib can represent data stored in relational and object-oriented databases, as well as PDB files, once these data are loaded they too can be queried by MMQL. Performance metrics are given for queries of PDB files for which all derived data are calculated at run time and compared to a preliminary version of OOPDB, a prototype object-oriented database with a schema based on a persistent version of PDBlib which offers more efficient data access and the potential to maintain derived information. MMQLlib, PDBlib and associated software are available via anonymous ftp from cuhhca.hhmi.columbia.edu.

  10. An Association-Oriented Partitioning Approach for Streaming Graph Query

    Directory of Open Access Journals (Sweden)

    Yun Hao

    2017-01-01

    Full Text Available The volumes of real-world graphs like knowledge graph are increasing rapidly, which makes streaming graph processing a hot research area. Processing graphs in streaming setting poses significant challenges from different perspectives, among which graph partitioning method plays a key role. Regarding graph query, a well-designed partitioning method is essential for achieving better performance. Existing offline graph partitioning methods often require full knowledge of the graph, which is not possible during streaming graph processing. In order to handle this problem, we propose an association-oriented streaming graph partitioning method named Assc. This approach first computes the rank values of vertices with a hybrid approximate PageRank algorithm. After splitting these vertices with an adapted variant affinity propagation algorithm, the process order on vertices in the sliding window can be determined. Finally, according to the level of these vertices and their association, the partition where the vertices should be distributed is decided. We compare its performance with a set of streaming graph partition methods and METIS, a widely adopted offline approach. The results show that our solution can partition graphs with hundreds of millions of vertices in streaming setting on a large collection of graph datasets and our approach outperforms other graph partitioning methods.

  11. Graphical modeling and query language for hospitals.

    Science.gov (United States)

    Barzdins, Janis; Barzdins, Juris; Rencis, Edgars; Sostaks, Agris

    2013-01-01

    So far there has been little evidence that implementation of the health information technologies (HIT) is leading to health care cost savings. One of the reasons for this lack of impact by the HIT likely lies in the complexity of the business process ownership in the hospitals. The goal of our research is to develop a business model-based method for hospital use which would allow doctors to retrieve directly the ad-hoc information from various hospital databases. We have developed a special domain-specific process modelling language called the MedMod. Formally, we define the MedMod language as a profile on UML Class diagrams, but we also demonstrate it on examples, where we explain the semantics of all its elements informally. Moreover, we have developed the Process Query Language (PQL) that is based on MedMod process definition language. The purpose of PQL is to allow a doctor querying (filtering) runtime data of hospital's processes described using MedMod. The MedMod language tries to overcome deficiencies in existing process modeling languages, allowing to specify the loosely-defined sequence of the steps to be performed in the clinical process. The main advantages of PQL are in two main areas - usability and efficiency. They are: 1) the view on data through "glasses" of familiar process, 2) the simple and easy-to-perceive means of setting filtering conditions require no more expertise than using spreadsheet applications, 3) the dynamic response to each step in construction of the complete query that shortens the learning curve greatly and reduces the error rate, and 4) the selected means of filtering and data retrieving allows to execute queries in O(n) time regarding the size of the dataset. We are about to continue developing this project with three further steps. First, we are planning to develop user-friendly graphical editors for the MedMod process modeling and query languages. The second step is to do evaluation of usability the proposed language and tool

  12. EquiX-A Search and Query Language for XML.

    Science.gov (United States)

    Cohen, Sara; Kanza, Yaron; Kogan, Yakov; Sagiv, Yehoshua; Nutt, Werner; Serebrenik, Alexander

    2002-01-01

    Describes EquiX, a search language for XML that combines querying with searching to query the data and the meta-data content of Web pages. Topics include search engines; a data model for XML documents; search query syntax; search query semantics; an algorithm for evaluating a query on a document; and indexing EquiX queries. (LRW)

  13. A Random Walk Approach to Query Informative Constraints for Clustering.

    Science.gov (United States)

    Abin, Ahmad Ali

    2017-08-09

    This paper presents a random walk approach to the problem of querying informative constraints for clustering. The proposed method is based on the properties of the commute time, that is the expected time taken for a random walk to travel between two nodes and return, on the adjacency graph of data. Commute time has the nice property of that, the more short paths connect two given nodes in a graph, the more similar those nodes are. Since computing the commute time takes the Laplacian eigenspectrum into account, we use this property in a recursive fashion to query informative constraints for clustering. At each recursion, the proposed method constructs the adjacency graph of data and utilizes the spectral properties of the commute time matrix to bipartition the adjacency graph. Thereafter, the proposed method benefits from the commute times distance on graph to query informative constraints between partitions. This process iterates for each partition until the stop condition becomes true. Experiments on real-world data show the efficiency of the proposed method for constraints selection.

  14. Instant Cassandra query language

    CERN Document Server

    Singh, Amresh

    2013-01-01

    Get to grips with a new technology, understand what it is and what it can do for you, and then get to work with the most important features and tasks. It's an Instant Starter guide.Instant Cassandra Query Language is great for those who are working with Cassandra databases and who want to either learn CQL to check data from the console or build serious applications using CQL. If you're looking for something that helps you get started with CQL in record time and you hate the idea of learning a new language syntax, then this book is for you.

  15. Path Index Based Keywords to SPARQL Query Transformation for Semantic Data Federations

    Directory of Open Access Journals (Sweden)

    Thilini Cooray

    2016-06-01

    Full Text Available Semantic web is a highly emerging research domain. Enhancing the ability of keyword query processing on Semantic Web data provides a huge support for familiarizing the usefulness of Semantic Web to the general public. Most of the existing approaches focus on just user keyword matching to RDF graphs and output the connecting elements as results. Semantic Web consists of SPARQL query language which can process queries more accurately and efficiently than general keyword matching. There are only about a couple of approaches available for transforming keyword queries to SPARQL. They basically rely on real time graph traversals? for identifying subgraphs which can connect user keywords. Those approaches are either limited to query processing on a single data store or a set of interlinked data sets. They have not focused on query processing on a federation of independent data sets which belongs to the same domain. This research proposes a Path Index based approach eliminating real time graph traversal for transforming keyword queries to SPARQL. We have introduced an ontology alignment based approach for keyword query transforming on a federation of RDF data stored using multiple heterogeneous vocabularies. Evaluation shows that the proposed approach have the ability to generate SPARQL queries which can provide highly relevant results for user keyword queries. The Path Index based query transformation approach has also achieved high efficiency compared to the existing approach.

  16. A model of language inflection graphs

    Science.gov (United States)

    Fukś, Henryk; Farzad, Babak; Cao, Yi

    2014-01-01

    Inflection graphs are highly complex networks representing relationships between inflectional forms of words in human languages. For so-called synthetic languages, such as Latin or Polish, they have particularly interesting structure due to the abundance of inflectional forms. We construct the simplest form of inflection graphs, namely a bipartite graph in which one group of vertices corresponds to dictionary headwords and the other group to inflected forms encountered in a given text. We, then, study projection of this graph on the set of headwords. The projection decomposes into a large number of connected components, to be called word groups. Distribution of sizes of word group exhibits some remarkable properties, resembling cluster distribution in a lattice percolation near the critical point. We propose a simple model which produces graphs of this type, reproducing the desired component distribution and other topological features.

  17. A Typed Text Retrieval Query Language for XML Documents.

    Science.gov (United States)

    Colazzo, Dario; Sartiani, Carlo; Albano, Antonio; Manghi, Paolo; Ghelli, Giorgio; Lini, Luca; Paoli, Michele

    2002-01-01

    Discussion of XML focuses on a description of Tequyla-TX, a typed text retrieval query language for XML documents that can search on both content and structures. Highlights include motivations; numerous examples; word-based and char-based searches; tag-dependent full-text searches; text normalization; query algebra; data models and term language;…

  18. SPARQL Assist language-neutral query composer

    Science.gov (United States)

    2012-01-01

    Background SPARQL query composition is difficult for the lay-person, and even the experienced bioinformatician in cases where the data model is unfamiliar. Moreover, established best-practices and internationalization concerns dictate that the identifiers for ontological terms should be opaque rather than human-readable, which further complicates the task of synthesizing queries manually. Results We present SPARQL Assist: a Web application that addresses these issues by providing context-sensitive type-ahead completion during SPARQL query construction. Ontological terms are suggested using their multi-lingual labels and descriptions, leveraging existing support for internationalization and language-neutrality. Moreover, the system utilizes the semantics embedded in ontologies, and within the query itself, to help prioritize the most likely suggestions. Conclusions To ensure success, the Semantic Web must be easily available to all users, regardless of locale, training, or preferred language. By enhancing support for internationalization, and moreover by simplifying the manual construction of SPARQL queries through the use of controlled-natural-language interfaces, we believe we have made some early steps towards simplifying access to Semantic Web resources. PMID:22373327

  19. SPARQL assist language-neutral query composer.

    Science.gov (United States)

    McCarthy, Luke; Vandervalk, Ben; Wilkinson, Mark

    2012-01-25

    SPARQL query composition is difficult for the lay-person, and even the experienced bioinformatician in cases where the data model is unfamiliar. Moreover, established best-practices and internationalization concerns dictate that the identifiers for ontological terms should be opaque rather than human-readable, which further complicates the task of synthesizing queries manually. We present SPARQL Assist: a Web application that addresses these issues by providing context-sensitive type-ahead completion during SPARQL query construction. Ontological terms are suggested using their multi-lingual labels and descriptions, leveraging existing support for internationalization and language-neutrality. Moreover, the system utilizes the semantics embedded in ontologies, and within the query itself, to help prioritize the most likely suggestions. To ensure success, the Semantic Web must be easily available to all users, regardless of locale, training, or preferred language. By enhancing support for internationalization, and moreover by simplifying the manual construction of SPARQL queries through the use of controlled-natural-language interfaces, we believe we have made some early steps towards simplifying access to Semantic Web resources.

  20. Cyclone: java-based querying and computing with Pathway/Genome databases.

    Science.gov (United States)

    Le Fèvre, François; Smidtas, Serge; Schächter, Vincent

    2007-05-15

    Cyclone aims at facilitating the use of BioCyc, a collection of Pathway/Genome Databases (PGDBs). Cyclone provides a fully extensible Java Object API to analyze and visualize these data. Cyclone can read and write PGDBs, and can write its own data in the CycloneML format. This format is automatically generated from the BioCyc ontology by Cyclone itself, ensuring continued compatibility. Cyclone objects can also be stored in a relational database CycloneDB. Queries can be written in SQL, and in an intuitive and concise object-oriented query language, Hibernate Query Language (HQL). In addition, Cyclone interfaces easily with Java software including the Eclipse IDE for HQL edition, the Jung API for graph algorithms or Cytoscape for graph visualization. Cyclone is freely available under an open source license at: http://sourceforge.net/projects/nemo-cyclone. For download and installation instructions, tutorials, use cases and examples, see http://nemo-cyclone.sourceforge.net.

  1. A Relational Algebra Query Language for Programming Relational Databases

    Science.gov (United States)

    McMaster, Kirby; Sambasivam, Samuel; Anderson, Nicole

    2011-01-01

    In this paper, we describe a Relational Algebra Query Language (RAQL) and Relational Algebra Query (RAQ) software product we have developed that allows database instructors to teach relational algebra through programming. Instead of defining query operations using mathematical notation (the approach commonly taken in database textbooks), students…

  2. Extending the Query Language of a Data Warehouse for Patient Recruitment.

    Science.gov (United States)

    Dietrich, Georg; Ertl, Maximilian; Fette, Georg; Kaspar, Mathias; Krebs, Jonathan; Mackenrodt, Daniel; Störk, Stefan; Puppe, Frank

    2017-01-01

    Patient recruitment for clinical trials is a laborious task, as many texts have to be screened. Usually, this work is done manually and takes a lot of time. We have developed a system that automates the screening process. Besides standard keyword queries, the query language supports extraction of numbers, time-spans and negations. In a feasibility study for patient recruitment from a stroke unit with 40 patients, we achieved encouraging extraction rates above 95% for numbers and negations and ca. 86% for time spans.

  3. Knowledge Query Language (KQL)

    Science.gov (United States)

    2016-02-12

    described as a sparse, distributed multidimensional sorted map. Unlike a relational database , BigTable has no multicolumn primary keys or constraints. The...in query languages such as SQL. Figure 3. Address expression-based querying. Each circled step in Figure 3 is described below. Datastore/ Database ...implementation we describe in later sections stores the instance of registry ontology in JSON files. 7 Throughout the rest of this report, we use the

  4. Answering SPARQL queries modulo RDF Schema with paths

    OpenAIRE

    Alkhateeb, Faisal; Euzenat, Jérôme

    2013-01-01

    alkhateeb2013a; SPARQL is the standard query language for RDF graphs. In its strict instantiation, it only offers querying according to the RDF semantics and would thus ignore the semantics of data expressed with respect to (RDF) schemas or (OWL) ontologies. Several extensions to SPARQL have been proposed to query RDF data modulo RDFS, i.e., interpreting the query with RDFS semantics and/or considering external ontologies. We introduce a general framework which allows for expressing query ans...

  5. Generating and Executing Complex Natural Language Queries across Linked Data.

    Science.gov (United States)

    Hamon, Thierry; Mougin, Fleur; Grabar, Natalia

    2015-01-01

    With the recent and intensive research in the biomedical area, the knowledge accumulated is disseminated through various knowledge bases. Links between these knowledge bases are needed in order to use them jointly. Linked Data, SPARQL language, and interfaces in Natural Language question-answering provide interesting solutions for querying such knowledge bases. We propose a method for translating natural language questions in SPARQL queries. We use Natural Language Processing tools, semantic resources, and the RDF triples description. The method is designed on 50 questions over 3 biomedical knowledge bases, and evaluated on 27 questions. It achieves 0.78 F-measure on the test set. The method for translating natural language questions into SPARQL queries is implemented as Perl module available at http://search.cpan.org/ thhamon/RDF-NLP-SPARQLQuery.

  6. The SQL++ Query Language: Configurable, Unifying and Semi-structured

    OpenAIRE

    Ong, Kian Win; Papakonstantinou, Yannis; Vernoux, Romain

    2014-01-01

    NoSQL databases support semi-structured data, typically modeled as JSON. They also provide limited (but expanding) query languages. Their idiomatic, non-SQL language constructs, the many variations, and the lack of formal semantics inhibit deep understanding of the query languages, and also impede progress towards clean, powerful, declarative query languages. This paper specifies the syntax and semantics of SQL++, which is applicable to both JSON native stores and SQL databases. The SQL++ sem...

  7. A natural language interface plug-in for cooperative query answering in biological databases.

    Science.gov (United States)

    Jamil, Hasan M

    2012-06-11

    One of the many unique features of biological databases is that the mere existence of a ground data item is not always a precondition for a query response. It may be argued that from a biologist's standpoint, queries are not always best posed using a structured language. By this we mean that approximate and flexible responses to natural language like queries are well suited for this domain. This is partly due to biologists' tendency to seek simpler interfaces and partly due to the fact that questions in biology involve high level concepts that are open to interpretations computed using sophisticated tools. In such highly interpretive environments, rigidly structured databases do not always perform well. In this paper, our goal is to propose a semantic correspondence plug-in to aid natural language query processing over arbitrary biological database schema with an aim to providing cooperative responses to queries tailored to users' interpretations. Natural language interfaces for databases are generally effective when they are tuned to the underlying database schema and its semantics. Therefore, changes in database schema become impossible to support, or a substantial reorganization cost must be absorbed to reflect any change. We leverage developments in natural language parsing, rule languages and ontologies, and data integration technologies to assemble a prototype query processor that is able to transform a natural language query into a semantically equivalent structured query over the database. We allow knowledge rules and their frequent modifications as part of the underlying database schema. The approach we adopt in our plug-in overcomes some of the serious limitations of many contemporary natural language interfaces, including support for schema modifications and independence from underlying database schema. The plug-in introduced in this paper is generic and facilitates connecting user selected natural language interfaces to arbitrary databases using a

  8. EAGLE: 'EAGLE'Is an' Algorithmic Graph Library for Exploration

    Energy Technology Data Exchange (ETDEWEB)

    2015-01-16

    The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. Today there is no tools to conduct "graph mining" on RDF standard data sets. We address that need through implementation of popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, degree distribution, diversity degree, PageRank, etc.). We implement these algorithms as SPARQL queries, wrapped within Python scripts and call our software tool as EAGLE. In RDF style, EAGLE stands for "EAGLE 'Is an' algorithmic graph library for exploration. EAGLE is like 'MATLAB' for 'Linked Data.'

  9. Concept-based query language approach to enterprise information systems

    Science.gov (United States)

    Niemi, Timo; Junkkari, Marko; Järvelin, Kalervo

    2014-01-01

    In enterprise information systems (EISs) it is necessary to model, integrate and compute very diverse data. In advanced EISs the stored data often are based both on structured (e.g. relational) and semi-structured (e.g. XML) data models. In addition, the ad hoc information needs of end-users may require the manipulation of data-oriented (structural), behavioural and deductive aspects of data. Contemporary languages capable of treating this kind of diversity suit only persons with good programming skills. In this paper we present a concept-oriented query language approach to manipulate this diversity so that the programming skill requirements are considerably reduced. In our query language, the features which need technical knowledge are hidden in application-specific concepts and structures. Therefore, users need not be aware of the underlying technology. Application-specific concepts and structures are represented by the modelling primitives of the extended RDOOM (relational deductive object-oriented modelling) which contains primitives for all crucial real world relationships (is-a relationship, part-of relationship, association), XML documents and views. Our query language also supports intensional and extensional-intensional queries, in addition to conventional extensional queries. In its query formulation, the end-user combines available application-specific concepts and structures through shared variables.

  10. Dynamic Representations of Sparse Graphs

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Fagerberg, Rolf

    1999-01-01

    We present a linear space data structure for maintaining graphs with bounded arboricity—a large class of sparse graphs containing e.g. planar graphs and graphs of bounded treewidth—under edge insertions, edge deletions, and adjacency queries. The data structure supports adjacency queries in worst...... case O(c) time, and edge insertions and edge deletions in amortized O(1) and O(c+log n) time, respectively, where n is the number of nodes in the graph, and c is the bound on the arboricity....

  11. Cyber Graph Queries for Geographically Distributed Data Centers

    Energy Technology Data Exchange (ETDEWEB)

    Berry, Jonathan W. [Mail Stop, Albuquerque, NM (United States); Collins, Michael [Christopher Newport Univ., VA (United States); Kearns, Aaron [Univ. of New Mexico, Albuquerque, NM (United States); Phillips, Cynthia A. [Mail Stop, Albuquerque, NM (United States); Saia, Jared [Univ. of New Mexico, Albuquerque, NM (United States)

    2015-05-01

    We present new algorithms for a distributed model for graph computations motivated by limited information sharing we first discussed in [20]. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a polylogarithmic size subgraph; 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centers. We have algorithms in both setting for s - t connectivity in both models. We also give an algorithm in the low-communication model for finding a planted clique. This is an anomaly- detection problem, finding a subgraph that is larger and denser than expected. For both the low- communication algorithms, we exploit structural properties of social networks to prove perfor- mance bounds better than what is possible for general graphs. For s - t connectivity, we use known properties. For planted clique, we propose a new property: bounded number of triangles per node. This property is based upon evidence from the social science literature. We found that classic examples of social networks do not have the bounded-triangles property. This is because many social networks contain elements that are non-human, such as accounts for a business, or other automated accounts. We describe some initial attempts to distinguish human nodes from automated nodes in social networks based only on topological properties.

  12. SPARK: Adapting Keyword Query to Semantic Search

    Science.gov (United States)

    Zhou, Qi; Wang, Chong; Xiong, Miao; Wang, Haofen; Yu, Yong

    Semantic search promises to provide more accurate result than present-day keyword search. However, progress with semantic search has been delayed due to the complexity of its query languages. In this paper, we explore a novel approach of adapting keywords to querying the semantic web: the approach automatically translates keyword queries into formal logic queries so that end users can use familiar keywords to perform semantic search. A prototype system named 'SPARK' has been implemented in light of this approach. Given a keyword query, SPARK outputs a ranked list of SPARQL queries as the translation result. The translation in SPARK consists of three major steps: term mapping, query graph construction and query ranking. Specifically, a probabilistic query ranking model is proposed to select the most likely SPARQL query. In the experiment, SPARK achieved an encouraging translation result.

  13. MOCQL: A Declarative Language for Ad-Hoc Model Querying

    DEFF Research Database (Denmark)

    Störrle, Harald

    2013-01-01

    Language (MOCQL), an experimental declarative textual language to express queries (and constraints) on models. We introduce MOCQL by examples and its grammar, evaluate its usability by means of controlled experiments, and find that modelers perform better and experience less cognitive load when working...

  14. Object-Oriented Query Language For Events Detection From Images Sequences

    Science.gov (United States)

    Ganea, Ion Eugen

    2015-09-01

    In this paper is presented a method to represent the events extracted from images sequences and the query language used for events detection. Using an object oriented model the spatial and temporal relationships between salient objects and also between events are stored and queried. This works aims to unify the storing and querying phases for video events processing. The object oriented language syntax used for events processing allow the instantiation of the indexes classes in order to improve the accuracy of the query results. The experiments were performed on images sequences provided from sport domain and it shows the reliability and the robustness of the proposed language. To extend the language will be added a specific syntax for constructing the templates for abnormal events and for detection of the incidents as the final goal of the research.

  15. Query Language for Location-Based Services: A Model Checking Approach

    Science.gov (United States)

    Hoareau, Christian; Satoh, Ichiro

    We present a model checking approach to the rationale, implementation, and applications of a query language for location-based services. Such query mechanisms are necessary so that users, objects, and/or services can effectively benefit from the location-awareness of their surrounding environment. The underlying data model is founded on a symbolic model of space organized in a tree structure. Once extended to a semantic model for modal logic, we regard location query processing as a model checking problem, and thus define location queries as hybrid logicbased formulas. Our approach is unique to existing research because it explores the connection between location models and query processing in ubiquitous computing systems, relies on a sound theoretical basis, and provides modal logic-based query mechanisms for expressive searches over a decentralized data structure. A prototype implementation is also presented and will be discussed.

  16. Compilation of functional languages using flow graph analysis

    NARCIS (Netherlands)

    Hartel, Pieter H.; Glaser, Hugh; Wild, John M.

    A system based on the notion of a flow graph is used to specify formally and to implement a compiler for a lazy functional language. The compiler takes a simple functional language as input and generates C. The generated C program can then be compiled, and loaded with an extensive run-time system to

  17. A Demonstration of the Grrr Graph Rewriting Programming Language

    OpenAIRE

    Rodgers, Peter; Vidal, Natalia

    2000-01-01

    This paper overviews the graph rewriting programming language, Grrr. The serial graph rewriting strategy is detailed, and key elements of the user interface are described. The system is illustrated by a simple example.

  18. An XML-Enabled Data Mining Query Language XML-DMQL

    NARCIS (Netherlands)

    Feng, L.; Dillon, T.

    2005-01-01

    Inspired by the good work of Han et al. (1996) and Elfeky et al. (2001) on the design of data mining query languages for relational and object-oriented databases, in this paper, we develop an expressive XML-enabled data mining query language by extension of XQuery. We first describe some

  19. CIRQuL: Complex Information Retrieval Query Language

    NARCIS (Netherlands)

    Mihajlovic, V.; Hiemstra, Djoerd; Apers, Peter M.G.

    In this paper we will present a new framework for the retrieval of XML documents. We will describe the extension for existing query languages (XPath and XQuery) geared toward ranked information retrieval and full-text search in XML documents. Furthermore we will present language models for ranked

  20. Comparing relational model transformation technologies: implementing Query/View/Transformation with Triple Graph Grammars

    DEFF Research Database (Denmark)

    Greenyer, Joel; Kindler, Ekkart

    2010-01-01

    and for model-based software engineering approaches in general. QVT (Query/View/Transformation) is the transformation technology recently proposed for this purpose by the OMG. TGGs (Triple Graph Grammars) are another transformation technology proposed in the mid-nineties, used for example in the FUJABA CASE...

  1. Graph Query Portal

    OpenAIRE

    Dayal, Amit; Brock, David

    2018-01-01

    Prashant Chandrasekar, a lead developer for the Social Interactome project, has tasked the team with creating a graph representation of the data collected from the social networks involved in that project. The data is currently stored in a MySQL database. The client requested that the graph database be Cayley, but after a literature review, Neo4j was chosen. The reasons for this shift will be explained in the design section. Secondarily, the team was tasked with coming up with three scena...

  2. An algorithm to transform natural language into SQL queries for relational databases

    Directory of Open Access Journals (Sweden)

    Garima Singh

    2016-09-01

    Full Text Available Intelligent interface, to enhance efficient interactions between user and databases, is the need of the database applications. Databases must be intelligent enough to make the accessibility faster. However, not every user familiar with the Structured Query Language (SQL queries as they may not aware of structure of the database and they thus require to learn SQL. So, non-expert users need a system to interact with relational databases in their natural language such as English. For this, Database Management System (DBMS must have an ability to understand Natural Language (NL. In this research, an intelligent interface is developed using semantic matching technique which translates natural language query to SQL using set of production rules and data dictionary. The data dictionary consists of semantics sets for relations and attributes. A series of steps like lower case conversion, tokenization, speech tagging, database element and SQL element extraction is used to convert Natural Language Query (NLQ to SQL Query. The transformed query is executed and the results are obtained by the user. Intelligent Interface is the need of database applications to enhance efficient interaction between user and DBMS.

  3. DBPQL: A view-oriented query language for the Intel Data Base Processor

    Science.gov (United States)

    Fishwick, P. A.

    1983-01-01

    An interactive query language (BDPQL) for the Intel Data Base Processor (DBP) is defined. DBPQL includes a parser generator package which permits the analyst to easily create and manipulate the query statement syntax and semantics. The prototype language, DBPQL, includes trace and performance commands to aid the analyst when implementing new commands and analyzing the execution characteristics of the DBP. The DBPQL grammar file and associated key procedures are included as an appendix to this report.

  4. References and arrow notation instead of join operation in query languages

    Directory of Open Access Journals (Sweden)

    Alexandr Savinov

    2012-10-01

    Full Text Available We study properties of the join operation in query languages and describe some of its major drawbacks. We provide strong arguments against using joins as a main construct for retrieving related data elements in general purpose query languages and argue for using references instead. Since conventional references are quite restrictive when applied to data modeling and query languages, we propose to use generalized references as they are defined in the concept-oriented model (COM. These references are used by two new operations, called projection and de-projection, which are denoted by right and left arrows and therefore this access method is referred to as arrow notation. We demonstrate advantages of the arrow notation in comparison to joins and argue that it makes queries simpler, more natural, easier to understand, and the whole query writing process more productive and less error-prone.

  5. Concepts and implementations of natural language query systems

    Science.gov (United States)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung

    1984-01-01

    The currently developed user language interfaces of information systems are generally intended for serious users. These interfaces commonly ignore potentially the largest user group, i.e., casual users. This project discusses the concepts and implementations of a natural query language system which satisfy the nature and information needs of casual users by allowing them to communicate with the system in the form of their native (natural) language. In addition, a framework for the development of such an interface is also introduced for the MADAM (Multics Approach to Data Access and Management) system at the University of Southwestern Louisiana.

  6. Using Query Languages and Mobile Code to Reduce Service Invocation Costs

    National Research Council Canada - National Science Library

    Szymanski, R; Palmer, N; Chase, T

    2006-01-01

    .... As a concrete example, this paper describes work being performed by CERDEC C2D (Communications Electronics Research, Development, and Engineering Center - Command and Control Directorate) at Fort Monmouth, NJ to expose a military mission plan as a web service through the use of simple SQL-like (Structured Query Language) statements optimized for mission data query.

  7. Simplifying Scalable Graph Processing with a Domain-Specific Language

    KAUST Repository

    Hong, Sungpack; Salihoglu, Semih; Widom, Jennifer; Olukotun, Kunle

    2014-01-01

    Large-scale graph processing, with its massive data sets, requires distributed processing. However, conventional frameworks for distributed graph processing, such as Pregel, use non-traditional programming models that are well-suited for parallelism and scalability but inconvenient for implementing non-trivial graph algorithms. In this paper, we use Green-Marl, a Domain-Specific Language for graph analysis, to intuitively describe graph algorithms and extend its compiler to generate equivalent Pregel implementations. Using the semantic information captured by Green-Marl, the compiler applies a set of transformation rules that convert imperative graph algorithms into Pregel's programming model. Our experiments show that the Pregel programs generated by the Green-Marl compiler perform similarly to manually coded Pregel implementations of the same algorithms. The compiler is even able to generate a Pregel implementation of a complicated graph algorithm for which a manual Pregel implementation is very challenging.

  8. Simplifying Scalable Graph Processing with a Domain-Specific Language

    KAUST Repository

    Hong, Sungpack

    2014-01-01

    Large-scale graph processing, with its massive data sets, requires distributed processing. However, conventional frameworks for distributed graph processing, such as Pregel, use non-traditional programming models that are well-suited for parallelism and scalability but inconvenient for implementing non-trivial graph algorithms. In this paper, we use Green-Marl, a Domain-Specific Language for graph analysis, to intuitively describe graph algorithms and extend its compiler to generate equivalent Pregel implementations. Using the semantic information captured by Green-Marl, the compiler applies a set of transformation rules that convert imperative graph algorithms into Pregel\\'s programming model. Our experiments show that the Pregel programs generated by the Green-Marl compiler perform similarly to manually coded Pregel implementations of the same algorithms. The compiler is even able to generate a Pregel implementation of a complicated graph algorithm for which a manual Pregel implementation is very challenging.

  9. A natural language query system for Hubble Space Telescope proposal selection

    Science.gov (United States)

    Hornick, Thomas; Cohen, William; Miller, Glenn

    1987-01-01

    The proposal selection process for the Hubble Space Telescope is assisted by a robust and easy to use query program (TACOS). The system parses an English subset language sentence regardless of the order of the keyword phases, allowing the user a greater flexibility than a standard command query language. Capabilities for macro and procedure definition are also integrated. The system was designed for flexibility in both use and maintenance. In addition, TACOS can be applied to any knowledge domain that can be expressed in terms of a single reaction. The system was implemented mostly in Common LISP. The TACOS design is described in detail, with particular attention given to the implementation methods of sentence processing.

  10. Graph reconstruction with a betweenness oracle

    DEFF Research Database (Denmark)

    Abrahamsen, Mikkel; Bodwin, Greg; Rotenberg, Eva

    2016-01-01

    Graph reconstruction algorithms seek to learn a hidden graph by repeatedly querying a blackbox oracle for information about the graph structure. Perhaps the most well studied and applied version of the problem uses a distance oracle, which can report the shortest path distance between any pair...... of nodes. We introduce and study the betweenness oracle, where bet(a, m, z) is true iff m lies on a shortest path between a and z. This oracle is strictly weaker than a distance oracle, in the sense that a betweenness query can be simulated by a constant number of distance queries, but not vice versa...

  11. A searching and reporting system for relational databases using a graph-based metadata representation.

    Science.gov (United States)

    Hewitt, Robin; Gobbi, Alberto; Lee, Man-Ling

    2005-01-01

    Relational databases are the current standard for storing and retrieving data in the pharmaceutical and biotech industries. However, retrieving data from a relational database requires specialized knowledge of the database schema and of the SQL query language. At Anadys, we have developed an easy-to-use system for searching and reporting data in a relational database to support our drug discovery project teams. This system is fast and flexible and allows users to access all data without having to write SQL queries. This paper presents the hierarchical, graph-based metadata representation and SQL-construction methods that, together, are the basis of this system's capabilities.

  12. Fast Inbound Top-K Query for Random Walk with Restart.

    Science.gov (United States)

    Zhang, Chao; Jiang, Shan; Chen, Yucheng; Sun, Yidan; Han, Jiawei

    2015-09-01

    Random walk with restart (RWR) is widely recognized as one of the most important node proximity measures for graphs, as it captures the holistic graph structure and is robust to noise in the graph. In this paper, we study a novel query based on the RWR measure, called the inbound top-k (Ink) query. Given a query node q and a number k , the Ink query aims at retrieving k nodes in the graph that have the largest weighted RWR scores to q . Ink queries can be highly useful for various applications such as traffic scheduling, disease treatment, and targeted advertising. Nevertheless, none of the existing RWR computation techniques can accurately and efficiently process the Ink query in large graphs. We propose two algorithms, namely Squeeze and Ripple, both of which can accurately answer the Ink query in a fast and incremental manner. To identify the top- k nodes, Squeeze iteratively performs matrix-vector multiplication and estimates the lower and upper bounds for all the nodes in the graph. Ripple employs a more aggressive strategy by only estimating the RWR scores for the nodes falling in the vicinity of q , the nodes outside the vicinity do not need to be evaluated because their RWR scores are propagated from the boundary of the vicinity and thus upper bounded. Ripple incrementally expands the vicinity until the top- k result set can be obtained. Our extensive experiments on real-life graph data sets show that Ink queries can retrieve interesting results, and the proposed algorithms are orders of magnitude faster than state-of-the-art method.

  13. Time series patterns and language support in DBMS

    Science.gov (United States)

    Telnarova, Zdenka

    2017-07-01

    This contribution is focused on pattern type Time Series as a rich in semantics representation of data. Some example of implementation of this pattern type in traditional Data Base Management Systems is briefly presented. There are many approaches how to manipulate with patterns and query patterns. Crucial issue can be seen in systematic approach to pattern management and specific pattern query language which takes into consideration semantics of patterns. Query language SQL-TS for manipulating with patterns is shown on Time Series data.

  14. Dynamic planar embeddings of dynamic graphs

    DEFF Research Database (Denmark)

    Holm, Jacob; Rotenberg, Eva

    2017-01-01

    query, one-flip- linkable(u,v) providing a suggestion for a flip that will make them linkable if one exists. We support all updates and queries in O(log 2 n) time. Our time bounds match those of Italiano et al. for a static (flipless) embedding of a dynamic graph. Our new algorithm is simpler......, exploiting that the complement of a spanning tree of a connected plane graph is a spanning tree of the dual graph. The primal and dual trees are interpreted as having the same Euler tour, and a main idea of the new algorithm is an elegant interaction between top trees over the two trees via their common...

  15. Querying and Serving N-gram Language Models with Python

    Directory of Open Access Journals (Sweden)

    2009-06-01

    Full Text Available Statistical n-gram language modeling is a very important technique in Natural Language Processing (NLP and Computational Linguistics used to assess the fluency of an utterance in any given language. It is widely employed in several important NLP applications such as Machine Translation and Automatic Speech Recognition. However, the most commonly used toolkit (SRILM to build such language models on a large scale is written entirely in C++ which presents a challenge to an NLP developer or researcher whose primary language of choice is Python. This article first provides a gentle introduction to statistical language modeling. It then describes how to build a native and efficient Python interface (using SWIG to the SRILM toolkit such that language models can be queried and used directly in Python code. Finally, it also demonstrates an effective use case of this interface by showing how to leverage it to build a Python language model server. Such a server can prove to be extremely useful when the language model needs to be queried by multiple clients over a network: the language model must only be loaded into memory once by the server and can then satisfy multiple requests. This article includes only those listings of source code that are most salient. To conserve space, some are only presented in excerpted form. The complete set of full source code listings may be found in Volume 1 of The Python Papers Source Codes Journal.

  16. On Describing Human White Matter Anatomy: The White Matter Query Language

    OpenAIRE

    Wassermann, Demian; Makris, Nikos; Rathi, Yogesh; Shenton, Martha; Kikinis, Ron; Kubicki, Marek; Westin, Carl-Fredrik

    2013-01-01

    The main contribution of this work is the careful syntactical definition of major white matter tracts in the human brain based on a neuroanatomist’s expert knowledge. We present a technique to formally describe white matter tracts and to automatically extract them from diffusion MRI data. The framework is based on a novel query language with a near-to-English textual syntax. This query language allows us to construct a dictionary of anatomical definitions describing white matter tracts. The d...

  17. Efficient Approximate OLAP Querying Over Time Series

    DEFF Research Database (Denmark)

    Perera, Kasun Baruhupolage Don Kasun Sanjeewa; Hahmann, Martin; Lehner, Wolfgang

    2016-01-01

    The ongoing trend for data gathering not only produces larger volumes of data, but also increases the variety of recorded data types. Out of these, especially time series, e.g. various sensor readings, have attracted attention in the domains of business intelligence and decision making. As OLAP...... queries play a major role in these domains, it is desirable to also execute them on time series data. While this is not a problem on the conceptual level, it can become a bottleneck with regards to query run-time. In general, processing OLAP queries gets more computationally intensive as the volume...... of data grows. This is a particular problem when querying time series data, which generally contains multiple measures recorded at fine time granularities. Usually, this issue is addressed either by scaling up hardware or by employing workload based query optimization techniques. However, these solutions...

  18. Enhancing Recall in Semantic Querying

    DEFF Research Database (Denmark)

    Rouces, Jacobo

    2013-01-01

    lexically and structurally different, which we will introduce in the next section. As RDF graphs from different sources are expected to be linked, the modeling heterogeneities will make the federated graph become sparser and inconsistent. This is detrimental to the recall of SPARQL queries, as the query...

  19. Min st-cut oracle for planar graphs with near-linear preprocessing time

    DEFF Research Database (Denmark)

    Borradaile, Glencora; Sankowski, Piotr; Wulff-Nilsen, Christian

    2010-01-01

    For an undirected n-vertex planar graph G with non-negative edge-weights, we consider the following type of query: given two vertices s and t in G, what is the weight of a min st-cut in G? We show how to answer such queries in constant time with O(n log5 n) preprocessing time and O(n log n) space....... We use a Gomory-Hu tree to represent all the pairwise min st-cuts implicitly. Previously, no subquadratic time algorithm was known for this problem. Our oracle can be extended to report the min st-cuts in time proportional to their size. Since all-pairs min st-cut and the minimum cycle basis are dual...... problems in planar graphs, we also obtain an implicit representation of a minimum cycle basis in O(n log5 n) time and O(n log n) space and an explicit representation with additional O(C) time and space where C is the size of the basis. To obtain our results, we require that shortest paths be unique...

  20. Mathematical Formula Search using Natural Language Queries

    Directory of Open Access Journals (Sweden)

    YANG, S.

    2014-11-01

    Full Text Available This paper presents how to search mathematical formulae written in MathML when given plain words as a query. Since the proposed method allows natural language queries like the traditional Information Retrieval for the mathematical formula search, users do not need to enter any complicated math symbols and to use any formula input tool. For this, formula data is converted into plain texts, and features are extracted from the converted texts. In our experiments, we achieve an outstanding performance, a MRR of 0.659. In addition, we introduce how to utilize formula classification for formula search. By using class information, we finally achieve an improved performance, a MRR of 0.690.

  1. An XML-Based Manipulation and Query Language for Rule-Based Information

    Science.gov (United States)

    Mansour, Essam; Höpfner, Hagen

    Rules are utilized to assist in the monitoring process that is required in activities, such as disease management and customer relationship management. These rules are specified according to the application best practices. Most of research efforts emphasize on the specification and execution of these rules. Few research efforts focus on managing these rules as one object that has a management life-cycle. This paper presents our manipulation and query language that is developed to facilitate the maintenance of this object during its life-cycle and to query the information contained in this object. This language is based on an XML-based model. Furthermore, we evaluate the model and language using a prototype system applied to a clinical case study.

  2. Multidimensional Data Model and Query Language for Informetrics.

    Science.gov (United States)

    Niemi, Timo; Hirvonen, Lasse; Jarvelin, Kalervo

    2003-01-01

    Discusses multidimensional data analysis, or online analytical processing (OLAP), which offer a single subject-oriented source for analyzing summary data based on various dimensions. Develops a conceptual/logical multidimensional model for supporting the needs of informetrics, including a multidimensional query language whose basic idea is to…

  3. Incremental View Maintenance for Deductive Graph Databases Using Generalized Discrimination Networks

    Directory of Open Access Journals (Sweden)

    Thomas Beyhl

    2016-12-01

    Full Text Available Nowadays, graph databases are employed when relationships between entities are in the scope of database queries to avoid performance-critical join operations of relational databases. Graph queries are used to query and modify graphs stored in graph databases. Graph queries employ graph pattern matching that is NP-complete for subgraph isomorphism. Graph database views can be employed that keep ready answers in terms of precalculated graph pattern matches for often stated and complex graph queries to increase query performance. However, such graph database views must be kept consistent with the graphs stored in the graph database. In this paper, we describe how to use incremental graph pattern matching as technique for maintaining graph database views. We present an incremental maintenance algorithm for graph database views, which works for imperatively and declaratively specified graph queries. The evaluation shows that our maintenance algorithm scales when the number of nodes and edges stored in the graph database increases. Furthermore, our evaluation shows that our approach can outperform existing approaches for the incremental maintenance of graph query results.

  4. FTree query construction for virtual screening: a statistical analysis.

    Science.gov (United States)

    Gerlach, Christof; Broughton, Howard; Zaliani, Andrea

    2008-02-01

    FTrees (FT) is a known chemoinformatic tool able to condense molecular descriptions into a graph object and to search for actives in large databases using graph similarity. The query graph is classically derived from a known active molecule, or a set of actives, for which a similar compound has to be found. Recently, FT similarity has been extended to fragment space, widening its capabilities. If a user were able to build a knowledge-based FT query from information other than a known active structure, the similarity search could be combined with other, normally separate, fields like de-novo design or pharmacophore searches. With this aim in mind, we performed a comprehensive analysis of several databases in terms of FT description and provide a basic statistical analysis of the FT spaces so far at hand. Vendors' catalogue collections and MDDR as a source of potential or known "actives", respectively, have been used. With the results reported herein, a set of ranges, mean values and standard deviations for several query parameters are presented in order to set a reference guide for the users. Applications on how to use this information in FT query building are also provided, using a newly built 3D-pharmacophore from 57 5HT-1F agonists and a published one which was used for virtual screening for tRNA-guanine transglycosylase (TGT) inhibitors.

  5. G-Bean: an ontology-graph based web tool for biomedical literature retrieval.

    Science.gov (United States)

    Wang, James Z; Zhang, Yuanyuan; Dong, Liang; Li, Lin; Srimani, Pradip K; Yu, Philip S

    2014-01-01

    query statement automatically from the natural language query strings. G-Bean is available at http://bioinformatics.clemson.edu/G-Bean/index.php. G-Bean addresses PubMed's limitations with ontology-graph based query expansion, automatic document indexing, and user search intention discovery. It shows significant advantages in finding relevant articles from the MEDLINE database to meet the information need of the user.

  6. Language model: Extension to solve inconsistency, incompleteness, and short query in cultural heritage collection

    Science.gov (United States)

    Tan, Kian Lam; Lim, Chen Kim

    2017-10-01

    With the explosive growth of online information such as email messages, news articles, and scientific literature, many institutions and museums are converting their cultural collections from physical data to digital format. However, this conversion resulted in the issues of inconsistency and incompleteness. Besides, the usage of inaccurate keywords also resulted in short query problem. Most of the time, the inconsistency and incompleteness are caused by the aggregation fault in annotating a document itself while the short query problem is caused by naive user who has prior knowledge and experience in cultural heritage domain. In this paper, we presented an approach to solve the problem of inconsistency, incompleteness and short query by incorporating the Term Similarity Matrix into the Language Model. Our approach is tested on the Cultural Heritage in CLEF (CHiC) collection which consists of short queries and documents. The results show that the proposed approach is effective and has improved the accuracy in retrieval time.

  7. Transformation Strategies between Block-Oriented and Graph-Oriented Process Modelling Languages

    DEFF Research Database (Denmark)

    Mendling, Jan; Lassen, Kristian Bisgaard; Zdun, Uwe

    2006-01-01

    Much recent research work discusses the transformation between different process modelling languages. This work, however, is mainly focussed on specific process modelling languages, and thus the general reusability of the applied transformation concepts is rather limited. In this paper, we aim...... to abstract from concrete transformation strategies by distinguishing two major paradigms for representing control flow in process modelling languages: block-oriented languages (such as BPEL and BPML) and graph-oriented languages (such as EPCs and YAWL). The contribution of this paper are generic strategies...... for transforming from block-oriented process languages to graph-oriented languages, and vice versa....

  8. Transformation Strategies between Block-Oriented and Graph-Oriented Process Modelling Languages

    DEFF Research Database (Denmark)

    Mendling, Jan; Lassen, Kristian Bisgaard; Zdun, Uwe

    to abstract from concrete transformationstrategies by distinguishing two major paradigms for process modelling languages:block-oriented languages (such as BPEL and BPML) and graph-oriented languages(such as EPCs and YAWL). The contribution of this paper are generic strategiesfor transforming from block......Much recent research work discusses the transformation between differentprocess modelling languages. This work, however, is mainly focussed on specific processmodelling languages, and thus the general reusability of the applied transformationconcepts is rather limited. In this paper, we aim......-oriented process languages to graph-oriented languages,and vice versa. We also present two case studies of applying our strategies....

  9. Native Language Integrated Queries with CppLINQ in C++

    Science.gov (United States)

    Vassilev, V.

    2015-05-01

    Programming language evolution brought to us the domain-specific languages (DSL). They proved to be very useful for expressing specific concepts, turning into a vital ingredient even for general-purpose frameworks. Supporting declarative DSLs (such as SQL) in imperative languages (such as C++) can happen in the manner of language integrated query (LINQ). We investigate approaches to integrate LINQ programming language, native to C++. We review its usability in the context of high energy physics. We present examples using CppLINQ for a few types data analysis workflows done by the end-users doing data analysis. We discuss evidences how this DSL technology can simplify massively parallel grid system such as PROOF.

  10. Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition

    Directory of Open Access Journals (Sweden)

    Suzuki Motoyuki

    2009-01-01

    Full Text Available Abstract We are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves Web documents using the chosen keywords. A problem is that the selected keywords tend to contain misrecognized words. The proposed method introduces two new ideas for avoiding the effects of keywords derived from misrecognized words. The first idea is to compose multiple queries from selected keyword candidates so that the misrecognized words and correct words do not fall into one query. The second idea is that the number of Web documents downloaded for each query is determined according to the "query relevance." Combining these two ideas, we can alleviate bad effect of misrecognized keywords by decreasing the number of downloaded Web documents from queries that contain misrecognized keywords. Finally, we examine a method of determining the number of iterative adaptations based on the recognition likelihood. Experiments have shown that the proposed stopping criterion can determine almost the optimum number of iterations. In the final experiment, the word accuracy without adaptation (55.29% was improved to 60.38%, which was 1.13 point better than the result of the conventional unsupervised adaptation method (59.25%.

  11. Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition

    Directory of Open Access Journals (Sweden)

    Akinori Ito

    2009-01-01

    Full Text Available We are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves Web documents using the chosen keywords. A problem is that the selected keywords tend to contain misrecognized words. The proposed method introduces two new ideas for avoiding the effects of keywords derived from misrecognized words. The first idea is to compose multiple queries from selected keyword candidates so that the misrecognized words and correct words do not fall into one query. The second idea is that the number of Web documents downloaded for each query is determined according to the “query relevance.” Combining these two ideas, we can alleviate bad effect of misrecognized keywords by decreasing the number of downloaded Web documents from queries that contain misrecognized keywords. Finally, we examine a method of determining the number of iterative adaptations based on the recognition likelihood. Experiments have shown that the proposed stopping criterion can determine almost the optimum number of iterations. In the final experiment, the word accuracy without adaptation (55.29% was improved to 60.38%, which was 1.13 point better than the result of the conventional unsupervised adaptation method (59.25%.

  12. PathSys: integrating molecular interaction graphs for systems biology

    Directory of Open Access Journals (Sweden)

    Raval Alpan

    2006-02-01

    Full Text Available Abstract Background The goal of information integration in systems biology is to combine information from a number of databases and data sets, which are obtained from both high and low throughput experiments, under one data management scheme such that the cumulative information provides greater biological insight than is possible with individual information sources considered separately. Results Here we present PathSys, a graph-based system for creating a combined database of networks of interaction for generating integrated view of biological mechanisms. We used PathSys to integrate over 14 curated and publicly contributed data sources for the budding yeast (S. cerevisiae and Gene Ontology. A number of exploratory questions were formulated as a combination of relational and graph-based queries to the integrated database. Thus, PathSys is a general-purpose, scalable, graph-data warehouse of biological information, complete with a graph manipulation and a query language, a storage mechanism and a generic data-importing mechanism through schema-mapping. Conclusion Results from several test studies demonstrate the effectiveness of the approach in retrieving biologically interesting relations between genes and proteins, the networks connecting them, and of the utility of PathSys as a scalable graph-based warehouse for interaction-network integration and a hypothesis generator system. The PathSys's client software, named BiologicalNetworks, developed for navigation and analyses of molecular networks, is available as a Java Web Start application at http://brak.sdsc.edu/pub/BiologicalNetworks.

  13. Chaotic Traversal (CHAT): Very Large Graphs Traversal Using Chaotic Dynamics

    Science.gov (United States)

    Changaival, Boonyarit; Rosalie, Martin; Danoy, Grégoire; Lavangnananda, Kittichai; Bouvry, Pascal

    2017-12-01

    Graph Traversal algorithms can find their applications in various fields such as routing problems, natural language processing or even database querying. The exploration can be considered as a first stepping stone into knowledge extraction from the graph which is now a popular topic. Classical solutions such as Breadth First Search (BFS) and Depth First Search (DFS) require huge amounts of memory for exploring very large graphs. In this research, we present a novel memoryless graph traversal algorithm, Chaotic Traversal (CHAT) which integrates chaotic dynamics to traverse large unknown graphs via the Lozi map and the Rössler system. To compare various dynamics effects on our algorithm, we present an original way to perform the exploration of a parameter space using a bifurcation diagram with respect to the topological structure of attractors. The resulting algorithm is an efficient and nonresource demanding algorithm, and is therefore very suitable for partial traversal of very large and/or unknown environment graphs. CHAT performance using Lozi map is proven superior than the, commonly known, Random Walk, in terms of number of nodes visited (coverage percentage) and computation time where the environment is unknown and memory usage is restricted.

  14. In-context query reformulation for failing SPARQL queries

    Science.gov (United States)

    Viswanathan, Amar; Michaelis, James R.; Cassidy, Taylor; de Mel, Geeth; Hendler, James

    2017-05-01

    Knowledge bases for decision support systems are growing increasingly complex, through continued advances in data ingest and management approaches. However, humans do not possess the cognitive capabilities to retain a bird's-eyeview of such knowledge bases, and may end up issuing unsatisfiable queries to such systems. This work focuses on the implementation of a query reformulation approach for graph-based knowledge bases, specifically designed to support the Resource Description Framework (RDF). The reformulation approach presented is instance-and schema-aware. Thus, in contrast to relaxation techniques found in the state-of-the-art, the presented approach produces in-context query reformulation.

  15. Lost in translation? A multilingual Query Builder improves the quality of PubMed queries: a randomised controlled trial.

    Science.gov (United States)

    Schuers, Matthieu; Joulakian, Mher; Kerdelhué, Gaetan; Segas, Léa; Grosjean, Julien; Darmoni, Stéfan J; Griffon, Nicolas

    2017-07-03

    MEDLINE is the most widely used medical bibliographic database in the world. Most of its citations are in English and this can be an obstacle for some researchers to access the information the database contains. We created a multilingual query builder to facilitate access to the PubMed subset using a language other than English. The aim of our study was to assess the impact of this multilingual query builder on the quality of PubMed queries for non-native English speaking physicians and medical researchers. A randomised controlled study was conducted among French speaking general practice residents. We designed a multi-lingual query builder to facilitate information retrieval, based on available MeSH translations and providing users with both an interface and a controlled vocabulary in their own language. Participating residents were randomly allocated either the French or the English version of the query builder. They were asked to translate 12 short medical questions into MeSH queries. The main outcome was the quality of the query. Two librarians blind to the arm independently evaluated each query, using a modified published classification that differentiated eight types of errors. Twenty residents used the French version of the query builder and 22 used the English version. 492 queries were analysed. There were significantly more perfect queries in the French group vs. the English group (respectively 37.9% vs. 17.9%; p PubMed queries in particular for researchers whose first language is not English.

  16. Adding Conflict Resolution Features to a Query Language for Database Federations

    Directory of Open Access Journals (Sweden)

    Kai-Uwe Sattler

    2000-11-01

    Full Text Available A main problem of data integration is the treatment of conflicts caused by different modeling of realworld entities, different data models or simply by different representations of one and the same object. During the integration phase these conflicts have to be identified and resolved as part of the mapping between local and global schemata. Therefore, conflict resolution affects the definition of the integrated view as well as query transformation and evaluation, in this paper we present a SQL extension for defining and querying database federations. This language addresses in particular the resolution of integration conflicts by providing mechanisms for mapping attributes, restructuring relations as well as extended integration operations. Finally, the application of these resolution strategies is briefly explained by presenting a simple conflict resolution method.

  17. Graph-Based Semantic Web Service Composition for Healthcare Data Integration.

    Science.gov (United States)

    Arch-Int, Ngamnij; Arch-Int, Somjit; Sonsilphong, Suphachoke; Wanchai, Paweena

    2017-01-01

    Within the numerous and heterogeneous web services offered through different sources, automatic web services composition is the most convenient method for building complex business processes that permit invocation of multiple existing atomic services. The current solutions in functional web services composition lack autonomous queries of semantic matches within the parameters of web services, which are necessary in the composition of large-scale related services. In this paper, we propose a graph-based Semantic Web Services composition system consisting of two subsystems: management time and run time. The management-time subsystem is responsible for dependency graph preparation in which a dependency graph of related services is generated automatically according to the proposed semantic matchmaking rules. The run-time subsystem is responsible for discovering the potential web services and nonredundant web services composition of a user's query using a graph-based searching algorithm. The proposed approach was applied to healthcare data integration in different health organizations and was evaluated according to two aspects: execution time measurement and correctness measurement.

  18. GDM:A New Graph Based Data Model Using Functional Abstractionx

    Institute of Scientific and Technical Information of China (English)

    Sankhayan Choudhury; Nabendu Chaki; Swapan Bhattacharya

    2006-01-01

    In this paper, a Graph-based semantic Data Model (GDM) is proposed with the primary objective of bridging the gap between the human perception of an enterprise and the needs of computing infrastructure to organize information in some particular manner for efficient storage and retrieval. The Graph Data Model (GDM) has been proposed as an alternative data model to combine the advantages of the relational model with the positive features of semantic data models.The proposed GDM offers a structural representation for interacting to the designer, making it always easy to comprehend the complex relations amongst basic data items. GDM allows an entire database to be viewed as a Graph (V, E) in a layered organization. Here, a graph is created in a bottom up fashion where V represents the basic instances of data or a functionally abstracted module, called primary semantic group (PSG) and secondary semantic group (SSG). An edge in the model implies the relationship among the secondary semantic groups. The contents of the lowest layer are the semantically grouped data values in the form of primary semantic groups. The SSGs are nothing but the higher-level abstraction and are created by the method of encapsulation of various PSGs, SSGs and basic data elements. This encapsulation methodology to provide a higher-level abstraction continues generating various secondary semantic groups until the designer thinks that it is sufficient to declare the actual problem domain. GDM, thus, uses standard abstractions available in a semantic data model with a structural representation in terms of a graph. The operations on the data model are formalized in the proposed graph algebra. A Graph Query Language (GQL) is also developed, maintaining similaritywith the widely accepted user-friendly SQL. Finally, the paper also presents the methodology to make this GDM compatible with the distributed environment,and a corresponding query processing technique for distributed environment is also

  19. GSMNet: A Hierarchical Graph Model for Moving Objects in Networks

    Directory of Open Access Journals (Sweden)

    Hengcai Zhang

    2017-03-01

    Full Text Available Existing data models for moving objects in networks are often limited by flexibly controlling the granularity of representing networks and the cost of location updates and do not encompass semantic information, such as traffic states, traffic restrictions and social relationships. In this paper, we aim to fill the gap of traditional network-constrained models and propose a hierarchical graph model called the Geo-Social-Moving model for moving objects in Networks (GSMNet that adopts four graph structures, RouteGraph, SegmentGraph, ObjectGraph and MoveGraph, to represent the underlying networks, trajectories and semantic information in an integrated manner. The bulk of user-defined data types and corresponding operators is proposed to handle moving objects and answer a new class of queries supporting three kinds of conditions: spatial, temporal and semantic information. Then, we develop a prototype system with the native graph database system Neo4Jto implement the proposed GSMNet model. In the experiment, we conduct the performance evaluation using simulated trajectories generated from the BerlinMOD (Berlin Moving Objects Database benchmark and compare with the mature MOD system Secondo. The results of 17 benchmark queries demonstrate that our proposed GSMNet model has strong potential to reduce time-consuming table join operations an d shows remarkable advantages with regard to representing semantic information and controlling the cost of location updates.

  20. High-performance analysis of filtered semantic graphs

    OpenAIRE

    Buluç, A; Fox, A; Gilbert, JR; Kamil, S; Lugowski, A; Oliker, L; Williams, S

    2012-01-01

    High performance is a crucial consideration when executing a complex analytic query on a massive semantic graph. In a semantic graph, vertices and edges carry \\attributes" of various types. Analytic queries on semantic graphs typically depend on the values of these attributes; thus, the computation must either view the graph through a filter that passes only those individual vertices and edges of interest, or else must first materialize a subgraph or subgraphs consisting of only the vertices ...

  1. Combining Vertex-centric Graph Processing with SPARQL for Large-scale RDF Data Analytics

    KAUST Repository

    Abdelaziz, Ibrahim

    2017-06-27

    Modern applications, such as drug repositioning, require sophisticated analytics on RDF graphs that combine structural queries with generic graph computations. Existing systems support either declarative SPARQL queries, or generic graph processing, but not both. We bridge the gap by introducing Spartex, a versatile framework for complex RDF analytics. Spartex extends SPARQL to support programs that combine seamlessly generic graph algorithms (e.g., PageRank, Shortest Paths, etc.) with SPARQL queries. Spartex builds on existing vertex-centric graph processing frameworks, such as Graphlab or Pregel. It implements a generic SPARQL operator as a vertex-centric program that interprets SPARQL queries and executes them efficiently using a built-in optimizer. In addition, any graph algorithm implemented in the underlying vertex-centric framework, can be executed in Spartex. We present various scenarios where our framework simplifies significantly the implementation of complex RDF data analytics programs. We demonstrate that Spartex scales to datasets with billions of edges, and show that our core SPARQL engine is at least as fast as the state-of-the-art specialized RDF engines. For complex analytical tasks that combine generic graph processing with SPARQL, Spartex is at least an order of magnitude faster than existing alternatives.

  2. Model-Based Requirements Management in Gear Systems Design Based On Graph-Based Design Languages

    Directory of Open Access Journals (Sweden)

    Kevin Holder

    2017-10-01

    Full Text Available For several decades, a wide-spread consensus concerning the enormous importance of an in-depth clarification of the specifications of a product has been observed. A weak clarification of specifications is repeatedly listed as a main cause for the failure of product development projects. Requirements, which can be defined as the purpose, goals, constraints, and criteria associated with a product development project, play a central role in the clarification of specifications. The collection of activities which ensure that requirements are identified, documented, maintained, communicated, and traced throughout the life cycle of a system, product, or service can be referred to as “requirements engineering”. These activities can be supported by a collection and combination of strategies, methods, and tools which are appropriate for the clarification of specifications. Numerous publications describe the strategy and the components of requirements management. Furthermore, recent research investigates its industrial application. Simultaneously, promising developments of graph-based design languages for a holistic digital representation of the product life cycle are presented. Current developments realize graph-based languages by the diagrams of the Unified Modelling Language (UML, and allow the automatic generation and evaluation of multiple product variants. The research presented in this paper seeks to present a method in order to combine the advantages of a conscious requirements management process and graph-based design languages. Consequently, the main objective of this paper is the investigation of a model-based integration of requirements in a product development process by means of graph-based design languages. The research method is based on an in-depth analysis of an exemplary industrial product development, a gear system for so-called “Electrical Multiple Units” (EMU. Important requirements were abstracted from a gear system

  3. Optimizing Temporal Queries

    DEFF Research Database (Denmark)

    Toman, David; Bowman, Ivan Thomas

    2003-01-01

    Recent research in the area of temporal databases has proposed a number of query languages that vary in their expressive power and the semantics they provide to users. These query languages represent a spectrum of solutions to the tension between clean semantics and efficient evaluation. Often, t...

  4. A New Publicly Available Chemical Query Language, CSRML, to support Chemotype Representations for Application to Data-Mining and Modeling

    Science.gov (United States)

    A new XML-based query language, CSRML, has been developed for representing chemical substructures, molecules, reaction rules, and reactions. CSRML queries are capable of integrating additional forms of information beyond the simple substructure (e.g., SMARTS) or reaction transfor...

  5. ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding.

    Science.gov (United States)

    Guhlin, Joseph; Silverstein, Kevin A T; Zhou, Peng; Tiffin, Peter; Young, Nevin D

    2017-08-10

    Rapid generation of omics data in recent years have resulted in vast amounts of disconnected datasets without systemic integration and knowledge building, while individual groups have made customized, annotated datasets available on the web with few ways to link them to in-lab datasets. With so many research groups generating their own data, the ability to relate it to the larger genomic and comparative genomic context is becoming increasingly crucial to make full use of the data. The Omics Database Generator (ODG) allows users to create customized databases that utilize published genomics data integrated with experimental data which can be queried using a flexible graph database. When provided with omics and experimental data, ODG will create a comparative, multi-dimensional graph database. ODG can import definitions and annotations from other sources such as InterProScan, the Gene Ontology, ENZYME, UniPathway, and others. This annotation data can be especially useful for studying new or understudied species for which transcripts have only been predicted, and rapidly give additional layers of annotation to predicted genes. In better studied species, ODG can perform syntenic annotation translations or rapidly identify characteristics of a set of genes or nucleotide locations, such as hits from an association study. ODG provides a web-based user-interface for configuring the data import and for querying the database. Queries can also be run from the command-line and the database can be queried directly through programming language hooks available for most languages. ODG supports most common genomic formats as well as generic, easy to use tab-separated value format for user-provided annotations. ODG is a user-friendly database generation and query tool that adapts to the supplied data to produce a comparative genomic database or multi-layered annotation database. ODG provides rapid comparative genomic annotation and is therefore particularly useful for non-model or

  6. A structural query system for Han characters

    DEFF Research Database (Denmark)

    Skala, Matthew

    2016-01-01

    The IDSgrep structural query system for Han character dictionaries is presented. This dictionary search system represents the spatial structure of Han characters using Extended Ideographic Description Sequences (EIDSes), a data model and syntax based on the Unicode IDS concept. It includes a query...... language for EIDS databases, with a freely available implementation and format translation from popular third-party IDS and XML character databases. The system is designed to suit the needs of font developers and foreign language learners. The search algorithm includes a bit vector index inspired by Bloom...... filters to support faster query operations. Experimental results are presented, evaluating the effect of the indexing on query performance....

  7. Location-Dependent Query Processing Under Soft Real-Time Constraints

    Directory of Open Access Journals (Sweden)

    Zoubir Mammeri

    2009-01-01

    Full Text Available In recent years, mobile devices and applications achieved an increasing development. In database field, this development required methods to consider new query types like location-dependent queries (i.e. the query results depend on the query issuer location. Although several researches addressed problems related to location-dependent query processing, a few works considered timing requirements that may be associated with queries (i.e., the query results must be delivered to mobile clients on time. The main objective of this paper is to propose a solution for location-dependent query processing under soft real-time constraints. Hence, we propose methods to take into account client location-dependency and to maximize the percentage of queries respecting their deadlines. We validate our proposal by implementing a prototype based on Oracle DBMS. Performance evaluation results show that the proposed solution optimizes the percentage of queries meeting their deadlines and the communication cost.

  8. Towards Verbalizing SPARQL Queries in Arabic

    Directory of Open Access Journals (Sweden)

    I. Al Agha

    2016-04-01

    Full Text Available With the wide spread of Open Linked Data and Semantic Web technologies, a larger amount of data has been published on the Web in the RDF and OWL formats. This data can be queried using SPARQL, the Semantic Web Query Language. SPARQL cannot be understood by ordinary users and is not directly accessible to humans, and thus they will not be able to check whether the retrieved answers truly correspond to the intended information need. Driven by this challenge, natural language generation from SPARQL data has recently attracted a considerable attention. However, most existing solutions to verbalize SPARQL in natural language focused on English and Latin-based languages. Little effort has been made on the Arabic language which has different characteristics and morphology. This work aims to particularly help Arab users to perceive SPARQL queries on the Semantic Web by translating SPARQL to Arabic. It proposes an approach that gets a SPARQL query as an input and generates a query expressed in Arabic as an output. The translation process combines both morpho-syntactic analysis and language dependencies to generate a legible and understandable Arabic query. The approach was preliminary assessed with a sample query set, and results indicated that 75% of the queries were correctly translated into Arabic.

  9. Error Checking for Chinese Query by Mining Web Log

    Directory of Open Access Journals (Sweden)

    Jianyong Duan

    2015-01-01

    Full Text Available For the search engine, error-input query is a common phenomenon. This paper uses web log as the training set for the query error checking. Through the n-gram language model that is trained by web log, the queries are analyzed and checked. Some features including query words and their number are introduced into the model. At the same time data smoothing algorithm is used to solve data sparseness problem. It will improve the overall accuracy of the n-gram model. The experimental results show that it is effective.

  10. On the power propagation time of a graph

    OpenAIRE

    Bozeman, Chassidy

    2016-01-01

    In this paper, we give Nordhaus-Gaddum upper and lower bounds on the sum of the power propagation time of a graph and its complement, and we consider the effects of edge subdivisions and edge contractions on the power propagation time of a graph. We also study a generalization of power propagation time, known as $k-$power propagation time, by characterizing all simple graphs on $n$ vertices whose $k-$power propagation time is $n-1$ or $n-2$ (for $k\\geq 1$) and $n-3$ (for $k\\geq 2$). We determ...

  11. Information Retrieval and Graph Analysis Approaches for Book Recommendation

    OpenAIRE

    Chahinez Benkoussas; Patrice Bellot

    2015-01-01

    A combination of multiple information retrieval approaches is proposed for the purpose of book recommendation. In this paper, book recommendation is based on complex user's query. We used different theoretical retrieval models: probabilistic as InL2 (Divergence from Randomness model) and language model and tested their interpolated combination. Graph analysis algorithms such as PageRank have been successful in Web environments. We consider the application of this algorithm in a new retrieval ...

  12. Efficient Processing of Multiple DTW Queries in Time Series Databases

    DEFF Research Database (Denmark)

    Kremer, Hardy; Günnemann, Stephan; Ivanescu, Anca-Maria

    2011-01-01

    . In many of today’s applications, however, large numbers of queries arise at any given time. Existing DTW techniques do not process multiple DTW queries simultaneously, a serious limitation which slows down overall processing. In this paper, we propose an efficient processing approach for multiple DTW...... for multiple DTW queries....

  13. A Qualitative Analysis Framework Using Natural Language Processing and Graph Theory

    Science.gov (United States)

    Tierney, Patrick J.

    2012-01-01

    This paper introduces a method of extending natural language-based processing of qualitative data analysis with the use of a very quantitative tool--graph theory. It is not an attempt to convert qualitative research to a positivist approach with a mathematical black box, nor is it a "graphical solution". Rather, it is a method to help qualitative…

  14. Processing SPARQL queries with regular expressions in RDF databases

    Science.gov (United States)

    2011-01-01

    Background As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users’ requests for extracting information from the RDF data as well as the lack of users’ knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. Results In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Conclusions Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns. PMID:21489225

  15. Processing SPARQL queries with regular expressions in RDF databases.

    Science.gov (United States)

    Lee, Jinsoo; Pham, Minh-Duc; Lee, Jihwan; Han, Wook-Shin; Cho, Hune; Yu, Hwanjo; Lee, Jeong-Hoon

    2011-03-29

    As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users' requests for extracting information from the RDF data as well as the lack of users' knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns.

  16. Similarity Measure of Graphs

    Directory of Open Access Journals (Sweden)

    Amine Labriji

    2017-07-01

    Full Text Available The topic of identifying the similarity of graphs was considered as highly recommended research field in the Web semantic, artificial intelligence, the shape recognition and information research. One of the fundamental problems of graph databases is finding similar graphs to a graph query. Existing approaches dealing with this problem are usually based on the nodes and arcs of the two graphs, regardless of parental semantic links. For instance, a common connection is not identified as being part of the similarity of two graphs in cases like two graphs without common concepts, the measure of similarity based on the union of two graphs, or the one based on the notion of maximum common sub-graph (SCM, or the distance of edition of graphs. This leads to an inadequate situation in the context of information research. To overcome this problem, we suggest a new measure of similarity between graphs, based on the similarity measure of Wu and Palmer. We have shown that this new measure satisfies the properties of a measure of similarities and we applied this new measure on examples. The results show that our measure provides a run time with a gain of time compared to existing approaches. In addition, we compared the relevance of the similarity values obtained, it appears that this new graphs measure is advantageous and  offers a contribution to solving the problem mentioned above.

  17. Dynamic planar embeddings of dynamic graphs

    DEFF Research Database (Denmark)

    Holm, Jacob; Rotenberg, Eva

    2015-01-01

    -flip-linkable(u, v) providing a suggestion for a flip that will make them linkable if one exists. We will support all updates and queries in O(log2 n) time. Our time bounds match those of Italiano et al. for a static (flipless) embedding of a dynamic graph. Our new algorithm is simpler, exploiting...... that the complement of a spanning tree of a connected plane graph is a spanning tree of the dual graph. The primal and dual trees are interpreted as having the same Euler tour, and a main idea of the new algorithm is an elegant interaction between top trees over the two trees via their common Euler tour....

  18. Design of a compiler working out a real time BASIC language for CAMAC monitoring

    International Nuclear Information System (INIS)

    Barlerin, Antoine.

    1978-06-01

    After exposing why a real time system is required in order to perform interactive measures and controls, the various units of this system are described and it is shown how to build them up. Through the real time monitor each process is managed and controlled. The required input-output system and the monitor are linked up. Through the control system the operator can at every moment interface with the process in progress. In accordance with the above described systems a method for language elaboration based on graph theory is outlined and applied to a short language. The last chapter describes the BASIC like language to which have been aplied the above methods and indicates the actual performance of the machine [fr

  19. Real-time community detection in full social networks on a laptop

    Science.gov (United States)

    Chamberlain, Benjamin Paul; Levy-Kramer, Josh; Humby, Clive

    2018-01-01

    For a broad range of research and practical applications it is important to understand the allegiances, communities and structure of key players in society. One promising direction towards extracting this information is to exploit the rich relational data in digital social networks (the social graph). As global social networks (e.g., Facebook and Twitter) are very large, most approaches make use of distributed computing systems for this purpose. Distributing graph processing requires solving many difficult engineering problems, which has lead some researchers to look at single-machine solutions that are faster and easier to maintain. In this article, we present an approach for analyzing full social networks on a standard laptop, allowing for interactive exploration of the communities in the locality of a set of user specified query vertices. The key idea is that the aggregate actions of large numbers of users can be compressed into a data structure that encapsulates the edge weights between vertices in a derived graph. Local communities can be constructed by selecting vertices that are connected to the query vertices with high edge weights in the derived graph. This compression is robust to noise and allows for interactive queries of local communities in real-time, which we define to be less than the average human reaction time of 0.25s. We achieve single-machine real-time performance by compressing the neighborhood of each vertex using minhash signatures and facilitate rapid queries through Locality Sensitive Hashing. These techniques reduce query times from hours using industrial desktop machines operating on the full graph to milliseconds on standard laptops. Our method allows exploration of strongly associated regions (i.e., communities) of large graphs in real-time on a laptop. It has been deployed in software that is actively used by social network analysts and offers another channel for media owners to monetize their data, helping them to continue to provide

  20. Real-time community detection in full social networks on a laptop.

    Directory of Open Access Journals (Sweden)

    Benjamin Paul Chamberlain

    Full Text Available For a broad range of research and practical applications it is important to understand the allegiances, communities and structure of key players in society. One promising direction towards extracting this information is to exploit the rich relational data in digital social networks (the social graph. As global social networks (e.g., Facebook and Twitter are very large, most approaches make use of distributed computing systems for this purpose. Distributing graph processing requires solving many difficult engineering problems, which has lead some researchers to look at single-machine solutions that are faster and easier to maintain. In this article, we present an approach for analyzing full social networks on a standard laptop, allowing for interactive exploration of the communities in the locality of a set of user specified query vertices. The key idea is that the aggregate actions of large numbers of users can be compressed into a data structure that encapsulates the edge weights between vertices in a derived graph. Local communities can be constructed by selecting vertices that are connected to the query vertices with high edge weights in the derived graph. This compression is robust to noise and allows for interactive queries of local communities in real-time, which we define to be less than the average human reaction time of 0.25s. We achieve single-machine real-time performance by compressing the neighborhood of each vertex using minhash signatures and facilitate rapid queries through Locality Sensitive Hashing. These techniques reduce query times from hours using industrial desktop machines operating on the full graph to milliseconds on standard laptops. Our method allows exploration of strongly associated regions (i.e., communities of large graphs in real-time on a laptop. It has been deployed in software that is actively used by social network analysts and offers another channel for media owners to monetize their data, helping them to

  1. Real-time community detection in full social networks on a laptop.

    Science.gov (United States)

    Chamberlain, Benjamin Paul; Levy-Kramer, Josh; Humby, Clive; Deisenroth, Marc Peter

    2018-01-01

    For a broad range of research and practical applications it is important to understand the allegiances, communities and structure of key players in society. One promising direction towards extracting this information is to exploit the rich relational data in digital social networks (the social graph). As global social networks (e.g., Facebook and Twitter) are very large, most approaches make use of distributed computing systems for this purpose. Distributing graph processing requires solving many difficult engineering problems, which has lead some researchers to look at single-machine solutions that are faster and easier to maintain. In this article, we present an approach for analyzing full social networks on a standard laptop, allowing for interactive exploration of the communities in the locality of a set of user specified query vertices. The key idea is that the aggregate actions of large numbers of users can be compressed into a data structure that encapsulates the edge weights between vertices in a derived graph. Local communities can be constructed by selecting vertices that are connected to the query vertices with high edge weights in the derived graph. This compression is robust to noise and allows for interactive queries of local communities in real-time, which we define to be less than the average human reaction time of 0.25s. We achieve single-machine real-time performance by compressing the neighborhood of each vertex using minhash signatures and facilitate rapid queries through Locality Sensitive Hashing. These techniques reduce query times from hours using industrial desktop machines operating on the full graph to milliseconds on standard laptops. Our method allows exploration of strongly associated regions (i.e., communities) of large graphs in real-time on a laptop. It has been deployed in software that is actively used by social network analysts and offers another channel for media owners to monetize their data, helping them to continue to provide

  2. Enabling Graph Appliance for Genome Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Singh, Rina [ORNL; Graves, Jeffrey A [ORNL; Lee, Sangkeun (Matt) [ORNL; Sukumar, Sreenivas R [ORNL; Shankar, Mallikarjun [ORNL

    2015-01-01

    In recent years, there has been a huge growth in the amount of genomic data available as reads generated from various genome sequencers. The number of reads generated can be huge, ranging from hundreds to billions of nucleotide, each varying in size. Assembling such large amounts of data is one of the challenging computational problems for both biomedical and data scientists. Most of the genome assemblers developed have used de Bruijn graph techniques. A de Bruijn graph represents a collection of read sequences by billions of vertices and edges, which require large amounts of memory and computational power to store and process. This is the major drawback to de Bruijn graph assembly. Massively parallel, multi-threaded, shared memory systems can be leveraged to overcome some of these issues. The objective of our research is to investigate the feasibility and scalability issues of de Bruijn graph assembly on Cray s Urika-GD system; Urika-GD is a high performance graph appliance with a large shared memory and massively multithreaded custom processor designed for executing SPARQL queries over large-scale RDF data sets. However, to the best of our knowledge, there is no research on representing a de Bruijn graph as an RDF graph or finding Eulerian paths in RDF graphs using SPARQL for potential genome discovery. In this paper, we address the issues involved in representing a de Bruin graphs as RDF graphs and propose an iterative querying approach for finding Eulerian paths in large RDF graphs. We evaluate the performance of our implementation on real world ebola genome datasets and illustrate how genome assembly can be accomplished with Urika-GD using iterative SPARQL queries.

  3. Top-k Keyword Search Over Graphs Based On Backward Search

    Directory of Open Access Journals (Sweden)

    Zeng Jia-Hui

    2017-01-01

    Full Text Available Keyword search is one of the most friendly and intuitive information retrieval methods. Using the keyword search to get the connected subgraph has a lot of application in the graph-based cognitive computation, and it is a basic technology. This paper focuses on the top-k keyword searching over graphs. We implemented a keyword search algorithm which applies the backward search idea. The algorithm locates the keyword vertices firstly, and then applies backward search to find rooted trees that contain query keywords. The experiment shows that query time is affected by the iteration number of the algorithm.

  4. Continuous-time quantum walks on star graphs

    International Nuclear Information System (INIS)

    Salimi, S.

    2009-01-01

    In this paper, we investigate continuous-time quantum walk on star graphs. It is shown that quantum central limit theorem for a continuous-time quantum walk on star graphs for N-fold star power graph, which are invariant under the quantum component of adjacency matrix, converges to continuous-time quantum walk on K 2 graphs (complete graph with two vertices) and the probability of observing walk tends to the uniform distribution.

  5. Querying Workflow Logs

    Directory of Open Access Journals (Sweden)

    Yan Tang

    2018-01-01

    Full Text Available A business process or workflow is an assembly of tasks that accomplishes a business goal. Business process management is the study of the design, configuration/implementation, enactment and monitoring, analysis, and re-design of workflows. The traditional methodology for the re-design and improvement of workflows relies on the well-known sequence of extract, transform, and load (ETL, data/process warehousing, and online analytical processing (OLAP tools. In this paper, we study the ad hoc queryiny of process enactments for (data-centric business processes, bypassing the traditional methodology for more flexibility in querying. We develop an algebraic query language based on “incident patterns” with four operators inspired from Business Process Model and Notation (BPMN representation, allowing the user to formulate ad hoc queries directly over workflow logs. A formal semantics of this query language, a preliminary query evaluation algorithm, and a group of elementary properties of the operators are provided.

  6. A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Choudhury, Sutanay; Holder, Larry; Chin, George; Agarwal, Khushbu; Feo, John T.

    2015-05-27

    Cyber security is one of the most significant technical challenges in current times. Detecting adversarial activities, prevention of theft of intellectual properties and customer data is a high priority for corporations and government agencies around the world. Cyber defenders need to analyze massive-scale, high-resolution network flows to identify, categorize, and mitigate attacks involving networks spanning institutional and national boundaries. Many of the cyber attacks can be described as subgraph patterns, with prominent examples being insider infiltrations (path queries), denial of service (parallel paths) and malicious spreads (tree queries). This motivates us to explore subgraph matching on streaming graphs in a continuous setting. The novelty of our work lies in using the subgraph distributional statistics collected from the streaming graph to determine the query processing strategy. We introduce a ``Lazy Search" algorithm where the search strategy is decided on a vertex-to-vertex basis depending on the likelihood of a match in the vertex neighborhood. We also propose a metric named ``Relative Selectivity" that is used to select between different query processing strategies. Our experiments performed on real online news, network traffic stream and a synthetic social network benchmark demonstrate 10-100x speedups over non-incremental, selectivity agnostic approaches.

  7. Processing SPARQL queries with regular expressions in RDF databases

    Directory of Open Access Journals (Sweden)

    Cho Hune

    2011-03-01

    Full Text Available Abstract Background As the Resource Description Framework (RDF data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf or Bio2RDF (bio2rdf.org, SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users’ requests for extracting information from the RDF data as well as the lack of users’ knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. Results In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1 We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2 We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3 We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Conclusions Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns.

  8. Linked data querying through FCA-based schema indexing

    OpenAIRE

    Brosius, Dominik; Staab, Steffen

    2016-01-01

    The effciency of SPARQL query evaluation against Linked Open Data may benefit from schema-based indexing. However, many data items come with incomplete schema information or lack schema descriptions entirely. In this position paper, we outline an approach to an indexing of linked data graphs based on schemata induced through Formal Concept Analysis. We show how to map queries onto RDF graphs based on such derived schema information. We sketch next steps for realizing and optimizing the sugges...

  9. External Data Structures for Shortest Path Queries on Planar Digraphs

    DEFF Research Database (Denmark)

    Arge, Lars; Toma, Laura

    2005-01-01

    In this paper we present space-query trade-offs for external memory data structures that answer shortest path queries on planar directed graphs. For any S = Ω(N 1 + ε) and S = O(N2/B), our main result is a family of structures that use S space and answer queries in O(N2/ S B) I/Os, thus obtaining...... optimal space-query product O(N2/B). An S space structure can be constructed in O(√S · sort(N)) I/Os, where sort(N) is the number of I/Os needed to sort N elements, B is the disk block size, and N is the size of the graph....

  10. SPARQL Query Re-writing Using Partonomy Based Transformation Rules

    Science.gov (United States)

    Jain, Prateek; Yeh, Peter Z.; Verma, Kunal; Henson, Cory A.; Sheth, Amit P.

    Often the information present in a spatial knowledge base is represented at a different level of granularity and abstraction than the query constraints. For querying ontology's containing spatial information, the precise relationships between spatial entities has to be specified in the basic graph pattern of SPARQL query which can result in long and complex queries. We present a novel approach to help users intuitively write SPARQL queries to query spatial data, rather than relying on knowledge of the ontology structure. Our framework re-writes queries, using transformation rules to exploit part-whole relations between geographical entities to address the mismatches between query constraints and knowledge base. Our experiments were performed on completely third party datasets and queries. Evaluations were performed on Geonames dataset using questions from National Geographic Bee serialized into SPARQL and British Administrative Geography Ontology using questions from a popular trivia website. These experiments demonstrate high precision in retrieval of results and ease in writing queries.

  11. Real-Time MENTAT programming language and architecture

    Science.gov (United States)

    Grimshaw, Andrew S.; Silberman, Ami; Liu, Jane W. S.

    1989-01-01

    Real-time MENTAT, a programming environment designed to simplify the task of programming real-time applications in distributed and parallel environments, is described. It is based on the same data-driven computation model and object-oriented programming paradigm as MENTAT. It provides an easy-to-use mechanism to exploit parallelism, language constructs for the expression and enforcement of timing constraints, and run-time support for scheduling and exciting real-time programs. The real-time MENTAT programming language is an extended C++. The extensions are added to facilitate automatic detection of data flow and generation of data flow graphs, to express the timing constraints of individual granules of computation, and to provide scheduling directives for the runtime system. A high-level view of the real-time MENTAT system architecture and programming language constructs is provided.

  12. Clone Detection for Graph-Based Model Transformation Languages

    DEFF Research Database (Denmark)

    Strüber, Daniel; Plöger, Jennifer; Acretoaie, Vlad

    2016-01-01

    and analytical quality assurance. From these use cases, we derive a set of key requirements. We describe our customization of existing model clone detection techniques allowing us to address these requirements. Finally, we provide an experimental evaluation, indicating that our customization of ConQAT, one......Cloning is a convenient mechanism to enable reuse across and within software artifacts. On the downside, it is also a practice related to significant long-term maintainability impediments, thus generating a need to identify clones in affected artifacts. A large variety of clone detection techniques...... has been proposed for programming and modeling languages; yet no specific ones have emerged for model transformation languages. In this paper, we explore clone detection for graph-based model transformation languages. We introduce potential use cases for such techniques in the context of constructive...

  13. Design of an On-Line Query Language for Full Text Patent Search.

    Science.gov (United States)

    Glantz, Richard S.

    The design of an English-like query language and an interactive computer environment for searching the full text of the U.S. patent collection are discussed. Special attention is paid to achieving a transparent user interface, to providing extremely broad search capabilities (including nested substitution classes, Kleene star events, and domain…

  14. On tractable query evaluation for SPARQL

    OpenAIRE

    Mengel, Stefan; Skritek, Sebastian

    2017-01-01

    Despite much work within the last decade on foundational properties of SPARQL - the standard query language for RDF data - rather little is known about the exact limits of tractability for this language. In particular, this is the case for SPARQL queries that contain the OPTIONAL-operator, even though it is one of the most intensively studied features of SPARQL. The aim of our work is to provide a more thorough picture of tractable classes of SPARQL queries. In general, SPARQL query evaluatio...

  15. Development of a user friendly interface for database querying in natural language by using concepts and means related to artificial intelligence

    International Nuclear Information System (INIS)

    Pujo, Pascal

    1989-01-01

    This research thesis reports the development of a user-friendly interface in natural language for querying a relational database. The developed system differs from usual approaches for its integrated architecture as the relational model management is totally controlled by the interface. The author first addresses the way to store data in order to make them accessible through an interface in natural language, and more precisely to store data with an organisation which would result in the less possible constraints in query formulation. The author then briefly presents techniques related to automatic processing in natural language, and discusses the implications of a better user-friendliness and for error processing. The next part reports the study of the developed interface: selection of data processing tools, interface development, data management at the interface level, information input by the user. The last chapter proposes an overview of possible evolutions for the interface: use of deductive functionalities, use of an extensional base and of an intentional base to deduce facts from knowledge stores in the extensional base, and handling of complex objects [fr

  16. QUERY RESPONSE TIME COMPARISON NOSQLDB MONGODB WITH SQLDB ORACLE

    Directory of Open Access Journals (Sweden)

    Humasak T. A. Simanjuntak

    2015-01-01

    Full Text Available Penyimpanan data saat ini terdapat dua jenis yakni relational database dan non-relational database. Kedua jenis DBMS (Database Managemnet System tersebut berbeda dalam berbagai aspek seperti per-formansi eksekusi query, scalability, reliability maupun struktur penyimpanan data. Kajian ini memiliki tujuan untuk mengetahui perbandingan performansi DBMS antara Oracle sebagai jenis relational data-base dan MongoDB sebagai jenis non-relational database dalam mengolah data terstruktur. Eksperimen dilakukan untuk mengetahui perbandingan performansi kedua DBMS tersebut untuk operasi insert, select, update dan delete dengan menggunakan query sederhana maupun kompleks pada database Northwind. Untuk mencapai tujuan eksperimen, 18 query yang terdiri dari 2 insert query, 10 select query, 2 update query dan 2 delete query dieksekusi. Query dieksekusi melalui sebuah aplikasi .Net yang dibangun sebagai perantara antara user dengan basis data. Eksperimen dilakukan pada tabel dengan atau tanpa relasi pada Oracle dan embedded atau bukan embedded dokumen pada MongoDB. Response time untuk setiap eksekusi query dibandingkan dengan menggunakan metode statistik. Eksperimen menunjukkan response time query untuk proses select, insert, dan update pada MongoDB lebih cepatdaripada Oracle. MongoDB lebih cepat 64.8 % untuk select query;MongoDB lebihcepat 72.8 % untuk insert query dan MongoDB lebih cepat 33.9 % untuk update query. Pada delete query, Oracle lebih cepat 96.8 % daripada MongoDB untuk table yang berelasi, tetapi MongoDB lebih cepat 83.8 % daripada Oracle untuk table yang tidak memiliki relasi.Untuk query kompleks dengan Map Reduce pada MongoDB lebih lambat 97.6% daripada kompleks query dengan aggregate function pada Oracle.

  17. Graph Transformation Semantics for a QVT Language

    NARCIS (Netherlands)

    Rensink, Arend; Nederpel, Ronald; Bruni, Roberto; Varró, Dániel

    It has been claimed by many in the graph transformation community that model transformation, as understood in the context of Model Driven Architecture, can be seen as an application of graph transformation. In this paper we substantiate this claim by giving a graph transformation-based semantics to

  18. Optimizing Temporal Queries: Efficient Handling of Duplicates

    DEFF Research Database (Denmark)

    Toman, David; Bowman, Ivan Thomas

    2001-01-01

    , these query languages are implemented by translating temporal queries into standard relational queries. However, the compiled queries are often quite cumbersome and expensive to execute even using state-of-the- art relational products. This paper presents an optimization technique that produces more efficient...... translated SQL queries by taking into account the properties of the encoding used for temporal attributes. For concreteness, this translation technique is presented in the context of SQL/TP; however, these techniques are also applicable to other temporal query languages....

  19. Temporal Representation in Semantic Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Levandoski, J J; Abdulla, G M

    2007-08-07

    A wide range of knowledge discovery and analysis applications, ranging from business to biological, make use of semantic graphs when modeling relationships and concepts. Most of the semantic graphs used in these applications are assumed to be static pieces of information, meaning temporal evolution of concepts and relationships are not taken into account. Guided by the need for more advanced semantic graph queries involving temporal concepts, this paper surveys the existing work involving temporal representations in semantic graphs.

  20. Querying and Mining Strings Made Easy

    KAUST Repository

    Sahli, Majed

    2017-10-13

    With the advent of large string datasets in several scientific and business applications, there is a growing need to perform ad-hoc analysis on strings. Currently, strings are stored, managed, and queried using procedural codes. This limits users to certain operations supported by existing procedural applications and requires manual query planning with limited tuning opportunities. This paper presents StarQL, a generic and declarative query language for strings. StarQL is based on a native string data model that allows StarQL to support a large variety of string operations and provide semantic-based query optimization. String analytic queries are too intricate to be solved on one machine. Therefore, we propose a scalable and efficient data structure that allows StarQL implementations to handle large sets of strings and utilize large computing infrastructures. Our evaluation shows that StarQL is able to express workloads of application-specific tools, such as BLAST and KAT in bioinformatics, and to mine Wikipedia text for interesting patterns using declarative queries. Furthermore, the StarQL query optimizer shows an order of magnitude reduction in query execution time.

  1. Building Scalable Knowledge Graphs for Earth Science

    Science.gov (United States)

    Ramachandran, Rahul; Maskey, Manil; Gatlin, Patrick; Zhang, Jia; Duan, Xiaoyi; Miller, J. J.; Bugbee, Kaylin; Christopher, Sundar; Freitag, Brian

    2017-01-01

    Knowledge Graphs link key entities in a specific domain with other entities via relationships. From these relationships, researchers can query knowledge graphs for probabilistic recommendations to infer new knowledge. Scientific papers are an untapped resource which knowledge graphs could leverage to accelerate research discovery. Goal: Develop an end-to-end (semi) automated methodology for constructing Knowledge Graphs for Earth Science.

  2. Time-dependent Networks as Models to Achieve Fast Exact Time-table Queries

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Jacob, Rico

    2001-01-01

    We consider efficient algorithms for exact time-table queries, i.e. algorithms that find optimal itineraries. We propose to use time-dependent networks as a model and show advantages of this approach over space-time networks as models.......We consider efficient algorithms for exact time-table queries, i.e. algorithms that find optimal itineraries. We propose to use time-dependent networks as a model and show advantages of this approach over space-time networks as models....

  3. Pragmatic Graph Rewriting Modifications

    OpenAIRE

    Rodgers, Peter; Vidal, Natalia

    1999-01-01

    We present new pragmatic constructs for easing programming in visual graph rewriting programming languages. The first is a modification to the rewriting process for nodes the host graph, where nodes specified as 'Once Only' in the LHS of a rewrite match at most once with a corresponding node in the host graph. This reduces the previously common use of tags to indicate the progress of matching in the graph. The second modification controls the application of LHS graphs, where those specified a...

  4. Astronomical Data Processing Using SciQL, an SQL Based Query Language for Array Data

    Science.gov (United States)

    Zhang, Y.; Scheers, B.; Kersten, M.; Ivanova, M.; Nes, N.

    2012-09-01

    SciQL (pronounced as ‘cycle’) is a novel SQL-based array query language for scientific applications with both tables and arrays as first class citizens. SciQL lowers the entrance fee of adopting relational DBMS (RDBMS) in scientific domains, because it includes functionality often only found in mathematics software packages. In this paper, we demonstrate the usefulness of SciQL for astronomical data processing using examples from the Transient Key Project of the LOFAR radio telescope. In particular, how the LOFAR light-curve database of all detected sources can be constructed, by correlating sources across the spatial, frequency, time and polarisation domains.

  5. Quick Mining of Isomorphic Exact Large Patterns from Large Graphs

    KAUST Repository

    Almasri, Islam

    2014-12-01

    The applications of the sub graph isomorphism search are growing with the growing number of areas that model their systems using graphs or networks. Specifically, many biological systems, such as protein interaction networks, molecular structures and protein contact maps, are modeled as graphs. The sub graph isomorphism search is concerned with finding all sub graphs that are isomorphic to a relevant query graph, the existence of such sub graphs can reflect on the characteristics of the modeled system. The most computationally expensive step in the search for isomorphic sub graphs is the backtracking algorithm that traverses the nodes of the target graph. In this paper, we propose a pruning approach that is inspired by the minimum remaining value heuristic that achieves greater scalability over large query and target graphs. Our testing on various biological networks shows that performance enhancement of our approach over existing state-of-the-art approaches varies between 6x and 53x. © 2014 IEEE.

  6. Quick Mining of Isomorphic Exact Large Patterns from Large Graphs

    KAUST Repository

    Almasri, Islam; Gao, Xin; Fedoroff, Nina V.

    2014-01-01

    The applications of the sub graph isomorphism search are growing with the growing number of areas that model their systems using graphs or networks. Specifically, many biological systems, such as protein interaction networks, molecular structures and protein contact maps, are modeled as graphs. The sub graph isomorphism search is concerned with finding all sub graphs that are isomorphic to a relevant query graph, the existence of such sub graphs can reflect on the characteristics of the modeled system. The most computationally expensive step in the search for isomorphic sub graphs is the backtracking algorithm that traverses the nodes of the target graph. In this paper, we propose a pruning approach that is inspired by the minimum remaining value heuristic that achieves greater scalability over large query and target graphs. Our testing on various biological networks shows that performance enhancement of our approach over existing state-of-the-art approaches varies between 6x and 53x. © 2014 IEEE.

  7. Analyzing locomotion synthesis with feature-based motion graphs.

    Science.gov (United States)

    Mahmudi, Mentar; Kallmann, Marcelo

    2013-05-01

    We propose feature-based motion graphs for realistic locomotion synthesis among obstacles. Among several advantages, feature-based motion graphs achieve improved results in search queries, eliminate the need of postprocessing for foot skating removal, and reduce the computational requirements in comparison to traditional motion graphs. Our contributions are threefold. First, we show that choosing transitions based on relevant features significantly reduces graph construction time and leads to improved search performances. Second, we employ a fast channel search method that confines the motion graph search to a free channel with guaranteed clearance among obstacles, achieving faster and improved results that avoid expensive collision checking. Lastly, we present a motion deformation model based on Inverse Kinematics applied over the transitions of a solution branch. Each transition is assigned a continuous deformation range that does not exceed the original transition cost threshold specified by the user for the graph construction. The obtained deformation improves the reachability of the feature-based motion graph and in turn also reduces the time spent during search. The results obtained by the proposed methods are evaluated and quantified, and they demonstrate significant improvements in comparison to traditional motion graph techniques.

  8. Percolator: Scalable Pattern Discovery in Dynamic Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Choudhury, Sutanay; Purohit, Sumit; Lin, Peng; Wu, Yinghui; Holder, Lawrence B.; Agarwal, Khushbu

    2018-02-06

    We demonstrate Percolator, a distributed system for graph pattern discovery in dynamic graphs. In contrast to conventional mining systems, Percolator advocates efficient pattern mining schemes that (1) support pattern detection with keywords; (2) integrate incremental and parallel pattern mining; and (3) support analytical queries such as trend analysis. The core idea of Percolator is to dynamically decide and verify a small fraction of patterns and their in- stances that must be inspected in response to buffered updates in dynamic graphs, with a total mining cost independent of graph size. We demonstrate a) the feasibility of incremental pattern mining by walking through each component of Percolator, b) the efficiency and scalability of Percolator over the sheer size of real-world dynamic graphs, and c) how the user-friendly GUI of Percolator inter- acts with users to support keyword-based queries that detect, browse and inspect trending patterns. We also demonstrate two user cases of Percolator, in social media trend analysis and academic collaboration analysis, respectively.

  9. A semantic-based approach for querying linked data using natural language

    KAUST Repository

    Paredes-Valverde, Mario André s; Valencia-Garcí a, Rafael; Rodriguez-Garcia, Miguel Angel; Colomo-Palacios, Ricardo; Alor-Herná ndez, Giner

    2016-01-01

    The semantic Web aims to provide to Web information with a well-defined meaning and make it understandable not only by humans but also by computers, thus allowing the automation, integration and reuse of high-quality information across different applications. However, current information retrieval mechanisms for semantic knowledge bases are intended to be only used by expert users. In this work, we propose a natural language interface that allows non-expert users the access to this kind of information through formulating queries in natural language. The present approach uses a domain-independent ontology model to represent the question's structure and context. Also, this model allows determination of the answer type expected by the user based on a proposed question classification. To prove the effectiveness of our approach, we have conducted an evaluation in the music domain using LinkedBrainz, an effort to provide the MusicBrainz information as structured data on the Web by means of Semantic Web technologies. Our proposal obtained encouraging results based on the F-measure metric, ranging from 0.74 to 0.82 for a corpus of questions generated by a group of real-world end users. © The Author(s) 2015.

  10. A semantic-based approach for querying linked data using natural language

    KAUST Repository

    Paredes-Valverde, Mario Andrés

    2016-01-11

    The semantic Web aims to provide to Web information with a well-defined meaning and make it understandable not only by humans but also by computers, thus allowing the automation, integration and reuse of high-quality information across different applications. However, current information retrieval mechanisms for semantic knowledge bases are intended to be only used by expert users. In this work, we propose a natural language interface that allows non-expert users the access to this kind of information through formulating queries in natural language. The present approach uses a domain-independent ontology model to represent the question\\'s structure and context. Also, this model allows determination of the answer type expected by the user based on a proposed question classification. To prove the effectiveness of our approach, we have conducted an evaluation in the music domain using LinkedBrainz, an effort to provide the MusicBrainz information as structured data on the Web by means of Semantic Web technologies. Our proposal obtained encouraging results based on the F-measure metric, ranging from 0.74 to 0.82 for a corpus of questions generated by a group of real-world end users. © The Author(s) 2015.

  11. Wiener index and Diameter of a Planar Graph in Subquadratic Time

    DEFF Research Database (Denmark)

    Wulff-Nilsen, Christian

    2009-01-01

    Consider the problem of computing the sum of distances between each pair of vertices of an unweighted graph. This sum is also known as the Wiener index of the graph, a generalization of a definition given by H. Wiener in 1947. A molecular topological index is a value obtained from the graph...... structure of a molecule such that this value (hopefully) correlates with physical and/or chemical properties of the molecule. The Wiener index is perhaps the most studied molecular topological index with more than a thousand publications. It is open whether the Wiener index of a planar graph can be obtained...... in subquadratic time. In my talk, I will solve this open problem by exhibiting an O(n2 log log n / log n) time algorithm, where n is the size of the graph. A simple modification yields an algorithm with the same time bound that computes the diameter (maximum distance between any vertex pair) of a planar graph. I...

  12. CytoscapeRPC: a plugin to create, modify and query Cytoscape networks from scripting languages.

    Science.gov (United States)

    Bot, Jan J; Reinders, Marcel J T

    2011-09-01

    CytoscapeRPC is a plugin for Cytoscape which allows users to create, query and modify Cytoscape networks from any programming language which supports XML-RPC. This enables them to access Cytoscape functionality and visualize their data interactively without leaving the programming environment with which they are familiar. Install through the Cytoscape plugin manager or visit the web page: http://wiki.nbic.nl/index.php/CytoscapeRPC for the user tutorial and download. j.j.bot@tudelft.nl; j.j.bot@tudelft.nl.

  13. Time-Dependent Networks as Models to Achieve Fast Exact Time-Table Queries

    DEFF Research Database (Denmark)

    Brodal, Gert Stølting; Jacob, Rico

    2003-01-01

    We consider efficient algorithms for exact time-table queries, i.e. algorithms that find optimal itineraries for travelers using a train system. We propose to use time-dependent networks as a model and show advantages of this approach over space-time networks as models.......We consider efficient algorithms for exact time-table queries, i.e. algorithms that find optimal itineraries for travelers using a train system. We propose to use time-dependent networks as a model and show advantages of this approach over space-time networks as models....

  14. EpiGeNet: A Graph Database of Interdependencies Between Genetic and Epigenetic Events in Colorectal Cancer.

    Science.gov (United States)

    Balaur, Irina; Saqi, Mansoor; Barat, Ana; Lysenko, Artem; Mazein, Alexander; Rawlings, Christopher J; Ruskin, Heather J; Auffray, Charles

    2017-10-01

    The development of colorectal cancer (CRC)-the third most common cancer type-has been associated with deregulations of cellular mechanisms stimulated by both genetic and epigenetic events. StatEpigen is a manually curated and annotated database, containing information on interdependencies between genetic and epigenetic signals, and specialized currently for CRC research. Although StatEpigen provides a well-developed graphical user interface for information retrieval, advanced queries involving associations between multiple concepts can benefit from more detailed graph representation of the integrated data. This can be achieved by using a graph database (NoSQL) approach. Data were extracted from StatEpigen and imported to our newly developed EpiGeNet, a graph database for storage and querying of conditional relationships between molecular (genetic and epigenetic) events observed at different stages of colorectal oncogenesis. We illustrate the enhanced capability of EpiGeNet for exploration of different queries related to colorectal tumor progression; specifically, we demonstrate the query process for (i) stage-specific molecular events, (ii) most frequently observed genetic and epigenetic interdependencies in colon adenoma, and (iii) paths connecting key genes reported in CRC and associated events. The EpiGeNet framework offers improved capability for management and visualization of data on molecular events specific to CRC initiation and progression.

  15. GELLO: an object-oriented query and expression language for clinical decision support.

    Science.gov (United States)

    Sordo, Margarita; Ogunyemi, Omolola; Boxwala, Aziz A; Greenes, Robert A

    2003-01-01

    GELLO is a purpose-specific, object-oriented (OO) query and expression language. GELLO is the result of a concerted effort of the Decision Systems Group (DSG) working with the HL7 Clinical Decision Support Technical Committee (CDSTC) to provide the HL7 community with a common format for data encoding and manipulation. GELLO will soon be submitted for ballot to the HL7 CDSTC for consideration as a standard.

  16. Signal Processing for Time-Series Functions on a Graph

    Science.gov (United States)

    2018-02-01

    Figures Fig. 1 Time -series function on a fixed graph.............................................2 iv Approved for public release; distribution is...φi〉`2(V)φi (39) 6= f̄ (40) Instead, we simply recover the average of f over time . 13 Approved for public release; distribution is unlimited. This...ARL-TR-8276• FEB 2018 US Army Research Laboratory Signal Processing for Time -Series Functions on a Graph by Humberto Muñoz-Barona, Jean Vettel, and

  17. Visibility Graph Based Time Series Analysis.

    Science.gov (United States)

    Stephen, Mutua; Gu, Changgui; Yang, Huijie

    2015-01-01

    Network based time series analysis has made considerable achievements in the recent years. By mapping mono/multivariate time series into networks, one can investigate both it's microscopic and macroscopic behaviors. However, most proposed approaches lead to the construction of static networks consequently providing limited information on evolutionary behaviors. In the present paper we propose a method called visibility graph based time series analysis, in which series segments are mapped to visibility graphs as being descriptions of the corresponding states and the successively occurring states are linked. This procedure converts a time series to a temporal network and at the same time a network of networks. Findings from empirical records for stock markets in USA (S&P500 and Nasdaq) and artificial series generated by means of fractional Gaussian motions show that the method can provide us rich information benefiting short-term and long-term predictions. Theoretically, we propose a method to investigate time series from the viewpoint of network of networks.

  18. Visibility Graph Based Time Series Analysis.

    Directory of Open Access Journals (Sweden)

    Mutua Stephen

    Full Text Available Network based time series analysis has made considerable achievements in the recent years. By mapping mono/multivariate time series into networks, one can investigate both it's microscopic and macroscopic behaviors. However, most proposed approaches lead to the construction of static networks consequently providing limited information on evolutionary behaviors. In the present paper we propose a method called visibility graph based time series analysis, in which series segments are mapped to visibility graphs as being descriptions of the corresponding states and the successively occurring states are linked. This procedure converts a time series to a temporal network and at the same time a network of networks. Findings from empirical records for stock markets in USA (S&P500 and Nasdaq and artificial series generated by means of fractional Gaussian motions show that the method can provide us rich information benefiting short-term and long-term predictions. Theoretically, we propose a method to investigate time series from the viewpoint of network of networks.

  19. Menopause and big data: Word Adjacency Graph modeling of menopause-related ChaCha data.

    Science.gov (United States)

    Carpenter, Janet S; Groves, Doyle; Chen, Chen X; Otte, Julie L; Miller, Wendy R

    2017-07-01

    To detect and visualize salient queries about menopause using Big Data from ChaCha. We used Word Adjacency Graph (WAG) modeling to detect clusters and visualize the range of menopause-related topics and their mutual proximity. The subset of relevant queries was fully modeled. We split each query into token words (ie, meaningful words and phrases) and removed stopwords (ie, not meaningful functional words). The remaining words were considered in sequence to build summary tables of words and two and three-word phrases. Phrases occurring at least 10 times were used to build a network graph model that was iteratively refined by observing and removing clusters of unrelated content. We identified two menopause-related subsets of queries by searching for questions containing menopause and menopause-related terms (eg, climacteric, hot flashes, night sweats, hormone replacement). The first contained 263,363 queries from individuals aged 13 and older and the second contained 5,892 queries from women aged 40 to 62 years. In the first set, we identified 12 topic clusters: 6 relevant to menopause and 6 less relevant. In the second set, we identified 15 topic clusters: 11 relevant to menopause and 4 less relevant. Queries about hormones were pervasive within both WAG models. Many of the queries reflected low literacy levels and/or feelings of embarrassment. We modeled menopause-related queries posed by ChaCha users between 2009 and 2012. ChaCha data may be used on its own or in combination with other Big Data sources to identify patient-driven educational needs and create patient-centered interventions.

  20. ESIP's Earth Science Knowledge Graph (ESKG) Testbed Project: An Automatic Approach to Building Interdisciplinary Earth Science Knowledge Graphs to Improve Data Discovery

    Science.gov (United States)

    McGibbney, L. J.; Jiang, Y.; Burgess, A. B.

    2017-12-01

    Big Earth observation data have been produced, archived and made available online, but discovering the right data in a manner that precisely and efficiently satisfies user needs presents a significant challenge to the Earth Science (ES) community. An emerging trend in information retrieval community is to utilize knowledge graphs to assist users in quickly finding desired information from across knowledge sources. This is particularly prevalent within the fields of social media and complex multimodal information processing to name but a few, however building a domain-specific knowledge graph is labour-intensive and hard to keep up-to-date. In this work, we update our progress on the Earth Science Knowledge Graph (ESKG) project; an ESIP-funded testbed project which provides an automatic approach to building a dynamic knowledge graph for ES to improve interdisciplinary data discovery by leveraging implicit, latent existing knowledge present within across several U.S Federal Agencies e.g. NASA, NOAA and USGS. ESKG strengthens ties between observations and user communities by: 1) developing a knowledge graph derived from various sources e.g. Web pages, Web Services, etc. via natural language processing and knowledge extraction techniques; 2) allowing users to traverse, explore, query, reason and navigate ES data via knowledge graph interaction. ESKG has the potential to revolutionize the way in which ES communities interact with ES data in the open world through the entity, spatial and temporal linkages and characteristics that make it up. This project enables the advancement of ESIP collaboration areas including both Discovery and Semantic Technologies by putting graph information right at our fingertips in an interactive, modern manner and reducing the efforts to constructing ontology. To demonstrate the ESKG concept, we will demonstrate use of our framework across NASA JPL's PO.DAAC, NOAA's Earth Observation Requirements Evaluation System (EORES) and various USGS

  1. A graph-based approach to detect spatiotemporal dynamics in satellite image time series

    Science.gov (United States)

    Guttler, Fabio; Ienco, Dino; Nin, Jordi; Teisseire, Maguelonne; Poncelet, Pascal

    2017-08-01

    Enhancing the frequency of satellite acquisitions represents a key issue for Earth Observation community nowadays. Repeated observations are crucial for monitoring purposes, particularly when intra-annual process should be taken into account. Time series of images constitute a valuable source of information in these cases. The goal of this paper is to propose a new methodological framework to automatically detect and extract spatiotemporal information from satellite image time series (SITS). Existing methods dealing with such kind of data are usually classification-oriented and cannot provide information about evolutions and temporal behaviors. In this paper we propose a graph-based strategy that combines object-based image analysis (OBIA) with data mining techniques. Image objects computed at each individual timestamp are connected across the time series and generates a set of evolution graphs. Each evolution graph is associated to a particular area within the study site and stores information about its temporal evolution. Such information can be deeply explored at the evolution graph scale or used to compare the graphs and supply a general picture at the study site scale. We validated our framework on two study sites located in the South of France and involving different types of natural, semi-natural and agricultural areas. The results obtained from a Landsat SITS support the quality of the methodological approach and illustrate how the framework can be employed to extract and characterize spatiotemporal dynamics.

  2. Querying Business Process Models with VMQL

    DEFF Research Database (Denmark)

    Störrle, Harald; Acretoaie, Vlad

    2013-01-01

    The Visual Model Query Language (VMQL) has been invented with the objectives (1) to make it easier for modelers to query models effectively, and (2) to be universally applicable to all modeling languages. In previous work, we have applied VMQL to UML, and validated the first of these two claims. ...

  3. A Query System Implementation Case Study.

    Science.gov (United States)

    Hiser, Judith N.; Neil, M. Elizabeth

    1985-01-01

    The Department of Administrative Programming Services of Clemson University investigated products available in user-friendly retrieval systems. The test of INTELLECT, a natural language query system written by Artifical Intelligence Corporation, is described. (Author/MLW)

  4. Use of Graph Database for the Integration of Heterogeneous Biological Data.

    Science.gov (United States)

    Yoon, Byoung-Ha; Kim, Seon-Kyu; Kim, Seon-Young

    2017-03-01

    Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data.

  5. Approximating terminological queries

    NARCIS (Netherlands)

    Stuckenschmidt, Heiner; Van Harmelen, Frank

    2002-01-01

    Current proposals for languages to encode terminological knowledge in intelligent systems support logical reasoning for answering user queries about objects and classes. An application of these languages on the World Wide Web, however, is hampered by the limitations of logical reasoning in terms

  6. Query optimization over crowdsourced data

    KAUST Repository

    Park, Hyunjung

    2013-08-26

    Deco is a comprehensive system for answering declarative queries posed over stored relational data together with data obtained on-demand from the crowd. In this paper we describe Deco\\'s cost-based query optimizer, building on Deco\\'s data model, query language, and query execution engine presented earlier. Deco\\'s objective in query optimization is to find the best query plan to answer a query, in terms of estimated monetary cost. Deco\\'s query semantics and plan execution strategies require several fundamental changes to traditional query optimization. Novel techniques incorporated into Deco\\'s query optimizer include a cost model distinguishing between "free" existing data versus paid new data, a cardinality estimation algorithm coping with changes to the database state during query execution, and a plan enumeration algorithm maximizing reuse of common subplans in a setting that makes reuse challenging. We experimentally evaluate Deco\\'s query optimizer, focusing on the accuracy of cost estimation and the efficiency of plan enumeration.

  7. Visually defining and querying consistent multi-granular clinical temporal abstractions.

    Science.gov (United States)

    Combi, Carlo; Oliboni, Barbara

    2012-02-01

    The main goal of this work is to propose a framework for the visual specification and query of consistent multi-granular clinical temporal abstractions. We focus on the issue of querying patient clinical information by visually defining and composing temporal abstractions, i.e., high level patterns derived from several time-stamped raw data. In particular, we focus on the visual specification of consistent temporal abstractions with different granularities and on the visual composition of different temporal abstractions for querying clinical databases. Temporal abstractions on clinical data provide a concise and high-level description of temporal raw data, and a suitable way to support decision making. Granularities define partitions on the time line and allow one to represent time and, thus, temporal clinical information at different levels of detail, according to the requirements coming from the represented clinical domain. The visual representation of temporal information has been considered since several years in clinical domains. Proposed visualization techniques must be easy and quick to understand, and could benefit from visual metaphors that do not lead to ambiguous interpretations. Recently, physical metaphors such as strips, springs, weights, and wires have been proposed and evaluated on clinical users for the specification of temporal clinical abstractions. Visual approaches to boolean queries have been considered in the last years and confirmed that the visual support to the specification of complex boolean queries is both an important and difficult research topic. We propose and describe a visual language for the definition of temporal abstractions based on a set of intuitive metaphors (striped wall, plastered wall, brick wall), allowing the clinician to use different granularities. A new algorithm, underlying the visual language, allows the physician to specify only consistent abstractions, i.e., abstractions not containing contradictory conditions on

  8. A Query Cache Tool for Optimizing Repeatable and Parallel OLAP Queries

    Science.gov (United States)

    Santos, Ricardo Jorge; Bernardino, Jorge

    On-line analytical processing against data warehouse databases is a common form of getting decision making information for almost every business field. Decision support information oftenly concerns periodic values based on regular attributes, such as sales amounts, percentages, most transactioned items, etc. This means that many similar OLAP instructions are periodically repeated, and simultaneously, between the several decision makers. Our Query Cache Tool takes advantage of previously executed queries, storing their results and the current state of the data which was accessed. Future queries only need to execute against the new data, inserted since the queries were last executed, and join these results with the previous ones. This makes query execution much faster, because we only need to process the most recent data. Our tool also minimizes the execution time and resource consumption for similar queries simultaneously executed by different users, putting the most recent ones on hold until the first finish and returns the results for all of them. The stored query results are held until they are considered outdated, then automatically erased. We present an experimental evaluation of our tool using a data warehouse based on a real-world business dataset and use a set of typical decision support queries to discuss the results, showing a very high gain in query execution time.

  9. Supporting Fourth Graders' Ability to Interpret Graphs through Real-Time Graphing Technology: A Preliminary Study

    Science.gov (United States)

    Deniz, Hasan; Dulger, Mehmet F.

    2012-01-01

    This study examined to what extent inquiry-based instruction supported with real-time graphing technology improves fourth grader's ability to interpret graphs as representations of physical science concepts such as motion and temperature. This study also examined whether there is any difference between inquiry-based instruction supported with…

  10. Querying Natural Logic Knowledge Bases

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Jensen, Per Anker

    2017-01-01

    This paper describes the principles of a system applying natural logic as a knowledge base language. Natural logics are regimented fragments of natural language employing high level inference rules. We advocate the use of natural logic for knowledge bases dealing with querying of classes...... in ontologies and class-relationships such as are common in life-science descriptions. The paper adopts a version of natural logic with recursive restrictive clauses such as relative clauses and adnominal prepositional phrases. It includes passive as well as active voice sentences. We outline a prototype...... for partial translation of natural language into natural logic, featuring further querying and conceptual path finding in natural logic knowledge bases....

  11. A Type Graph Model for Java Programs

    NARCIS (Netherlands)

    Rensink, Arend; Zambon, Eduardo

    2009-01-01

    In this report we present a type graph that models all executable constructs of the Java programming language. Such a model is useful for any graph-based technique that relies on a representation of Java programs as graphs. The model can be regarded as a common representation to which all Java

  12. Graph Processing in Main-Memory Column Stores

    OpenAIRE

    Paradies, Marcus

    2017-01-01

    Evermore, novel and traditional business applications leverage the advantages of a graph data model, such as the offered schema flexibility and an explicit representation of relationships between entities. As a consequence, companies are confronted with the challenge of storing, manipulating, and querying terabytes of graph data for enterprise-critical applications. Although these business applications operate on graph-structured data, they still require direct access to the relational data a...

  13. Knowledge Graphs as Context Models: Improving the Detection of Cross-Language Plagiarism with Paraphrasing

    OpenAIRE

    Franco-Salvador, Marc; Gupta, Parth; Rosso, Paolo

    2013-01-01

    Cross-language plagiarism detection attempts to identify and extract automatically plagiarism among documents in different languages. Plagiarized fragments can be translated verbatim copies or may alter their structure to hide the copying, which is known as paraphrasing and is more difficult to detect. In order to improve the paraphrasing detection, we use a knowledge graph-based approach to obtain and compare context models of document fragments in different languages. Experimental results i...

  14. A Type Graph Model for Java Programs

    NARCIS (Netherlands)

    Rensink, Arend; Zambon, Eduardo; Lee, D.; Lopes, A.; Poetzsch-Heffter, A.

    2009-01-01

    In this work we present a type graph that models all executable constructs of the Java programming language. Such a model is useful for any graph-based technique that relies on a representation of Java programs as graphs. The model can be regarded as a common representation to which all Java syntax

  15. Engineering Object-Oriented Semantics Using Graph Transformations

    NARCIS (Netherlands)

    Kastenberg, H.; Kleppe, A.G.; Rensink, Arend

    In this paper we describe the application of the theory of graph transformations to the practise of language design. We have defined the semantics of a small but realistic object-oriented language (called TAAL) by mapping the language constructs to graphs and their operational semantics to graph

  16. X-Graphs: Language and Algorithms for Heterogeneous Graph Streams

    Science.gov (United States)

    2017-09-01

    are widely used by academia and industry. 15. SUBJECT TERMS Data Analytics, Graph Analytics, High-Performance Computing 16. SECURITY CLASSIFICATION...form the core of the DeepDive Knowledge Construction System. 2 INTRODUCTION The goal of the X-Graphs project was to develop computational techniques...memory multicore machine. Ringo is based on Snap.py and SNAP, and uses Python . Ringo now allows the integration of Delite DSL Framework Graph

  17. Query2Question: Translating Visualization Interaction into Natural Language.

    Science.gov (United States)

    Nafari, Maryam; Weaver, Chris

    2015-06-01

    Richly interactive visualization tools are increasingly popular for data exploration and analysis in a wide variety of domains. Existing systems and techniques for recording provenance of interaction focus either on comprehensive automated recording of low-level interaction events or on idiosyncratic manual transcription of high-level analysis activities. In this paper, we present the architecture and translation design of a query-to-question (Q2Q) system that automatically records user interactions and presents them semantically using natural language (written English). Q2Q takes advantage of domain knowledge and uses natural language generation (NLG) techniques to translate and transcribe a progression of interactive visualization states into a visual log of styled text that complements and effectively extends the functionality of visualization tools. We present Q2Q as a means to support a cross-examination process in which questions rather than interactions are the focus of analytic reasoning and action. We describe the architecture and implementation of the Q2Q system, discuss key design factors and variations that effect question generation, and present several visualizations that incorporate Q2Q for analysis in a variety of knowledge domains.

  18. Neuro-symbolic representation learning on biological knowledge graphs

    KAUST Repository

    AlShahrani, Mona; Khan, Mohammed Asif; Maddouri, Omar; Kinjo, Akira R; Queralt-Rosinach, Nú ria; Hoehndorf, Robert

    2017-01-01

    Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. In the past years, feature learning methods that are applicable to graph

  19. Information Retrieval and Graph Analysis Approaches for Book Recommendation

    Directory of Open Access Journals (Sweden)

    Chahinez Benkoussas

    2015-01-01

    Full Text Available A combination of multiple information retrieval approaches is proposed for the purpose of book recommendation. In this paper, book recommendation is based on complex user's query. We used different theoretical retrieval models: probabilistic as InL2 (Divergence from Randomness model and language model and tested their interpolated combination. Graph analysis algorithms such as PageRank have been successful in Web environments. We consider the application of this algorithm in a new retrieval approach to related document network comprised of social links. We called Directed Graph of Documents (DGD a network constructed with documents and social information provided from each one of them. Specifically, this work tackles the problem of book recommendation in the context of INEX (Initiative for the Evaluation of XML retrieval Social Book Search track. A series of reranking experiments demonstrate that combining retrieval models yields significant improvements in terms of standard ranked retrieval metrics. These results extend the applicability of link analysis algorithms to different environments.

  20. Information Retrieval and Graph Analysis Approaches for Book Recommendation.

    Science.gov (United States)

    Benkoussas, Chahinez; Bellot, Patrice

    2015-01-01

    A combination of multiple information retrieval approaches is proposed for the purpose of book recommendation. In this paper, book recommendation is based on complex user's query. We used different theoretical retrieval models: probabilistic as InL2 (Divergence from Randomness model) and language model and tested their interpolated combination. Graph analysis algorithms such as PageRank have been successful in Web environments. We consider the application of this algorithm in a new retrieval approach to related document network comprised of social links. We called Directed Graph of Documents (DGD) a network constructed with documents and social information provided from each one of them. Specifically, this work tackles the problem of book recommendation in the context of INEX (Initiative for the Evaluation of XML retrieval) Social Book Search track. A series of reranking experiments demonstrate that combining retrieval models yields significant improvements in terms of standard ranked retrieval metrics. These results extend the applicability of link analysis algorithms to different environments.

  1. Topological structure of dictionary graphs

    International Nuclear Information System (INIS)

    Fuks, Henryk; Krzeminski, Mark

    2009-01-01

    We investigate the topological structure of the subgraphs of dictionary graphs constructed from WordNet and Moby thesaurus data. In the process of learning a foreign language, the learner knows only a subset of all words of the language, corresponding to a subgraph of a dictionary graph. When this subgraph grows with time, its topological properties change. We introduce the notion of the pseudocore and argue that the growth of the vocabulary roughly follows decreasing pseudocore numbers-that is, one first learns words with a high pseudocore number followed by smaller pseudocores. We also propose an alternative strategy for vocabulary growth, involving decreasing core numbers as opposed to pseudocore numbers. We find that as the core or pseudocore grows in size, the clustering coefficient first decreases, then reaches a minimum and starts increasing again. The minimum occurs when the vocabulary reaches a size between 10 3 and 10 4 . A simple model exhibiting similar behavior is proposed. The model is based on a generalized geometric random graph. Possible implications for language learning are discussed.

  2. Multi-Dimensional Path Queries

    DEFF Research Database (Denmark)

    Bækgaard, Lars

    1998-01-01

    to create nested path structures. We present an SQL-like query language that is based on path expressions and we show how to use it to express multi-dimensional path queries that are suited for advanced data analysis in decision support environments like data warehousing environments......We present the path-relationship model that supports multi-dimensional data modeling and querying. A path-relationship database is composed of sets of paths and sets of relationships. A path is a sequence of related elements (atoms, paths, and sets of paths). A relationship is a binary path...

  3. Using genre pedagogy to promote student proficiency in the language required for interpreting line graphs

    Science.gov (United States)

    Smit, Jantien; Bakker, Arthur; van Eerde, Dolly; Kuijpers, Maggie

    2016-09-01

    The importance of language in mathematics learning has been widely acknowledged. However, little is known about how to make this insight productive in the design and enactment of language-oriented mathematics education. In a design-based research project, we explored how language-oriented mathematics education can be designed and enacted. We drew on genre pedagogy to promote student proficiency in the language required for interpreting line graphs. In the intervention, the teacher used scaffolding strategies to focus students' attention on the structure and linguistic features of the language involved in this particular domain. The research question addressed in this paper is how student proficiency in this language may be promoted. The study comprised nine lessons involving 22 students in grades 5 and 6 (aged 10-12); of these students, 19 had a migrant background. In light of the research aim, we first describe the rationale behind our design. Next, we illustrate how the design was enacted by means of a case study focusing on one student in the classroom practice of developing proficiency in the language required for interpreting line graphs. On the basis of pre- and posttest scores, we conclude that overall their proficiency has increased. Together, the results indicate that and how genre pedagogy may be used to help students become more proficient in the language required in a mathematical domain.

  4. A web-based data-querying tool based on ontology-driven methodology and flowchart-based model.

    Science.gov (United States)

    Ping, Xiao-Ou; Chung, Yufang; Tseng, Yi-Ju; Liang, Ja-Der; Yang, Pei-Ming; Huang, Guan-Tarn; Lai, Feipei

    2013-10-08

    Because of the increased adoption rate of electronic medical record (EMR) systems, more health care records have been increasingly accumulating in clinical data repositories. Therefore, querying the data stored in these repositories is crucial for retrieving the knowledge from such large volumes of clinical data. The aim of this study is to develop a Web-based approach for enriching the capabilities of the data-querying system along the three following considerations: (1) the interface design used for query formulation, (2) the representation of query results, and (3) the models used for formulating query criteria. The Guideline Interchange Format version 3.5 (GLIF3.5), an ontology-driven clinical guideline representation language, was used for formulating the query tasks based on the GLIF3.5 flowchart in the Protégé environment. The flowchart-based data-querying model (FBDQM) query execution engine was developed and implemented for executing queries and presenting the results through a visual and graphical interface. To examine a broad variety of patient data, the clinical data generator was implemented to automatically generate the clinical data in the repository, and the generated data, thereby, were employed to evaluate the system. The accuracy and time performance of the system for three medical query tasks relevant to liver cancer were evaluated based on the clinical data generator in the experiments with varying numbers of patients. In this study, a prototype system was developed to test the feasibility of applying a methodology for building a query execution engine using FBDQMs by formulating query tasks using the existing GLIF. The FBDQM-based query execution engine was used to successfully retrieve the clinical data based on the query tasks formatted using the GLIF3.5 in the experiments with varying numbers of patients. The accuracy of the three queries (ie, "degree of liver damage," "degree of liver damage when applying a mutually exclusive setting

  5. A user-friendly, menu-driven, language-free laser characteristics curves graphing program for desk-top IBM PC compatible computers

    Science.gov (United States)

    Klutz, Glenn

    1989-01-01

    A facility was established that uses collected data and feeds it into mathematical models that generate improved data arrays by correcting for various losses, base line drift, and conversion to unity scaling. These developed data arrays have headers and other identifying information affixed and are subsequently stored in a Laser Materials and Characteristics data base which is accessible to various users. The two part data base: absorption - emission spectra and tabulated data, is developed around twelve laser models. The tabulated section of the data base is divided into several parts: crystalline, optical, mechanical, and thermal properties; aborption and emission spectra information; chemical name and formulas; and miscellaneous. A menu-driven, language-free graphing program will reduce and/or remove the requirement that users become competent FORTRAN programmers and the concomitant requirement that they also spend several days to a few weeks becoming conversant with the GEOGRAF library and sequence of calls and the continual refreshers of both. The work included becoming thoroughly conversant with or at least very familiar with GEOGRAF by GEOCOMP Corp. The development of the graphing program involved trial runs of the various callable library routines on dummy data in order to become familiar with actual implementation and sequencing. This was followed by trial runs with actual data base files and some additional data from current research that was not in the data base but currently needed graphs. After successful runs, with dummy and real data, using actual FORTRAN instructions steps were undertaken to develop the menu-driven language-free implementation of a program which would require the user only know how to use microcomputers. The user would simply be responding to items displayed on the video screen. To assist the user in arriving at the optimum values needed for a specific graph, a paper, and pencil check list was made available to use on the trial runs.

  6. Query Health: standards-based, cross-platform population health surveillance.

    Science.gov (United States)

    Klann, Jeffrey G; Buck, Michael D; Brown, Jeffrey; Hadley, Marc; Elmore, Richard; Weber, Griffin M; Murphy, Shawn N

    2014-01-01

    Understanding population-level health trends is essential to effectively monitor and improve public health. The Office of the National Coordinator for Health Information Technology (ONC) Query Health initiative is a collaboration to develop a national architecture for distributed, population-level health queries across diverse clinical systems with disparate data models. Here we review Query Health activities, including a standards-based methodology, an open-source reference implementation, and three pilot projects. Query Health defined a standards-based approach for distributed population health queries, using an ontology based on the Quality Data Model and Consolidated Clinical Document Architecture, Health Quality Measures Format (HQMF) as the query language, the Query Envelope as the secure transport layer, and the Quality Reporting Document Architecture as the result language. We implemented this approach using Informatics for Integrating Biology and the Bedside (i2b2) and hQuery for data analytics and PopMedNet for access control, secure query distribution, and response. We deployed the reference implementation at three pilot sites: two public health departments (New York City and Massachusetts) and one pilot designed to support Food and Drug Administration post-market safety surveillance activities. The pilots were successful, although improved cross-platform data normalization is needed. This initiative resulted in a standards-based methodology for population health queries, a reference implementation, and revision of the HQMF standard. It also informed future directions regarding interoperability and data access for ONC's Data Access Framework initiative. Query Health was a test of the learning health system that supplied a functional methodology and reference implementation for distributed population health queries that has been validated at three sites. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under

  7. SkyQuery - A Prototype Distributed Query and Cross-Matching Web Service for the Virtual Observatory

    Science.gov (United States)

    Thakar, A. R.; Budavari, T.; Malik, T.; Szalay, A. S.; Fekete, G.; Nieto-Santisteban, M.; Haridas, V.; Gray, J.

    2002-12-01

    We have developed a prototype distributed query and cross-matching service for the VO community, called SkyQuery, which is implemented with hierarchichal Web Services. SkyQuery enables astronomers to run combined queries on existing distributed heterogeneous astronomy archives. SkyQuery provides a simple, user-friendly interface to run distributed queries over the federation of registered astronomical archives in the VO. The SkyQuery client connects to the portal Web Service, which farms the query out to the individual archives, which are also Web Services called SkyNodes. The cross-matching algorithm is run recursively on each SkyNode. Each archive is a relational DBMS with a HTM index for fast spatial lookups. The results of the distributed query are returned as an XML DataSet that is automatically rendered by the client. SkyQuery also returns the image cutout corresponding to the query result. SkyQuery finds not only matches between the various catalogs, but also dropouts - objects that exist in some of the catalogs but not in others. This is often as important as finding matches. We demonstrate the utility of SkyQuery with a brown-dwarf search between SDSS and 2MASS, and a search for radio-quiet quasars in SDSS, 2MASS and FIRST. The importance of a service like SkyQuery for the worldwide astronomical community cannot be overstated: data on the same objects in various archives is mapped in different wavelength ranges and looks very different due to different errors, instrument sensitivities and other peculiarities of each archive. Our cross-matching algorithm preforms a fuzzy spatial join across multiple catalogs. This type of cross-matching is currently often done by eye, one object at a time. A static cross-identification table for a set of archives would become obsolete by the time it was built - the exponential growth of astronomical data means that a dynamic cross-identification mechanism like SkyQuery is the only viable option. SkyQuery was funded by a

  8. Graphs with branchwidth at most three

    NARCIS (Netherlands)

    Bodlaender, H.L.; Thilikos, D.M.

    1997-01-01

    In this paper we investigate both the structure of graphs with branchwidth at most three, as well as algorithms to recognise such graphs. We show that a graph has branchwidth at most three, if and only if it has treewidth at most three and does not contain the three-dimensional binary cube graph

  9. GBA manager: an online tool for querying low-complexity regions in proteins.

    Science.gov (United States)

    Bandyopadhyay, Nirmalya; Kahveci, Tamer

    2010-01-01

    Abstract We developed GBA Manager, an online software that facilitates the Graph-Based Algorithm (GBA) we proposed in our earlier work. GBA identifies the low-complexity regions (LCR) of protein sequences. GBA exploits a similarity matrix, such as BLOSUM62, to compute the complexity of the subsequences of the input protein sequence. It uses a graph-based algorithm to accurately compute the regions that have low complexities. GBA Manager is a user friendly web-service that enables online querying of protein sequences using GBA. In addition to querying capabilities of the existing GBA algorithm, GBA Manager computes the p-values of the LCR identified. The p-value gives an estimate of the possibility that the region appears by chance. GBA Manager presents the output in three different understandable formats. GBA Manager is freely accessible at http://bioinformatics.cise.ufl.edu/GBA/GBA.htm .

  10. Regular paths in SparQL: querying the NCI Thesaurus.

    Science.gov (United States)

    Detwiler, Landon T; Suciu, Dan; Brinkley, James F

    2008-11-06

    OWL, the Web Ontology Language, provides syntax and semantics for representing knowledge for the semantic web. Many of the constructs of OWL have a basis in the field of description logics. While the formal underpinnings of description logics have lead to a highly computable language, it has come at a cognitive cost. OWL ontologies are often unintuitive to readers lacking a strong logic background. In this work we describe GLEEN, a regular path expression library, which extends the RDF query language SparQL to support complex path expressions over OWL and other RDF-based ontologies. We illustrate the utility of GLEEN by showing how it can be used in a query-based approach to defining simpler, more intuitive views of OWL ontologies. In particular we show how relatively simple GLEEN-enhanced SparQL queries can create views of the OWL version of the NCI Thesaurus that match the views generated by the web-based NCI browser.

  11. A Graph Library Extension of SVG

    DEFF Research Database (Denmark)

    Nørmark, Kurt

    2007-01-01

    be aggregated as a single node, and an entire graph can be embedded in a single node. In addition, a number of different graph animations are described. The starting point of the SVG extension is a library that provides an exact of mirror of SVG 1.1 in the functional programming language Scheme. Each element...

  12. A Fuzzy Query Mechanism for Human Resource Websites

    Science.gov (United States)

    Lai, Lien-Fu; Wu, Chao-Chin; Huang, Liang-Tsung; Kuo, Jung-Chih

    Users' preferences often contain imprecision and uncertainty that are difficult for traditional human resource websites to deal with. In this paper, we apply the fuzzy logic theory to develop a fuzzy query mechanism for human resource websites. First, a storing mechanism is proposed to store fuzzy data into conventional database management systems without modifying DBMS models. Second, a fuzzy query language is proposed for users to make fuzzy queries on fuzzy databases. User's fuzzy requirement can be expressed by a fuzzy query which consists of a set of fuzzy conditions. Third, each fuzzy condition associates with a fuzzy importance to differentiate between fuzzy conditions according to their degrees of importance. Fourth, the fuzzy weighted average is utilized to aggregate all fuzzy conditions based on their degrees of importance and degrees of matching. Through the mutual compensation of all fuzzy conditions, the ordering of query results can be obtained according to user's preference.

  13. Recon2Neo4j: applying graph database technologies for managing comprehensive genome-scale networks.

    Science.gov (United States)

    Balaur, Irina; Mazein, Alexander; Saqi, Mansoor; Lysenko, Artem; Rawlings, Christopher J; Auffray, Charles

    2017-04-01

    The goal of this work is to offer a computational framework for exploring data from the Recon2 human metabolic reconstruction model. Advanced user access features have been developed using the Neo4j graph database technology and this paper describes key features such as efficient management of the network data, examples of the network querying for addressing particular tasks, and how query results are converted back to the Systems Biology Markup Language (SBML) standard format. The Neo4j-based metabolic framework facilitates exploration of highly connected and comprehensive human metabolic data and identification of metabolic subnetworks of interest. A Java-based parser component has been developed to convert query results (available in the JSON format) into SBML and SIF formats in order to facilitate further results exploration, enhancement or network sharing. The Neo4j-based metabolic framework is freely available from: https://diseaseknowledgebase.etriks.org/metabolic/browser/ . The java code files developed for this work are available from the following url: https://github.com/ibalaur/MetabolicFramework . ibalaur@eisbm.org. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  14. VPipe: Virtual Pipelining for Scheduling of DAG Stream Query Plans

    Science.gov (United States)

    Wang, Song; Gupta, Chetan; Mehta, Abhay

    There are data streams all around us that can be harnessed for tremendous business and personal advantage. For an enterprise-level stream processing system such as CHAOS [1] (Continuous, Heterogeneous Analytic Over Streams), handling of complex query plans with resource constraints is challenging. While several scheduling strategies exist for stream processing, efficient scheduling of complex DAG query plans is still largely unsolved. In this paper, we propose a novel execution scheme for scheduling complex directed acyclic graph (DAG) query plans with meta-data enriched stream tuples. Our solution, called Virtual Pipelined Chain (or VPipe Chain for short), effectively extends the "Chain" pipelining scheduling approach to complex DAG query plans.

  15. A Feynman graph selection tool in GRACE system

    International Nuclear Information System (INIS)

    Yuasa, Fukuko; Ishikawa, Tadashi; Kaneko, Toshiaki

    2001-01-01

    We present a Feynman graph selection tool grcsel, which is an interpreter written in C language. In the framework of GRACE, it enables us to get a subset of Feynman graphs according to given conditions

  16. Graph-based Operational Semantics of a Lazy Functional Languages

    DEFF Research Database (Denmark)

    Rose, Kristoffer Høgsbro

    1992-01-01

    Presents Graph Operational Semantics (GOS): a semantic specification formalism based on structural operational semantics and term graph rewriting. Demonstrates the method by specifying the dynamic ...

  17. Learning of Multimodal Representations With Random Walks on the Click Graph.

    Science.gov (United States)

    Wu, Fei; Lu, Xinyan; Song, Jun; Yan, Shuicheng; Zhang, Zhongfei Mark; Rui, Yong; Zhuang, Yueting

    2016-02-01

    In multimedia information retrieval, most classic approaches tend to represent different modalities of media in the same feature space. With the click data collected from the users' searching behavior, existing approaches take either one-to-one paired data (text-image pairs) or ranking examples (text-query-image and/or image-query-text ranking lists) as training examples, which do not make full use of the click data, particularly the implicit connections among the data objects. In this paper, we treat the click data as a large click graph, in which vertices are images/text queries and edges indicate the clicks between an image and a query. We consider learning a multimodal representation from the perspective of encoding the explicit/implicit relevance relationship between the vertices in the click graph. By minimizing both the truncated random walk loss as well as the distance between the learned representation of vertices and their corresponding deep neural network output, the proposed model which is named multimodal random walk neural network (MRW-NN) can be applied to not only learn robust representation of the existing multimodal data in the click graph, but also deal with the unseen queries and images to support cross-modal retrieval. We evaluate the latent representation learned by MRW-NN on a public large-scale click log data set Clickture and further show that MRW-NN achieves much better cross-modal retrieval performance on the unseen queries/images than the other state-of-the-art methods.

  18. Persistent Identifiers for Improved Accessibility for Linked Data Querying

    Science.gov (United States)

    Shepherd, A.; Chandler, C. L.; Arko, R. A.; Fils, D.; Jones, M. B.; Krisnadhi, A.; Mecum, B.

    2016-12-01

    The adoption of linked open data principles within the geosciences has increased the amount of accessible information available on the Web. However, this data is difficult to consume for those who are unfamiliar with Semantic Web technologies such as Web Ontology Language (OWL), Resource Description Framework (RDF) and SPARQL - the RDF query language. Consumers would need to understand the structure of the data and how to efficiently query it. Furthermore, understanding how to query doesn't solve problems of poor precision and recall in search results. For consumers unfamiliar with the data, full-text searches are most accessible, but not ideal as they arrest the advantages of data disambiguation and co-reference resolution efforts. Conversely, URI searches across linked data can deliver improved search results, but knowledge of these exact URIs may remain difficult to obtain. The increased adoption of Persistent Identifiers (PIDs) can lead to improved linked data querying by a wide variety of consumers. Because PIDs resolve to a single entity, they are an excellent data point for disambiguating content. At the same time, PIDs are more accessible and prominent than a single data provider's linked data URI. When present in linked open datasets, PIDs provide balance between the technical and social hurdles of linked data querying as evidenced by the NSF EarthCube GeoLink project. The GeoLink project, funded by NSF's EarthCube initiative, have brought together data repositories include content from field expeditions, laboratory analyses, journal publications, conference presentations, theses/reports, and funding awards that span scientific studies from marine geology to marine ecosystems and biogeochemistry to paleoclimatology.

  19. Conceptual graph grammar--a simple formalism for sublanguage.

    Science.gov (United States)

    Johnson, S B

    1998-11-01

    There are a wide variety of computer applications that deal with various aspects of medical language: concept representation, controlled vocabulary, natural language processing, and information retrieval. While technical and theoretical methods appear to differ, all approaches investigate different aspects of the same phenomenon: medical sublanguage. This paper surveys the properties of medical sublanguage from a formal perspective, based on detailed analyses cited in the literature. A review of several computer systems based on sublanguage approaches shows some of the difficulties in addressing the interaction between the syntactic and semantic aspects of sublanguage. A formalism called Conceptual Graph Grammar is presented that attempts to combine both syntax and semantics into a single notation by extending standard Conceptual Graph notation. Examples from the domain of pathology diagnoses are provided to illustrate the use of this formalism in medical language analysis. The strengths and weaknesses of the approach are then considered. Conceptual Graph Grammar is an attempt to synthesize the common properties of different approaches to sublanguage into a single formalism, and to begin to define a common foundation for language-related research in medical informatics.

  20. AN EFFECTIVE RECOMMENDATIONS BY DIFFUSION ALGORITHM FOR WEB GRAPH MINING

    Directory of Open Access Journals (Sweden)

    S. Vasukipriya

    2013-04-01

    Full Text Available The information on the World Wide Web grows in an explosive rate. Societies are relying more on the Web for their miscellaneous needs of information. Recommendation systems are active information filtering systems that attempt to present the information items like movies, music, images, books recommendations, tags recommendations, query suggestions, etc., to the users. Various kinds of data bases are used for the recommendations; fundamentally these data bases can be molded in the form of many types of graphs. Aiming at provided that a general framework on effective DR (Recommendations by Diffusion algorithm for web graphs mining. First introduce a novel graph diffusion model based on heat diffusion. This method can be applied to both undirected graphs and directed graphs. Then it shows how to convert different Web data sources into correct graphs in our models.

  1. Handbook of graph grammars and computing by graph transformation

    CERN Document Server

    Engels, G; Kreowski, H J; Rozenberg, G

    1999-01-01

    Graph grammars originated in the late 60s, motivated by considerations about pattern recognition and compiler construction. Since then, the list of areas which have interacted with the development of graph grammars has grown quite impressively. Besides the aforementioned areas, it includes software specification and development, VLSI layout schemes, database design, modeling of concurrent systems, massively parallel computer architectures, logic programming, computer animation, developmental biology, music composition, visual languages, and many others.The area of graph grammars and graph tran

  2. A SQL-Database Based Meta-CASE System and its Query Subsystem

    Science.gov (United States)

    Eessaar, Erki; Sgirka, Rünno

    Meta-CASE systems simplify the creation of CASE (Computer Aided System Engineering) systems. In this paper, we present a meta-CASE system that provides a web-based user interface and uses an object-relational database system (ORDBMS) as its basis. The use of ORDBMSs allows us to integrate different parts of the system and simplify the creation of meta-CASE and CASE systems. ORDBMSs provide powerful query mechanism. The proposed system allows developers to use queries to evaluate and gradually improve artifacts and calculate values of software measures. We illustrate the use of the systems by using SimpleM modeling language and discuss the use of SQL in the context of queries about artifacts. We have created a prototype of the meta-CASE system by using PostgreSQL™ ORDBMS and PHP scripting language.

  3. CloudLM: a Cloud-based Language Model for Machine Translation

    Directory of Open Access Journals (Sweden)

    Ferrández-Tordera Jorge

    2016-04-01

    Full Text Available Language models (LMs are an essential element in statistical approaches to natural language processing for tasks such as speech recognition and machine translation (MT. The advent of big data leads to the availability of massive amounts of data to build LMs, and in fact, for the most prominent languages, using current techniques and hardware, it is not feasible to train LMs with all the data available nowadays. At the same time, it has been shown that the more data is used for a LM the better the performance, e.g. for MT, without any indication yet of reaching a plateau. This paper presents CloudLM, an open-source cloud-based LM intended for MT, which allows to query distributed LMs. CloudLM relies on Apache Solr and provides the functionality of state-of-the-art language modelling (it builds upon KenLM, while allowing to query massive LMs (as the use of local memory is drastically reduced, at the expense of slower decoding speed.

  4. QuerySpaces on Hadoop for the ATLAS EventIndex

    CERN Document Server

    Hrivnac, Julius; The ATLAS collaboration; Cranshaw, Jack; Favareto, Andrea; Prokoshin, Fedor; Glasman, Claudia; Toebbicke, Rainer

    2015-01-01

    A Hadoop-based implementation of the adaptive query engine serving as the back-end for the ATLAS EventIndex. The QuerySpaces implementation handles both original data and search results providing fast and efficient mechanisms for new user queries using already accumulated knowledge for optimization. Detailed descriptions and statistics about user requests are collected in HBase tables and HDFS files. Requests are associated to their results and a graph of relations between them is created to be used to find the most efficient way of providing answers to new requests The environment is completely transparent to users and is accessible over several command-line interfaces, a Web Service and a programming API.

  5. XML Graphs in Program Analysis

    DEFF Research Database (Denmark)

    Møller, Anders; Schwartzbach, Michael I.

    2011-01-01

    of XML graphs against different XML schema languages, and provide a software package that enables others to make use of these ideas. We also survey the use of XML graphs for program analysis with four very different languages: XACT (XML in Java), Java Servlets (Web application programming), XSugar......XML graphs have shown to be a simple and effective formalism for representing sets of XML documents in program analysis. It has evolved through a six year period with variants tailored for a range of applications. We present a unified definition, outline the key properties including validation...

  6. UCLA at TREC 2014 Clinical Decision Support Track: Exploring Language Models, Query Expansion, and Boosting

    Science.gov (United States)

    2014-11-01

    qualitative benet to the retrieved results compared to lower and higher boost factors. 3 Query terms Comments right lower quadrant abdominal pain ... pain for hours, decreased appetite, and enlarged appendix on abdominal ultrasound. Table 1 describes the query terms for an example patient summary. The...the diagnostic process. For example, pediatric cases, women of child- bearing age, and geriatic populations each have a distinct set of possible

  7. Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width.

    Science.gov (United States)

    De Sa, Christopher; Zhang, Ce; Olukotun, Kunle; Ré, Christopher

    2015-12-01

    Gibbs sampling on factor graphs is a widely used inference technique, which often produces good empirical results. Theoretical guarantees for its performance are weak: even for tree structured graphs, the mixing time of Gibbs may be exponential in the number of variables. To help understand the behavior of Gibbs sampling, we introduce a new (hyper)graph property, called hierarchy width . We show that under suitable conditions on the weights, bounded hierarchy width ensures polynomial mixing time. Our study of hierarchy width is in part motivated by a class of factor graph templates, hierarchical templates , which have bounded hierarchy width-regardless of the data used to instantiate them. We demonstrate a rich application from natural language processing in which Gibbs sampling provably mixes rapidly and achieves accuracy that exceeds human volunteers.

  8. Dynamical Languages

    Science.gov (United States)

    Xie, Huimin

    The following sections are included: * Definition of Dynamical Languages * Distinct Excluded Blocks * Definition and Properties * L and L″ in Chomsky Hierarchy * A Natural Equivalence Relation * Symbolic Flows * Symbolic Flows and Dynamical Languages * Subshifts of Finite Type * Sofic Systems * Graphs and Dynamical Languages * Graphs and Shannon-Graphs * Transitive Languages * Topological Entropy

  9. KoralQuery -- A General Corpus Query Protocol

    DEFF Research Database (Denmark)

    Bingel, Joachim; Diewald, Nils

    2015-01-01

    . In this paper, we present KoralQuery, a JSON-LD based general corpus query protocol, aiming to be independent of particular QLs, tasks and corpus formats. In addition to describing the system of types and operations that KoralQuery is built on, we exemplify the representation of corpus queries in the serialized...

  10. Beginning SQL queries from novice to professional

    CERN Document Server

    Churcher, Clare

    2016-01-01

    Anyone who does any work at all with databases needs to know something of SQL. This is a friendly and easy-to-read guide to writing queries with the all-important - in the database world - SQL language. The author writes with exceptional clarity.

  11. Khovanov homology of graph-links

    Energy Technology Data Exchange (ETDEWEB)

    Nikonov, Igor M [M. V. Lomonosov Moscow State University, Faculty of Mechanics and Mathematics, Moscow (Russian Federation)

    2012-08-31

    Graph-links arise as the intersection graphs of turning chord diagrams of links. Speaking informally, graph-links provide a combinatorial description of links up to mutations. Many link invariants can be reformulated in the language of graph-links. Khovanov homology, a well-known and useful knot invariant, is defined for graph-links in this paper (in the case of the ground field of characteristic two). Bibliography: 14 titles.

  12. On the mixing time of geographical threshold graphs

    Energy Technology Data Exchange (ETDEWEB)

    Bradonjic, Milan [Los Alamos National Laboratory

    2009-01-01

    In this paper, we study the mixing time of random graphs generated by the geographical threshold graph (GTG) model, a generalization of random geometric graphs (RGG). In a GTG, nodes are distributed in a Euclidean space, and edges are assigned according to a threshold function involving the distance between nodes as well as randomly chosen node weights. The motivation for analyzing this model is that many real networks (e.g., wireless networks, the Internet, etc.) need to be studied by using a 'richer' stochastic model (which in this case includes both a distance between nodes and weights on the nodes). We specifically study the mixing times of random walks on 2-dimensional GTGs near the connectivity threshold. We provide a set of criteria on the distribution of vertex weights that guarantees that the mixing time is {Theta}(n log n).

  13. Graph theoretical analysis of functional network for comprehension of sign language.

    Science.gov (United States)

    Liu, Lanfang; Yan, Xin; Liu, Jin; Xia, Mingrui; Lu, Chunming; Emmorey, Karen; Chu, Mingyuan; Ding, Guosheng

    2017-09-15

    Signed languages are natural human languages using the visual-motor modality. Previous neuroimaging studies based on univariate activation analysis show that a widely overlapped cortical network is recruited regardless whether the sign language is comprehended (for signers) or not (for non-signers). Here we move beyond previous studies by examining whether the functional connectivity profiles and the underlying organizational structure of the overlapped neural network may differ between signers and non-signers when watching sign language. Using graph theoretical analysis (GTA) and fMRI, we compared the large-scale functional network organization in hearing signers with non-signers during the observation of sentences in Chinese Sign Language. We found that signed sentences elicited highly similar cortical activations in the two groups of participants, with slightly larger responses within the left frontal and left temporal gyrus in signers than in non-signers. Crucially, further GTA revealed substantial group differences in the topologies of this activation network. Globally, the network engaged by signers showed higher local efficiency (t (24) =2.379, p=0.026), small-worldness (t (24) =2.604, p=0.016) and modularity (t (24) =3.513, p=0.002), and exhibited different modular structures, compared to the network engaged by non-signers. Locally, the left ventral pars opercularis served as a network hub in the signer group but not in the non-signer group. These findings suggest that, despite overlap in cortical activation, the neural substrates underlying sign language comprehension are distinguishable at the network level from those for the processing of gestural action. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. Investigation in Query System Framework for High Energy Physics

    CERN Document Server

    Jatuphattharachat, Thanat

    2017-01-01

    We summarize an investigation in query system framework for HEP (High Energy Physics). Our work was an investigation on distributed server part of Femtocode, which is a query language that provides the ability for physicists to make plots and other aggregations in real-time. To make the system more robust and capable of processing large amount of data quickly, it is necessary to deploy the system on a redundant and distributed computing cluster. This project aims to investigate third party coordination and resource management frameworks which fit into the design of real-time distributed query system. Zookeeper, Mesos and Marathon are the main frameworks for this investigation. The results indicate that Zookeeper is good for job coordinator and job tracking as it provides robust, fast, simple and transparent read and write process for all connecting client across distributed Zookeeper server. Furthermore, it also supports high availability access and consistency guarantee within specific time bound.

  15. Visual Querying in Chemical Databases using SMARTS Patterns

    OpenAIRE

    Šípek, Vojtěch

    2014-01-01

    The purpose of this thesis is to create framework for visual querying in chemical databases which will be implemented as a web application. By using graphical editor, which is a part of client side, the user creates queries which are translated into chemical query language SMARTS. This query is parsed on the application server which is connected to the chemical database. This framework also contains tooling for creating the database and index structure above it. 1

  16. Performance evaluation of unified medical language system®'s synonyms expansion to query PubMed

    Directory of Open Access Journals (Sweden)

    Griffon Nicolas

    2012-02-01

    Full Text Available Abstract Background PubMed is the main access to medical literature on the Internet. In order to enhance the performance of its information retrieval tools, primarily non-indexed citations, the authors propose a method: expanding users' queries using Unified Medical Language System' (UMLS synonyms i.e. all the terms gathered under one unique Concept Unique Identifier. Methods This method was evaluated using queries constructed to emphasize the differences between this new method and the current PubMed automatic term mapping. Four experts assessed citation relevance. Results Using UMLS, we were able to retrieve new citations in 45.5% of queries, which implies a small increase in recall. The new strategy led to a heterogeneous 23.7% mean increase in non-indexed citation retrieved. Of these, 82% have been published less than 4 months earlier. The overall mean precision was 48.4% but differed according to the evaluators, ranging from 36.7% to 88.1% (Inter rater agreement was poor: kappa = 0.34. Conclusions This study highlights the need for specific search tools for each type of user and use-cases. The proposed strategy may be useful to retrieve recent scientific advancement.

  17. NetMatchStar: an enhanced Cytoscape network querying app [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Fabio Rinnone

    2015-08-01

    Full Text Available We present NetMatchStar, a Cytoscape app to find all the occurrences of a query graph in a network and check for its significance as a motif with respect to seven different random models. The query can be uploaded or built from scratch using Cytoscape facilities. The app significantly enhances the previous NetMatch in style, performance and functionality. Notably NetMatchStar allows queries with wildcards.

  18. Querying XML Data with SPARQL

    Science.gov (United States)

    Bikakis, Nikos; Gioldasis, Nektarios; Tsinaraki, Chrisa; Christodoulakis, Stavros

    SPARQL is today the standard access language for Semantic Web data. In the recent years XML databases have also acquired industrial importance due to the widespread applicability of XML in the Web. In this paper we present a framework that bridges the heterogeneity gap and creates an interoperable environment where SPARQL queries are used to access XML databases. Our approach assumes that fairly generic mappings between ontology constructs and XML Schema constructs have been automatically derived or manually specified. The mappings are used to automatically translate SPARQL queries to semantically equivalent XQuery queries which are used to access the XML databases. We present the algorithms and the implementation of SPARQL2XQuery framework, which is used for answering SPARQL queries over XML databases.

  19. Downloading Multiple Records Using Query Strings

    Directory of Open Access Journals (Sweden)

    Adam Crymble

    2012-11-01

    Full Text Available Downloading a single record from a website is easy, but downloading many records at a time – an increasingly frequent need for a historian – is much more efficient using a programming language such as Python. In this lesson, we will write a program that will download a series of records from the Old Bailey Online using custom search criteria, and save them to a directory on our computer. This process involves interpreting and manipulating URL Query Strings. In this case, the tutorial will seek to download sources that contain references to people of African descent that were published in the Old Bailey Proceedings between 1700 and 1750.

  20. Exformatics Declarative Case Management Workflows as DCR Graphs

    DEFF Research Database (Denmark)

    Slaats, Tijs; Mukkamala, Raghava Rao; Hildebrandt, Thomas

    2013-01-01

    Declarative workflow languages have been a growing research subject over the past ten years, but applications of the declarative approach in industry are still uncommon. Over the past two years Exformatics A/S, a Danish provider of Electronic Case Management systems, has been cooperating...... with researchers at IT University of Copenhagen (ITU) to create tools for the declarative workflow language Dynamic Condition Response Graphs (DCR Graphs) and incorporate them into their products and in teaching at ITU. In this paper we give a status report over the work. We start with an informal introduction...

  1. Fixation Time for Evolutionary Graphs

    Science.gov (United States)

    Nie, Pu-Yan; Zhang, Pei-Ai

    Evolutionary graph theory (EGT) is recently proposed by Lieberman et al. in 2005. EGT is successful for explaining biological evolution and some social phenomena. It is extremely important to consider the time of fixation for EGT in many practical problems, including evolutionary theory and the evolution of cooperation. This study characterizes the time to asymptotically reach fixation.

  2. Declarative Event-Based Workflow as Distributed Dynamic Condition Response Graphs

    DEFF Research Database (Denmark)

    Hildebrandt, Thomas; Mukkamala, Raghava Rao

    2010-01-01

    We present Dynamic Condition Response Graphs (DCR Graphs) as a declarative, event-based process model inspired by the workflow language employed by our industrial partner and conservatively generalizing prime event structures. A dynamic condition response graph is a directed graph with nodes repr...... exemplify the use of distributed DCR Graphs on a simple workflow taken from a field study at a Danish hospital, pointing out their flexibility compared to imperative workflow models. Finally we provide a mapping from DCR Graphs to Buchi-automata....

  3. Development and validation of a structured query language implementation of the Elixhauser comorbidity index.

    Science.gov (United States)

    Epstein, Richard H; Dexter, Franklin

    2017-07-01

    Comorbidity adjustment is often performed during outcomes and health care resource utilization research. Our goal was to develop an efficient algorithm in structured query language (SQL) to determine the Elixhauser comorbidity index. We wrote an SQL algorithm to calculate the Elixhauser comorbidities from Diagnosis Related Group and International Classification of Diseases (ICD) codes. Validation was by comparison to expected comorbidities from combinations of these codes and to the 2013 Nationwide Readmissions Database (NRD). The SQL algorithm matched perfectly with expected comorbidities for all combinations of ICD-9 or ICD-10, and Diagnosis Related Groups. Of 13 585 859 evaluable NRD records, the algorithm matched 100% of the listed comorbidities. Processing time was ∼0.05 ms/record. The SQL Elixhauser code was efficient and computationally identical to the SAS algorithm used for the NRD. This algorithm may be useful where preprocessing of large datasets in a relational database environment and comorbidity determination is desired before statistical analysis. A validated SQL procedure to calculate Elixhauser comorbidities and the van Walraven index from ICD-9 or ICD-10 discharge diagnosis codes has been published. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  4. Loose Graph Simulations

    DEFF Research Database (Denmark)

    Mansutti, Alessio; Miculan, Marino; Peressotti, Marco

    2017-01-01

    We introduce loose graph simulations (LGS), a new notion about labelled graphs which subsumes in an intuitive and natural way subgraph isomorphism (SGI), regular language pattern matching (RLPM) and graph simulation (GS). Being a unification of all these notions, LGS allows us to express directly...... also problems which are “mixed” instances of previous ones, and hence which would not fit easily in any of them. After the definition and some examples, we show that the problem of finding loose graph simulations is NP-complete, we provide formal translation of SGI, RLPM, and GS into LGSs, and we give...

  5. Delay-time distribution in the scattering of time-narrow wave packets (II)—quantum graphs

    Science.gov (United States)

    Smilansky, Uzy; Schanz, Holger

    2018-02-01

    We apply the framework developed in the preceding paper in this series (Smilansky 2017 J. Phys. A: Math. Theor. 50 215301) to compute the time-delay distribution in the scattering of ultra short radio frequency pulses on complex networks of transmission lines which are modeled by metric (quantum) graphs. We consider wave packets which are centered at high wave number and comprise many energy levels. In the limit of pulses of very short duration we compute upper and lower bounds to the actual time-delay distribution of the radiation emerging from the network using a simplified problem where time is replaced by the discrete count of vertex-scattering events. The classical limit of the time-delay distribution is also discussed and we show that for finite networks it decays exponentially, with a decay constant which depends on the graph connectivity and the distribution of its edge lengths. We illustrate and apply our theory to a simple model graph where an algebraic decay of the quantum time-delay distribution is established.

  6. An Experimental Investigation of Complexity in Database Query Formulation Tasks

    Science.gov (United States)

    Casterella, Gretchen Irwin; Vijayasarathy, Leo

    2013-01-01

    Information Technology professionals and other knowledge workers rely on their ability to extract data from organizational databases to respond to business questions and support decision making. Structured query language (SQL) is the standard programming language for querying data in relational databases, and SQL skills are in high demand and are…

  7. VMQL: A Visual Language for Ad-Hoc Model Querying

    DEFF Research Database (Denmark)

    Störrle, Harald

    2011-01-01

    In large scale model based development, analysis level models are more like knowledge bases than engineering artifacts. Their effectiveness depends, to a large degree, on the ability of domain experts to retrieve information from them ad hoc. For large scale models, however, existing query...

  8. XAL: An algebra for XML query optimization

    NARCIS (Netherlands)

    Frasincar, F.; Houben, G.J.P.M.; Pau, C.D.; Zhou, Xiaofang

    2002-01-01

    This paper proposes XAL, an XML ALgebra. Its novelty is based on the simplicity of its data model and its well-defined logical operators, which makes it suitable for composability, optimizability, and semantics definition of a query language for XML data. At the heart of the algebra resides the

  9. Tag cloud generation for results of multiple keywords queries

    DEFF Research Database (Denmark)

    Leginus, Martin; Dolog, Peter; Lage, Ricardo Gomes

    2013-01-01

    In this paper we study tag cloud generation for retrieved results of multiple keyword queries. It is motivated by many real world scenarios such as personalization tasks, surveillance systems and information retrieval tasks defined with multiple keywords. We adjust the state-of-the-art tag cloud...... generation techniques for multiple keywords query results. Consequently, we conduct the extensive evaluation on top of three distinct collaborative tagging systems. The graph-based methods perform significantly better for the Movielens and Bibsonomy datasets. Tag cloud generation based on maximal coverage...

  10. BioFed: federated query processing over life sciences linked open data.

    Science.gov (United States)

    Hasnain, Ali; Mehmood, Qaiser; Sana E Zainab, Syeda; Saleem, Muhammad; Warren, Claude; Zehra, Durre; Decker, Stefan; Rebholz-Schuhmann, Dietrich

    2017-03-15

    Biomedical data, e.g. from knowledge bases and ontologies, is increasingly made available following open linked data principles, at best as RDF triple data. This is a necessary step towards unified access to biological data sets, but this still requires solutions to query multiple endpoints for their heterogeneous data to eventually retrieve all the meaningful information. Suggested solutions are based on query federation approaches, which require the submission of SPARQL queries to endpoints. Due to the size and complexity of available data, these solutions have to be optimised for efficient retrieval times and for users in life sciences research. Last but not least, over time, the reliability of data resources in terms of access and quality have to be monitored. Our solution (BioFed) federates data over 130 SPARQL endpoints in life sciences and tailors query submission according to the provenance information. BioFed has been evaluated against the state of the art solution FedX and forms an important benchmark for the life science domain. The efficient cataloguing approach of the federated query processing system 'BioFed', the triple pattern wise source selection and the semantic source normalisation forms the core to our solution. It gathers and integrates data from newly identified public endpoints for federated access. Basic provenance information is linked to the retrieved data. Last but not least, BioFed makes use of the latest SPARQL standard (i.e., 1.1) to leverage the full benefits for query federation. The evaluation is based on 10 simple and 10 complex queries, which address data in 10 major and very popular data sources (e.g., Dugbank, Sider). BioFed is a solution for a single-point-of-access for a large number of SPARQL endpoints providing life science data. It facilitates efficient query generation for data access and provides basic provenance information in combination with the retrieved data. BioFed fully supports SPARQL 1.1 and gives access to the

  11. Improving Students' Understanding of Waves by Plotting a Displacement-Time Graph in Class

    Science.gov (United States)

    Wei, Yajun

    2012-04-01

    The topic of waves is one that many high school physics students find difficult to understand. This is especially true when using some A-level textbooks1,2used in the U.K., where the concept of waves is introduced prior to the concept of simple harmonic oscillations. One of the challenges my students encounter is understanding the difference between displacement-time graphs and displacement-position graphs. Many students wonder why these two graphs have the same sinusoidal shape. Having the students use multimedia simulations allows them to see, in a hands-on fashion, the relationship between the two graphs.

  12. Big Data Analytics with Datalog Queries on Spark.

    Science.gov (United States)

    Shkapsky, Alexander; Yang, Mohan; Interlandi, Matteo; Chiu, Hsuan; Condie, Tyson; Zaniolo, Carlo

    2016-01-01

    There is great interest in exploiting the opportunity provided by cloud computing platforms for large-scale analytics. Among these platforms, Apache Spark is growing in popularity for machine learning and graph analytics. Developing efficient complex analytics in Spark requires deep understanding of both the algorithm at hand and the Spark API or subsystem APIs (e.g., Spark SQL, GraphX). Our BigDatalog system addresses the problem by providing concise declarative specification of complex queries amenable to efficient evaluation. Towards this goal, we propose compilation and optimization techniques that tackle the important problem of efficiently supporting recursion in Spark. We perform an experimental comparison with other state-of-the-art large-scale Datalog systems and verify the efficacy of our techniques and effectiveness of Spark in supporting Datalog-based analytics.

  13. AliBaba: PubMed as a graph.

    Science.gov (United States)

    Plake, Conrad; Schiemann, Torsten; Pankalla, Marcus; Hakenberg, Jörg; Leser, Ulf

    2006-10-01

    The biomedical literature contains a wealth of information on associations between many different types of objects, such as protein-protein interactions, gene-disease associations and subcellular locations of proteins. When searching such information using conventional search engines, e.g. PubMed, users see the data only one-abstract at a time and 'hidden' in natural language text. AliBaba is an interactive tool for graphical summarization of search results. It parses the set of abstracts that fit a PubMed query and presents extracted information on biomedical objects and their relationships as a graphical network. AliBaba extracts associations between cells, diseases, drugs, proteins, species and tissues. Several filter options allow for a more focused search. Thus, researchers can grasp complex networks described in various articles at a glance. http://alibaba.informatik.hu-berlin.de/

  14. Guidelines for a graph-theoretic implementation of structural equation modeling

    Science.gov (United States)

    Grace, James B.; Schoolmaster, Donald R.; Guntenspergen, Glenn R.; Little, Amanda M.; Mitchell, Brian R.; Miller, Kathryn M.; Schweiger, E. William

    2012-01-01

    Structural equation modeling (SEM) is increasingly being chosen by researchers as a framework for gaining scientific insights from the quantitative analyses of data. New ideas and methods emerging from the study of causality, influences from the field of graphical modeling, and advances in statistics are expanding the rigor, capability, and even purpose of SEM. Guidelines for implementing the expanded capabilities of SEM are currently lacking. In this paper we describe new developments in SEM that we believe constitute a third-generation of the methodology. Most characteristic of this new approach is the generalization of the structural equation model as a causal graph. In this generalization, analyses are based on graph theoretic principles rather than analyses of matrices. Also, new devices such as metamodels and causal diagrams, as well as an increased emphasis on queries and probabilistic reasoning, are now included. Estimation under a graph theory framework permits the use of Bayesian or likelihood methods. The guidelines presented start from a declaration of the goals of the analysis. We then discuss how theory frames the modeling process, requirements for causal interpretation, model specification choices, selection of estimation method, model evaluation options, and use of queries, both to summarize retrospective results and for prospective analyses. The illustrative example presented involves monitoring data from wetlands on Mount Desert Island, home of Acadia National Park. Our presentation walks through the decision process involved in developing and evaluating models, as well as drawing inferences from the resulting prediction equations. In addition to evaluating hypotheses about the connections between human activities and biotic responses, we illustrate how the structural equation (SE) model can be queried to understand how interventions might take advantage of an environmental threshold to limit Typha invasions. The guidelines presented provide for

  15. Efficient evaluation of shortest travel-time path queries through spatial mashups

    KAUST Repository

    Zhang, Detian

    2017-01-07

    In the real world, the route/path with the shortest travel time in a road network is more meaningful than that with the shortest network distance for location-based services (LBS). However, not every LBS provider has adequate resources to compute/estimate travel time for routes by themselves. A cost-effective way for LBS providers to estimate travel time for routes is to issue external route requests to Web mapping services (e.g., Google Maps, Bing Maps, and MapQuest Maps). Due to the high cost of processing such external route requests and the usage limits of Web mapping services, we take the advantage of direction sharing, parallel requesting and waypoints supported by Web mapping services to reduce the number of external route requests and the query response time for shortest travel-time route queries in this paper. We first give the definition of sharing ability to reflect the possibility of sharing the direction information of a route with others, and find out the queries that their query routes are independent with each other for parallel processing. Then, we model the problem of selecting the optimal waypoints for an external route request as finding the longest simple path in a weighted complete digraph. As it is a MAX SNP-hard problem, we propose a greedy algorithm with performance guarantee to find the best set of waypoints in an external route request. We evaluate the performance of our approach using a real Web mapping service, a real road network, real and synthetic data sets. Experimental results show the efficiency, scalability, and applicability of our approach.

  16. Efficient evaluation of shortest travel-time path queries through spatial mashups

    KAUST Repository

    Zhang, Detian; Chow, Chi-Yin; Liu, An; Zhang, Xiangliang; Ding, Qingzhu; Li, Qing

    2017-01-01

    In the real world, the route/path with the shortest travel time in a road network is more meaningful than that with the shortest network distance for location-based services (LBS). However, not every LBS provider has adequate resources to compute/estimate travel time for routes by themselves. A cost-effective way for LBS providers to estimate travel time for routes is to issue external route requests to Web mapping services (e.g., Google Maps, Bing Maps, and MapQuest Maps). Due to the high cost of processing such external route requests and the usage limits of Web mapping services, we take the advantage of direction sharing, parallel requesting and waypoints supported by Web mapping services to reduce the number of external route requests and the query response time for shortest travel-time route queries in this paper. We first give the definition of sharing ability to reflect the possibility of sharing the direction information of a route with others, and find out the queries that their query routes are independent with each other for parallel processing. Then, we model the problem of selecting the optimal waypoints for an external route request as finding the longest simple path in a weighted complete digraph. As it is a MAX SNP-hard problem, we propose a greedy algorithm with performance guarantee to find the best set of waypoints in an external route request. We evaluate the performance of our approach using a real Web mapping service, a real road network, real and synthetic data sets. Experimental results show the efficiency, scalability, and applicability of our approach.

  17. Mining Tasks from the Web Anchor Text Graph: MSR Notebook Paper for the TREC 2015 Tasks Track

    Science.gov (United States)

    2015-11-20

    Mining Tasks from the Web Anchor Text Graph: MSR Notebook Paper for the TREC 2015 Tasks Track Paul N. Bennett Microsoft Research Redmond, USA pauben...anchor text graph has proven useful in the general realm of query reformulation [2], we sought to quantify the value of extracting key phrases from...anchor text in the broader setting of the task understanding track. Given a query, our approach considers a simple method for identifying a relevant

  18. Relationship Between Time Consumption and Quality of Responses to Drug-related Queries

    DEFF Research Database (Denmark)

    Amundstuen Reppe, Linda; Lydersen, Stian; Schjøtt, Jan

    2016-01-01

    in score, –0.05 per hour of work; 95% CI, –0.08 to –0.01; P = 0.005). No such associations were found for the internal experts’ assessment. Implications To our knowledge, this is the first study of the association between time consumption and quality of responses to drug-related queries in DICs......Purpose The aims of this study were to assess the quality of responses produced by drug information centers (DICs) in Scandinavia, and to study the association between time consumption processing queries and the quality of the responses. Methods We posed six identical drug-related queries to seven...... DICs in Scandinavia, and the time consumption required for processing them was estimated. Clinical pharmacologists (internal experts) and general practitioners (external experts) reviewed responses individually. We used mixed model linear regression analyses to study the associations between time...

  19. Heuristic query optimization for query multiple table and multiple clausa on mobile finance application

    Science.gov (United States)

    Indrayana, I. N. E.; P, N. M. Wirasyanti D.; Sudiartha, I. KG

    2018-01-01

    Mobile application allow many users to access data from the application without being limited to space, space and time. Over time the data population of this application will increase. Data access time will cause problems if the data record has reached tens of thousands to millions of records.The objective of this research is to maintain the performance of data execution for large data records. One effort to maintain data access time performance is to apply query optimization method. The optimization used in this research is query heuristic optimization method. The built application is a mobile-based financial application using MySQL database with stored procedure therein. This application is used by more than one business entity in one database, thus enabling rapid data growth. In this stored procedure there is an optimized query using heuristic method. Query optimization is performed on a “Select” query that involves more than one table with multiple clausa. Evaluation is done by calculating the average access time using optimized and unoptimized queries. Access time calculation is also performed on the increase of population data in the database. The evaluation results shown the time of data execution with query heuristic optimization relatively faster than data execution time without using query optimization.

  20. Strategic Port Graph Rewriting: An Interactive Modelling and Analysis Framework

    Directory of Open Access Journals (Sweden)

    Maribel Fernández

    2014-07-01

    Full Text Available We present strategic portgraph rewriting as a basis for the implementation of visual modelling and analysis tools. The goal is to facilitate the specification, analysis and simulation of complex systems, using port graphs. A system is represented by an initial graph and a collection of graph rewriting rules, together with a user-defined strategy to control the application of rules. The strategy language includes constructs to deal with graph traversal and management of rewriting positions in the graph. We give a small-step operational semantics for the language, and describe its implementation in the graph transformation and visualisation tool PORGY.

  1. Combining the power of searching and querying

    NARCIS (Netherlands)

    Cohen, S.; Kanza, Y.; Kogan, Y.A.; Nutt, W.; Sagiv, Y.; Serebrenik, A.; Etzion, O.; Scheuermann, P.

    2000-01-01

    EquiX is a search language for XML that combines the power of querying with the simplicity of searching. Requirements for search languages are discussed and it is shown that EquiX meets the necessary criteria. Both a graphical abstract syntax and a formal concrete syntax are presented for EquiX

  2. Inferring ontology graph structures using OWL reasoning

    KAUST Repository

    Rodriguez-Garcia, Miguel Angel

    2018-01-05

    Ontologies are representations of a conceptualization of a domain. Traditionally, ontologies in biology were represented as directed acyclic graphs (DAG) which represent the backbone taxonomy and additional relations between classes. These graphs are widely exploited for data analysis in the form of ontology enrichment or computation of semantic similarity. More recently, ontologies are developed in a formal language such as the Web Ontology Language (OWL) and consist of a set of axioms through which classes are defined or constrained. While the taxonomy of an ontology can be inferred directly from the axioms of an ontology as one of the standard OWL reasoning tasks, creating general graph structures from OWL ontologies that exploit the ontologies\\' semantic content remains a challenge.We developed a method to transform ontologies into graphs using an automated reasoner while taking into account all relations between classes. Searching for (existential) patterns in the deductive closure of ontologies, we can identify relations between classes that are implied but not asserted and generate graph structures that encode for a large part of the ontologies\\' semantic content. We demonstrate the advantages of our method by applying it to inference of protein-protein interactions through semantic similarity over the Gene Ontology and demonstrate that performance is increased when graph structures are inferred using deductive inference according to our method. Our software and experiment results are available at http://github.com/bio-ontology-research-group/Onto2Graph .Onto2Graph is a method to generate graph structures from OWL ontologies using automated reasoning. The resulting graphs can be used for improved ontology visualization and ontology-based data analysis.

  3. Inferring ontology graph structures using OWL reasoning.

    Science.gov (United States)

    Rodríguez-García, Miguel Ángel; Hoehndorf, Robert

    2018-01-05

    Ontologies are representations of a conceptualization of a domain. Traditionally, ontologies in biology were represented as directed acyclic graphs (DAG) which represent the backbone taxonomy and additional relations between classes. These graphs are widely exploited for data analysis in the form of ontology enrichment or computation of semantic similarity. More recently, ontologies are developed in a formal language such as the Web Ontology Language (OWL) and consist of a set of axioms through which classes are defined or constrained. While the taxonomy of an ontology can be inferred directly from the axioms of an ontology as one of the standard OWL reasoning tasks, creating general graph structures from OWL ontologies that exploit the ontologies' semantic content remains a challenge. We developed a method to transform ontologies into graphs using an automated reasoner while taking into account all relations between classes. Searching for (existential) patterns in the deductive closure of ontologies, we can identify relations between classes that are implied but not asserted and generate graph structures that encode for a large part of the ontologies' semantic content. We demonstrate the advantages of our method by applying it to inference of protein-protein interactions through semantic similarity over the Gene Ontology and demonstrate that performance is increased when graph structures are inferred using deductive inference according to our method. Our software and experiment results are available at http://github.com/bio-ontology-research-group/Onto2Graph . Onto2Graph is a method to generate graph structures from OWL ontologies using automated reasoning. The resulting graphs can be used for improved ontology visualization and ontology-based data analysis.

  4. Graph Creation, Visualisation and Transformation

    Directory of Open Access Journals (Sweden)

    Maribel Fernández

    2010-03-01

    Full Text Available We describe a tool to create, edit, visualise and compute with interaction nets - a form of graph rewriting systems. The editor, called GraphPaper, allows users to create and edit graphs and their transformation rules using an intuitive user interface. The editor uses the functionalities of the TULIP system, which gives us access to a wealth of visualisation algorithms. Interaction nets are not only a formalism for the specification of graphs, but also a rewrite-based computation model. We discuss graph rewriting strategies and a language to express them in order to perform strategic interaction net rewriting.

  5. Time- and Cost-Optimal Parallel Algorithms for the Dominance and Visibility Graphs

    Directory of Open Access Journals (Sweden)

    D. Bhagavathi

    1996-01-01

    Full Text Available The compaction step of integrated circuit design motivates associating several kinds of graphs with a collection of non-overlapping rectangles in the plane. These graphs are intended to capture various visibility relations amongst the rectangles in the collection. The contribution of this paper is to propose time- and cost-optimal algorithms to construct two such graphs, namely, the dominance graph (DG, for short and the visibility graph (VG, for short. Specifically, we show that with a collection of n non-overlapping rectangles as input, both these structures can be constructed in θ(log n time using n processors in the CREW model.

  6. A System for Conceptual Pathway Finding and Deductive Querying

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Nilsson, Jørgen Fischer

    2015-01-01

    We describe principles and design of a system for knowledge bases applying a natural logic. Natural logics are forms of logic which appear as stylized fragments of natural language sentences. Accordingly, such knowledge base sentences can be read and understood directly by a domain expert....... The system applies a graph form computed from the input natural logic sentences. The graph form generalizes the usual partial-order ontological sub-class structures by accommodation of affirmative sentences comprising recursive phrase structures. In this paper we focus on the logical inference rules...

  7. Instant MDX queries for SQL Server 2012

    CERN Document Server

    Emond, Nicholas

    2013-01-01

    Get to grips with a new technology, understand what it is and what it can do for you, and then get to work with the most important features and tasks. This short, focused guide is a great way to get stated with writing MDX queries. New developers can use this book as a reference for how to use functions and the syntax of a query as well as how to use Calculated Members and Named Sets.This book is great for new developers who want to learn the MDX query language from scratch and install SQL Server 2012 with Analysis Services

  8. Modeling Large Time Series for Efficient Approximate Query Processing

    DEFF Research Database (Denmark)

    Perera, Kasun S; Hahmann, Martin; Lehner, Wolfgang

    2015-01-01

    query statistics derived from experiments and when running the system. Our approach can also reduce communication load by exchanging models instead of data. To allow seamless integration of model-based querying into traditional data warehouses, we introduce a SQL compatible query terminology. Our...

  9. A Framework for WWW Query Processing

    Science.gov (United States)

    Wu, Binghui Helen; Wharton, Stephen (Technical Monitor)

    2000-01-01

    Query processing is the most common operation in a DBMS. Sophisticated query processing has been mainly targeted at a single enterprise environment providing centralized control over data and metadata. Submitting queries by anonymous users on the web is different in such a way that load balancing or DBMS' accessing control becomes the key issue. This paper provides a solution by introducing a framework for WWW query processing. The success of this framework lies in the utilization of query optimization techniques and the ontological approach. This methodology has proved to be cost effective at the NASA Goddard Space Flight Center Distributed Active Archive Center (GDAAC).

  10. Using a High-Dimensional Graph of Semantic Space to Model Relationships among Words

    Directory of Open Access Journals (Sweden)

    Alice F Jackson

    2014-05-01

    Full Text Available The GOLD model (Graph Of Language Distribution is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA. The superior performance of the GOLD models (big and small suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition.

  11. Using a high-dimensional graph of semantic space to model relationships among words.

    Science.gov (United States)

    Jackson, Alice F; Bolger, Donald J

    2014-01-01

    The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD).

  12. SPANG: a SPARQL client supporting generation and reuse of queries for distributed RDF databases.

    Science.gov (United States)

    Chiba, Hirokazu; Uchiyama, Ikuo

    2017-02-08

    Toward improved interoperability of distributed biological databases, an increasing number of datasets have been published in the standardized Resource Description Framework (RDF). Although the powerful SPARQL Protocol and RDF Query Language (SPARQL) provides a basis for exploiting RDF databases, writing SPARQL code is burdensome for users including bioinformaticians. Thus, an easy-to-use interface is necessary. We developed SPANG, a SPARQL client that has unique features for querying RDF datasets. SPANG dynamically generates typical SPARQL queries according to specified arguments. It can also call SPARQL template libraries constructed in a local system or published on the Web. Further, it enables combinatorial execution of multiple queries, each with a distinct target database. These features facilitate easy and effective access to RDF datasets and integrative analysis of distributed data. SPANG helps users to exploit RDF datasets by generation and reuse of SPARQL queries through a simple interface. This client will enhance integrative exploitation of biological RDF datasets distributed across the Web. This software package is freely available at http://purl.org/net/spang .

  13. Wiener Index, Diameter, and Stretch Factor of a Weighted Planar Graph in Subquadratic Time

    DEFF Research Database (Denmark)

    Wulff-Nilsen, Christian

    over all pairs of distinct vertices of the ratio between the graph distance and the Euclidean distance between the two vertices). More specifically, we show that the Wiener index and diameter can be found in O(n^2*(log log n)^4/log n) worst-case time and that the stretch factor can be found in O(n^2......We solve three open problems: the existence of subquadratic time algorithms for computing the Wiener index (sum of APSP distances) and the diameter (maximum distance between any vertex pair) of a planar graph with non-negative edge weights and the stretch factor of a plane geometric graph (maximum...

  14. Polynomial-time computability of the edge-reliability of graphs using Gilbert's formula

    Directory of Open Access Journals (Sweden)

    Marlowe Thomas J.

    1998-01-01

    Full Text Available Reliability is an important consideration in analyzing computer and other communication networks, but current techniques are extremely limited in the classes of graphs which can be analyzed efficiently. While Gilbert's formula establishes a theoretically elegant recursive relationship between the edge reliability of a graph and the reliability of its subgraphs, naive evaluation requires consideration of all sequences of deletions of individual vertices, and for many graphs has time complexity essentially Θ (N!. We discuss a general approach which significantly reduces complexity, encoding subgraph isomorphism in a finer partition by invariants, and recursing through the set of invariants. We illustrate this approach using threshhold graphs, and show that any computation of reliability using Gilbert's formula will be polynomial-time if and only if the number of invariants considered is polynomial; we then show families of graphs with polynomial-time, and non-polynomial reliability computation, and show that these encompass most previously known results. We then codify our approach to indicate how it can be used for other classes of graphs, and suggest several classes to which the technique can be applied.

  15. Neuro-symbolic representation learning on biological knowledge graphs

    KAUST Repository

    Alshahrani, Mona

    2017-04-21

    Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. In the past years, feature learning methods that are applicable to graph-structured data are becoming available, but have not yet widely been applied and evaluated on structured biological knowledge.We develop a novel method for feature learning on biological knowledge graphs. Our method combines symbolic methods, in particular knowledge representation using symbolic logic and automated reasoning, with neural networks to generate embeddings of nodes that encode for related information within knowledge graphs. Through the use of symbolic logic, these embeddings contain both explicit and implicit information. We apply these embeddings to the prediction of edges in the knowledge graph representing problems of function prediction, finding candidate genes of diseases, protein-protein interactions, or drug target relations, and demonstrate performance that matches and sometimes outperforms traditional approaches based on manually crafted features. Our method can be applied to any biological knowledge graph, and will thereby open up the increasing amount of SemanticWeb based knowledge bases in biology to use in machine learning and data analytics.https://github.com/bio-ontology-research-group/walking-rdf-and-owl.robert.hoehndorf@kaust.edu.sa.Supplementary data are available at Bioinformatics online.

  16. Deep Learning with Dynamic Computation Graphs

    OpenAIRE

    Looks, Moshe; Herreshoff, Marcello; Hutchins, DeLesley; Norvig, Peter

    2017-01-01

    Neural networks that compute over graph structures are a natural fit for problems in a variety of domains, including natural language (parse trees) and cheminformatics (molecular graphs). However, since the computation graph has a different shape and size for every input, such networks do not directly support batched training or inference. They are also difficult to implement in popular deep learning libraries, which are based on static data-flow graphs. We introduce a technique called dynami...

  17. Code query by example

    Science.gov (United States)

    Vaucouleur, Sebastien

    2011-02-01

    We introduce code query by example for customisation of evolvable software products in general and of enterprise resource planning systems (ERPs) in particular. The concept is based on an initial empirical study on practices around ERP systems. We motivate our design choices based on those empirical results, and we show how the proposed solution helps with respect to the infamous upgrade problem: the conflict between the need for customisation and the need for upgrade of ERP systems. We further show how code query by example can be used as a form of lightweight static analysis, to detect automatically potential defects in large software products. Code query by example as a form of lightweight static analysis is particularly interesting in the context of ERP systems: it is often the case that programmers working in this field are not computer science specialists but more of domain experts. Hence, they require a simple language to express custom rules.

  18. StreamQRE: Modular Specification and Efficient Evaluation of Quantitative Queries over Streaming Data.

    Science.gov (United States)

    Mamouras, Konstantinos; Raghothaman, Mukund; Alur, Rajeev; Ives, Zachary G; Khanna, Sanjeev

    2017-06-01

    Real-time decision making in emerging IoT applications typically relies on computing quantitative summaries of large data streams in an efficient and incremental manner. To simplify the task of programming the desired logic, we propose StreamQRE, which provides natural and high-level constructs for processing streaming data. Our language has a novel integration of linguistic constructs from two distinct programming paradigms: streaming extensions of relational query languages and quantitative extensions of regular expressions. The former allows the programmer to employ relational constructs to partition the input data by keys and to integrate data streams from different sources, while the latter can be used to exploit the logical hierarchy in the input stream for modular specifications. We first present the core language with a small set of combinators, formal semantics, and a decidable type system. We then show how to express a number of common patterns with illustrative examples. Our compilation algorithm translates the high-level query into a streaming algorithm with precise complexity bounds on per-item processing time and total memory footprint. We also show how to integrate approximation algorithms into our framework. We report on an implementation in Java, and evaluate it with respect to existing high-performance engines for processing streaming data. Our experimental evaluation shows that (1) StreamQRE allows more natural and succinct specification of queries compared to existing frameworks, (2) the throughput of our implementation is higher than comparable systems (for example, two-to-four times greater than RxJava), and (3) the approximation algorithms supported by our implementation can lead to substantial memory savings.

  19. Multi-Level Anomaly Detection on Time-Varying Graph Data

    Energy Technology Data Exchange (ETDEWEB)

    Bridges, Robert A [ORNL; Collins, John P [ORNL; Ferragut, Erik M [ORNL; Laska, Jason A [ORNL; Sullivan, Blair D [ORNL

    2015-01-01

    This work presents a novel modeling and analysis framework for graph sequences which addresses the challenge of detecting and contextualizing anomalies in labelled, streaming graph data. We introduce a generalization of the BTER model of Seshadhri et al. by adding flexibility to community structure, and use this model to perform multi-scale graph anomaly detection. Specifically, probability models describing coarse subgraphs are built by aggregating probabilities at finer levels, and these closely related hierarchical models simultaneously detect deviations from expectation. This technique provides insight into a graph's structure and internal context that may shed light on a detected event. Additionally, this multi-scale analysis facilitates intuitive visualizations by allowing users to narrow focus from an anomalous graph to particular subgraphs or nodes causing the anomaly. For evaluation, two hierarchical anomaly detectors are tested against a baseline Gaussian method on a series of sampled graphs. We demonstrate that our graph statistics-based approach outperforms both a distribution-based detector and the baseline in a labeled setting with community structure, and it accurately detects anomalies in synthetic and real-world datasets at the node, subgraph, and graph levels. To illustrate the accessibility of information made possible via this technique, the anomaly detector and an associated interactive visualization tool are tested on NCAA football data, where teams and conferences that moved within the league are identified with perfect recall, and precision greater than 0.786.

  20. Constructs for Programming with Graph Rewrites

    OpenAIRE

    Rodgers, Peter

    2000-01-01

    Graph rewriting is becoming increasingly popular as a method for programming with graph based data structures. We present several modifications to a basic serial graph rewriting paradigm and discuss how they improve coding programs in the Grrr graph rewriting programming language. The constructs we present are once only nodes, attractor nodes and single match rewrites. We illustrate the operation of the constructs by example. The advantages of adding these new rewrite modifiers is to reduce t...

  1. Advanced SPARQL querying in small molecule databases.

    Science.gov (United States)

    Galgonek, Jakub; Hurt, Tomáš; Michlíková, Vendula; Onderka, Petr; Schwarz, Jan; Vondrášek, Jiří

    2016-01-01

    In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users. We extended a SPARQL engine to be able to use special procedures inside SPARQL queries. This allows the user to work with data that cannot be simply precomputed and thus cannot be directly stored in the database. We designed an algorithm that checks a query against data ontology to identify possible user errors. This greatly improves query debugging. We also introduced an approach to visualize retrieved data in a user-friendly way, based on templates describing visualizations of resource classes. To integrate all of our approaches, we developed a simple web application. Our system was implemented successfully, and we demonstrated its usability on the ChEBI database transformed into RDF form. To demonstrate procedure call functions, we employed compound similarity searching based on OrChem. The application is publicly available at https://bioinfo.uochb.cas.cz/projects/chemRDF.

  2. Fundamentals of algebraic graph transformation

    CERN Document Server

    Ehrig, Hartmut; Prange, Ulrike; Taentzer, Gabriele

    2006-01-01

    Graphs are widely used to represent structural information in the form of objects and connections between them. Graph transformation is the rule-based manipulation of graphs, an increasingly important concept in computer science and related fields. This is the first textbook treatment of the algebraic approach to graph transformation, based on algebraic structures and category theory. Part I is an introduction to the classical case of graph and typed graph transformation. In Part II basic and advanced results are first shown for an abstract form of replacement systems, so-called adhesive high-level replacement systems based on category theory, and are then instantiated to several forms of graph and Petri net transformation systems. Part III develops typed attributed graph transformation, a technique of key relevance in the modeling of visual languages and in model transformation. Part IV contains a practical case study on model transformation and a presentation of the AGG (attributed graph grammar) tool envir...

  3. Declarative Process Mining for DCR Graphs

    DEFF Research Database (Denmark)

    Debois, Søren; Hildebrandt, Thomas T.; Laursen, Paw Høvsgaard

    2017-01-01

    We investigate process mining for the declarative Dynamic Condition Response (DCR) graphs process modelling language. We contribute (a) a process mining algorithm for DCR graphs, (b) a proposal for a set of metrics quantifying output model quality, and (c) a preliminary example-based comparison...

  4. Truth Space Method for Caching Database Queries

    Directory of Open Access Journals (Sweden)

    S. V. Mosin

    2015-01-01

    Full Text Available We propose a new method of client-side data caching for relational databases with a central server and distant clients. Data are loaded into the client cache based on queries executed on the server. Every query has the corresponding DB table – the result of the query execution. These queries have a special form called "universal relational query" based on three fundamental Relational Algebra operations: selection, projection and natural join. We have to mention that such a form is the closest one to the natural language and the majority of database search queries can be expressed in this way. Besides, this form allows us to analyze query correctness by checking lossless join property. A subsequent query may be executed in a client’s local cache if we can determine that the query result is entirely contained in the cache. For this we compare truth spaces of the logical restrictions in a new user’s query and the results of the queries execution in the cache. Such a comparison can be performed analytically , without need in additional Database queries. This method may be used to define lacking data in the cache and execute the query on the server only for these data. To do this the analytical approach is also used, what distinguishes our paper from the existing technologies. We propose four theorems for testing the required conditions. The first and the third theorems conditions allow us to define the existence of required data in cache. The second and the fourth theorems state conditions to execute queries with cache only. The problem of cache data actualizations is not discussed in this paper. However, it can be solved by cataloging queries on the server and their serving by triggers in background mode. The article is published in the author’s wording.

  5. Improving the Usability of OCL as an Ad-hoc Model Querying Language

    DEFF Research Database (Denmark)

    Störrle, Harald

    2013-01-01

    from our research and make it accessible to the OCL community, we propose the OCL Query API (OQAPI), a library of query-predicates to improve the user-friendliness of OCL for ad-hoc querying. The usability of OQAPI is studied using controlled experiments. We nd considerable evidence to support our...

  6. BJUT at TREC 2015 Microblog Track: Real Time Filtering Using Knowledge Base

    Science.gov (United States)

    2015-11-20

    all non- English language tweeter are supposed to be junk, thus we reduce that once we detect a non- English Unicode character in it. • Corpus Generation...new query. According to the new query, we use it to generate Lemur Query Parameter File. Noted that we simply select the unigram Language Model with...learning to rank of tweets. In Proceedings of the 23rd International Conference on Computational Linguistics , pages 295–303. Association for Computational

  7. Interactive exploration of large-scale time-varying data using dynamic tracking graphs

    KAUST Repository

    Widanagamaachchi, W.

    2012-10-01

    Exploring and analyzing the temporal evolution of features in large-scale time-varying datasets is a common problem in many areas of science and engineering. One natural representation of such data is tracking graphs, i.e., constrained graph layouts that use one spatial dimension to indicate time and show the "tracks" of each feature as it evolves, merges or disappears. However, for practical data sets creating the corresponding optimal graph layouts that minimize the number of intersections can take hours to compute with existing techniques. Furthermore, the resulting graphs are often unmanageably large and complex even with an ideal layout. Finally, due to the cost of the layout, changing the feature definition, e.g. by changing an iso-value, or analyzing properly adjusted sub-graphs is infeasible. To address these challenges, this paper presents a new framework that couples hierarchical feature definitions with progressive graph layout algorithms to provide an interactive exploration of dynamically constructed tracking graphs. Our system enables users to change feature definitions on-the-fly and filter features using arbitrary attributes while providing an interactive view of the resulting tracking graphs. Furthermore, the graph display is integrated into a linked view system that provides a traditional 3D view of the current set of features and allows a cross-linked selection to enable a fully flexible spatio-temporal exploration of data. We demonstrate the utility of our approach with several large-scale scientific simulations from combustion science. © 2012 IEEE.

  8. Query-Time Optimization Techniques for Structured Queries in Information Retrieval

    Science.gov (United States)

    Cartright, Marc-Allen

    2013-01-01

    The use of information retrieval (IR) systems is evolving towards larger, more complicated queries. Both the IR industrial and research communities have generated significant evidence indicating that in order to continue improving retrieval effectiveness, increases in retrieval model complexity may be unavoidable. From an operational perspective,…

  9. Contracts for Cross-organizational Workflows as Timed Dynamic Condition Response Graphs

    DEFF Research Database (Denmark)

    Hildebrandt, Thomas; Mukkamala, Raghava Rao; Slaats, Tijs

    2013-01-01

    We conservatively extend the declarative Dynamic Condition Response (DCR) Graph process model, introduced in the PhD thesis of the second author, to allow for discrete time deadlines. We prove that safety and liveness properties can be verified by mapping finite timed DCR Graphs to finite state...

  10. Neuro-symbolic representation learning on biological knowledge graphs.

    Science.gov (United States)

    Alshahrani, Mona; Khan, Mohammad Asif; Maddouri, Omar; Kinjo, Akira R; Queralt-Rosinach, Núria; Hoehndorf, Robert

    2017-09-01

    Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. In the past years, feature learning methods that are applicable to graph-structured data are becoming available, but have not yet widely been applied and evaluated on structured biological knowledge. Results: We develop a novel method for feature learning on biological knowledge graphs. Our method combines symbolic methods, in particular knowledge representation using symbolic logic and automated reasoning, with neural networks to generate embeddings of nodes that encode for related information within knowledge graphs. Through the use of symbolic logic, these embeddings contain both explicit and implicit information. We apply these embeddings to the prediction of edges in the knowledge graph representing problems of function prediction, finding candidate genes of diseases, protein-protein interactions, or drug target relations, and demonstrate performance that matches and sometimes outperforms traditional approaches based on manually crafted features. Our method can be applied to any biological knowledge graph, and will thereby open up the increasing amount of Semantic Web based knowledge bases in biology to use in machine learning and data analytics. https://github.com/bio-ontology-research-group/walking-rdf-and-owl. robert.hoehndorf@kaust.edu.sa. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  11. SCRY: Enabling quantitative reasoning in SPARQL queries

    NARCIS (Netherlands)

    Meroño-Peñuela, A.; Stringer, Bas; Loizou, Antonis; Abeln, Sanne; Heringa, Jaap

    2015-01-01

    The inability to include quantitative reasoning in SPARQL queries slows down the application of Semantic Web technology in the life sciences. SCRY, our SPARQL compatible service layer, improves this by executing services at query time and making their outputs query-accessible, generating RDF data on

  12. Topics in graph theory graphs and their Cartesian product

    CERN Document Server

    Imrich, Wilfried; Rall, Douglas F

    2008-01-01

    From specialists in the field, you will learn about interesting connections and recent developments in the field of graph theory by looking in particular at Cartesian products-arguably the most important of the four standard graph products. Many new results in this area appear for the first time in print in this book. Written in an accessible way, this book can be used for personal study in advanced applications of graph theory or for an advanced graph theory course.

  13. Sum of All-Pairs Shortest Path Distances in a Planar Graph in Subquadratic Time

    DEFF Research Database (Denmark)

    Wulff-Nilsen, Christian

    2008-01-01

    We consider the problem of computing the Wiener index of a graph, defined as the sum of distances between all pairs of its vertices. It is an open problem whether the Wiener index of a planar graph can be found in subquadratic time. We solve this problem by presenting an algorithm with O(n^2*log...

  14. Quantum walk on a chimera graph

    Science.gov (United States)

    Xu, Shu; Sun, Xiangxiang; Wu, Jizhou; Zhang, Wei-Wei; Arshed, Nigum; Sanders, Barry C.

    2018-05-01

    We analyse a continuous-time quantum walk on a chimera graph, which is a graph of choice for designing quantum annealers, and we discover beautiful quantum walk features such as localization that starkly distinguishes classical from quantum behaviour. Motivated by technological thrusts, we study continuous-time quantum walk on enhanced variants of the chimera graph and on diminished chimera graph with a random removal of vertices. We explain the quantum walk by constructing a generating set for a suitable subgroup of graph isomorphisms and corresponding symmetry operators that commute with the quantum walk Hamiltonian; the Hamiltonian and these symmetry operators provide a complete set of labels for the spectrum and the stationary states. Our quantum walk characterization of the chimera graph and its variants yields valuable insights into graphs used for designing quantum-annealers.

  15. Generic multiset programming for language-integrated querying

    DEFF Research Database (Denmark)

    Henglein, Fritz; Larsen, Ken Friis

    2010-01-01

    This paper demonstrates how relational algebraic programming based on efficient symbolic representations of multisets and operations on them can be applied to the query sublanguage of SQL in a type-safe fashion. In essence, it provides a library for naïve programming with multisets in a generalized...... SQL-style fashion, but avoids many cases of asymptotically inefficient nested iteration through cross-products....

  16. A hierarchical graph neuron scheme for real-time pattern recognition.

    Science.gov (United States)

    Nasution, B B; Khan, A I

    2008-02-01

    The hierarchical graph neuron (HGN) implements a single cycle memorization and recall operation through a novel algorithmic design. The HGN is an improvement on the already published original graph neuron (GN) algorithm. In this improved approach, it recognizes incomplete/noisy patterns. It also resolves the crosstalk problem, which is identified in the previous publications, within closely matched patterns. To accomplish this, the HGN links multiple GN networks for filtering noise and crosstalk out of pattern data inputs. Intrinsically, the HGN is a lightweight in-network processing algorithm which does not require expensive floating point computations; hence, it is very suitable for real-time applications and tiny devices such as the wireless sensor networks. This paper describes that the HGN's pattern matching capability and the small response time remain insensitive to the increases in the number of stored patterns. Moreover, the HGN does not require definition of rules or setting of thresholds by the operator to achieve the desired results nor does it require heuristics entailing iterative operations for memorization and recall of patterns.

  17. Multifractal analysis of visibility graph-based Ito-related connectivity time series.

    Science.gov (United States)

    Czechowski, Zbigniew; Lovallo, Michele; Telesca, Luciano

    2016-02-01

    In this study, we investigate multifractal properties of connectivity time series resulting from the visibility graph applied to normally distributed time series generated by the Ito equations with multiplicative power-law noise. We show that multifractality of the connectivity time series (i.e., the series of numbers of links outgoing any node) increases with the exponent of the power-law noise. The multifractality of the connectivity time series could be due to the width of connectivity degree distribution that can be related to the exit time of the associated Ito time series. Furthermore, the connectivity time series are characterized by persistence, although the original Ito time series are random; this is due to the procedure of visibility graph that, connecting the values of the time series, generates persistence but destroys most of the nonlinear correlations. Moreover, the visibility graph is sensitive for detecting wide "depressions" in input time series.

  18. Graph Algorithm Animation with Grrr

    OpenAIRE

    Rodgers, Peter; Vidal, Natalia

    2000-01-01

    We discuss geometric positioning, highlighting of visited nodes and user defined highlighting that form the algorithm animation facilities in the Grrr graph rewriting programming language. The main purpose of animation was initially for the debugging and profiling of Grrr code, but recently it has been extended for the purpose of teaching algorithms to undergraduate students. The animation is restricted to graph based algorithms such as graph drawing, list manipulation or more traditional gra...

  19. Language-Agnostic Relation Extraction from Abstracts in Wikis

    Directory of Open Access Journals (Sweden)

    Nicolas Heist

    2018-03-01

    Full Text Available Large-scale knowledge graphs, such as DBpedia, Wikidata, or YAGO, can be enhanced by relation extraction from text, using the data in the knowledge graph as training data, i.e., using distant supervision. While most existing approaches use language-specific methods (usually for English, we present a language-agnostic approach that exploits background knowledge from the graph instead of language-specific techniques and builds machine learning models only from language-independent features. We demonstrate the extraction of relations from Wikipedia abstracts, using the twelve largest language editions of Wikipedia. From those, we can extract 1.6 M new relations in DBpedia at a level of precision of 95%, using a RandomForest classifier trained only on language-independent features. We furthermore investigate the similarity of models for different languages and show an exemplary geographical breakdown of the information extracted. In a second series of experiments, we show how the approach can be transferred to DBkWik, a knowledge graph extracted from thousands of Wikis. We discuss the challenges and first results of extracting relations from a larger set of Wikis, using a less formalized knowledge graph.

  20. Elastic Spatial Query Processing in OpenStack Cloud Computing Environment for Time-Constraint Data Analysis

    Directory of Open Access Journals (Sweden)

    Wei Huang

    2017-03-01

    Full Text Available Geospatial big data analysis (GBDA is extremely significant for time-constraint applications such as disaster response. However, the time-constraint analysis is not yet a trivial task in the cloud computing environment. Spatial query processing (SQP is typical computation-intensive and indispensable for GBDA, and the spatial range query, join query, and the nearest neighbor query algorithms are not scalable without using MapReduce-liked frameworks. Parallel SQP algorithms (PSQPAs are trapped in screw-processing, which is a known issue in Geoscience. To satisfy time-constrained GBDA, we propose an elastic SQP approach in this paper. First, Spark is used to implement PSQPAs. Second, Kubernetes-managed Core Operation System (CoreOS clusters provide self-healing Docker containers for running Spark clusters in the cloud. Spark-based PSQPAs are submitted to Docker containers, where Spark master instances reside. Finally, the horizontal pod auto-scaler (HPA would scale-out and scale-in Docker containers for supporting on-demand computing resources. Combined with an auto-scaling group of virtual instances, HPA helps to find each of the five nearest neighbors for 46,139,532 query objects from 834,158 spatial data objects in less than 300 s. The experiments conducted on an OpenStack cloud demonstrate that auto-scaling containers can satisfy time-constraint GBDA in clouds.

  1. Continuous-Time Classical and Quantum Random Walk on Direct Product of Cayley Graphs

    International Nuclear Information System (INIS)

    Salimi, S.; Jafarizadeh, M. A.

    2009-01-01

    In this paper we define direct product of graphs and give a recipe for obtaining probability of observing particle on vertices in the continuous-time classical and quantum random walk. In the recipe, the probability of observing particle on direct product of graph is obtained by multiplication of probability on the corresponding to sub-graphs, where this method is useful to determining probability of walk on complicated graphs. Using this method, we calculate the probability of continuous-time classical and quantum random walks on many of finite direct product Cayley graphs (complete cycle, complete K n , charter and n-cube). Also, we inquire that the classical state the stationary uniform distribution is reached as t → ∞ but for quantum state is not always satisfied. (general)

  2. PyDecay/GraphPhys: A Unified Language and Storage System for Particle Decay Process Descriptions

    Energy Technology Data Exchange (ETDEWEB)

    Dunietz, Jesse N.; /MIT /SLAC

    2011-06-22

    To ease the tasks of Monte Carlo (MC) simulation and event reconstruction (i.e. inferring particle-decay events from experimental data) for long-term BaBar data preservation and analysis, the following software components have been designed: a language ('GraphPhys') for specifying decay processes, common to both simulation and data analysis, allowing arbitrary parameters on particles, decays, and entire processes; an automated visualization tool to show graphically what decays have been specified; and a searchable database storage mechanism for decay specifications. Unlike HepML, a proposed XML standard for HEP metadata, the specification language is designed not for data interchange between computer systems, but rather for direct manipulation by human beings as well as computers. The components are interoperable: the information parsed from files in the specification language can easily be rendered as an image by the visualization package, and conversion between decay representations was implemented. Several proof-of-concept command-line tools were built based on this framework. Applications include building easier and more efficient interfaces to existing analysis tools for current projects (e.g. BaBar/BESII), providing a framework for analyses in future experimental settings (e.g. LHC/SuperB), and outreach programs that involve giving students access to BaBar data and analysis tools to give them a hands-on feel for scientific analysis.

  3. Querying temporal databases via OWL 2 QL

    CSIR Research Space (South Africa)

    Klarman, S

    2014-06-01

    Full Text Available SQL:2011, the most recently adopted version of the SQL query language, has unprecedentedly standardized the representation of temporal data in relational databases. Following the successful paradigm of ontology-based data access, we develop a...

  4. Querying UML models using OCL and Prolog: A performance study

    NARCIS (Netherlands)

    Chimiak-Opoka, J.; Felderer, M.; Lenz, C.; Lange, C.F.J.

    2008-01-01

    The size of unified modeling language (UML) models used in practice is very large and ranges up to hundreds and thousands of classes. Querying of these models is used to support their quality assessment by information filtering and aggregating. For both, human cognition and automated analysis, there

  5. Practical querying of temporal data via OWL 2 QL and SQL: 2011

    CSIR Research Space (South Africa)

    Klarman, S

    2013-12-01

    Full Text Available We develop a practical approach to querying temporal data stored in temporal SQL:2011 databases through the semantic layer of OWL 2 QL ontologies. An interval-based temporal query language (TQL), which we propose for this task, is defined via...

  6. Formal language constrained path problems

    Energy Technology Data Exchange (ETDEWEB)

    Barrett, C.; Jacob, R.; Marathe, M.

    1997-07-08

    In many path finding problems arising in practice, certain patterns of edge/vertex labels in the labeled graph being traversed are allowed/preferred, while others are disallowed. Motivated by such applications as intermodal transportation planning, the authors investigate the complexity of finding feasible paths in a labeled network, where the mode choice for each traveler is specified by a formal language. The main contributions of this paper include the following: (1) the authors show that the problem of finding a shortest path between a source and destination for a traveler whose mode choice is specified as a context free language is solvable efficiently in polynomial time, when the mode choice is specified as a regular language they provide algorithms with improved space and time bounds; (2) in contrast, they show that the problem of finding simple paths between a source and a given destination is NP-hard, even when restricted to very simple regular expressions and/or very simple graphs; (3) for the class of treewidth bounded graphs, they show that (i) the problem of finding a regular language constrained simple path between source and a destination is solvable in polynomial time and (ii) the extension to finding context free language constrained simple paths is NP-complete. Several extensions of these results are presented in the context of finding shortest paths with additional constraints. These results significantly extend the results in [MW95]. As a corollary of the results, they obtain a polynomial time algorithm for the BEST k-SIMILAR PATH problem studied in [SJB97]. The previous best algorithm was given by [SJB97] and takes exponential time in the worst case.

  7. A Query Evaluation Approach using Opinions of Turkish Financial Market Professionals

    Directory of Open Access Journals (Sweden)

    Bora Uğurlu

    2015-08-01

    Full Text Available People who do not have expertise in the financial area may not see the relationship between the numerical and linguistic data. In our study, a knowledge discovery approach using Turkish natural language processing is recommended in order to respond to meaningful queries and classify them with high accuracy. Query corpus consists of randomly selected unique keywords. Quantitative evaluation is done in order to measure the classification performance. Experimental results indicate that our proposed approach is sufficiently consistent with and able to make categorical classifications correctly. The approach highlights the relationship between numerical and linguistic data obtained from Turkish financial market.

  8. Mapping Between Semantic Graphs and Sentences in Grammar Induction System

    Directory of Open Access Journals (Sweden)

    Laszlo Kovacs

    2010-06-01

    Full Text Available The proposed transformation module performs mapping be-
    tween two di®erent knowledge representation forms used in grammar induction systems. The kernel knowledge representation form is a special predicate centered conceptual graph called ECG. The ECG provides a semantic-based, language independent description of the environment. The other base representation form is some kind of language. The sentences of the language should meet the corresponding grammatical rules. The pilot project demonstrates the functionality of a translator module using this transformation engine between the ECG graph and the Hungarian language.

  9. Sonata: Query-Driven Network Telemetry

    KAUST Repository

    Gupta, Arpit; Harrison, Rob; Pawar, Ankita; Birkner, Rü diger; Canini, Marco; Feamster, Nick; Rexford, Jennifer; Willinger, Walter

    2017-01-01

    Operating networks depends on collecting and analyzing measurement data. Current technologies do not make it easy to do so, typically because they separate data collection (e.g., packet capture or flow monitoring) from analysis, producing either too much data to answer a general question or too little data to answer a detailed question. In this paper, we present Sonata, a network telemetry system that uses a uniform query interface to drive the joint collection and analysis of network traffic. Sonata takes the advantage of two emerging technologies---streaming analytics platforms and programmable network devices---to facilitate joint collection and analysis. Sonata allows operators to more directly express network traffic analysis tasks in terms of a high-level language. The underlying runtime partitions each query into a portion that runs on the switch and another that runs on the streaming analytics platform iteratively refines the query to efficiently capture only the traffic that pertains to the operator's query, and exploits sketches to reduce state in switches in exchange for more approximate results. Through an evaluation of a prototype implementation, we demonstrate that Sonata can support a wide range of network telemetry tasks with less state in the network, and lower data rates to streaming analytics systems, than current approaches can achieve.

  10. Sonata: Query-Driven Network Telemetry

    KAUST Repository

    Gupta, Arpit

    2017-05-02

    Operating networks depends on collecting and analyzing measurement data. Current technologies do not make it easy to do so, typically because they separate data collection (e.g., packet capture or flow monitoring) from analysis, producing either too much data to answer a general question or too little data to answer a detailed question. In this paper, we present Sonata, a network telemetry system that uses a uniform query interface to drive the joint collection and analysis of network traffic. Sonata takes the advantage of two emerging technologies---streaming analytics platforms and programmable network devices---to facilitate joint collection and analysis. Sonata allows operators to more directly express network traffic analysis tasks in terms of a high-level language. The underlying runtime partitions each query into a portion that runs on the switch and another that runs on the streaming analytics platform iteratively refines the query to efficiently capture only the traffic that pertains to the operator\\'s query, and exploits sketches to reduce state in switches in exchange for more approximate results. Through an evaluation of a prototype implementation, we demonstrate that Sonata can support a wide range of network telemetry tasks with less state in the network, and lower data rates to streaming analytics systems, than current approaches can achieve.

  11. Conceptual querying through ontologies

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik

    2009-01-01

    is motivated by an obvious need for users to survey huge volumes of objects in query answers. An ontology formalism and a special notion of-instantiated ontology" are introduced. The latter is a structure reflecting the content in the document collection in that; it is a restriction of a general world......We present here ail approach to conceptual querying where the aim is, given a collection of textual database objects or documents, to target an abstraction of the entire database content in terms of the concepts appearing in documents, rather than the documents in the collection. The approach...... knowledge ontology to the concepts instantiated in the collection. The notion of ontology-based similarity is briefly described, language constructs for direct navigation and retrieval of concepts in the ontology are discussed and approaches to conceptual summarization are presented....

  12. Analysis of queries sent to PubMed at the point of care: Observation of search behaviour in a medical teaching hospital

    Science.gov (United States)

    Hoogendam, Arjen; Stalenhoef, Anton FH; Robbé, Pieter F de Vries; Overbeke, A John PM

    2008-01-01

    Background The use of PubMed to answer daily medical care questions is limited because it is challenging to retrieve a small set of relevant articles and time is restricted. Knowing what aspects of queries are likely to retrieve relevant articles can increase the effectiveness of PubMed searches. The objectives of our study were to identify queries that are likely to retrieve relevant articles by relating PubMed search techniques and tools to the number of articles retrieved and the selection of articles for further reading. Methods This was a prospective observational study of queries regarding patient-related problems sent to PubMed by residents and internists in internal medicine working in an Academic Medical Centre. We analyzed queries, search results, query tools (Mesh, Limits, wildcards, operators), selection of abstract and full-text for further reading, using a portal that mimics PubMed. Results PubMed was used to solve 1121 patient-related problems, resulting in 3205 distinct queries. Abstracts were viewed in 999 (31%) of these queries, and in 126 (39%) of 321 queries using query tools. The average term count per query was 2.5. Abstracts were selected in more than 40% of queries using four or five terms, increasing to 63% if the use of four or five terms yielded 2–161 articles. Conclusion Queries sent to PubMed by physicians at our hospital during daily medical care contain fewer than three terms. Queries using four to five terms, retrieving less than 161 article titles, are most likely to result in abstract viewing. PubMed search tools are used infrequently by our population and are less effective than the use of four or five terms. Methods to facilitate the formulation of precise queries, using more relevant terms, should be the focus of education and research. PMID:18816391

  13. Dynamic Planar Range Maxima Queries

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Tsakalidis, Konstantinos

    2011-01-01

    We consider the dynamic two-dimensional maxima query problem. Let P be a set of n points in the plane. A point is maximal if it is not dominated by any other point in P. We describe two data structures that support the reporting of the t maximal points that dominate a given query point, and allow...... for insertions and deletions of points in P. In the pointer machine model we present a linear space data structure with O(logn + t) worst case query time and O(logn) worst case update time. This is the first dynamic data structure for the planar maxima dominance query problem that achieves these bounds...... are integers in the range U = {0, …,2 w  − 1 }. We present a linear space data structure that supports 3-sided range maxima queries in O(logn/loglogn+t) worst case time and updates in O(logn/loglogn) worst case time. These are the first sublogarithmic worst case bounds for all operations in the RAM model....

  14. Fundamental group of dual graphs and applications to quantum space time

    International Nuclear Information System (INIS)

    Nada, S.I.; Hamouda, E.H.

    2009-01-01

    Let G be a connected planar graph with n vertices and m edges. It is known that the fundamental group of G has 1 -(n - m) generators. In this paper, we show that if G is a self-dual graph, then its fundamental group has (n - 1) generators. We indicate that these results are relevant to quantum space time.

  15. Time series analysis of the developed financial markets' integration using visibility graphs

    Science.gov (United States)

    Zhuang, Enyu; Small, Michael; Feng, Gang

    2014-09-01

    A time series representing the developed financial markets' segmentation from 1973 to 2012 is studied. The time series reveals an obvious market integration trend. To further uncover the features of this time series, we divide it into seven windows and generate seven visibility graphs. The measuring capabilities of the visibility graphs provide means to quantitatively analyze the original time series. It is found that the important historical incidents that influenced market integration coincide with variations in the measured graphical node degree. Through the measure of neighborhood span, the frequencies of the historical incidents are disclosed. Moreover, it is also found that large "cycles" and significant noise in the time series are linked to large and small communities in the generated visibility graphs. For large cycles, how historical incidents significantly affected market integration is distinguished by density and compactness of the corresponding communities.

  16. MetricForensics: A Multi-Level Approach for Mining Volatile Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Henderson, Keith [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Eliassi-Rad, Tina [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Faloutsos, Christos [Carnegie Mellon Univ., Pittsburgh, PA (United States); Akoglu, Leman [Carnegie Mellon Univ., Pittsburgh, PA (United States); Li, Lei [Carnegie Mellon Univ., Pittsburgh, PA (United States); Maruhashi, Koji [Fujitsu Laboratories Ltd., Kanagawa (Japan); Prakash, B. Aditya [Carnegie Mellon Univ., Pittsburgh, PA (United States); Tong, H [Carnegie Mellon Univ., Pittsburgh, PA (United States)

    2010-02-08

    Advances in data collection and storage capacity have made it increasingly possible to collect highly volatile graph data for analysis. Existing graph analysis techniques are not appropriate for such data, especially in cases where streaming or near-real-time results are required. An example that has drawn significant research interest is the cyber-security domain, where internet communication traces are collected and real-time discovery of events, behaviors, patterns and anomalies is desired. We propose MetricForensics, a scalable framework for analysis of volatile graphs. MetricForensics combines a multi-level “drill down" approach, a collection of user-selected graph metrics and a collection of analysis techniques. At each successive level, more sophisticated metrics are computed and the graph is viewed at a finer temporal resolution. In this way, MetricForensics scales to highly volatile graphs by only allocating resources for computationally expensive analysis when an interesting event is discovered at a coarser resolution first. We test MetricForensics on three real-world graphs: an enterprise IP trace, a trace of legitimate and malicious network traffic from a research institution, and the MIT Reality Mining proximity sensor data. Our largest graph has »3M vertices and »32M edges, spanning 4:5 days. The results demonstrate the scalability and capability of MetricForensics in analyzing volatile graphs; and highlight four novel phenomena in such graphs: elbows, broken correlations, prolonged spikes, and strange stars.

  17. Interactive Graph Layout of a Million Nodes

    Directory of Open Access Journals (Sweden)

    Peng Mi

    2016-12-01

    Full Text Available Sensemaking of large graphs, specifically those with millions of nodes, is a crucial task in many fields. Automatic graph layout algorithms, augmented with real-time human-in-the-loop interaction, can potentially support sensemaking of large graphs. However, designing interactive algorithms to achieve this is challenging. In this paper, we tackle the scalability problem of interactive layout of large graphs, and contribute a new GPU-based force-directed layout algorithm that exploits graph topology. This algorithm can interactively layout graphs with millions of nodes, and support real-time interaction to explore alternative graph layouts. Users can directly manipulate the layout of vertices in a force-directed fashion. The complexity of traditional repulsive force computation is reduced by approximating calculations based on the hierarchical structure of multi-level clustered graphs. We evaluate the algorithm performance, and demonstrate human-in-the-loop layout in two sensemaking case studies. Moreover, we summarize lessons learned for designing interactive large graph layout algorithms on the GPU.

  18. Characterizing graphs of maximum matching width at most 2

    DEFF Research Database (Denmark)

    Jeong, Jisu; Ok, Seongmin; Suh, Geewon

    2017-01-01

    The maximum matching width is a width-parameter that is de ned on a branch-decomposition over the vertex set of a graph. The size of a maximum matching in the bipartite graph is used as a cut-function. In this paper, we characterize the graphs of maximum matching width at most 2 using the minor o...

  19. Professional XMPP Programming with JavaScript and jQuery

    CERN Document Server

    Moffitt, Jack

    2010-01-01

    Create real-time, highly interactive apps quickly with the powerful XMPP protocol. XMPP is a robust protocol used for a wide range of applications, including instant messaging, multi-user chat, voice and video conferencing, collaborative spaces, real-time gaming, data synchronization, and search. This book teaches you how to harness the power of XMPP in your own apps and presents you with all the tools you need to build the next generation of apps using XMPP or add new features to your current apps. Featuring the JavaScript language throughout and making use of the jQuery library, the book con

  20. Query responses

    Directory of Open Access Journals (Sweden)

    Paweł Łupkowski

    2017-05-01

    Full Text Available In this article we consider the phenomenon of answering a query with a query. Although such answers are common, no large scale, corpus-based characterization exists, with the exception of clarification requests. After briefly reviewing different theoretical approaches on this subject, we present a corpus study of query responses in the British National Corpus and develop a taxonomy for query responses. We point at a variety of response categories that have not been formalized in previous dialogue work, particularly those relevant to adversarial interaction. We show that different response categories have significantly different rates of subsequent answer provision. We provide a formal analysis of the response categories in the framework of KoS.

  1. Web-based topology queries on a BIM model

    DEFF Research Database (Denmark)

    Rasmussen, Mads Holten; Hviid, Christian Anker; Karlshøj, Jan

    Building Information Modeling (BIM) is in the industry often confused with 3D-modeling regardless that the potential of modeling information goes way beyond performing clash detections on geometrical objects occupying the same physical space. Lately, several research projects have tried to change...... that by extending BIM with information using linked data technologies. However, when showing information alone the strong communication benefits of 3D are neglected, and a practical way of connecting the two worlds is currently missing. In this paper, we present a prototype of a visual query interface running...... is to establish a baseline for discussion of the general design choices that have been considered, and the developed application further serves as a proof of concept for combining BIM model data with a knowledge graph and potentially other sources of Linked Open Data, in a simple web interface....

  2. Natural language query system design for interactive information storage and retrieval systems. Presentation visuals. M.S. Thesis Final Report, 1 Jul. 1985 - 31 Dec. 1987

    Science.gov (United States)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung

    1985-01-01

    This Working Paper Series entry represents a collection of presentation visuals associated with the companion report entitled Natural Language Query System Design for Interactive Information Storage and Retrieval Systems, USL/DBMS NASA/RECON Working Paper Series report number DBMS.NASA/RECON-17.

  3. SPARTex: A Vertex-Centric Framework for RDF Data Analytics

    KAUST Repository

    Abdelaziz, Ibrahim

    2015-08-31

    A growing number of applications require combining SPARQL queries with generic graph search on RDF data. However, the lack of procedural capabilities in SPARQL makes it inappropriate for graph analytics. Moreover, RDF engines focus on SPARQL query evaluation whereas graph management frameworks perform only generic graph computations. In this work, we bridge the gap by introducing SPARTex, an RDF analytics framework based on the vertex-centric computation model. In SPARTex, user-defined vertex centric programs can be invoked from SPARQL as stored procedures. SPARTex allows the execution of a pipeline of graph algorithms without the need for multiple reads/writes of input data and intermediate results. We use a cost-based optimizer for minimizing the communication cost. SPARTex evaluates queries that combine SPARQL and generic graph computations orders of magnitude faster than existing RDF engines. We demonstrate a real system prototype of SPARTex running on a local cluster using real and synthetic datasets. SPARTex has a real-time graphical user interface that allows the participants to write regular SPARQL queries, use our proposed SPARQL extension to declaratively invoke graph algorithms or combine/pipeline both SPARQL querying and generic graph analytics.

  4. Accelerating SPARQL Queries and Analytics on RDF Data

    KAUST Repository

    Al-Harbi, Razen

    2016-11-09

    The complexity of SPARQL queries and RDF applications poses great challenges on distributed RDF management systems. SPARQL workloads are dynamic and con- sist of queries with variable complexities. Hence, systems that use static partitioning su↵er from communication overhead for workloads that generate excessive communi- cation. Concurrently, RDF applications are becoming more sophisticated, mandating analytical operations that extend beyond SPARQL queries. Being primarily designed and optimized to execute SPARQL queries, which lack procedural capabilities, exist- ing systems are not suitable for rich RDF analytics. This dissertation tackles the problem of accelerating SPARQL queries and RDF analytics on distributed shared-nothing RDF systems. First, a distributed RDF en- gine, coined AdPart, is introduced. AdPart uses lightweight hash partitioning for sharding triples using their subject values; rendering its startup overhead very low. The locality-aware query optimizer of AdPart takes full advantage of the partition- ing to (i) support the fully parallel processing of join patterns on subjects and (ii) minimize data communication for general queries by applying hash distribution of intermediate results instead of broadcasting, wherever possible. By exploiting hash- based locality, AdPart achieves better or comparable performance to systems that employ sophisticated partitioning schemes. To cope with workloads dynamism, AdPart is extended to dynamically adapt to workload changes. AdPart monitors the data access patterns and dynamically redis- tributes and replicates the instances of the most frequent patterns among workers.Consequently, the communication cost for future queries is drastically reduced or even eliminated. Experiments with synthetic and real data verify that AdPart starts faster than all existing systems and gracefully adapts to the query load. Finally, to support and accelerate rich RDF analytical tasks, a vertex-centric RDF analytics framework is

  5. CORECLUSTER: A Degeneracy Based Graph Clustering Framework

    OpenAIRE

    Giatsidis , Christos; Malliaros , Fragkiskos; Thilikos , Dimitrios M. ,; Vazirgiannis , Michalis

    2014-01-01

    International audience; Graph clustering or community detection constitutes an important task forinvestigating the internal structure of graphs, with a plethora of applications in several domains. Traditional tools for graph clustering, such asspectral methods, typically suffer from high time and space complexity. In thisarticle, we present \\textsc{CoreCluster}, an efficient graph clusteringframework based on the concept of graph degeneracy, that can be used along withany known graph clusteri...

  6. Using Behavior Over Time Graphs to Spur Systems Thinking Among Public Health Practitioners.

    Science.gov (United States)

    Calancie, Larissa; Anderson, Seri; Branscomb, Jane; Apostolico, Alexsandra A; Lich, Kristen Hassmiller

    2018-02-01

    Public health practitioners can use Behavior Over Time (BOT) graphs to spur discussion and systems thinking around complex challenges. Multiple large systems, such as health care, the economy, and education, affect chronic disease rates in the United States. System thinking tools can build public health practitioners' capacity to understand these systems and collaborate within and across sectors to improve population health. BOT graphs show a variable, or variables (y axis) over time (x axis). Although analyzing trends is not new to public health, drawing BOT graphs, annotating the events and systemic forces that are likely to influence the depicted trends, and then discussing the graphs in a diverse group provides an opportunity for public health practitioners to hear each other's perspectives and creates a more holistic understanding of the key factors that contribute to a trend. We describe how BOT graphs are used in public health, how they can be used to generate group discussion, and how this process can advance systems-level thinking. Then we describe how BOT graphs were used with groups of maternal and child health (MCH) practitioners and partners (N = 101) during a training session to advance their thinking about MCH challenges. Eighty-six percent of the 84 participants who completed an evaluation agreed or strongly agreed that they would use this BOT graph process to engage stakeholders in their home states and jurisdictions. The BOT graph process we describe can be applied to a variety of public health issues and used by practitioners, stakeholders, and researchers.

  7. Exponential-Time Algorithms and Complexity of NP-Hard Graph Problems

    DEFF Research Database (Denmark)

    Taslaman, Nina Sofia

    of algorithms, as well as investigations into how far such improvements can get under reasonable assumptions.      The first part is concerned with detection of cycles in graphs, especially parameterized generalizations of Hamiltonian cycles. A remarkably simple Monte Carlo algorithm is presented......NP-hard problems are deemed highly unlikely to be solvable in polynomial time. Still, one can often find algorithms that are substantially faster than brute force solutions. This thesis concerns such algorithms for problems from graph theory; techniques for constructing and improving this type......, and with high probability any found solution is shortest possible. Moreover, the algorithm can be used to find a cycle of given parity through the specified elements.      The second part concerns the hardness of problems encoded as evaluations of the Tutte polynomial at some fixed point in the rational plane...

  8. Quantum complexity of graph and algebraic problems

    International Nuclear Information System (INIS)

    Doern, Sebastian

    2008-01-01

    This thesis is organized as follows: In Chapter 2 we give some basic notations, definitions and facts from linear algebra, graph theory, group theory and quantum computation. In Chapter 3 we describe three important methods for the construction of quantum algorithms. We present the quantum search algorithm by Grover, the quantum amplitude amplification and the quantum walk search technique by Magniez et al. These three tools are the basis for the development of our new quantum algorithms for graph and algebra problems. In Chapter 4 we present two tools for proving quantum query lower bounds. We present the quantum adversary method by Ambainis and the polynomial method introduced by Beals et al. The quantum adversary tool is very useful to prove good lower bounds for many graph and algebra problems. The part of the thesis containing the original results is organized in two parts. In the first part we consider the graph problems. In Chapter 5 we give a short summary of known quantum graph algorithms. In Chapter 6 to 8 we study the complexity of our new algorithms for matching problems, graph traversal and independent set problems on quantum computers. In the second part of our thesis we present new quantum algorithms for algebraic problems. In Chapter 9 to 10 we consider group testing problems and prove quantum complexity bounds for important problems from linear algebra. (orig.)

  9. Quantum complexity of graph and algebraic problems

    Energy Technology Data Exchange (ETDEWEB)

    Doern, Sebastian

    2008-02-04

    This thesis is organized as follows: In Chapter 2 we give some basic notations, definitions and facts from linear algebra, graph theory, group theory and quantum computation. In Chapter 3 we describe three important methods for the construction of quantum algorithms. We present the quantum search algorithm by Grover, the quantum amplitude amplification and the quantum walk search technique by Magniez et al. These three tools are the basis for the development of our new quantum algorithms for graph and algebra problems. In Chapter 4 we present two tools for proving quantum query lower bounds. We present the quantum adversary method by Ambainis and the polynomial method introduced by Beals et al. The quantum adversary tool is very useful to prove good lower bounds for many graph and algebra problems. The part of the thesis containing the original results is organized in two parts. In the first part we consider the graph problems. In Chapter 5 we give a short summary of known quantum graph algorithms. In Chapter 6 to 8 we study the complexity of our new algorithms for matching problems, graph traversal and independent set problems on quantum computers. In the second part of our thesis we present new quantum algorithms for algebraic problems. In Chapter 9 to 10 we consider group testing problems and prove quantum complexity bounds for important problems from linear algebra. (orig.)

  10. A Distributed Approach to Continuous Monitoring of Constrained k-Nearest Neighbor Queries in Road Networks

    Directory of Open Access Journals (Sweden)

    Hyung-Ju Cho

    2012-01-01

    Full Text Available Given two positive parameters k and r, a constrained k-nearest neighbor (CkNN query returns the k closest objects within a network distance r of the query location in road networks. In terms of the scalability of monitoring these CkNN queries, existing solutions based on central processing at a server suffer from a sudden and sharp rise in server load as well as messaging cost as the number of queries increases. In this paper, we propose a distributed and scalable scheme called DAEMON for the continuous monitoring of CkNN queries in road networks. Our query processing is distributed among clients (query objects and server. Specifically, the server evaluates CkNN queries issued at intersections of road segments, retrieves the objects on the road segments between neighboring intersections, and sends responses to the query objects. Finally, each client makes its own query result using this server response. As a result, our distributed scheme achieves close-to-optimal communication costs and scales well to large numbers of monitoring queries. Exhaustive experimental results demonstrate that our scheme substantially outperforms its competitor in terms of query processing time and messaging cost.

  11. A weak zero-one law for sequences of random distance graphs

    Energy Technology Data Exchange (ETDEWEB)

    Zhukovskii, Maksim E [M. V. Lomonosov Moscow State University, Faculty of Mechanics and Mathematics, Moscow (Russian Federation)

    2012-07-31

    We study zero-one laws for properties of random distance graphs. Properties written in a first-order language are considered. For p(N) such that pN{sup {alpha}}{yields}{infinity} as N{yields}{infinity}, and (1-p)N{sup {alpha}} {yields} {infinity} as N {yields} {infinity} for any {alpha}>0, we succeed in refuting the law. In this connection, we consider a weak zero-one j-law. For this law, we obtain results for random distance graphs which are similar to the assertions concerning the classical zero-one law for random graphs. Bibliography: 18 titles.

  12. Evaluation of Static JavaScript Call Graph Algorithms

    NARCIS (Netherlands)

    J.-J. Dijkstra (Jorryt-Jan)

    2014-01-01

    htmlabstractThis thesis consists of a replication study in which two algorithms to compute JavaScript call graphs have been implemented and evaluated. Existing IDE support for JavaScript is hampered due to the dynamic nature of the language. Previous studies partially solve call graph computation

  13. JavaScript & jQuery The Missing Manual

    CERN Document Server

    McFarland, David

    2011-01-01

    JavaScript lets you supercharge your HTML with animation, interactivity, and visual effects-but many web designers find the language hard to learn. This jargon-free guide covers JavaScript basics and shows you how to save time and effort with the jQuery library of prewritten JavaScript code. You'll soon be building web pages that feel and act like desktop programs, without having to do much programming. The important stuff you need to know: Make your pages interactive. Create JavaScript events that react to visitor actions.Use animations and effects. Build drop-down navigation menus, pop-ups

  14. Designing and Implementing a Cross-Language Information Retrieval System Using Linguistic Corpora

    Directory of Open Access Journals (Sweden)

    Amin Nezarat

    2012-03-01

    Full Text Available Information retrieval (IR is a crucial area of natural language processing (NLP and can be defined as finding documents whose content is relevant to the query need of a user. Cross-language information retrieval (CLIR refers to a kind of information retrieval in which the language of the query and that of searched document are different. In fact, it is a retrieval process where the user presents queries in one language to retrieve documents in another language. This paper tried to construct a bilingual lexicon of parallel chunks of English and Persian from two very large monolingual corpora an English-Persian parallel corpus which could be directly applied to cross-language information retrieval tasks. For this purpose, a statistical measure known as Association Score (AS was used to compute the association value between every two corresponding chunks in the corpus using a couple of complicated algorithms. Once the CLIR system was developed using this bilingual lexicon, an experiment was performed on a set of one hundred English and Persian phrases and collocations to see to what extend this system was effective in assisting the users find the most relevant and suitable equivalents of their queries in either language.

  15. Combining Vertex-centric Graph Processing with SPARQL for Large-scale RDF Data Analytics

    KAUST Repository

    Abdelaziz, Ibrahim; Al-Harbi, Mohammad Razen; Salihoglu, Semih; Kalnis, Panos

    2017-01-01

    , but not both. We bridge the gap by introducing Spartex, a versatile framework for complex RDF analytics. Spartex extends SPARQL to support programs that combine seamlessly generic graph algorithms (e.g., PageRank, Shortest Paths, etc.) with SPARQL queries

  16. Graph processing platforms at scale: practices and experiences

    Energy Technology Data Exchange (ETDEWEB)

    Lim, Seung-Hwan [ORNL; Lee, Sangkeun (Matt) [ORNL; Brown, Tyler C [ORNL; Sukumar, Sreenivas R [ORNL; Ganesh, Gautam [ORNL

    2015-01-01

    Graph analysis unveils hidden associations of data in many phenomena and artifacts, such as road network, social networks, genomic information, and scientific collaboration. Unfortunately, a wide diversity in the characteristics of graphs and graph operations make it challenging to find a right combination of tools and implementation of algorithms to discover desired knowledge from the target data set. This study presents an extensive empirical study of three representative graph processing platforms: Pegasus, GraphX, and Urika. Each system represents a combination of options in data model, processing paradigm, and infrastructure. We benchmarked each platform using three popular graph operations, degree distribution, connected components, and PageRank over a variety of real-world graphs. Our experiments show that each graph processing platform shows different strength, depending the type of graph operations. While Urika performs the best in non-iterative operations like degree distribution, GraphX outputforms iterative operations like connected components and PageRank. In addition, we discuss challenges to optimize the performance of each platform over large scale real world graphs.

  17. Interactive Graph Layout of a Million Nodes

    OpenAIRE

    Peng Mi; Maoyuan Sun; Moeti Masiane; Yong Cao; Chris North

    2016-01-01

    Sensemaking of large graphs, specifically those with millions of nodes, is a crucial task in many fields. Automatic graph layout algorithms, augmented with real-time human-in-the-loop interaction, can potentially support sensemaking of large graphs. However, designing interactive algorithms to achieve this is challenging. In this paper, we tackle the scalability problem of interactive layout of large graphs, and contribute a new GPU-based force-directed layout algorithm that exploits graph to...

  18. Graphs as a Problem-Solving Tool in 1-D Kinematics

    Science.gov (United States)

    Desbien, Dwain M.

    2008-01-01

    In this age of the microcomputer-based lab (MBL), students are quite accustomed to looking at graphs of position, velocity, and acceleration versus time. A number of textbooks argue convincingly that the slope of the velocity graph gives the acceleration, the area under the velocity graph yields the displacement, and the area under the…

  19. Geospatial-temporal semantic graph representations of trajectories from remote sensing and geolocation data

    Science.gov (United States)

    Perkins, David Nikolaus; Brost, Randolph; Ray, Lawrence P.

    2017-08-08

    Various technologies for facilitating analysis of large remote sensing and geolocation datasets to identify features of interest are described herein. A search query can be submitted to a computing system that executes searches over a geospatial temporal semantic (GTS) graph to identify features of interest. The GTS graph comprises nodes corresponding to objects described in the remote sensing and geolocation datasets, and edges that indicate geospatial or temporal relationships between pairs of nodes in the nodes. Trajectory information is encoded in the GTS graph by the inclusion of movable nodes to facilitate searches for features of interest in the datasets relative to moving objects such as vehicles.

  20. Polynomial-time computability of the edge-reliability of graphs using Gilbert's formula

    Directory of Open Access Journals (Sweden)

    Thomas J. Marlowe

    1998-01-01

    Full Text Available Reliability is an important consideration in analyzing computer and other communication networks, but current techniques are extremely limited in the classes of graphs which can be analyzed efficiently. While Gilbert's formula establishes a theoretically elegant recursive relationship between the edge reliability of a graph and the reliability of its subgraphs, naive evaluation requires consideration of all sequences of deletions of individual vertices, and for many graphs has time complexity essentially Θ (N!. We discuss a general approach which significantly reduces complexity, encoding subgraph isomorphism in a finer partition by invariants, and recursing through the set of invariants.

  1. Quantum walks on quotient graphs

    International Nuclear Information System (INIS)

    Krovi, Hari; Brun, Todd A.

    2007-01-01

    A discrete-time quantum walk on a graph Γ is the repeated application of a unitary evolution operator to a Hilbert space corresponding to the graph. If this unitary evolution operator has an associated group of symmetries, then for certain initial states the walk will be confined to a subspace of the original Hilbert space. Symmetries of the original graph, given by its automorphism group, can be inherited by the evolution operator. We show that a quantum walk confined to the subspace corresponding to this symmetry group can be seen as a different quantum walk on a smaller quotient graph. We give an explicit construction of the quotient graph for any subgroup H of the automorphism group and illustrate it with examples. The automorphisms of the quotient graph which are inherited from the original graph are the original automorphism group modulo the subgroup H used to construct it. The quotient graph is constructed by removing the symmetries of the subgroup H from the original graph. We then analyze the behavior of hitting times on quotient graphs. Hitting time is the average time it takes a walk to reach a given final vertex from a given initial vertex. It has been shown in earlier work [Phys. Rev. A 74, 042334 (2006)] that the hitting time for certain initial states of a quantum walks can be infinite, in contrast to classical random walks. We give a condition which determines whether the quotient graph has infinite hitting times given that they exist in the original graph. We apply this condition for the examples discussed and determine which quotient graphs have infinite hitting times. All known examples of quantum walks with hitting times which are short compared to classical random walks correspond to systems with quotient graphs much smaller than the original graph; we conjecture that the existence of a small quotient graph with finite hitting times is necessary for a walk to exhibit a quantum speedup

  2. Man vs. Machine: Differences in SPARQL Queries

    NARCIS (Netherlands)

    Rietveld, L.; Hoekstra, R.

    2014-01-01

    Server-side SPARQL query logs have been a topic of study for some time now. The USEWOD collection of query logs is currently the primary source of information for researchers. A recurring problem is that these logs leave application queries and queries created by humans indistinguishable. In this

  3. Research in Mobile Database Query Optimization and Processing

    Directory of Open Access Journals (Sweden)

    Agustinus Borgy Waluyo

    2005-01-01

    Full Text Available The emergence of mobile computing provides the ability to access information at any time and place. However, as mobile computing environments have inherent factors like power, storage, asymmetric communication cost, and bandwidth limitations, efficient query processing and minimum query response time are definitely of great interest. This survey groups a variety of query optimization and processing mechanisms in mobile databases into two main categories, namely: (i query processing strategy, and (ii caching management strategy. Query processing includes both pull and push operations (broadcast mechanisms. We further classify push operation into on-demand broadcast and periodic broadcast. Push operation (on-demand broadcast relates to designing techniques that enable the server to accommodate multiple requests so that the request can be processed efficiently. Push operation (periodic broadcast corresponds to data dissemination strategies. In this scheme, several techniques to improve the query performance by broadcasting data to a population of mobile users are described. A caching management strategy defines a number of methods for maintaining cached data items in clients' local storage. This strategy considers critical caching issues such as caching granularity, caching coherence strategy and caching replacement policy. Finally, this survey concludes with several open issues relating to mobile query optimization and processing strategy.

  4. jQuery for designers beginner's guide

    CERN Document Server

    MacLees, Natalie

    2014-01-01

    A step-by-step guide that spices up your web pages and designs them in the way you want using the most widely used JavaScript library, jQuery. The beginner-friendly and easy-to-understand approach of the book will help get to grips with jQuery in no time. If you know the fundamentals of HTML and CSS, and want to extend your knowledge by learning to use JavaScript, then this is just the book for you. jQuery makes JavaScript straightforward and approachable - you'll be surprised at how easy it can be to add animations and special effects to your beautifully designed pages.

  5. PERANGKAT BANTU UNTUK OPTIMASI QUERY PADA ORACLE DENGAN RESTRUKTURISASI SQL

    Directory of Open Access Journals (Sweden)

    Darlis Heru Murti

    2006-07-01

    Full Text Available Query merupakan bagian dari bahasa pemrograman SQL (Structured Query Language yang berfungsi untuk mengambil data (read dalam DBMS (Database Management System, termasuk Oracle [3]. Pada Oracle, ada tiga tahap proses yang dilakukan dalam pengeksekusian query, yaitu Parsing, Execute dan Fetch. Sebelum proses execute dijalankan, Oracle terlebih dahulu membuat execution plan yang akan menjadi skenario dalam proses excute.Dalam proses pengeksekusian query, terdapat faktor-faktor yang mempengaruhi kinerja query, di antaranya access path (cara pengambilan data dari sebuah tabel dan operasi join (cara menggabungkan data dari dua tabel. Untuk mendapatkan query dengan kinerja optimal, maka diperlukan pertimbangan-pertimbangan dalam menyikapi faktor-faktor tersebut.  Optimasi query merupakan suatu cara untuk mendapatkan query dengan kinerja seoptimal mungkin, terutama dilihat dari sudut pandang waktu. Ada banyak metode untuk mengoptimasi query, tapi pada Penelitian ini, penulis membuat sebuah aplikasi untuk mengoptimasi query dengan metode restrukturisasi SQL statement. Pada metode ini, objek yang dianalisa adalah struktur klausa yang membangun sebuah query. Aplikasi ini memiliki satu input dan lima jenis output. Input dari aplikasi ini adalah sebuah query sedangkan kelima jenis output aplikasi ini adalah berupa query hasil optimasi, saran perbaikan, saran pembuatan indeks baru, execution plan dan data statistik. Cara kerja aplikasi ini dibagi menjadi empat tahap yaitu mengurai query menjadi sub query, mengurai query per-klausa, menentukan access path dan operasi join dan restrukturisasi query.Dari serangkaian ujicoba yang dilakukan penulis, aplikasi telah dapat berjalan sesuai dengan tujuan pembuatan Penelitian ini, yaitu mendapatkan query dengan kinerja optimal.Kata Kunci : Query, SQL, DBMS, Oracle, Parsing, Execute, Fetch, Execution Plan, Access Path, Operasi Join, Restrukturisasi SQL statement.

  6. Lazy Toggle PRM: A single-query approach to motion planning

    KAUST Repository

    Denny, Jory

    2013-05-01

    Probabilistic RoadMaps (PRMs) are quite suc-cessful in solving complex and high-dimensional motion plan-ning problems. While particularly suited for multiple-query scenarios and expansive spaces, they lack efficiency in both solving single-query scenarios and mapping narrow spaces. Two PRM variants separately tackle these gaps. Lazy PRM reduces the computational cost of roadmap construction for single-query scenarios by delaying roadmap validation until query time. Toggle PRM is well suited for mapping narrow spaces by mapping both Cfree and Cobst, which gives certain theoretical benefits. However, fully validating the two resulting roadmaps can be costly. We present a strategy, Lazy Toggle PRM, for integrating these two approaches into a method which is both suited for narrow passages and efficient single-query calculations. This simultaneously addresses two challenges of PRMs. Like Lazy PRM, Lazy Toggle PRM delays validation of roadmaps until query time, but if no path is found, the algorithm augments the roadmap using the Toggle PRM methodology. We demonstrate the effectiveness of Lazy Toggle PRM in a wide range of scenarios, including those with narrow passages and high descriptive complexity (e.g., those described by many triangles), concluding that it is more effective than existing methods in solving difficult queries. © 2013 IEEE.

  7. An approach to multiscale modelling with graph grammars.

    Science.gov (United States)

    Ong, Yongzhi; Streit, Katarína; Henke, Michael; Kurth, Winfried

    2014-09-01

    Functional-structural plant models (FSPMs) simulate biological processes at different spatial scales. Methods exist for multiscale data representation and modification, but the advantages of using multiple scales in the dynamic aspects of FSPMs remain unclear. Results from multiscale models in various other areas of science that share fundamental modelling issues with FSPMs suggest that potential advantages do exist, and this study therefore aims to introduce an approach to multiscale modelling in FSPMs. A three-part graph data structure and grammar is revisited, and presented with a conceptual framework for multiscale modelling. The framework is used for identifying roles, categorizing and describing scale-to-scale interactions, thus allowing alternative approaches to model development as opposed to correlation-based modelling at a single scale. Reverse information flow (from macro- to micro-scale) is catered for in the framework. The methods are implemented within the programming language XL. Three example models are implemented using the proposed multiscale graph model and framework. The first illustrates the fundamental usage of the graph data structure and grammar, the second uses probabilistic modelling for organs at the fine scale in order to derive crown growth, and the third combines multiscale plant topology with ozone trends and metabolic network simulations in order to model juvenile beech stands under exposure to a toxic trace gas. The graph data structure supports data representation and grammar operations at multiple scales. The results demonstrate that multiscale modelling is a viable method in FSPM and an alternative to correlation-based modelling. Advantages and disadvantages of multiscale modelling are illustrated by comparisons with single-scale implementations, leading to motivations for further research in sensitivity analysis and run-time efficiency for these models.

  8. Implementing and evaluating a regional strategy to improve testing rates in VA patients at risk for HIV, utilizing the QUERI process as a guiding framework: QUERI Series.

    Science.gov (United States)

    Goetz, Matthew B; Bowman, Candice; Hoang, Tuyen; Anaya, Henry; Osborn, Teresa; Gifford, Allen L; Asch, Steven M

    2008-03-19

    We describe how we used the framework of the U.S. Department of Veterans Affairs (VA) Quality Enhancement Research Initiative (QUERI) to develop a program to improve rates of diagnostic testing for the Human Immunodeficiency Virus (HIV). This venture was prompted by the observation by the CDC that 25% of HIV-infected patients do not know their diagnosis - a point of substantial importance to the VA, which is the largest provider of HIV care in the United States. Following the QUERI steps (or process), we evaluated: 1) whether undiagnosed HIV infection is a high-risk, high-volume clinical issue within the VA, 2) whether there are evidence-based recommendations for HIV testing, 3) whether there are gaps in the performance of VA HIV testing, and 4) the barriers and facilitators to improving current practice in the VA.Based on our findings, we developed and initiated a QUERI step 4/phase 1 pilot project using the precepts of the Chronic Care Model. Our improvement strategy relies upon electronic clinical reminders to provide decision support; audit/feedback as a clinical information system, and appropriate changes in delivery system design. These activities are complemented by academic detailing and social marketing interventions to achieve provider activation. Our preliminary formative evaluation indicates the need to ensure leadership and team buy-in, address facility-specific barriers, refine the reminder, and address factors that contribute to inter-clinic variances in HIV testing rates. Preliminary unadjusted data from the first seven months of our program show 3-5 fold increases in the proportion of at-risk patients who are offered HIV testing at the VA sites (stations) where the pilot project has been undertaken; no change was seen at control stations. This project demonstrates the early success of the application of the QUERI process to the development of a program to improve HIV testing rates. Preliminary unadjusted results show that the coordinated use of

  9. Query transformations and their role in Web searching by the members of the general public

    Directory of Open Access Journals (Sweden)

    Martin Whittle

    2006-01-01

    Full Text Available Introduction. This paper reports preliminary research in a primarily experimental study of how the general public search for information on the Web. The focus is on the query transformation patterns that characterise searching. Method. In this work, we have used transaction logs from the Excite search engine to develop methods for analysing query transformations that should aid the analysis of our ongoing experimental work. Our methods involve the use of similarity techniques to link queries with the most similar previous query in a train. The resulting query transformations are represented as a list of codes representing a whole search. Analysis. It is shown how query transformation sequences can be represented as graphical networks and some basic statistical results are shown. A correlation analysis is performed to examine the co-occurrence of Boolean and quotation mark changes with the syntactic changes. Results. A frequency analysis of the occurrence of query transformation codes is presented. The connectivity of graphs obtained from the query transformation is investigated and found to follow an exponential scaling law. The correlation analysis reveals a number of patterns that provide some interesting insights into Web searching by the general public. Conclusion. We have developed analytical methods based on query similarity that can be applied to our current experimental work with volunteer subjects. The results of these will form part of a database with the aim of developing an improved understanding of how the public search the Web.

  10. Occupation times distribution for Brownian motion on graphs

    CERN Document Server

    Desbois, J

    2002-01-01

    Considering a Brownian motion on a general graph, we study the joint law for the occupation times on all the bonds. In particular, we show that the Laplace transform of this distribution can be expressed as the ratio of two determinants. We give two formulations, with arc or vertex matrices, for this result and discuss a simple example. (letter to the editor)

  11. What Would a Graph Look Like in this Layout? A Machine Learning Approach to Large Graph Visualization.

    Science.gov (United States)

    Kwon, Oh-Hyun; Crnovrsanin, Tarik; Ma, Kwan-Liu

    2018-01-01

    Using different methods for laying out a graph can lead to very different visual appearances, with which the viewer perceives different information. Selecting a "good" layout method is thus important for visualizing a graph. The selection can be highly subjective and dependent on the given task. A common approach to selecting a good layout is to use aesthetic criteria and visual inspection. However, fully calculating various layouts and their associated aesthetic metrics is computationally expensive. In this paper, we present a machine learning approach to large graph visualization based on computing the topological similarity of graphs using graph kernels. For a given graph, our approach can show what the graph would look like in different layouts and estimate their corresponding aesthetic metrics. An important contribution of our work is the development of a new framework to design graph kernels. Our experimental study shows that our estimation calculation is considerably faster than computing the actual layouts and their aesthetic metrics. Also, our graph kernels outperform the state-of-the-art ones in both time and accuracy. In addition, we conducted a user study to demonstrate that the topological similarity computed with our graph kernel matches perceptual similarity assessed by human users.

  12. A seminar on graph theory

    CERN Document Server

    Harary, Frank

    2015-01-01

    Presented in 1962-63 by experts at University College, London, these lectures offer a variety of perspectives on graph theory. Although the opening chapters form a coherent body of graph theoretic concepts, this volume is not a text on the subject but rather an introduction to the extensive literature of graph theory. The seminar's topics are geared toward advanced undergraduate students of mathematics.Lectures by this volume's editor, Frank Harary, include ""Some Theorems and Concepts of Graph Theory,"" ""Topological Concepts in Graph Theory,"" ""Graphical Reconstruction,"" and other introduc

  13. Query containment in entity SQL

    OpenAIRE

    Rull Fort, Guillem; Bernstein, Philip A.; Garcia dos Santos, Ivo; Katsis, Yannis; Melnik, Sergey; Teniente López, Ernest

    2013-01-01

    We describe a software architecture we have developed for a constructive containment checker of Entity SQL queries defined over extended ER schemas expressed in Microsoft's Entity Data Model. Our application of interest is compilation of object-to-relational mappings for Microsoft's ADO.NET Entity Framework, which has been shipping since 2007. The supported language includes several features which have been individually addressed in the past but, to the best of our knowledge, they have not be...

  14. Infinitesimal deformations of Poisson bi-vectors using the Kontsevich graph calculus

    Science.gov (United States)

    Buring, Ricardo; Kiselev, Arthemy V.; Rutten, Nina

    2018-02-01

    Let \\mathscr{P} be a Poisson structure on a finite-dimensional affine real manifold. Can \\mathscr{P} be deformed in such a way that it stays Poisson? The language of Kontsevich graphs provides a universal approach - with respect to all affine Poisson manifolds - to finding a class of solutions to this deformation problem. For that reasoning, several types of graphs are needed. In this paper we outline the algorithms to generate those graphs. The graphs that encode deformations are classified by the number of internal vertices k; for k ≤ 4 we present all solutions of the deformation problem. For k ≥ 5, first reproducing the pentagon-wheel picture suggested at k = 6 by Kontsevich and Willwacher, we construct the heptagon-wheel cocycle that yields a new unique solution without 2-loops and tadpoles at k = 8.

  15. Computing the Maximum Detour of a Plane Graph in Subquadratic Time

    DEFF Research Database (Denmark)

    Wulff-Nilsen, Christian

    Let G be a plane graph where each edge is a line segment. We consider the problem of computing the maximum detour of G, defined as the maximum over all pairs of distinct points p and q of G of the ratio between the distance between p and q in G and the distance |pq|. The fastest known algorithm f...... for this problem has O(n^2) running time. We show how to obtain O(n^{3/2}*(log n)^3) expected running time. We also show that if G has bounded treewidth, its maximum detour can be computed in O(n*(log n)^3) expected time....

  16. XML Graphs in Program Analysis

    DEFF Research Database (Denmark)

    Møller, Anders; Schwartzbach, Michael Ignatieff

    2007-01-01

    XML graphs have shown to be a simple and effective formalism for representing sets of XML documents in program analysis. It has evolved through a six year period with variants tailored for a range of applications. We present a unified definition, outline the key properties including validation...... of XML graphs against different XML schema languages, and provide a software package that enables others to make use of these ideas. We also survey four very different applications: XML in Java, Java Servlets and JSP, transformations between XML and non-XML data, and XSLT....

  17. A generative blog post retrieval model that uses query expansion based on external collections

    NARCIS (Netherlands)

    Weerkamp, W.; Balog, K.; de Rijke, M.; Su, K.-Y.; Su, J.; Wiebe, J.; Li, H.

    2009-01-01

    User generated content is characterized by short, noisy documents, with many spelling errors and unexpected language usage. To bridge the vocabulary gap between the user's information need and documents in a specific user generated content environment, the blogosphere, we apply a form of query

  18. jQuery UI cookbook

    CERN Document Server

    Boduch, Adam

    2013-01-01

    Filled with a practical collection of recipes, jQuery UI Cookbook is full of clear, step-by-step instructions that will help you harness the powerful UI framework in jQuery. Depending on your needs, you can dip in and out of the Cookbook and its recipes, or follow the book from start to finish.If you are a jQuery UI developer looking to improve your existing applications, extract ideas for your new application, or to better understand the overall widget architecture, then jQuery UI Cookbook is a must-have for you. The reader should at least have a rudimentary understanding of what jQuery UI is

  19. Higher-Order Minimal Functional Graphs

    DEFF Research Database (Denmark)

    Jones, Neil D; Rosendahl, Mads

    1994-01-01

    We present a minimal function graph semantics for a higher-order functional language with applicative evaluation order. The semantics captures the intermediate calls performed during the evaluation of a program. This information may be used in abstract interpretation as a basis for proving...

  20. Instant jQuery selectors

    CERN Document Server

    De Rosa, Aurelio

    2013-01-01

    Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. Instant jQuery Selectors follows a simple how-to format with recipes aimed at making you well versed with the wide range of selectors that jQuery has to offer through a myriad of examples.Instant jQuery Selectors is for web developers who want to delve into jQuery from its very starting point: selectors. Even if you're already familiar with the framework and its selectors, you could find several tips and tricks that you aren't aware of, especially about performance and how jQuery ac

  1. Integration, Provenance, and Temporal Queries for Large-Scale Knowledge Bases

    OpenAIRE

    Gao, Shi

    2016-01-01

    Knowledge bases that summarize web information in RDF triples deliver many benefits, including support for natural language question answering and powerful structured queries that extract encyclopedic knowledge via SPARQL. Large scale knowledge bases grow rapidly in terms of scale and significance, and undergo frequent changes in both schema and content. Two critical problems have thus emerged: (i) how to support temporal queries that explore the history of knowledge bases or flash-back to th...

  2. Performance of Distributed Query Optimization in Client/Server Systems

    NARCIS (Netherlands)

    Skowronek, J.; Blanken, Henk; Wilschut, A.N.

    The design, implementation and performance of an optimizer for a nested query language is considered. The optimizer operates in a client/server environment, in particular an Intranet setting. The paper deals with the scalability challenge by tackling the load of many clients by allocating optimizer

  3. Conceptual structures: Information processing in mind and machine

    Energy Technology Data Exchange (ETDEWEB)

    Sowa, J.F.

    1983-01-01

    The broad coverage in this book combines material from philosophy, linguistics, psychology, and artificial intelligence. As a unifying force, the book focuses on conceptual graphs - networks of concepts and conceptual relations developed over the past twenty years. Important topics include: philosophical issues of knowledge and meaning; cognitive psychology in relation to linguistics and artificial intelligence; conceptual graphs as a knowledge representation language; applications for formal logic, plausible reasoning, heuristics, and computation; natural language parsing, generation, and pragmatics; knowledge-based systems, conceptual analysis, and database design and query; limits of conceptualization and unsolved problems. To accommodate readers from various disciplines, this book is designed to be read at different levels. Contents: Philosophical basis; Psychological evidence; Conceptual graphs; Reasoning and computation; Language; Knowledge engineering; Epilog; Limits of conceptualization; Appendices; Index.

  4. PAQ: Persistent Adaptive Query Middleware for Dynamic Environments

    Science.gov (United States)

    Rajamani, Vasanth; Julien, Christine; Payton, Jamie; Roman, Gruia-Catalin

    Pervasive computing applications often entail continuous monitoring tasks, issuing persistent queries that return continuously updated views of the operational environment. We present PAQ, a middleware that supports applications' needs by approximating a persistent query as a sequence of one-time queries. PAQ introduces an integration strategy abstraction that allows composition of one-time query responses into streams representing sophisticated spatio-temporal phenomena of interest. A distinguishing feature of our middleware is the realization that the suitability of a persistent query's result is a function of the application's tolerance for accuracy weighed against the associated overhead costs. In PAQ, programmers can specify an inquiry strategy that dictates how information is gathered. Since network dynamics impact the suitability of a particular inquiry strategy, PAQ associates an introspection strategy with a persistent query, that evaluates the quality of the query's results. The result of introspection can trigger application-defined adaptation strategies that alter the nature of the query. PAQ's simple API makes developing adaptive querying systems easily realizable. We present the key abstractions, describe their implementations, and demonstrate the middleware's usefulness through application examples and evaluation.

  5. GfaPy: a flexible and extensible software library for handling sequence graphs in Python.

    Science.gov (United States)

    Gonnella, Giorgio; Kurtz, Stefan

    2017-10-01

    GFA 1 and GFA 2 are recently defined formats for representing sequence graphs, such as assembly, variation or splicing graphs. The formats are adopted by several software tools. Here, we present GfaPy, a software package for creating, parsing and editing GFA graphs using the programming language Python. GfaPy supports GFA 1 and GFA 2, using the same interface and allows for interconversion between both formats. The software package provides a simple interface for custom record types, which is an important new feature of GFA 2 (compared to GFA 1). This enables new applications of the format. GfaPy is available open source at https://github.com/ggonnella/gfapy and installable via pip. gonnella@zbh.uni-hamburg.de. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  6. A Graph Summarization Algorithm Based on RFID Logistics

    Science.gov (United States)

    Sun, Yan; Hu, Kongfa; Lu, Zhipeng; Zhao, Li; Chen, Ling

    Radio Frequency Identification (RFID) applications are set to play an essential role in object tracking and supply chain management systems. The volume of data generated by a typical RFID application will be enormous as each item will generate a complete history of all the individual locations that it occupied at every point in time. The movement trails of such RFID data form gigantic commodity flowgraph representing the locations and durations of the path stages traversed by each item. In this paper, we use graph to construct a warehouse of RFID commodity flows, and introduce a database-style operation to summarize graphs, which produces a summary graph by grouping nodes based on user-selected node attributes, further allows users to control the hierarchy of summaries. It can cut down the size of graphs, and provide convenience for users to study just on the shrunk graph which they interested. Through extensive experiments, we demonstrate the effectiveness and efficiency of the proposed method.

  7. Optimal Planar Orthogonal Skyline Counting Queries

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Larsen, Kasper Green

    2014-01-01

    counting queries, i.e. given a query rectangle R to report the size of the skyline of P\\cap R. We present a data structure for storing n points with integer coordinates having query time O(lg n/lglg n) and space usage O(n). The model of computation is a unit cost RAM with logarithmic word size. We prove...

  8. A study of medical and health queries to web search engines.

    Science.gov (United States)

    Spink, Amanda; Yang, Yin; Jansen, Jim; Nykanen, Pirrko; Lorence, Daniel P; Ozmutlu, Seda; Ozmutlu, H Cenk

    2004-03-01

    This paper reports findings from an analysis of medical or health queries to different web search engines. We report results: (i). comparing samples of 10000 web queries taken randomly from 1.2 million query logs from the AlltheWeb.com and Excite.com commercial web search engines in 2001 for medical or health queries, (ii). comparing the 2001 findings from Excite and AlltheWeb.com users with results from a previous analysis of medical and health related queries from the Excite Web search engine for 1997 and 1999, and (iii). medical or health advice-seeking queries beginning with the word 'should'. Findings suggest: (i). a small percentage of web queries are medical or health related, (ii). the top five categories of medical or health queries were: general health, weight issues, reproductive health and puberty, pregnancy/obstetrics, and human relationships, and (iii). over time, the medical and health queries may have declined as a proportion of all web queries, as the use of specialized medical/health websites and e-commerce-related queries has increased. Findings provide insights into medical and health-related web querying and suggests some implications for the use of the general web search engines when seeking medical/health information.

  9. A three-colour graph as a complete topological invariant for gradient-like diffeomorphisms of surfaces

    International Nuclear Information System (INIS)

    Grines, V Z; Pochinka, O V; Kapkaeva, S Kh

    2014-01-01

    In a paper of Oshemkov and Sharko, three-colour graphs were used to make the topological equivalence of Morse-Smale flows on surfaces obtained by Peixoto more precise. In the present paper, in the language of three-colour graphs equipped with automorphisms, we obtain a complete (including realization) topological classification of gradient-like cascades on surfaces. Bibliography: 25 titles

  10. A three-colour graph as a complete topological invariant for gradient-like diffeomorphisms of surfaces

    Energy Technology Data Exchange (ETDEWEB)

    Grines, V Z; Pochinka, O V [N.I. Lobachevsky State University of Nizhny Novgorod, Nizhny Novgorod (Russian Federation); Kapkaeva, S Kh [N.P. Ogarev Mordovian State University, Saransk (Russian Federation)

    2014-10-31

    In a paper of Oshemkov and Sharko, three-colour graphs were used to make the topological equivalence of Morse-Smale flows on surfaces obtained by Peixoto more precise. In the present paper, in the language of three-colour graphs equipped with automorphisms, we obtain a complete (including realization) topological classification of gradient-like cascades on surfaces. Bibliography: 25 titles.

  11. Implementing and evaluating a regional strategy to improve testing rates in VA patients at risk for HIV, utilizing the QUERI process as a guiding framework: QUERI Series

    Directory of Open Access Journals (Sweden)

    Osborn Teresa

    2008-03-01

    Full Text Available Abstract Background We describe how we used the framework of the U.S. Department of Veterans Affairs (VA Quality Enhancement Research Initiative (QUERI to develop a program to improve rates of diagnostic testing for the Human Immunodeficiency Virus (HIV. This venture was prompted by the observation by the CDC that 25% of HIV-infected patients do not know their diagnosis – a point of substantial importance to the VA, which is the largest provider of HIV care in the United States. Methods Following the QUERI steps (or process, we evaluated: 1 whether undiagnosed HIV infection is a high-risk, high-volume clinical issue within the VA, 2 whether there are evidence-based recommendations for HIV testing, 3 whether there are gaps in the performance of VA HIV testing, and 4 the barriers and facilitators to improving current practice in the VA. Based on our findings, we developed and initiated a QUERI step 4/phase 1 pilot project using the precepts of the Chronic Care Model. Our improvement strategy relies upon electronic clinical reminders to provide decision support; audit/feedback as a clinical information system, and appropriate changes in delivery system design. These activities are complemented by academic detailing and social marketing interventions to achieve provider activation. Results Our preliminary formative evaluation indicates the need to ensure leadership and team buy-in, address facility-specific barriers, refine the reminder, and address factors that contribute to inter-clinic variances in HIV testing rates. Preliminary unadjusted data from the first seven months of our program show 3–5 fold increases in the proportion of at-risk patients who are offered HIV testing at the VA sites (stations where the pilot project has been undertaken; no change was seen at control stations. Discussion This project demonstrates the early success of the application of the QUERI process to the development of a program to improve HIV testing rates

  12. CrossQuery: a web tool for easy associative querying of transcriptome data.

    Directory of Open Access Journals (Sweden)

    Toni U Wagner

    Full Text Available Enormous amounts of data are being generated by modern methods such as transcriptome or exome sequencing and microarray profiling. Primary analyses such as quality control, normalization, statistics and mapping are highly complex and need to be performed by specialists. Thereafter, results are handed back to biomedical researchers, who are then confronted with complicated data lists. For rather simple tasks like data filtering, sorting and cross-association there is a need for new tools which can be used by non-specialists. Here, we describe CrossQuery, a web tool that enables straight forward, simple syntax queries to be executed on transcriptome sequencing and microarray datasets. We provide deep-sequencing data sets of stem cell lines derived from the model fish Medaka and microarray data of human endothelial cells. In the example datasets provided, mRNA expression levels, gene, transcript and sample identification numbers, GO-terms and gene descriptions can be freely correlated, filtered and sorted. Queries can be saved for later reuse and results can be exported to standard formats that allow copy-and-paste to all widespread data visualization tools such as Microsoft Excel. CrossQuery enables researchers to quickly and freely work with transcriptome and microarray data sets requiring only minimal computer skills. Furthermore, CrossQuery allows growing association of multiple datasets as long as at least one common point of correlated information, such as transcript identification numbers or GO-terms, is shared between samples. For advanced users, the object-oriented plug-in and event-driven code design of both server-side and client-side scripts allow easy addition of new features, data sources and data types.

  13. CrossQuery: a web tool for easy associative querying of transcriptome data.

    Science.gov (United States)

    Wagner, Toni U; Fischer, Andreas; Thoma, Eva C; Schartl, Manfred

    2011-01-01

    Enormous amounts of data are being generated by modern methods such as transcriptome or exome sequencing and microarray profiling. Primary analyses such as quality control, normalization, statistics and mapping are highly complex and need to be performed by specialists. Thereafter, results are handed back to biomedical researchers, who are then confronted with complicated data lists. For rather simple tasks like data filtering, sorting and cross-association there is a need for new tools which can be used by non-specialists. Here, we describe CrossQuery, a web tool that enables straight forward, simple syntax queries to be executed on transcriptome sequencing and microarray datasets. We provide deep-sequencing data sets of stem cell lines derived from the model fish Medaka and microarray data of human endothelial cells. In the example datasets provided, mRNA expression levels, gene, transcript and sample identification numbers, GO-terms and gene descriptions can be freely correlated, filtered and sorted. Queries can be saved for later reuse and results can be exported to standard formats that allow copy-and-paste to all widespread data visualization tools such as Microsoft Excel. CrossQuery enables researchers to quickly and freely work with transcriptome and microarray data sets requiring only minimal computer skills. Furthermore, CrossQuery allows growing association of multiple datasets as long as at least one common point of correlated information, such as transcript identification numbers or GO-terms, is shared between samples. For advanced users, the object-oriented plug-in and event-driven code design of both server-side and client-side scripts allow easy addition of new features, data sources and data types.

  14. The conceptual basis of mathematics in cardiology: (I) algebra, functions and graphs.

    Science.gov (United States)

    Bates, Jason H T; Sobel, Burton E

    2003-02-01

    This is the first in a series of four articles developed for the readers of. Without language ideas cannot be articulated. What may not be so immediately obvious is that they cannot be formulated either. One of the essential languages of cardiology is mathematics. Unfortunately, medical education does not emphasize, and in fact, often neglects empowering physicians to think mathematically. Reference to statistics, conditional probability, multicompartmental modeling, algebra, calculus and transforms is common but often without provision of genuine conceptual understanding. At the University of Vermont College of Medicine, Professor Bates developed a course designed to address these deficiencies. The course covered mathematical principles pertinent to clinical cardiovascular and pulmonary medicine and research. It focused on fundamental concepts to facilitate formulation and grasp of ideas. This series of four articles was developed to make the material available for a wider audience. The articles will be published sequentially in Coronary Artery Disease. Beginning with fundamental axioms and basic algebraic manipulations they address algebra, function and graph theory, real and complex numbers, calculus and differential equations, mathematical modeling, linear system theory and integral transforms and statistical theory. The principles and concepts they address provide the foundation needed for in-depth study of any of these topics. Perhaps of even more importance, they should empower cardiologists and cardiovascular researchers to utilize the language of mathematics in assessing the phenomena of immediate pertinence to diagnosis, pathophysiology and therapeutics. The presentations are interposed with queries (by Coronary Artery Disease, abbreviated as CAD) simulating the nature of interactions that occurred during the course itself. Each article concludes with one or more examples illustrating application of the concepts covered to cardiovascular medicine and

  15. WQO is Decidable for Factorial Languages

    KAUST Repository

    Atminas, Aistis; Lozin, Vadim; Moshkov, Mikhail

    2017-01-01

    language given by a finite antidictionary is well-quasi-ordered under the factor containment relation can be solved in polynomial time. We also discuss possible ways to extend our solution to permutations and graphs.

  16. Hierarchical graphs for rule-based modeling of biochemical systems

    Directory of Open Access Journals (Sweden)

    Hu Bin

    2011-02-01

    Full Text Available Abstract Background In rule-based modeling, graphs are used to represent molecules: a colored vertex represents a component of a molecule, a vertex attribute represents the internal state of a component, and an edge represents a bond between components. Components of a molecule share the same color. Furthermore, graph-rewriting rules are used to represent molecular interactions. A rule that specifies addition (removal of an edge represents a class of association (dissociation reactions, and a rule that specifies a change of a vertex attribute represents a class of reactions that affect the internal state of a molecular component. A set of rules comprises an executable model that can be used to determine, through various means, the system-level dynamics of molecular interactions in a biochemical system. Results For purposes of model annotation, we propose the use of hierarchical graphs to represent structural relationships among components and subcomponents of molecules. We illustrate how hierarchical graphs can be used to naturally document the structural organization of the functional components and subcomponents of two proteins: the protein tyrosine kinase Lck and the T cell receptor (TCR complex. We also show that computational methods developed for regular graphs can be applied to hierarchical graphs. In particular, we describe a generalization of Nauty, a graph isomorphism and canonical labeling algorithm. The generalized version of the Nauty procedure, which we call HNauty, can be used to assign canonical labels to hierarchical graphs or more generally to graphs with multiple edge types. The difference between the Nauty and HNauty procedures is minor, but for completeness, we provide an explanation of the entire HNauty algorithm. Conclusions Hierarchical graphs provide more intuitive formal representations of proteins and other structured molecules with multiple functional components than do the regular graphs of current languages for

  17. Advanced Query and Data Mining Capabilities for MaROS

    Science.gov (United States)

    Wang, Paul; Wallick, Michael N.; Allard, Daniel A.; Gladden, Roy E.; Hy, Franklin H.

    2013-01-01

    The Mars Relay Operational Service (MaROS) comprises a number of tools to coordinate, plan, and visualize various aspects of the Mars Relay network. These levels include a Web-based user interface, a back-end "ReSTlet" built in Java, and databases that store the data as it is received from the network. As part of MaROS, the innovators have developed and implemented a feature set that operates on several levels of the software architecture. This new feature is an advanced querying capability through either the Web-based user interface, or through a back-end REST interface to access all of the data gathered from the network. This software is not meant to replace the REST interface, but to augment and expand the range of available data. The current REST interface provides specific data that is used by the MaROS Web application to display and visualize the information; however, the returned information from the REST interface has typically been pre-processed to return only a subset of the entire information within the repository, particularly only the information that is of interest to the GUI (graphical user interface). The new, advanced query and data mining capabilities allow users to retrieve the raw data and/or to perform their own data processing. The query language used to access the repository is a restricted subset of the structured query language (SQL) that can be built safely from the Web user interface, or entered as freeform SQL by a user. The results are returned in a CSV (Comma Separated Values) format for easy exporting to third party tools and applications that can be used for data mining or user-defined visualization and interpretation. This is the first time that a service is capable of providing access to all cross-project relay data from a single Web resource. Because MaROS contains the data for a variety of missions from the Mars network, which span both NASA and ESA, the software also establishes an access control list (ACL) on each data record

  18. Design and analysis of stochastic DSS query optimizers in a distributed database system

    Directory of Open Access Journals (Sweden)

    Manik Sharma

    2016-07-01

    Full Text Available Query optimization is a stimulating task of any database system. A number of heuristics have been applied in recent times, which proposed new algorithms for substantially improving the performance of a query. The hunt for a better solution still continues. The imperishable developments in the field of Decision Support System (DSS databases are presenting data at an exceptional rate. The massive volume of DSS data is consequential only when it is able to access and analyze by distinctive researchers. Here, an innovative stochastic framework of DSS query optimizer is proposed to further optimize the design of existing query optimization genetic approaches. The results of Entropy Based Restricted Stochastic Query Optimizer (ERSQO are compared with the results of Exhaustive Enumeration Query Optimizer (EAQO, Simple Genetic Query Optimizer (SGQO, Novel Genetic Query Optimizer (NGQO and Restricted Stochastic Query Optimizer (RSQO. In terms of Total Costs, EAQO outperforms SGQO, NGQO, RSQO and ERSQO. However, stochastic approaches dominate in terms of runtime. The Total Costs produced by ERSQO is better than SGQO, NGQO and RGQO by 12%, 8% and 5% respectively. Moreover, the effect of replicating data on the Total Costs of DSS query is also examined. In addition, the statistical analysis revealed a 2-tailed significant correlation between the number of join operations and the Total Costs of distributed DSS query. Finally, in regard to the consistency of stochastic query optimizers, the results of SGQO, NGQO, RSQO and ERSQO are 96.2%, 97.2%, 97.45 and 97.8% consistent respectively.

  19. Performance analysis of chi models using discrete-time probabilistic reward graphs

    NARCIS (Netherlands)

    Trcka, N.; Georgievska, S.; Markovski, J.; Andova, S.; Vink, de E.P.

    2008-01-01

    We propose the model of discrete-time probabilistic reward graphs (DTPRGs) for performance analysis of systems exhibiting discrete deterministic time delays and probabilistic behavior, via their interpretation as discrete-time Markov reward chains, full-fledged platform for qualitative and

  20. Querying phenotype-genotype relationships on patient datasets using semantic web technology: the example of Cerebrotendinous xanthomatosis.

    Science.gov (United States)

    Taboada, María; Martínez, Diego; Pilo, Belén; Jiménez-Escrig, Adriano; Robinson, Peter N; Sobrido, María J

    2012-07-31

    Semantic Web technology can considerably catalyze translational genetics and genomics research in medicine, where the interchange of information between basic research and clinical levels becomes crucial. This exchange involves mapping abstract phenotype descriptions from research resources, such as knowledge databases and catalogs, to unstructured datasets produced through experimental methods and clinical practice. This is especially true for the construction of mutation databases. This paper presents a way of harmonizing abstract phenotype descriptions with patient data from clinical practice, and querying this dataset about relationships between phenotypes and genetic variants, at different levels of abstraction. Due to the current availability of ontological and terminological resources that have already reached some consensus in biomedicine, a reuse-based ontology engineering approach was followed. The proposed approach uses the Ontology Web Language (OWL) to represent the phenotype ontology and the patient model, the Semantic Web Rule Language (SWRL) to bridge the gap between phenotype descriptions and clinical data, and the Semantic Query Web Rule Language (SQWRL) to query relevant phenotype-genotype bidirectional relationships. The work tests the use of semantic web technology in the biomedical research domain named cerebrotendinous xanthomatosis (CTX), using a real dataset and ontologies. A framework to query relevant phenotype-genotype bidirectional relationships is provided. Phenotype descriptions and patient data were harmonized by defining 28 Horn-like rules in terms of the OWL concepts. In total, 24 patterns of SWQRL queries were designed following the initial list of competency questions. As the approach is based on OWL, the semantic of the framework adapts the standard logical model of an open world assumption. This work demonstrates how semantic web technologies can be used to support flexible representation and computational inference mechanisms

  1. Approximate furthest neighbor with application to annulus query

    DEFF Research Database (Denmark)

    Pagh, Rasmus; Silvestri, Francesco; Sivertsen, Johan von Tangen

    2016-01-01

    -dimensional Euclidean space. The method builds on the technique of Indyk (SODA 2003), storing random projections to provide sublinear query time for AFN. However, we introduce a different query algorithm, improving on Indyk׳s approximation factor and reducing the running time by a logarithmic factor. We also present......, the query-dependent approach is used for deriving a data structure for the approximate annulus query problem, which is defined as follows: given an input set S and two parameters r>0 and w≥1, construct a data structure that returns for each query point q a point p∈S such that the distance between p and q...

  2. "Our teacher speaks English at all times!" The mining of profesors usage of language at forin language lesson"

    Directory of Open Access Journals (Sweden)

    Urška Sešek

    2009-12-01

    Full Text Available Different approaches to foreign language teaching can entail very different approaches to the use of the target language in the classroom. The currently prevailing opinion is that the teacher should not primarily use the learners' mother tongue but the target language, as far as that is possible and meaningful. This is important even though today's learners of mainstream-taught foreign languages in Slovenia are much more exposed to their target language outside of school than they were even 10 years ago. The teacher's use of the target language namely represents not only a source of input and a model of its active usage but is also a means of establishing authority and a tool for execution of classroom activities. In order to successfully carry out all of her/his increasingly demanding professional tasks, the teacher should maintain and develop their target language competences in terms of accuracy, appropriateness and modification strategies to adapt to learner needs. It is also very useful to look at the teacher's target language use from a functional perspective to become aware of how different types of utterances / speech acts / language forms can contribute to achieving different educational goals.

  3. Ontobee: A linked ontology data server to support ontology term dereferencing, linkage, query and integration

    Science.gov (United States)

    Ong, Edison; Xiang, Zuoshuang; Zhao, Bin; Liu, Yue; Lin, Yu; Zheng, Jie; Mungall, Chris; Courtot, Mélanie; Ruttenberg, Alan; He, Yongqun

    2017-01-01

    Linked Data (LD) aims to achieve interconnected data by representing entities using Unified Resource Identifiers (URIs), and sharing information using Resource Description Frameworks (RDFs) and HTTP. Ontologies, which logically represent entities and relations in specific domains, are the basis of LD. Ontobee (http://www.ontobee.org/) is a linked ontology data server that stores ontology information using RDF triple store technology and supports query, visualization and linkage of ontology terms. Ontobee is also the default linked data server for publishing and browsing biomedical ontologies in the Open Biological Ontology (OBO) Foundry (http://obofoundry.org) library. Ontobee currently hosts more than 180 ontologies (including 131 OBO Foundry Library ontologies) with over four million terms. Ontobee provides a user-friendly web interface for querying and visualizing the details and hierarchy of a specific ontology term. Using the eXtensible Stylesheet Language Transformation (XSLT) technology, Ontobee is able to dereference a single ontology term URI, and then output RDF/eXtensible Markup Language (XML) for computer processing or display the HTML information on a web browser for human users. Statistics and detailed information are generated and displayed for each ontology listed in Ontobee. In addition, a SPARQL web interface is provided for custom advanced SPARQL queries of one or multiple ontologies. PMID:27733503

  4. A Deductive and Typed Object-Oriented Language

    NARCIS (Netherlands)

    Bal, C.M.R.; Balsters, H.

    1993-01-01

    In this paper we introduce a logical query language extended with object-oriented typing facilities. This language, called DTL (from DataTypeLog), can be seen as an extension of Datalog equipped with complex objects, object identities, and multiple inheritance based on Cardelli type theory. The

  5. WQO is Decidable for Factorial Languages

    KAUST Repository

    Atminas, Aistis

    2017-08-08

    A language is factorial if it is closed under taking factors, i.e. contiguous subwords. Every factorial language can be described by an antidictionary, i.e. a minimal set of forbidden factors. We show that the problem of deciding whether a factorial language given by a finite antidictionary is well-quasi-ordered under the factor containment relation can be solved in polynomial time. We also discuss possible ways to extend our solution to permutations and graphs.

  6. VAL language: description and analysis

    International Nuclear Information System (INIS)

    McGraw, J.R.

    1982-01-01

    VAL is a high-level, function-based language designed for use on data flow computers. A data flow computer has many small processors organized to cooperate in the executive of a single computation. A computation is represented by its data flow graph; each operator in a graph is scheduled for execution on one of the processors after all of its operands' values are known. VAL promotes the indentification of concurrency in algorithms and simplifies the mapping into data graphs. This paper presents a detailed introduction to VAL and analyzes its usefulness for programming in a highly concurrent environment. VAL provides implicit concurrency (operations that can execute simultaneously are evident without the need for any explicit language notation). The language uses function- and expression-based features that prohibit all side effects, which simplifies translation to graphs. The salient language features are described and illustrated through examples taken from a complete VAL program for adaptive quadrature. Analysis of the language shows that VAL meets the critical needs for a data flow environment. The language encourages programmers to think in terms of general concurrency, enhances readability (due to the absence of side effects), and possesses a structure amenable to verification techniques. However, VAL is still evolving. The language definition needs refining, and more support tools for programmer use need to be developed. Also, some new kinds of optimization problems should be addressed

  7. Right-Linear Languages Generated in Systems of Knowledge Representation based on LSG

    Directory of Open Access Journals (Sweden)

    Daniela Danciulescu

    2017-04-01

    Full Text Available In Tudor (Preda (2010 a method for formal languages generation based on labeled stratified graph representations is sketched. The author proves that the considered method can generate regular languages and context-sensitive languages by considering an exemplification of the proposed method for a particular regular language and another one for a particular contextsensitive language. At the end of the study, the author highlights some open problems for future research among which we remind: (1 The study of the language families that can be generated by means of these structures; (2 The study of the infiniteness of the languages that can be represented in stratified graphs. In this paper, we extend the method presented in Tudor (Preda(2010, by considering the stratified graph formalism in a system of knowledge representation and reasoning. More precisely, we propose a method that can be applied for generating any Right Linear Language construction. Our method is proved and exemplified in several cases.

  8. Selected Topics from LVCSR Research for Asian Languages at Tokyo Tech

    Science.gov (United States)

    Furui, Sadaoki

    This paper presents our recent work in regard to building Large Vocabulary Continuous Speech Recognition (LVCSR) systems for the Thai, Indonesian, and Chinese languages. For Thai, since there is no word boundary in the written form, we have proposed a new method for automatically creating word-like units from a text corpus, and applied topic and speaking style adaptation to the language model to recognize spoken-style utterances. For Indonesian, we have applied proper noun-specific adaptation to acoustic modeling, and rule-based English-to-Indonesian phoneme mapping to solve the problem of large variation in proper noun and English word pronunciation in a spoken-query information retrieval system. In spoken Chinese, long organization names are frequently abbreviated, and abbreviated utterances cannot be recognized if the abbreviations are not included in the dictionary. We have proposed a new method for automatically generating Chinese abbreviations, and by expanding the vocabulary using the generated abbreviations, we have significantly improved the performance of spoken query-based search.

  9. Profinite graphs and groups

    CERN Document Server

    Ribes, Luis

    2017-01-01

    This book offers a detailed introduction to graph theoretic methods in profinite groups and applications to abstract groups. It is the first to provide a comprehensive treatment of the subject. The author begins by carefully developing relevant notions in topology, profinite groups and homology, including free products of profinite groups, cohomological methods in profinite groups, and fixed points of automorphisms of free pro-p groups. The final part of the book is dedicated to applications of the profinite theory to abstract groups, with sections on finitely generated subgroups of free groups, separability conditions in free and amalgamated products, and algorithms in free groups and finite monoids. Profinite Graphs and Groups will appeal to students and researchers interested in profinite groups, geometric group theory, graphs and connections with the theory of formal languages. A complete reference on the subject, the book includes historical and bibliographical notes as well as a discussion of open quest...

  10. Predictions of first passage times in sparse discrete fracture networks using graph-based reductions

    Science.gov (United States)

    Hyman, J.; Hagberg, A.; Srinivasan, G.; Mohd-Yusof, J.; Viswanathan, H. S.

    2017-12-01

    We present a graph-based methodology to reduce the computational cost of obtaining first passage times through sparse fracture networks. We derive graph representations of generic three-dimensional discrete fracture networks (DFNs) using the DFN topology and flow boundary conditions. Subgraphs corresponding to the union of the k shortest paths between the inflow and outflow boundaries are identified and transport on their equivalent subnetworks is compared to transport through the full network. The number of paths included in the subgraphs is based on the scaling behavior of the number of edges in the graph with the number of shortest paths. First passage times through the subnetworks are in good agreement with those obtained in the full network, both for individual realizations and in distribution. Accurate estimates of first passage times are obtained with an order of magnitude reduction of CPU time and mesh size using the proposed method.

  11. Query deforestation

    OpenAIRE

    Grust, Torsten; Scholl, Marc H.

    1998-01-01

    The construction of a declarative query engine for a DBMS includes the challenge of compiling algebraic queries into efficient execution plans that can be run on top of the persistent storage. This work pursues the goal of employing foldr-build deforestation for the derivation of efficient streaming programs - programs that do not allocate intermediate data structures to perform their task - from algebraic (combinator) query plans. The query engine is based on the insertion representation of ...

  12. Minimum nonuniform graph partitioning with unrelated weights

    Science.gov (United States)

    Makarychev, K. S.; Makarychev, Yu S.

    2017-12-01

    We give a bi-criteria approximation algorithm for the Minimum Nonuniform Graph Partitioning problem, recently introduced by Krauthgamer, Naor, Schwartz and Talwar. In this problem, we are given a graph G=(V,E) and k numbers ρ_1,\\dots, ρ_k. The goal is to partition V into k disjoint sets (bins) P_1,\\dots, P_k satisfying \\vert P_i\\vert≤ ρi \\vert V\\vert for all i, so as to minimize the number of edges cut by the partition. Our bi-criteria algorithm gives an O(\\sqrt{log \\vert V\\vert log k}) approximation for the objective function in general graphs and an O(1) approximation in graphs excluding a fixed minor. The approximate solution satisfies the relaxed capacity constraints \\vert P_i\\vert ≤ (5+ \\varepsilon)ρi \\vert V\\vert. This algorithm is an improvement upon the O(log \\vert V\\vert)-approximation algorithm by Krauthgamer, Naor, Schwartz and Talwar. We extend our results to the case of 'unrelated weights' and to the case of 'unrelated d-dimensional weights'. A preliminary version of this work was presented at the 41st International Colloquium on Automata, Languages and Programming (ICALP 2014). Bibliography: 7 titles.

  13. Request language of the JINR information retrieval system

    International Nuclear Information System (INIS)

    Arnaudov, D.D.; Kozhenkova, Z.I.

    1975-01-01

    A classification of operating languages of information retreival systems (IRS) is given. A justification is made for the selection of the descriptor no-grammar language for coding documents and queries in the JINR IRS. A Boolean form query minimization algorithm is applied

  14. ARTICLE Robust Diagnosis of Mechatronics System by Bond Graph Approach

    Directory of Open Access Journals (Sweden)

    Abderrahmene Sellami

    2018-03-01

    Full Text Available This article presents design of a robust diagnostic system based on bond graph model for a mechatronic system. Mechatronics is the synergistic and systemic combination of mechanics, electronics and computer science. The design of a mechatronic system modeled by the bond graph model becomes easier and more generous. The bond graph tool is a unified graphical language for all areas of engineering sciences and confirmed as a structured approach to modeling and simulation of multidisciplinary systems.

  15. GMB: An Efficient Query Processor for Biological Data

    Directory of Open Access Journals (Sweden)

    Taha Kamal

    2011-06-01

    Full Text Available Bioinformatics applications manage complex biological data stored into distributed and often heterogeneous databases and require large computing power. These databases are too big and complicated to be rapidly queried every time a user submits a query, due to the overhead involved in decomposing the queries, sending the decomposed queries to remote databases, and composing the results. There is also considerable communication costs involved. This study addresses the mentioned problems in Grid-based environment for bioinformatics. We propose a Grid middleware called GMB that alleviates these problems by caching the results of Frequently Used Queries (FUQ. Queries are classified based on their types and frequencies. FUQ are answered from the middleware, which improves their response time. GMB acts as a gateway to TeraGrid Grid: it resides between users’ applications and TeraGrid Grid. We evaluate GMB experimentally.

  16. Time signal filtering by relative neighborhood graph localized linear approximation

    DEFF Research Database (Denmark)

    Sørensen, John Aasted

    1994-01-01

    A time signal filtering algorithm based on the relative neighborhood graph (RNG) used for localization of linear filters is proposed. The filter is constructed from a training signal during two stages. During the first stage an RNG is constructed. During the second stage, localized linear filters...

  17. Canonical Labelling of Site Graphs

    Directory of Open Access Journals (Sweden)

    Nicolas Oury

    2013-06-01

    Full Text Available We investigate algorithms for canonical labelling of site graphs, i.e. graphs in which edges bind vertices on sites with locally unique names. We first show that the problem of canonical labelling of site graphs reduces to the problem of canonical labelling of graphs with edge colourings. We then present two canonical labelling algorithms based on edge enumeration, and a third based on an extension of Hopcroft's partition refinement algorithm. All run in quadratic worst case time individually. However, one of the edge enumeration algorithms runs in sub-quadratic time for graphs with "many" automorphisms, and the partition refinement algorithm runs in sub-quadratic time for graphs with "few" bisimulation equivalences. This suite of algorithms was chosen based on the expectation that graphs fall in one of those two categories. If that is the case, a combined algorithm runs in sub-quadratic worst case time. Whether this expectation is reasonable remains an interesting open problem.

  18. The Data Cyclotron query processing scheme

    NARCIS (Netherlands)

    Goncalves, R.; Kersten, M.

    2011-01-01

    A grand challenge of distributed query processing is to devise a self-organizing architecture which exploits all hardware resources optimally to manage the database hot set, minimize query response time, and maximize throughput without single point global coordination. The Data Cyclotron

  19. Querying archetype-based EHRs by search ontology-based XPath engineering.

    Science.gov (United States)

    Kropf, Stefan; Uciteli, Alexandr; Schierle, Katrin; Krücken, Peter; Denecke, Kerstin; Herre, Heinrich

    2018-05-11

    Legacy data and new structured data can be stored in a standardized format as XML-based EHRs on XML databases. Querying documents on these databases is crucial for answering research questions. Instead of using free text searches, that lead to false positive results, the precision can be increased by constraining the search to certain parts of documents. A search ontology-based specification of queries on XML documents defines search concepts and relates them to parts in the XML document structure. Such query specification method is practically introduced and evaluated by applying concrete research questions formulated in natural language on a data collection for information retrieval purposes. The search is performed by search ontology-based XPath engineering that reuses ontologies and XML-related W3C standards. The key result is that the specification of research questions can be supported by the usage of search ontology-based XPath engineering. A deeper recognition of entities and a semantic understanding of the content is necessary for a further improvement of precision and recall. Key limitation is that the application of the introduced process requires skills in ontology and software development. In future, the time consuming ontology development could be overcome by implementing a new clinical role: the clinical ontologist. The introduced Search Ontology XML extension connects Search Terms to certain parts in XML documents and enables an ontology-based definition of queries. Search ontology-based XPath engineering can support research question answering by the specification of complex XPath expressions without deep syntax knowledge about XPaths.

  20. The Data Cyclotron query processing scheme.

    NARCIS (Netherlands)

    R.A. Goncalves (Romulo); M.L. Kersten (Martin)

    2011-01-01

    htmlabstractA grand challenge of distributed query processing is to devise a self-organizing architecture which exploits all hardware resources optimally to manage the database hot set, minimize query response time, and maximize throughput without single point global coordination. The Data Cyclotron

  1. Advanced SPARQL querying in small molecule databases

    Czech Academy of Sciences Publication Activity Database

    Galgonek, Jakub; Hurt, T.; Michlíková, V.; Onderka, P.; Schwarz, J.; Vondrášek, Jiří

    2016-01-01

    Roč. 8, Jun 6 (2016), č. článku 31. ISSN 1758-2946 R&D Projects: GA MŠk(CZ) LM2015047 Institutional support: RVO:61388963 Keywords : Resource Description Framework * SPARQL query language * Database of small molecules Subject RIV: CF - Physical ; Theoretical Chemistry Impact factor: 4.220, year: 2016 http://jcheminf.springeropen.com/articles/10.1186/s13321-016-0144-4

  2. Experiments with Cross-Language Information Retrieval on a Health Portal for Psychology and Psychotherapy.

    Science.gov (United States)

    Andrenucci, Andrea

    2016-01-01

    Few studies have been performed within cross-language information retrieval (CLIR) in the field of psychology and psychotherapy. The aim of this paper is to to analyze and assess the quality of available query translation methods for CLIR on a health portal for psychology. A test base of 100 user queries, 50 Multi Word Units (WUs) and 50 Single WUs, was used. Swedish was the source language and English the target language. Query translation methods based on machine translation (MT) and dictionary look-up were utilized in order to submit query translations to two search engines: Google Site Search and Quick Ask. Standard IR evaluation measures and a qualitative analysis were utilized to assess the results. The lexicon extracted with word alignment of the portal's parallel corpus provided better statistical results among dictionary look-ups. Google Translate provided more linguistically correct translations overall and also delivered better retrieval results in MT.

  3. TrajGraph: A Graph-Based Visual Analytics Approach to Studying Urban Network Centralities Using Taxi Trajectory Data.

    Science.gov (United States)

    Huang, Xiaoke; Zhao, Ye; Yang, Jing; Zhang, Chong; Ma, Chao; Ye, Xinyue

    2016-01-01

    We propose TrajGraph, a new visual analytics method, for studying urban mobility patterns by integrating graph modeling and visual analysis with taxi trajectory data. A special graph is created to store and manifest real traffic information recorded by taxi trajectories over city streets. It conveys urban transportation dynamics which can be discovered by applying graph analysis algorithms. To support interactive, multiscale visual analytics, a graph partitioning algorithm is applied to create region-level graphs which have smaller size than the original street-level graph. Graph centralities, including Pagerank and betweenness, are computed to characterize the time-varying importance of different urban regions. The centralities are visualized by three coordinated views including a node-link graph view, a map view and a temporal information view. Users can interactively examine the importance of streets to discover and assess city traffic patterns. We have implemented a fully working prototype of this approach and evaluated it using massive taxi trajectories of Shenzhen, China. TrajGraph's capability in revealing the importance of city streets was evaluated by comparing the calculated centralities with the subjective evaluations from a group of drivers in Shenzhen. Feedback from a domain expert was collected. The effectiveness of the visual interface was evaluated through a formal user study. We also present several examples and a case study to demonstrate the usefulness of TrajGraph in urban transportation analysis.

  4. Ontobee: A linked ontology data server to support ontology term dereferencing, linkage, query and integration.

    Science.gov (United States)

    Ong, Edison; Xiang, Zuoshuang; Zhao, Bin; Liu, Yue; Lin, Yu; Zheng, Jie; Mungall, Chris; Courtot, Mélanie; Ruttenberg, Alan; He, Yongqun

    2017-01-04

    Linked Data (LD) aims to achieve interconnected data by representing entities using Unified Resource Identifiers (URIs), and sharing information using Resource Description Frameworks (RDFs) and HTTP. Ontologies, which logically represent entities and relations in specific domains, are the basis of LD. Ontobee (http://www.ontobee.org/) is a linked ontology data server that stores ontology information using RDF triple store technology and supports query, visualization and linkage of ontology terms. Ontobee is also the default linked data server for publishing and browsing biomedical ontologies in the Open Biological Ontology (OBO) Foundry (http://obofoundry.org) library. Ontobee currently hosts more than 180 ontologies (including 131 OBO Foundry Library ontologies) with over four million terms. Ontobee provides a user-friendly web interface for querying and visualizing the details and hierarchy of a specific ontology term. Using the eXtensible Stylesheet Language Transformation (XSLT) technology, Ontobee is able to dereference a single ontology term URI, and then output RDF/eXtensible Markup Language (XML) for computer processing or display the HTML information on a web browser for human users. Statistics and detailed information are generated and displayed for each ontology listed in Ontobee. In addition, a SPARQL web interface is provided for custom advanced SPARQL queries of one or multiple ontologies. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Giant Components in Biased Graph Processes

    OpenAIRE

    Amir, Gideon; Gurel-Gurevich, Ori; Lubetzky, Eyal; Singer, Amit

    2005-01-01

    A random graph process, $\\Gorg[1](n)$, is a sequence of graphs on $n$ vertices which begins with the edgeless graph, and where at each step a single edge is added according to a uniform distribution on the missing edges. It is well known that in such a process a giant component (of linear size) typically emerges after $(1+o(1))\\frac{n}{2}$ edges (a phenomenon known as ``the double jump''), i.e., at time $t=1$ when using a timescale of $n/2$ edges in each step. We consider a generalization of ...

  6. A Novel Two-Tier Cooperative Caching Mechanism for the Optimization of Multi-Attribute Periodic Queries in Wireless Sensor Networks

    Science.gov (United States)

    Zhou, ZhangBing; Zhao, Deng; Shu, Lei; Tsang, Kim-Fung

    2015-01-01

    Wireless sensor networks, serving as an important interface between physical environments and computational systems, have been used extensively for supporting domain applications, where multiple-attribute sensory data are queried from the network continuously and periodically. Usually, certain sensory data may not vary significantly within a certain time duration for certain applications. In this setting, sensory data gathered at a certain time slot can be used for answering concurrent queries and may be reused for answering the forthcoming queries when the variation of these data is within a certain threshold. To address this challenge, a popularity-based cooperative caching mechanism is proposed in this article, where the popularity of sensory data is calculated according to the queries issued in recent time slots. This popularity reflects the possibility that sensory data are interested in the forthcoming queries. Generally, sensory data with the highest popularity are cached at the sink node, while sensory data that may not be interested in the forthcoming queries are cached in the head nodes of divided grid cells. Leveraging these cooperatively cached sensory data, queries are answered through composing these two-tier cached data. Experimental evaluation shows that this approach can reduce the network communication cost significantly and increase the network capability. PMID:26131665

  7. Matching health information seekers' queries to medical terms.

    Science.gov (United States)

    Soualmia, Lina F; Prieur-Gaston, Elise; Moalla, Zied; Lecroq, Thierry; Darmoni, Stéfan J

    2012-01-01

    The Internet is a major source of health information but most seekers are not familiar with medical vocabularies. Hence, their searches fail due to bad query formulation. Several methods have been proposed to improve information retrieval: query expansion, syntactic and semantic techniques or knowledge-based methods. However, it would be useful to clean those queries which are misspelled. In this paper, we propose a simple yet efficient method in order to correct misspellings of queries submitted by health information seekers to a medical online search tool. In addition to query normalizations and exact phonetic term matching, we tested two approximate string comparators: the similarity score function of Stoilos and the normalized Levenshtein edit distance. We propose here to combine them to increase the number of matched medical terms in French. We first took a sample of query logs to determine the thresholds and processing times. In the second run, at a greater scale we tested different combinations of query normalizations before or after misspelling correction with the retained thresholds in the first run. According to the total number of suggestions (around 163, the number of the first sample of queries), at a threshold comparator score of 0.3, the normalized Levenshtein edit distance gave the highest F-Measure (88.15%) and at a threshold comparator score of 0.7, the Stoilos function gave the highest F-Measure (84.31%). By combining Levenshtein and Stoilos, the highest F-Measure (80.28%) is obtained with 0.2 and 0.7 thresholds respectively. However, queries are composed by several terms that may be combination of medical terms. The process of query normalization and segmentation is thus required. The highest F-Measure (64.18%) is obtained when this process is realized before spelling-correction. Despite the widely known high performance of the normalized edit distance of Levenshtein, we show in this paper that its combination with the Stoilos algorithm improved

  8. Graphs cospectral with a friendship graph or its complement

    Directory of Open Access Journals (Sweden)

    Alireza Abdollahi

    2013-12-01

    Full Text Available Let $n$ be any positive integer and let $F_n$ be the friendship (or Dutch windmill graph with $2n+1$ vertices and $3n$ edges. Here we study graphs with the same adjacency spectrum as the $F_n$. Two graphs are called cospectral if the eigenvalues multiset of their adjacency matrices are the same. Let $G$ be a graph cospectral with $F_n$. Here we prove that if $G$ has no cycle of length $4$ or $5$, then $Gcong F_n$. Moreover if $G$ is connected and planar then $Gcong F_n$.All but one of connected components of $G$ are isomorphic to $K_2$.The complement $overline{F_n}$ of the friendship graph is determined by its adjacency eigenvalues, that is, if $overline{F_n}$ is cospectral with a graph $H$, then $Hcong overline{F_n}$.

  9. XQOWL: An Extension of XQuery for OWL Querying and Reasoning

    Directory of Open Access Journals (Sweden)

    Jesús M. Almendros-Jiménez

    2015-01-01

    Full Text Available One of the main aims of the so-called Web of Data is to be able to handle heterogeneous resources where data can be expressed in either XML or RDF. The design of programming languages able to handle both XML and RDF data is a key target in this context. In this paper we present a framework called XQOWL that makes possible to handle XML and RDF/OWL data with XQuery. XQOWL can be considered as an extension of the XQuery language that connects XQuery with SPARQL and OWL reasoners. XQOWL embeds SPARQL queries (via Jena SPARQL engine in XQuery and enables to make calls to OWL reasoners (HermiT, Pellet and FaCT++ from XQuery. It permits to combine queries against XML and RDF/OWL resources as well as to reason with RDF/OWL data. Therefore input data can be either XML or RDF/OWL and output data can be formatted in XML (also using RDF/OWL XML serialization.

  10. A high performance, ad-hoc, fuzzy query processing system for relational databases

    Science.gov (United States)

    Mansfield, William H., Jr.; Fleischman, Robert M.

    1992-01-01

    Database queries involving imprecise or fuzzy predicates are currently an evolving area of academic and industrial research. Such queries place severe stress on the indexing and I/O subsystems of conventional database environments since they involve the search of large numbers of records. The Datacycle architecture and research prototype is a database environment that uses filtering technology to perform an efficient, exhaustive search of an entire database. It has recently been modified to include fuzzy predicates in its query processing. The approach obviates the need for complex index structures, provides unlimited query throughput, permits the use of ad-hoc fuzzy membership functions, and provides a deterministic response time largely independent of query complexity and load. This paper describes the Datacycle prototype implementation of fuzzy queries and some recent performance results.

  11. A local search for a graph clustering problem

    Science.gov (United States)

    Navrotskaya, Anna; Il'ev, Victor

    2016-10-01

    In the clustering problems one has to partition a given set of objects (a data set) into some subsets (called clusters) taking into consideration only similarity of the objects. One of most visual formalizations of clustering is graph clustering, that is grouping the vertices of a graph into clusters taking into consideration the edge structure of the graph whose vertices are objects and edges represent similarities between the objects. In the graph k-clustering problem the number of clusters does not exceed k and the goal is to minimize the number of edges between clusters and the number of missing edges within clusters. This problem is NP-hard for any k ≥ 2. We propose a polynomial time (2k-1)-approximation algorithm for graph k-clustering. Then we apply a local search procedure to the feasible solution found by this algorithm and hold experimental research of obtained heuristics.

  12. Approximate dictionary queries

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Gasieniec, Leszek

    1996-01-01

    Given a set of n binary strings of length m each. We consider the problem of answering d-queries. Given a binary query string of length m, a d-query is to report if there exists a string in the set within Hamming distance d of . We present a data structure of size O(nm) supporting 1-queries in ti...

  13. RCQ-GA: RDF Chain Query Optimization Using Genetic Algorithms

    Science.gov (United States)

    Hogenboom, Alexander; Milea, Viorel; Frasincar, Flavius; Kaymak, Uzay

    The application of Semantic Web technologies in an Electronic Commerce environment implies a need for good support tools. Fast query engines are needed for efficient querying of large amounts of data, usually represented using RDF. We focus on optimizing a special class of SPARQL queries, the so-called RDF chain queries. For this purpose, we devise a genetic algorithm called RCQ-GA that determines the order in which joins need to be performed for an efficient evaluation of RDF chain queries. The approach is benchmarked against a two-phase optimization algorithm, previously proposed in literature. The more complex a query is, the more RCQ-GA outperforms the benchmark in solution quality, execution time needed, and consistency of solution quality. When the algorithms are constrained by a time limit, the overall performance of RCQ-GA compared to the benchmark further improves.

  14. Managing and Querying Image Annotation and Markup in XML.

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid.

  15. Managing and Querying Image Annotation and Markup in XML

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid. PMID:21218167

  16. The color space of a graph

    DEFF Research Database (Denmark)

    Jensen, T.R.; Thomassen, Carsten

    2000-01-01

    If k is a prime power, and G is a graph with n vertices, then a k-coloring of G may be considered as a vector in GF(k)(n). We prove that the subspace of GF(3)(n) spanned by all 3-colorings of a planar triangle-free graph with n vertices has dimension n. In particular, any such graph has at least n...... - 1 nonequivalent 3-colorings, and the addition of any edge or any vertex of degree 3 results in a 3-colorable graph. (C) 2000 John Wiley & Sons, Inc....

  17. A Novel Efficient Graph Model for the Multiple Longest Common Subsequences (MLCS Problem

    Directory of Open Access Journals (Sweden)

    Zhan Peng

    2017-08-01

    Full Text Available Searching for the Multiple Longest Common Subsequences (MLCS of multiple sequences is a classical NP-hard problem, which has been used in many applications. One of the most effective exact approaches for the MLCS problem is based on dominant point graph, which is a kind of directed acyclic graph (DAG. However, the time and space efficiency of the leading dominant point graph based approaches is still unsatisfactory: constructing the dominated point graph used by these approaches requires a huge amount of time and space, which hinders the applications of these approaches to large-scale and long sequences. To address this issue, in this paper, we propose a new time and space efficient graph model called the Leveled-DAG for the MLCS problem. The Leveled-DAG can timely eliminate all the nodes in the graph that cannot contribute to the construction of MLCS during constructing. At any moment, only the current level and some previously generated nodes in the graph need to be kept in memory, which can greatly reduce the memory consumption. Also, the final graph contains only one node in which all of the wanted MLCS are saved, thus, no additional operations for searching the MLCS are needed. The experiments are conducted on real biological sequences with different numbers and lengths respectively, and the proposed algorithm is compared with three state-of-the-art algorithms. The experimental results show that the time and space needed for the Leveled-DAG approach are smaller than those for the compared algorithms especially on large-scale and long sequences.

  18. AGM: A DSL for mobile cloud computing based on directed graph

    Science.gov (United States)

    Tanković, Nikola; Grbac, Tihana Galinac

    2016-06-01

    This paper summarizes a novel approach for consuming a domain specific language (DSL) by transforming it to a directed graph representation persisted by a graph database. Using such specialized database enables advanced navigation trough the stored model exposing only relevant subsets of meta-data to different involved services and components. We applied this approach in a mobile cloud computing system and used it to model several mobile applications in retail, supply chain management and merchandising domain. These application are distributed in a Software-as-a-Service (SaaS) fashion and used by thousands of customers in Croatia. We report on lessons learned and propose further research on this topic.

  19. Acyclicity in edge-colored graphs

    DEFF Research Database (Denmark)

    Gutin, Gregory; Jones, Mark; Sheng, Bin

    2017-01-01

    A walk W in edge-colored graphs is called properly colored (PC) if every pair of consecutive edges in W is of different color. We introduce and study five types of PC acyclicity in edge-colored graphs such that graphs of PC acyclicity of type i is a proper superset of graphs of acyclicity of type i......+1, i=1,2,3,4. The first three types are equivalent to the absence of PC cycles, PC closed trails, and PC closed walks, respectively. While graphs of types 1, 2 and 3 can be recognized in polynomial time, the problem of recognizing graphs of type 4 is, somewhat surprisingly, NP-hard even for 2-edge-colored...... graphs (i.e., when only two colors are used). The same problem with respect to type 5 is polynomial-time solvable for all edge-colored graphs. Using the five types, we investigate the border between intractability and tractability for the problems of finding the maximum number of internally vertex...

  20. Recommending Multidimensional Queries

    Science.gov (United States)

    Giacometti, Arnaud; Marcel, Patrick; Negre, Elsa

    Interactive analysis of datacube, in which a user navigates a cube by launching a sequence of queries is often tedious since the user may have no idea of what the forthcoming query should be in his current analysis. To better support this process we propose in this paper to apply a Collaborative Work approach that leverages former explorations of the cube to recommend OLAP queries. The system that we have developed adapts Approximate String Matching, a technique popular in Information Retrieval, to match the current analysis with the former explorations and help suggesting a query to the user. Our approach has been implemented with the open source Mondrian OLAP server to recommend MDX queries and we have carried out some preliminary experiments that show its efficiency for generating effective query recommendations.

  1. Graphing trillions of triangles.

    Science.gov (United States)

    Burkhardt, Paul

    2017-07-01

    The increasing size of Big Data is often heralded but how data are transformed and represented is also profoundly important to knowledge discovery, and this is exemplified in Big Graph analytics. Much attention has been placed on the scale of the input graph but the product of a graph algorithm can be many times larger than the input. This is true for many graph problems, such as listing all triangles in a graph. Enabling scalable graph exploration for Big Graphs requires new approaches to algorithms, architectures, and visual analytics. A brief tutorial is given to aid the argument for thoughtful representation of data in the context of graph analysis. Then a new algebraic method to reduce the arithmetic operations in counting and listing triangles in graphs is introduced. Additionally, a scalable triangle listing algorithm in the MapReduce model will be presented followed by a description of the experiments with that algorithm that led to the current largest and fastest triangle listing benchmarks to date. Finally, a method for identifying triangles in new visual graph exploration technologies is proposed.

  2. Language-concordant automated telephone queries to assess medication adherence in a diverse population: a cross-sectional analysis of convergent validity with pharmacy claims.

    Science.gov (United States)

    Ratanawongsa, Neda; Quan, Judy; Handley, Margaret A; Sarkar, Urmimala; Schillinger, Dean

    2018-04-06

    Clinicians have difficulty accurately assessing medication non-adherence within chronic disease care settings. Health information technology (HIT) could offer novel tools to assess medication adherence in diverse populations outside of usual health care settings. In a multilingual urban safety net population, we examined the validity of assessing adherence using automated telephone self-management (ATSM) queries, when compared with non-adherence using continuous medication gap (CMG) on pharmacy claims. We hypothesized that patients reporting greater days of missed pills to ATSM queries would have higher rates of non-adherence as measured by CMG, and that ATSM adherence assessments would perform as well as structured interview assessments. As part of an ATSM-facilitated diabetes self-management program, low-income health plan members typed numeric responses to rotating weekly ATSM queries: "In the last 7 days, how many days did you MISS taking your …" diabetes, blood pressure, or cholesterol pill. Research assistants asked similar questions in computer-assisted structured telephone interviews. We measured continuous medication gap (CMG) by claims over 12 preceding months. To evaluate convergent validity, we compared rates of optimal adherence (CMG ≤ 20%) across respondents reporting 0, 1, and ≥ 2 missed pill days on ATSM and on structured interview. Among 210 participants, 46% had limited health literacy, 57% spoke Cantonese, and 19% Spanish. ATSM respondents reported ≥1 missed day for diabetes (33%), blood pressure (19%), and cholesterol (36%) pills. Interview respondents reported ≥1 missed day for diabetes (28%), blood pressure (21%), and cholesterol (26%) pills. Optimal adherence rates by CMG were lower among ATSM respondents reporting more missed days for blood pressure (p = 0.02) and cholesterol (p < 0.01); by interview, differences were significant for cholesterol (p = 0.01). Language-concordant ATSM demonstrated modest potential

  3. Analyzing and synthesizing phylogenies using tree alignment graphs.

    Directory of Open Access Journals (Sweden)

    Stephen A Smith

    Full Text Available Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG. The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees, we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to

  4. Analyzing and synthesizing phylogenies using tree alignment graphs.

    Science.gov (United States)

    Smith, Stephen A; Brown, Joseph W; Hinchliff, Cody E

    2013-01-01

    Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe.

  5. On a conjecture concerning helly circle graphs

    Directory of Open Access Journals (Sweden)

    Durán Guillermo

    2003-01-01

    Full Text Available We say that G is an e-circle graph if there is a bijection between its vertices and straight lines on the cartesian plane such that two vertices are adjacent in G if and only if the corresponding lines intersect inside the circle of radius one. This definition suggests a method for deciding whether a given graph G is an e-circle graph, by constructing a convenient system S of equations and inequations which represents the structure of G, in such a way that G is an e-circle graph if and only if S has a solution. In fact, e-circle graphs are exactly the circle graphs (intersection graphs of chords in a circle, and thus this method provides an analytic way for recognizing circle graphs. A graph G is a Helly circle graph if G is a circle graph and there exists a model of G by chords such that every three pairwise intersecting chords intersect at the same point. A conjecture by Durán (2000 states that G is a Helly circle graph if and only if G is a circle graph and contains no induced diamonds (a diamond is a graph formed by four vertices and five edges. Many unsuccessful efforts - mainly based on combinatorial and geometrical approaches - have been done in order to validate this conjecture. In this work, we utilize the ideas behind the definition of e-circle graphs and restate this conjecture in terms of an equivalence between two systems of equations and inequations, providing a new, analytic tool to deal with it.

  6. Centrosymmetric Graphs And A Lower Bound For Graph Energy Of Fullerenes

    Directory of Open Access Journals (Sweden)

    Katona Gyula Y.

    2014-11-01

    Full Text Available The energy of a molecular graph G is defined as the summation of the absolute values of the eigenvalues of adjacency matrix of a graph G. In this paper, an infinite class of fullerene graphs with 10n vertices, n ≥ 2, is considered. By proving centrosymmetricity of the adjacency matrix of these fullerene graphs, a lower bound for its energy is given. Our method is general and can be extended to other class of fullerene graphs.

  7. Evaluating Trajectory Queries over Imprecise Location Data

    DEFF Research Database (Denmark)

    Xie, Scott, Xike; Cheng, Reynold; Yiu, Man Lung

    2012-01-01

    Trajectory queries, which retrieve nearby objects for every point of a given route, can be used to identify alerts of potential threats along a vessel route, or monitor the adjacent rescuers to a travel path. However, the locations of these objects (e.g., threats, succours) may not be precisely...... obtained due to hardware limitations of measuring devices, as well as the constantly-changing nature of the external environment. Ignoring data uncertainty can render low query quality, and cause undesirable consequences such as missing alerts of threats and poor response time in rescue operations. Also......, the query is quite time-consuming, since all the points on the trajectory are considered. In this paper, we study how to efficiently evaluate trajectory queries over imprecise location data, by proposing a new concept called the u-bisector. In general, the u-bisector is an extension of bisector to handle...

  8. The role of economics in the QUERI program: QUERI Series.

    Science.gov (United States)

    Smith, Mark W; Barnett, Paul G

    2008-04-22

    The United States (U.S.) Department of Veterans Affairs (VA) Quality Enhancement Research Initiative (QUERI) has implemented economic analyses in single-site and multi-site clinical trials. To date, no one has reviewed whether the QUERI Centers are taking an optimal approach to doing so. Consistent with the continuous learning culture of the QUERI Program, this paper provides such a reflection. We present a case study of QUERI as an example of how economic considerations can and should be integrated into implementation research within both single and multi-site studies. We review theoretical and applied cost research in implementation studies outside and within VA. We also present a critique of the use of economic research within the QUERI program. Economic evaluation is a key element of implementation research. QUERI has contributed many developments in the field of implementation but has only recently begun multi-site implementation trials across multiple regions within the national VA healthcare system. These trials are unusual in their emphasis on developing detailed costs of implementation, as well as in the use of business case analyses (budget impact analyses). Economics appears to play an important role in QUERI implementation studies, only after implementation has reached the stage of multi-site trials. Economic analysis could better inform the choice of which clinical best practices to implement and the choice of implementation interventions to employ. QUERI economics also would benefit from research on costing methods and development of widely accepted international standards for implementation economics.

  9. Cellular Automata on Graphs: Topological Properties of ER Graphs Evolved towards Low-Entropy Dynamics

    Directory of Open Access Journals (Sweden)

    Marc-Thorsten Hütt

    2012-06-01

    Full Text Available Cellular automata (CA are a remarkably  efficient tool for exploring general properties of complex systems and spatiotemporal patterns arising from local rules. Totalistic cellular automata,  where the update  rules depend  only on the density of neighboring states, are at the same time a versatile  tool for exploring  dynamical  processes on graphs. Here we briefly review our previous results on cellular automata on graphs, emphasizing some systematic relationships between network architecture and dynamics identified in this way. We then extend the investigation  towards graphs obtained in a simulated-evolution procedure, starting from Erdő s–Rényi (ER graphs and selecting for low entropies of the CA dynamics. Our key result is a strong association of low Shannon entropies with a broadening of the graph’s degree distribution.

  10. jQuery Tools UI Library

    CERN Document Server

    Libby, Alex

    2012-01-01

    A practical tutorial with powerful yet simple projects that are quick to implement. This book is aimed at developers who have prior jQuery knowledge, but may not have any prior experience with jQuery Tools. It is possible that they may have started with the basics of jQuery Tools, but want to learn more about how it can be used, as well as get ideas for future projects.

  11. Conceptual Pathway Querying of Natural Logic Knowledge Bases from Text Bases

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Nilsson, Jørgen Fischer

    2013-01-01

    language than predicate logic. Natural logic accommodates a variety of scientific parlance, ontologies and domain models. It also supports a semantic net or graph view of the knowledge base. This admits computation of relationships between concepts simultaneously through pathfinding in the knowledge base...

  12. Digital Workflows for a 3d Semantic Representation of AN Ancient Mining Landscape

    Science.gov (United States)

    Hiebel, G.; Hanke, K.

    2017-08-01

    The ancient mining landscape of Schwaz/Brixlegg in the Tyrol, Austria witnessed mining from prehistoric times to modern times creating a first order cultural landscape when it comes to one of the most important inventions in human history: the production of metal. In 1991 a part of this landscape was lost due to an enormous landslide that reshaped part of the mountain. With our work we want to propose a digital workflow to create a 3D semantic representation of this ancient mining landscape with its mining structures to preserve it for posterity. First, we define a conceptual model to integrate the data. It is based on the CIDOC CRM ontology and CRMgeo for geometric data. To transform our information sources to a formal representation of the classes and properties of the ontology we applied semantic web technologies and created a knowledge graph in RDF (Resource Description Framework). Through the CRMgeo extension coordinate information of mining features can be integrated into the RDF graph and thus related to the detailed digital elevation model that may be visualized together with the mining structures using Geoinformation systems or 3D visualization tools. The RDF network of the triple store can be queried using the SPARQL query language. We created a snapshot of mining, settlement and burial sites in the Bronze Age. The results of the query were loaded into a Geoinformation system and a visualization of known bronze age sites related to mining, settlement and burial activities was created.

  13. A method to implement fine-grained access control for personal health records through standard relational database queries.

    Science.gov (United States)

    Sujansky, Walter V; Faus, Sam A; Stone, Ethan; Brennan, Patricia Flatley

    2010-10-01

    Online personal health records (PHRs) enable patients to access, manage, and share certain of their own health information electronically. This capability creates the need for precise access-controls mechanisms that restrict the sharing of data to that intended by the patient. The authors describe the design and implementation of an access-control mechanism for PHR repositories that is modeled on the eXtensible Access Control Markup Language (XACML) standard, but intended to reduce the cognitive and computational complexity of XACML. The authors implemented the mechanism entirely in a relational database system using ANSI-standard SQL statements. Based on a set of access-control rules encoded as relational table rows, the mechanism determines via a single SQL query whether a user who accesses patient data from a specific application is authorized to perform a requested operation on a specified data object. Testing of this query on a moderately large database has demonstrated execution times consistently below 100ms. The authors include the details of the implementation, including algorithms, examples, and a test database as Supplementary materials. Copyright © 2010 Elsevier Inc. All rights reserved.

  14. Query Migration from Object Oriented World to Semantic World

    Directory of Open Access Journals (Sweden)

    Nassima Soussi

    2016-06-01

    Full Text Available In the last decades, object-oriented approach was able to take a large share of databases market aiming to design and implement structured and reusable software through the composition of independent elements in order to have programs with a high performance. On the other hand, the mass of information stored in the web is increasing day after day with a vertiginous speed, exposing the currently web faced with the problem of creating a bridge so as to facilitate access to data between different applications and systems as well as to look for relevant and exact information wished by users. In addition, all existing approach of rewriting object oriented languages to SPARQL language rely on models transformation process to guarantee this mapping. All the previous raisons has prompted us to write this paper in order to bridge an important gap between these two heterogeneous worlds (object oriented and semantic web world by proposing the first provably semantics preserving OQLto-SPARQL translation algorithm for each element of OQL Query (SELECT clause, FROM clause, FILTER constraint, implicit/ explicit join and union/intersection SELECT queries.

  15. Querying Large Physics Data Sets Over an Information Grid

    CERN Document Server

    Baker, N; Kovács, Z; Le Goff, J M; McClatchey, R

    2001-01-01

    Optimising use of the Web (WWW) for LHC data analysis is a complex problem and illustrates the challenges arising from the integration of and computation across massive amounts of information distributed worldwide. Finding the right piece of information can, at times, be extremely time-consuming, if not impossible. So-called Grids have been proposed to facilitate LHC computing and many groups have embarked on studies of data replication, data migration and networking philosophies. Other aspects such as the role of 'middleware' for Grids are emerging as requiring research. This paper positions the need for appropriate middleware that enables users to resolve physics queries across massive data sets. It identifies the role of meta-data for query resolution and the importance of Information Grids for high-energy physics analysis rather than just Computational or Data Grids. This paper identifies software that is being implemented at CERN to enable the querying of very large collaborating HEP data-sets, initially...

  16. An 00 visual language definition approach supporting multiple views

    OpenAIRE

    Akehurst, David H.; I.E.E.E. Computer Society

    2000-01-01

    The formal approach to visual language definition is to use graph grammars and/or graph transformation techniques. These techniques focus on specifying the syntax and manipulation rules of the concrete representation. This paper presents a constraint and object-oriented approach to defining visual languages that uses UML and OCL as a definition language. Visual language definitions specify a mapping between concrete and abstract models of possible visual sentences, which carl subsequently be ...

  17. Foundations of RDF Databases

    Science.gov (United States)

    Arenas, Marcelo; Gutierrez, Claudio; Pérez, Jorge

    The goal of this paper is to give an overview of the basics of the theory of RDF databases. We provide a formal definition of RDF that includes the features that distinguish this model from other graph data models. We then move into the fundamental issue of querying RDF data. We start by considering the RDF query language SPARQL, which is a W3C Recommendation since January 2008. We provide an algebraic syntax and a compositional semantics for this language, study the complexity of the evaluation problem for different fragments of SPARQL, and consider the problem of optimizing the evaluation of SPARQL queries, showing that a natural fragment of this language has some good properties in this respect. We furthermore study the expressive power of SPARQL, by comparing it with some well-known query languages such as relational algebra. We conclude by considering the issue of querying RDF data in the presence of RDFS vocabulary. In particular, we present a recently proposed extension of SPARQL with navigational capabilities.

  18. Pattern graph rewrite systems

    Directory of Open Access Journals (Sweden)

    Aleks Kissinger

    2014-03-01

    Full Text Available String diagrams are a powerful tool for reasoning about physical processes, logic circuits, tensor networks, and many other compositional structures. Dixon, Duncan and Kissinger introduced string graphs, which are a combinatoric representations of string diagrams, amenable to automated reasoning about diagrammatic theories via graph rewrite systems. In this extended abstract, we show how the power of such rewrite systems can be greatly extended by introducing pattern graphs, which provide a means of expressing infinite families of rewrite rules where certain marked subgraphs, called !-boxes ("bang boxes", on both sides of a rule can be copied any number of times or removed. After reviewing the string graph formalism, we show how string graphs can be extended to pattern graphs and how pattern graphs and pattern rewrite rules can be instantiated to concrete string graphs and rewrite rules. We then provide examples demonstrating the expressive power of pattern graphs and how they can be applied to study interacting algebraic structures that are central to categorical quantum mechanics.

  19. The role of economics in the QUERI program: QUERI Series

    Directory of Open Access Journals (Sweden)

    Smith Mark W

    2008-04-01

    Full Text Available Abstract Background The United States (U.S. Department of Veterans Affairs (VA Quality Enhancement Research Initiative (QUERI has implemented economic analyses in single-site and multi-site clinical trials. To date, no one has reviewed whether the QUERI Centers are taking an optimal approach to doing so. Consistent with the continuous learning culture of the QUERI Program, this paper provides such a reflection. Methods We present a case study of QUERI as an example of how economic considerations can and should be integrated into implementation research within both single and multi-site studies. We review theoretical and applied cost research in implementation studies outside and within VA. We also present a critique of the use of economic research within the QUERI program. Results Economic evaluation is a key element of implementation research. QUERI has contributed many developments in the field of implementation but has only recently begun multi-site implementation trials across multiple regions within the national VA healthcare system. These trials are unusual in their emphasis on developing detailed costs of implementation, as well as in the use of business case analyses (budget impact analyses. Conclusion Economics appears to play an important role in QUERI implementation studies, only after implementation has reached the stage of multi-site trials. Economic analysis could better inform the choice of which clinical best practices to implement and the choice of implementation interventions to employ. QUERI economics also would benefit from research on costing methods and development of widely accepted international standards for implementation economics.

  20. Searching for rare diseases in PubMed: a blind comparison of Orphanet expert query and query based on terminological knowledge.

    Science.gov (United States)

    Griffon, N; Schuers, M; Dhombres, F; Merabti, T; Kerdelhué, G; Rollin, L; Darmoni, S J

    2016-08-02

    Despite international initiatives like Orphanet, it remains difficult to find up-to-date information about rare diseases. The aim of this study is to propose an exhaustive set of queries for PubMed based on terminological knowledge and to evaluate it versus the queries based on expertise provided by the most frequently used resource in Europe: Orphanet. Four rare disease terminologies (MeSH, OMIM, HPO and HRDO) were manually mapped to each other permitting the automatic creation of expended terminological queries for rare diseases. For 30 rare diseases, 30 citations retrieved by Orphanet expert query and/or query based on terminological knowledge were assessed for relevance by two independent reviewers unaware of the query's origin. An adjudication procedure was used to resolve any discrepancy. Precision, relative recall and F-measure were all computed. For each Orphanet rare disease (n = 8982), there was a corresponding terminological query, in contrast with only 2284 queries provided by Orphanet. Only 553 citations were evaluated due to queries with 0 or only a few hits. There were no significant differences between the Orpha query and terminological query in terms of precision, respectively 0.61 vs 0.52 (p = 0.13). Nevertheless, terminological queries retrieved more citations more often than Orpha queries (0.57 vs. 0.33; p = 0.01). Interestingly, Orpha queries seemed to retrieve older citations than terminological queries (p < 0.0001). The terminological queries proposed in this study are now currently available for all rare diseases. They may be a useful tool for both precision or recall oriented literature search.

  1. Genetic algorithms for RDF chain query optimization

    NARCIS (Netherlands)

    Hogenboom, A.C.; Milea, D.V.; Frasincar, F.; Kaymak, U.; Calders, T.; Tuyls, K.; Pechenizkiy, M.

    2009-01-01

    The application of Semantic Web technologies in an Electronic Commerce environment implies a need for good support tools. Fast query engines are required for efficient real-time querying of large amounts of data, usually represented using RDF. We focus on optimizing a special class of SPARQL

  2. Contracting a planar graph efficiently

    DEFF Research Database (Denmark)

    Holm, Jacob; Italiano, Giuseppe F.; Karczmarz, Adam

    2017-01-01

    the data structure, we can achieve optimal running times for decremental bridge detection, 2-edge connectivity, maximal 3-edge connected components, and the problem of finding a unique perfect matching for a static planar graph. Furthermore, we improve the running times of algorithms for several planar...

  3. Advanced software development workstation. Engineering scripting language graphical editor: DRAFT design document

    Science.gov (United States)

    1991-01-01

    The Engineering Scripting Language (ESL) is a language designed to allow nonprogramming users to write Higher Order Language (HOL) programs by drawing directed graphs to represent the program and having the system generate the corresponding program in HOL. The ESL system supports user generation of HOL programs through the manipulation of directed graphs. The components of this graphs (nodes, ports, and connectors) are objects each of which has its own properties and property values. The purpose of the ESL graphical editor is to allow the user to create or edit graph objects which represent programs.

  4. Facilitating Cohort Discovery by Enhancing Ontology Exploration, Query Management and Query Sharing for Large Clinical Data Repositories

    Science.gov (United States)

    Tao, Shiqiang; Cui, Licong; Wu, Xi; Zhang, Guo-Qiang

    2017-01-01

    To help researchers better access clinical data, we developed a prototype query engine called DataSphere for exploring large-scale integrated clinical data repositories. DataSphere expedites data importing using a NoSQL data management system and dynamically renders its user interface for concept-based querying tasks. DataSphere provides an interactive query-building interface together with query translation and optimization strategies, which enable users to build and execute queries effectively and efficiently. We successfully loaded a dataset of one million patients for University of Kentucky (UK) Healthcare into DataSphere with more than 300 million clinical data records. We evaluated DataSphere by comparing it with an instance of i2b2 deployed at UK Healthcare, demonstrating that DataSphere provides enhanced user experience for both query building and execution. PMID:29854239

  5. Facilitating Cohort Discovery by Enhancing Ontology Exploration, Query Management and Query Sharing for Large Clinical Data Repositories.

    Science.gov (United States)

    Tao, Shiqiang; Cui, Licong; Wu, Xi; Zhang, Guo-Qiang

    2017-01-01

    To help researchers better access clinical data, we developed a prototype query engine called DataSphere for exploring large-scale integrated clinical data repositories. DataSphere expedites data importing using a NoSQL data management system and dynamically renders its user interface for concept-based querying tasks. DataSphere provides an interactive query-building interface together with query translation and optimization strategies, which enable users to build and execute queries effectively and efficiently. We successfully loaded a dataset of one million patients for University of Kentucky (UK) Healthcare into DataSphere with more than 300 million clinical data records. We evaluated DataSphere by comparing it with an instance of i2b2 deployed at UK Healthcare, demonstrating that DataSphere provides enhanced user experience for both query building and execution.

  6. A Comparison and Evaluation of Real-Time Software Systems Modeling Languages

    Science.gov (United States)

    Evensen, Kenneth D.; Weiss, Kathryn Anne

    2010-01-01

    A model-driven approach to real-time software systems development enables the conceptualization of software, fostering a more thorough understanding of its often complex architecture and behavior while promoting the documentation and analysis of concerns common to real-time embedded systems such as scheduling, resource allocation, and performance. Several modeling languages have been developed to assist in the model-driven software engineering effort for real-time systems, and these languages are beginning to gain traction with practitioners throughout the aerospace industry. This paper presents a survey of several real-time software system modeling languages, namely the Architectural Analysis and Design Language (AADL), the Unified Modeling Language (UML), Systems Modeling Language (SysML), the Modeling and Analysis of Real-Time Embedded Systems (MARTE) UML profile, and the AADL for UML profile. Each language has its advantages and disadvantages, and in order to adequately describe a real-time software system's architecture, a complementary use of multiple languages is almost certainly necessary. This paper aims to explore these languages in the context of understanding the value each brings to the model-driven software engineering effort and to determine if it is feasible and practical to combine aspects of the various modeling languages to achieve more complete coverage in architectural descriptions. To this end, each language is evaluated with respect to a set of criteria such as scope, formalisms, and architectural coverage. An example is used to help illustrate the capabilities of the various languages.

  7. Classical dynamics on graphs

    International Nuclear Information System (INIS)

    Barra, F.; Gaspard, P.

    2001-01-01

    We consider the classical evolution of a particle on a graph by using a time-continuous Frobenius-Perron operator that generalizes previous propositions. In this way, the relaxation rates as well as the chaotic properties can be defined for the time-continuous classical dynamics on graphs. These properties are given as the zeros of some periodic-orbit zeta functions. We consider in detail the case of infinite periodic graphs where the particle undergoes a diffusion process. The infinite spatial extension is taken into account by Fourier transforms that decompose the observables and probability densities into sectors corresponding to different values of the wave number. The hydrodynamic modes of diffusion are studied by an eigenvalue problem of a Frobenius-Perron operator corresponding to a given sector. The diffusion coefficient is obtained from the hydrodynamic modes of diffusion and has the Green-Kubo form. Moreover, we study finite but large open graphs that converge to the infinite periodic graph when their size goes to infinity. The lifetime of the particle on the open graph is shown to correspond to the lifetime of a system that undergoes a diffusion process before it escapes

  8. Graph-based Techniques for Topic Classification of Tweets in Spanish

    Directory of Open Access Journals (Sweden)

    Hector Cordobés

    2014-03-01

    Full Text Available Topic classification of texts is one of the most interesting challenges in Natural Language Processing (NLP. Topic classifiers commonly use a bag-of-words approach, in which the classifier uses (and is trained with selected terms from the input texts. In this work we present techniques based on graph similarity to classify short texts by topic. In our classifier we build graphs from the input texts, and then use properties of these graphs to classify them. We have tested the resulting algorithm by classifying Twitter messages in Spanish among a predefined set of topics, achieving more than 70% accuracy.

  9. Does query expansion limit our learning? A comparison of social-based expansion to content-based expansion for medical queries on the internet.

    Science.gov (United States)

    Pentoney, Christopher; Harwell, Jeff; Leroy, Gondy

    2014-01-01

    Searching for medical information online is a common activity. While it has been shown that forming good queries is difficult, Google's query suggestion tool, a type of query expansion, aims to facilitate query formation. However, it is unknown how this expansion, which is based on what others searched for, affects the information gathering of the online community. To measure the impact of social-based query expansion, this study compared it with content-based expansion, i.e., what is really in the text. We used 138,906 medical queries from the AOL User Session Collection and expanded them using Google's Autocomplete method (social-based) and the content of the Google Web Corpus (content-based). We evaluated the specificity and ambiguity of the expansion terms for trigram queries. We also looked at the impact on the actual results using domain diversity and expansion edit distance. Results showed that the social-based method provided more precise expansion terms as well as terms that were less ambiguous. Expanded queries do not differ significantly in diversity when expanded using the social-based method (6.72 different domains returned in the first ten results, on average) vs. content-based method (6.73 different domains, on average).

  10. Semantic content-based recommendations using semantic graphs.

    Science.gov (United States)

    Guo, Weisen; Kraines, Steven B

    2010-01-01

    Recommender systems (RSs) can be useful for suggesting items that might be of interest to specific users. Most existing content-based recommendation (CBR) systems are designed to recommend items based on text content, and the items in these systems are usually described with keywords. However, similarity evaluations based on keywords suffer from the ambiguity of natural languages. We present a semantic CBR method that uses Semantic Web technologies to recommend items that are more similar semantically with the items that the user prefers. We use semantic graphs to represent the items and we calculate the similarity scores for each pair of semantic graphs using an inverse graph frequency algorithm. The items having higher similarity scores to the items that are known to be preferred by the user are recommended.

  11. Query-by-example surgical activity detection.

    Science.gov (United States)

    Gao, Yixin; Vedula, S Swaroop; Lee, Gyusung I; Lee, Mija R; Khudanpur, Sanjeev; Hager, Gregory D

    2016-06-01

    Easy acquisition of surgical data opens many opportunities to automate skill evaluation and teaching. Current technology to search tool motion data for surgical activity segments of interest is limited by the need for manual pre-processing, which can be prohibitive at scale. We developed a content-based information retrieval method, query-by-example (QBE), to automatically detect activity segments within surgical data recordings of long duration that match a query. The example segment of interest (query) and the surgical data recording (target trial) are time series of kinematics. Our approach includes an unsupervised feature learning module using a stacked denoising autoencoder (SDAE), two scoring modules based on asymmetric subsequence dynamic time warping (AS-DTW) and template matching, respectively, and a detection module. A distance matrix of the query against the trial is computed using the SDAE features, followed by AS-DTW combined with template scoring, to generate a ranked list of candidate subsequences (substrings). To evaluate the quality of the ranked list against the ground-truth, thresholding conventional DTW distances and bipartite matching are applied. We computed the recall, precision, F1-score, and a Jaccard index-based score on three experimental setups. We evaluated our QBE method using a suture throw maneuver as the query, on two tool motion datasets (JIGSAWS and MISTIC-SL) captured in a training laboratory. We observed a recall of 93, 90 and 87 % and a precision of 93, 91, and 88 % with same surgeon same trial (SSST), same surgeon different trial (SSDT) and different surgeon (DS) experiment setups on JIGSAWS, and a recall of 87, 81 and 75 % and a precision of 72, 61, and 53 % with SSST, SSDT and DS experiment setups on MISTIC-SL, respectively. We developed a novel, content-based information retrieval method to automatically detect multiple instances of an activity within long surgical recordings. Our method demonstrated adequate recall

  12. ViSlang: A system for interpreted domain-specific languages for scientific visualization

    KAUST Repository

    Rautek, Peter

    2014-12-31

    Researchers from many domains use scientific visualization in their daily practice. Existing implementations of algorithms usually come with a graphical user interface (high-level interface), or as software library or source code (low-level interface). In this paper we present a system that integrates domain-specific languages (DSLs) and facilitates the creation of new DSLs. DSLs provide an effective interface for domain scientists avoiding the difficulties involved with low-level interfaces and at the same time offering more flexibility than high-level interfaces. We describe the design and implementation of ViSlang, an interpreted language specifically tailored for scientific visualization. A major contribution of our design is the extensibility of the ViSlang language. Novel DSLs that are tailored to the problems of the domain can be created and integrated into ViSlang. We show that our approach can be added to existing user interfaces to increase the flexibility for expert users on demand, but at the same time does not interfere with the user experience of novice users. To demonstrate the flexibility of our approach we present new DSLs for volume processing, querying and visualization. We report the implementation effort for new DSLs and compare our approach with Matlab and Python implementations in terms of run-time performance.

  13. Neural networks for link prediction in realistic biomedical graphs: a multi-dimensional evaluation of graph embedding-based approaches.

    Science.gov (United States)

    Crichton, Gamal; Guo, Yufan; Pyysalo, Sampo; Korhonen, Anna

    2018-05-21

    Link prediction in biomedical graphs has several important applications including predicting Drug-Target Interactions (DTI), Protein-Protein Interaction (PPI) prediction and Literature-Based Discovery (LBD). It can be done using a classifier to output the probability of link formation between nodes. Recently several works have used neural networks to create node representations which allow rich inputs to neural classifiers. Preliminary works were done on this and report promising results. However they did not use realistic settings like time-slicing, evaluate performances with comprehensive metrics or explain when or why neural network methods outperform. We investigated how inputs from four node representation algorithms affect performance of a neural link predictor on random- and time-sliced biomedical graphs of real-world sizes (∼ 6 million edges) containing information relevant to DTI, PPI and LBD. We compared the performance of the neural link predictor to those of established baselines and report performance across five metrics. In random- and time-sliced experiments when the neural network methods were able to learn good node representations and there was a negligible amount of disconnected nodes, those approaches outperformed the baselines. In the smallest graph (∼ 15,000 edges) and in larger graphs with approximately 14% disconnected nodes, baselines such as Common Neighbours proved a justifiable choice for link prediction. At low recall levels (∼ 0.3) the approaches were mostly equal, but at higher recall levels across all nodes and average performance at individual nodes, neural network approaches were superior. Analysis showed that neural network methods performed well on links between nodes with no previous common neighbours; potentially the most interesting links. Additionally, while neural network methods benefit from large amounts of data, they require considerable amounts of computational resources to utilise them. Our results indicate

  14. Modern graph theory

    CERN Document Server

    Bollobás, Béla

    1998-01-01

    The time has now come when graph theory should be part of the education of every serious student of mathematics and computer science, both for its own sake and to enhance the appreciation of mathematics as a whole. This book is an in-depth account of graph theory, written with such a student in mind; it reflects the current state of the subject and emphasizes connections with other branches of pure mathematics. The volume grew out of the author's earlier book, Graph Theory -- An Introductory Course, but its length is well over twice that of its predecessor, allowing it to reveal many exciting new developments in the subject. Recognizing that graph theory is one of several courses competing for the attention of a student, the book contains extensive descriptive passages designed to convey the flavor of the subject and to arouse interest. In addition to a modern treatment of the classical areas of graph theory such as coloring, matching, extremal theory, and algebraic graph theory, the book presents a detailed ...

  15. Experimental quantum private queries with linear optics

    International Nuclear Information System (INIS)

    De Martini, Francesco; Giovannetti, Vittorio; Lloyd, Seth; Maccone, Lorenzo; Nagali, Eleonora; Sansoni, Linda; Sciarrino, Fabio

    2009-01-01

    The quantum private query is a quantum cryptographic protocol to recover information from a database, preserving both user and data privacy: the user can test whether someone has retained information on which query was asked and the database provider can test the amount of information released. Here we discuss a variant of the quantum private query algorithm that admits a simple linear optical implementation: it employs the photon's momentum (or time slot) as address qubits and its polarization as bus qubit. A proof-of-principle experimental realization is implemented.

  16. jQuery Mobile Up and Running

    CERN Document Server

    Firtman, Maximiliano

    2012-01-01

    Would you like to build one mobile web application that works on iPad and Kindle Fire as well as iPhone and Android smartphones? This introductory guide to jQuery Mobile shows you how. Through a series of hands-on exercises, you'll learn the best ways to use this framework's many interface components to build customizable, multiplatform apps. You don't need any programming skills or previous experience with jQuery to get started. By the time you finish this book, you'll know how to create responsive, Ajax-based interfaces that work on a variety of smartphones and tablets, using jQuery Mobile

  17. PatternQuery: web application for fast detection of biomacromolecular structural patterns in the entire Protein Data Bank.

    Science.gov (United States)

    Sehnal, David; Pravda, Lukáš; Svobodová Vařeková, Radka; Ionescu, Crina-Maria; Koča, Jaroslav

    2015-07-01

    Well defined biomacromolecular patterns such as binding sites, catalytic sites, specific protein or nucleic acid sequences, etc. precisely modulate many important biological phenomena. We introduce PatternQuery, a web-based application designed for detection and fast extraction of such patterns. The application uses a unique query language with Python-like syntax to define the patterns that will be extracted from datasets provided by the user, or from the entire Protein Data Bank (PDB). Moreover, the database-wide search can be restricted using a variety of criteria, such as PDB ID, resolution, and organism of origin, to provide only relevant data. The extraction generally takes a few seconds for several hundreds of entries, up to approximately one hour for the whole PDB. The detected patterns are made available for download to enable further processing, as well as presented in a clear tabular and graphical form directly in the browser. The unique design of the language and the provided service could pave the way towards novel PDB-wide analyses, which were either difficult or unfeasible in the past. The application is available free of charge at http://ncbr.muni.cz/PatternQuery. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Time-related patient data retrieval for the case studies from the pharmacogenomics research network

    Science.gov (United States)

    Zhu, Qian; Tao, Cui; Ding, Ying; Chute, Christopher G.

    2012-01-01

    There are lots of question-based data elements from the pharmacogenomics research network (PGRN) studies. Many data elements contain temporal information. To semantically represent these elements so that they can be machine processiable is a challenging problem for the following reasons: (1) the designers of these studies usually do not have the knowledge of any computer modeling and query languages, so that the original data elements usually are represented in spreadsheets in human languages; and (2) the time aspects in these data elements can be too complex to be represented faithfully in a machine-understandable way. In this paper, we introduce our efforts on representing these data elements using semantic web technologies. We have developed an ontology, CNTRO, for representing clinical events and their temporal relations in the web ontology language (OWL). Here we use CNTRO to represent the time aspects in the data elements. We have evaluated 720 time-related data elements from PGRN studies. We adapted and extended the knowledge representation requirements for EliXR-TIME to categorize our data elements. A CNTRO-based SPARQL query builder has been developed to customize users’ own SPARQL queries for each knowledge representation requirement. The SPARQL query builder has been evaluated with a simulated EHR triple store to ensure its functionalities. PMID:23076712

  19. Time-related patient data retrieval for the case studies from the pharmacogenomics research network.

    Science.gov (United States)

    Zhu, Qian; Tao, Cui; Ding, Ying; Chute, Christopher G

    2012-11-01

    There are lots of question-based data elements from the pharmacogenomics research network (PGRN) studies. Many data elements contain temporal information. To semantically represent these elements so that they can be machine processiable is a challenging problem for the following reasons: (1) the designers of these studies usually do not have the knowledge of any computer modeling and query languages, so that the original data elements usually are represented in spreadsheets in human languages; and (2) the time aspects in these data elements can be too complex to be represented faithfully in a machine-understandable way. In this paper, we introduce our efforts on representing these data elements using semantic web technologies. We have developed an ontology, CNTRO, for representing clinical events and their temporal relations in the web ontology language (OWL). Here we use CNTRO to represent the time aspects in the data elements. We have evaluated 720 time-related data elements from PGRN studies. We adapted and extended the knowledge representation requirements for EliXR-TIME to categorize our data elements. A CNTRO-based SPARQL query builder has been developed to customize users' own SPARQL queries for each knowledge representation requirement. The SPARQL query builder has been evaluated with a simulated EHR triple store to ensure its functionalities.

  20. Predicting Drug Recalls From Internet Search Engine Queries.

    Science.gov (United States)

    Yom-Tov, Elad

    2017-01-01

    Batches of pharmaceuticals are sometimes recalled from the market when a safety issue or a defect is detected in specific production runs of a drug. Such problems are usually detected when patients or healthcare providers report abnormalities to medical authorities. Here, we test the hypothesis that defective production lots can be detected earlier by monitoring queries to Internet search engines. We extracted queries from the USA to the Bing search engine, which mentioned one of the 5195 pharmaceutical drugs during 2015 and all recall notifications issued by the Food and Drug Administration (FDA) during that year. By using attributes that quantify the change in query volume at the state level, we attempted to predict if a recall of a specific drug will be ordered by FDA in a time horizon ranging from 1 to 40 days in future. Our results show that future drug recalls can indeed be identified with an AUC of 0.791 and a lift at 5% of approximately 6 when predicting a recall occurring one day ahead. This performance degrades as prediction is made for longer periods ahead. The most indicative attributes for prediction are sudden spikes in query volume about a specific medicine in each state. Recalls of prescription drugs and those estimated to be of medium-risk are more likely to be identified using search query data. These findings suggest that aggregated Internet search engine data can be used to facilitate in early warning of faulty batches of medicines.

  1. Unemployment Insurance Query (UIQ)

    Data.gov (United States)

    Social Security Administration — The Unemployment Insurance Query (UIQ) provides State Unemployment Insurance agencies real-time online access to SSA data. This includes SSN verification and Title...

  2. Graphs and Homomorphisms

    CERN Document Server

    Hell, Pavol

    2004-01-01

    This is a book about graph homomorphisms. Graph theory is now an established discipline but the study of graph homomorphisms has only recently begun to gain wide acceptance and interest. The subject gives a useful perspective in areas such as graph reconstruction, products, fractional and circular colourings, and has applications in complexity theory, artificial intelligence, telecommunication, and, most recently, statistical physics.Based on the authors' lecture notes for graduate courses, this book can be used as a textbook for a second course in graph theory at 4th year or master's level an

  3. A graph model for opportunistic network coding

    KAUST Repository

    Sorour, Sameh

    2015-08-12

    © 2015 IEEE. Recent advancements in graph-based analysis and solutions of instantly decodable network coding (IDNC) trigger the interest to extend them to more complicated opportunistic network coding (ONC) scenarios, with limited increase in complexity. In this paper, we design a simple IDNC-like graph model for a specific subclass of ONC, by introducing a more generalized definition of its vertices and the notion of vertex aggregation in order to represent the storage of non-instantly-decodable packets in ONC. Based on this representation, we determine the set of pairwise vertex adjacency conditions that can populate this graph with edges so as to guarantee decodability or aggregation for the vertices of each clique in this graph. We then develop the algorithmic procedures that can be applied on the designed graph model to optimize any performance metric for this ONC subclass. A case study on reducing the completion time shows that the proposed framework improves on the performance of IDNC and gets very close to the optimal performance.

  4. Fragger: a protein fragment picker for structural queries.

    Science.gov (United States)

    Berenger, Francois; Simoncini, David; Voet, Arnout; Shrestha, Rojan; Zhang, Kam Y J

    2017-01-01

    Protein modeling and design activities often require querying the Protein Data Bank (PDB) with a structural fragment, possibly containing gaps. For some applications, it is preferable to work on a specific subset of the PDB or with unpublished structures. These requirements, along with specific user needs, motivated the creation of a new software to manage and query 3D protein fragments. Fragger is a protein fragment picker that allows protein fragment databases to be created and queried. All fragment lengths are supported and any set of PDB files can be used to create a database. Fragger can efficiently search a fragment database with a query fragment and a distance threshold. Matching fragments are ranked by distance to the query. The query fragment can have structural gaps and the allowed amino acid sequences matching a query can be constrained via a regular expression of one-letter amino acid codes. Fragger also incorporates a tool to compute the backbone RMSD of one versus many fragments in high throughput. Fragger should be useful for protein design, loop grafting and related structural bioinformatics tasks.

  5. Path-based Queries on Trajectory Data

    DEFF Research Database (Denmark)

    Krogh, Benjamin Bjerre; Pelekis, Nikos; Theodoridis, Yannis

    2014-01-01

    In traffic research, management, and planning a number of path-based analyses are heavily used, e.g., for computing turn-times, evaluating green waves, or studying traffic flow. These analyses require retrieving the trajectories that follow the full path being analyzed. Existing path queries cannot...... sufficiently support such path-based analyses because they retrieve all trajectories that touch any edge in the path. In this paper, we define and formalize the strict path query. This is a novel query type tailored to support path-based analysis, where trajectories must follow all edges in the path...... a specific path by only retrieving data from the first and last edge in the path. To correctly answer strict path queries existing network-constrained trajectory indexes must retrieve data from all edges in the path. An extensive performance study of NETTRA using a very large real-world trajectory data set...

  6. Graphical Language for Data Processing

    Science.gov (United States)

    Alphonso, Keith

    2011-01-01

    A graphical language for processing data allows processing elements to be connected with virtual wires that represent data flows between processing modules. The processing of complex data, such as lidar data, requires many different algorithms to be applied. The purpose of this innovation is to automate the processing of complex data, such as LIDAR, without the need for complex scripting and programming languages. The system consists of a set of user-interface components that allow the user to drag and drop various algorithmic and processing components onto a process graph. By working graphically, the user can completely visualize the process flow and create complex diagrams. This innovation supports the nesting of graphs, such that a graph can be included in another graph as a single step for processing. In addition to the user interface components, the system includes a set of .NET classes that represent the graph internally. These classes provide the internal system representation of the graphical user interface. The system includes a graph execution component that reads the internal representation of the graph (as described above) and executes that graph. The execution of the graph follows the interpreted model of execution in that each node is traversed and executed from the original internal representation. In addition, there are components that allow external code elements, such as algorithms, to be easily integrated into the system, thus making the system infinitely expandable.

  7. Querying on Federated Sensor Networks

    Directory of Open Access Journals (Sweden)

    Zuhal Can

    2016-09-01

    Full Text Available A Federated Sensor Network (FSN is a network of geographically distributed Wireless Sensor Networks (WSNs called islands. For querying on an FSN, we introduce the Layered Federated Sensor Network (L-FSN Protocol. For layered management, L-FSN provides communication among islands by its inter-island querying protocol by which a query packet routing path is determined according to some path selection policies. L-FSN allows autonomous management of each island by island-specific intra-island querying protocols that can be selected according to island properties. We evaluate the applicability of L-FSN and compare the L-FSN protocol with various querying protocols running on the flat federation model. Flat federation is a method to federate islands by running a single querying protocol on an entire FSN without distinguishing communication among and within islands. For flat federation, we select a querying protocol from geometrical, hierarchical cluster-based, hash-based, and tree-based WSN querying protocol categories. We found that a layered federation of islands by L-FSN increases the querying performance with respect to energy-efficiency, query resolving distance, and query resolving latency. Moreover, L-FSN’s flexibility of choosing intra-island querying protocols regarding the island size brings advantages on energy-efficiency and query resolving latency.

  8. Particle transport in breathing quantum graph

    International Nuclear Information System (INIS)

    Matrasulov, D.U.; Yusupov, J.R.; Sabirov, K.K.; Sobirov, Z.A.

    2012-01-01

    Full text: Particle transport in nanoscale networks and discrete structures is of fundamental and practical importance. Usually such systems are modeled by so-called quantum graphs, the systems attracting much attention in physics and mathematics during past two decades [1-5]. During last two decades quantum graphs found numerous applications in modeling different discrete structures and networks in nanoscale and mesoscopic physics (e.g., see reviews [1-3]). Despite considerable progress made in the study of particle dynamics most of the problems deal with unperturbed case and the case of time-dependent perturbation has not yet be explored. In this work we treat particle dynamics for quantum star graph with time-dependent bonds. In particular, we consider harmonically breathing quantum star graphs, the cases of monotonically contracting and expanding graphs. The latter can be solved exactly analytically. Edge boundaries are considered to be time-dependent, while branching point is assumed to be fixed. Quantum dynamics of a particle in such graphs is studied by solving Schrodinger equation with time-dependent boundary conditions given on a star graph. Time-dependence of the average kinetic energy is analyzed. Space-time evolution of the Gaussian wave packet is treated for harmonically breathing star graph. It is found that for certain frequencies energy is a periodic function of time, while for others it can be non-monotonically growing function of time. Such a feature can be caused by possible synchronization of the particles motion and the motions of the moving edges of graph bonds. (authors) References: [1] Tsampikos Kottos and Uzy Smilansky, Ann. Phys., 76, 274 (1999). [2] Sven Gnutzmann and Uzy Smilansky, Adv. Phys. 55, 527 (2006). [3] S. GnutzmannJ.P. Keating, F. Piotet, Ann. Phys., 325, 2595 (2010). [4] P.Exner, P.Seba, P.Stovicek, J. Phys. A: Math. Gen. 21, 4009 (1988). [5] J. Boman, P. Kurasov, Adv. Appl. Math., 35, 58 (2005)

  9. A Multi-Query Optimizer for Monet

    NARCIS (Netherlands)

    S. Manegold (Stefan); A.J. Pellenkoft (Jan); M.L. Kersten (Martin)

    2000-01-01

    textabstractDatabase systems allow for concurrent use of several applications (and query interfaces). Each application generates an ``optimal'' plan---a sequence of low-level database operators---for accessing the database. The queries posed by users through the same application can be optimized

  10. A multi-query optimizer for Monet

    NARCIS (Netherlands)

    S. Manegold (Stefan); A.J. Pellenkoft (Jan); M.L. Kersten (Martin)

    2000-01-01

    textabstractDatabase systems allow for concurrent use of several applications (and query interfaces). Each application generates an ``optimal'' plan---a sequence of low-level database operators---for accessing the database. The queries posed by users through the same application can be optimized

  11. Steiner Distance in Graphs--A Survey

    OpenAIRE

    Mao, Yaping

    2017-01-01

    For a connected graph $G$ of order at least $2$ and $S\\subseteq V(G)$, the \\emph{Steiner distance} $d_G(S)$ among the vertices of $S$ is the minimum size among all connected subgraphs whose vertex sets contain $S$. In this paper, we summarize the known results on the Steiner distance parameters, including Steiner distance, Steiner diameter, Steiner center, Steiner median, Steiner interval, Steiner distance hereditary graph, Steiner distance stable graph, average Steiner distance, and Steiner ...

  12. Continuous-time quantum walk on an extended star graph: Trapping and superradiance transition

    Science.gov (United States)

    Yalouz, Saad; Pouthier, Vincent

    2018-02-01

    A tight-binding model is introduced for describing the dynamics of an exciton on an extended star graph whose central node is occupied by a trap. On this graph, the exciton dynamics is governed by two kinds of eigenstates: many eigenstates are associated with degenerate real eigenvalues insensitive to the trap, whereas three decaying eigenstates characterized by complex energies contribute to the trapping process. It is shown that the excitonic population absorbed by the trap depends on the size of the graph, only. By contrast, both the size parameters and the absorption rate control the dynamics of the trapping. When these parameters are judiciously chosen, the efficiency of the transfer is optimized resulting in the minimization of the absorption time. Analysis of the eigenstates reveals that such a feature arises around the superradiance transition. Moreover, depending on the size of the network, two situations are highlighted where the transport efficiency is either superoptimized or suboptimized.

  13. Using Graph and Vertex Entropy to Compare Empirical Graphs with Theoretical Graph Models

    Directory of Open Access Journals (Sweden)

    Tomasz Kajdanowicz

    2016-09-01

    Full Text Available Over the years, several theoretical graph generation models have been proposed. Among the most prominent are: the Erdős–Renyi random graph model, Watts–Strogatz small world model, Albert–Barabási preferential attachment model, Price citation model, and many more. Often, researchers working with real-world data are interested in understanding the generative phenomena underlying their empirical graphs. They want to know which of the theoretical graph generation models would most probably generate a particular empirical graph. In other words, they expect some similarity assessment between the empirical graph and graphs artificially created from theoretical graph generation models. Usually, in order to assess the similarity of two graphs, centrality measure distributions are compared. For a theoretical graph model this means comparing the empirical graph to a single realization of a theoretical graph model, where the realization is generated from the given model using an arbitrary set of parameters. The similarity between centrality measure distributions can be measured using standard statistical tests, e.g., the Kolmogorov–Smirnov test of distances between cumulative distributions. However, this approach is both error-prone and leads to incorrect conclusions, as we show in our experiments. Therefore, we propose a new method for graph comparison and type classification by comparing the entropies of centrality measure distributions (degree centrality, betweenness centrality, closeness centrality. We demonstrate that our approach can help assign the empirical graph to the most similar theoretical model using a simple unsupervised learning method.

  14. A general approach to query flattening

    NARCIS (Netherlands)

    van Ruth, J.

    The translation of queries from complex data models to simpler data models is a recurring theme in the construction of efficient data management systems. In this paper we propose a general framework to guide the translation from data models with nested types to a flat relational model (query

  15. The value of Retrospective and Concurrent Think Aloud in formative usability testing of a physician data query tool.

    Science.gov (United States)

    Peute, Linda W P; de Keizer, Nicolette F; Jaspers, Monique W M

    2015-06-01

    To compare the performance of the Concurrent (CTA) and Retrospective (RTA) Think Aloud method and to assess their value in a formative usability evaluation of an Intensive Care Registry-physician data query tool designed to support ICU quality improvement processes. Sixteen representative intensive care physicians participated in the usability evaluation study. Subjects were allocated to either the CTA or RTA method by a matched randomized design. Each subject performed six usability-testing tasks of varying complexity in the query tool in a real-working context. Methods were compared with regard to number and type of problems detected. Verbal protocols of CTA and RTA were analyzed in depth to assess differences in verbal output. Standardized measures were applied to assess thoroughness in usability problem detection weighted per problem severity level and method overall effectiveness in detecting usability problems with regard to the time subjects spent per method. The usability evaluation of the data query tool revealed a total of 43 unique usability problems that the intensive care physicians encountered. CTA detected unique usability problems with regard to graphics/symbols, navigation issues, error messages, and the organization of information on the query tool's screens. RTA detected unique issues concerning system match with subjects' language and applied terminology. The in-depth verbal protocol analysis of CTA provided information on intensive care physicians' query design strategies. Overall, CTA performed significantly better than RTA in detecting usability problems. CTA usability problem detection effectiveness was 0.80 vs. 0.62 (pusability problems of a moderate (0.85 vs. 0.7) and severe nature (0.71 vs. 0.57). In this study, the CTA is more effective in usability-problem detection and provided clarification of intensive care physician query design strategies to inform redesign of the query tool. However, CTA does not outperform RTA. The RTA

  16. Generalizing a categorization of students’ interpretations of linear kinematics graphs

    Directory of Open Access Journals (Sweden)

    Laurens Bollen

    2016-02-01

    Full Text Available We have investigated whether and how a categorization of responses to questions on linear distance-time graphs, based on a study of Irish students enrolled in an algebra-based course, could be adopted and adapted to responses from students enrolled in calculus-based physics courses at universities in Flanders, Belgium (KU Leuven and the Basque Country, Spain (University of the Basque Country. We discuss how we adapted the categorization to accommodate a much more diverse student cohort and explain how the prior knowledge of students may account for many differences in the prevalence of approaches and success rates. Although calculus-based physics students make fewer mistakes than algebra-based physics students, they encounter similar difficulties that are often related to incorrectly dividing two coordinates. We verified that a qualitative understanding of kinematics is an important but not sufficient condition for students to determine a correct value for the speed. When comparing responses to questions on linear distance-time graphs with responses to isomorphic questions on linear water level versus time graphs, we observed that the context of a question influences the approach students use. Neither qualitative understanding nor an ability to find the slope of a context-free graph proved to be a reliable predictor for the approach students use when they determine the instantaneous speed.

  17. Generalizing a categorization of students' interpretations of linear kinematics graphs

    Science.gov (United States)

    Bollen, Laurens; De Cock, Mieke; Zuza, Kristina; Guisasola, Jenaro; van Kampen, Paul

    2016-06-01

    We have investigated whether and how a categorization of responses to questions on linear distance-time graphs, based on a study of Irish students enrolled in an algebra-based course, could be adopted and adapted to responses from students enrolled in calculus-based physics courses at universities in Flanders, Belgium (KU Leuven) and the Basque Country, Spain (University of the Basque Country). We discuss how we adapted the categorization to accommodate a much more diverse student cohort and explain how the prior knowledge of students may account for many differences in the prevalence of approaches and success rates. Although calculus-based physics students make fewer mistakes than algebra-based physics students, they encounter similar difficulties that are often related to incorrectly dividing two coordinates. We verified that a qualitative understanding of kinematics is an important but not sufficient condition for students to determine a correct value for the speed. When comparing responses to questions on linear distance-time graphs with responses to isomorphic questions on linear water level versus time graphs, we observed that the context of a question influences the approach students use. Neither qualitative understanding nor an ability to find the slope of a context-free graph proved to be a reliable predictor for the approach students use when they determine the instantaneous speed.

  18. Collective spatial keyword querying

    DEFF Research Database (Denmark)

    Cao, Xin; Cong, Gao; Jensen, Christian S.

    2011-01-01

    With the proliferation of geo-positioning and geo-tagging, spatial web objects that possess both a geographical location and a textual description are gaining in prevalence, and spatial keyword queries that exploit both location and textual description are gaining in prominence. However, the quer......With the proliferation of geo-positioning and geo-tagging, spatial web objects that possess both a geographical location and a textual description are gaining in prevalence, and spatial keyword queries that exploit both location and textual description are gaining in prominence. However......, the queries studied so far generally focus on finding individual objects that each satisfy a query rather than finding groups of objects where the objects in a group collectively satisfy a query. We define the problem of retrieving a group of spatial web objects such that the group's keywords cover the query......'s keywords and such that objects are nearest to the query location and have the lowest inter-object distances. Specifically, we study two variants of this problem, both of which are NP-complete. We devise exact solutions as well as approximate solutions with provable approximation bounds to the problems. We...

  19. Path Minima Queries in Dynamic Weighted Trees

    DEFF Research Database (Denmark)

    Davoodi, Pooya; Brodal, Gerth Stølting; Satti, Srinivasa Rao

    2011-01-01

    In the path minima problem on a tree, each edge is assigned a weight and a query asks for the edge with minimum weight on a path between two nodes. For the dynamic version of the problem, where the edge weights can be updated, we give data structures that achieve optimal query time\\todo{what about...

  20. Dynamic Planar Convex Hull with Optimal Query Time and O(log n · log log n ) Update Time

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Jakob, Riko

    2000-01-01

    The dynamic maintenance of the convex hull of a set of points in the plane is one of the most important problems in computational geometry. We present a data structure supporting point insertions in amortized O(log n · log log log n) time, point deletions in amortized O(log n · log log n) time......, and various queries about the convex hull in optimal O(log n) worst-case time. The data structure requires O(n) space. Applications of the new dynamic convex hull data structure are improved deterministic algorithms for the k-level problem and the red-blue segment intersection problem where all red and all...

  1. SM4MQ: A Semantic Model for Multidimensional Queries

    DEFF Research Database (Denmark)

    Varga, Jovan; Dobrokhotova, Ekaterina; Romero, Oscar

    2017-01-01

    metadata artifacts (e.g., queries) to assist users with the analysis. However, modeling and sharing of most of these artifacts are typically overlooked. Thus, in this paper we focus on the query metadata artifact in the Exploratory OLAP context and propose an RDF-based vocabulary for its representation......, sharing, and reuse on the SW. As OLAP is based on the underlying multidimensional (MD) data model we denote such queries as MD queries and define SM4MQ: A Semantic Model for Multidimensional Queries. Furthermore, we propose a method to automate the exploitation of queries by means of SPARQL. We apply...... the method to a use case of transforming queries from SM4MQ to a vector representation. For the use case, we developed the prototype and performed an evaluation that shows how our approach can significantly ease and support user assistance such as query recommendation....

  2. Using search engine query data to track pharmaceutical utilization: a study of statins.

    Science.gov (United States)

    Schuster, Nathaniel M; Rogers, Mary A M; McMahon, Laurence F

    2010-08-01

    To examine temporal and geographic associations between Google queries for health information and healthcare utilization benchmarks. Retrospective longitudinal study. Using Google Trends and Google Insights for Search data, the search terms Lipitor (atorvastatin calcium; Pfizer, Ann Arbor, MI) and simvastatin were evaluated for change over time and for association with Lipitor revenues. The relationship between query data and community-based resource use per Medicare beneficiary was assessed for 35 US metropolitan areas. Google queries for Lipitor significantly decreased from January 2004 through June 2009 and queries for simvastatin significantly increased (P patent (P global revenues from 2004 to 2008 (P search engine queries for medical information correlate with pharmaceutical revenue and with overall healthcare utilization in a community. This suggests that search query data can track community-wide characteristics in healthcare utilization and have the potential for informing payers and policy makers regarding trends in utilization.

  3. A linear time algorithm for minimum fill-in and treewidth for distance heredity graphs

    NARCIS (Netherlands)

    Broersma, Haitze J.; Dahlhaus, E.; Kloks, A.J.J.; Kloks, T.

    2000-01-01

    A graph is distance hereditary if it preserves distances in all its connected induced subgraphs. The MINIMUM FILL-IN problem is the problem of finding a chordal supergraph with the smallest possible number of edges. The TREEWIDTH problem is the problem of finding a chordal embedding of the graph

  4. WSU-IR at TREC 2015 Clinical Decision Support Track: Joint Weighting of Explicit and Latent Medical Query Concepts from Diverse Sources

    Science.gov (United States)

    2015-11-20

    Iron Deficiency Anemia ” was represented using the Indri query language as follows: 1.00...weight( 0.40 #combine( Iron Deficiency Anemia ) 0.35 #combine( #od4( Iron Deficiency ) #od4( Deficiency Anemia ) ) 0.45 #combine( #uw17( Iron Deficiency ...uw17( Deficiency Anemia ) ) ) where 0.40, 0.35 and 0.45 are the weights of the corresponding concept types. The window sizes for ordered

  5. Sensor-Generated Time Series Events: A Definition Language

    Science.gov (United States)

    Anguera, Aurea; Lara, Juan A.; Lizcano, David; Martínez, Maria Aurora; Pazos, Juan

    2012-01-01

    There are now a great many domains where information is recorded by sensors over a limited time period or on a permanent basis. This data flow leads to sequences of data known as time series. In many domains, like seismography or medicine, time series analysis focuses on particular regions of interest, known as events, whereas the remainder of the time series contains hardly any useful information. In these domains, there is a need for mechanisms to identify and locate such events. In this paper, we propose an events definition language that is general enough to be used to easily and naturally define events in time series recorded by sensors in any domain. The proposed language has been applied to the definition of time series events generated within the branch of medicine dealing with balance-related functions in human beings. A device, called posturograph, is used to study balance-related functions. The platform has four sensors that record the pressure intensity being exerted on the platform, generating four interrelated time series. As opposed to the existing ad hoc proposals, the results confirm that the proposed language is valid, that is generally applicable and accurate, for identifying the events contained in the time series.

  6. A Reduction of the Graph Reconstruction Conjecture

    Directory of Open Access Journals (Sweden)

    Monikandan S.

    2014-08-01

    Full Text Available A graph is said to be reconstructible if it is determined up to isomor- phism from the collection of all its one-vertex deleted unlabeled subgraphs. Reconstruction Conjecture (RC asserts that all graphs on at least three vertices are reconstructible. In this paper, we prove that interval-regular graphs and some new classes of graphs are reconstructible and show that RC is true if and only if all non-geodetic and non-interval-regular blocks G with diam(G = 2 or diam(Ḡ = diam(G = 3 are reconstructible

  7. Head First jQuery

    CERN Document Server

    Benedetti, Ryan

    2011-01-01

    Want to add more interactivity and polish to your websites? Discover how jQuery can help you build complex scripting functionality in just a few lines of code. With Head First jQuery, you'll quickly get up to speed on this amazing JavaScript library by learning how to navigate HTML documents while handling events, effects, callbacks, and animations. By the time you've completed the book, you'll be incorporating Ajax apps, working seamlessly with HTML and CSS, and handling data with PHP, MySQL and JSON. If you want to learn-and understand-how to create interactive web pages, unobtrusive scrip

  8. Probabilistic Graph Layout for Uncertain Network Visualization.

    Science.gov (United States)

    Schulz, Christoph; Nocaj, Arlind; Goertler, Jochen; Deussen, Oliver; Brandes, Ulrik; Weiskopf, Daniel

    2017-01-01

    We present a novel uncertain network visualization technique based on node-link diagrams. Nodes expand spatially in our probabilistic graph layout, depending on the underlying probability distributions of edges. The visualization is created by computing a two-dimensional graph embedding that combines samples from the probabilistic graph. A Monte Carlo process is used to decompose a probabilistic graph into its possible instances and to continue with our graph layout technique. Splatting and edge bundling are used to visualize point clouds and network topology. The results provide insights into probability distributions for the entire network-not only for individual nodes and edges. We validate our approach using three data sets that represent a wide range of network types: synthetic data, protein-protein interactions from the STRING database, and travel times extracted from Google Maps. Our approach reveals general limitations of the force-directed layout and allows the user to recognize that some nodes of the graph are at a specific position just by chance.

  9. Multi-Dimensional Top-k Dominating Queries

    DEFF Research Database (Denmark)

    Yiu, Man Lung; Mamoulis, Nikos

    2009-01-01

    The top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. This query is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects. In addition, it combines the advantages of top......-k and skyline queries without sharing their disadvantages: (i) the output size can be controlled, (ii) no ranking functions need to be specified by users, and (iii) the result is independent of the scales at different dimensions. Despite their importance, top-k dominating queries have not received adequate...

  10. GraphStore: A Distributed Graph Storage System for Big Data Networks

    Science.gov (United States)

    Martha, VenkataSwamy

    2013-01-01

    Networks, such as social networks, are a universal solution for modeling complex problems in real time, especially in the Big Data community. While previous studies have attempted to enhance network processing algorithms, none have paved a path for the development of a persistent storage system. The proposed solution, GraphStore, provides an…

  11. The parametric modified limited penetrable visibility graph for constructing complex networks from time series

    Science.gov (United States)

    Li, Xiuming; Sun, Mei; Gao, Cuixia; Han, Dun; Wang, Minggang

    2018-02-01

    This paper presents the parametric modified limited penetrable visibility graph (PMLPVG) algorithm for constructing complex networks from time series. We modify the penetrable visibility criterion of limited penetrable visibility graph (LPVG) in order to improve the rationality of the original penetrable visibility and preserve the dynamic characteristics of the time series. The addition of view angle provides a new approach to characterize the dynamic structure of the time series that is invisible in the previous algorithm. The reliability of the PMLPVG algorithm is verified by applying it to three types of artificial data as well as the actual data of natural gas prices in different regions. The empirical results indicate that PMLPVG algorithm can distinguish the different time series from each other. Meanwhile, the analysis results of natural gas prices data using PMLPVG are consistent with the detrended fluctuation analysis (DFA). The results imply that the PMLPVG algorithm may be a reasonable and significant tool for identifying various time series in different fields.

  12. A Characterization of 2-Tree Probe Interval Graphs

    Directory of Open Access Journals (Sweden)

    Brown David E.

    2014-08-01

    Full Text Available A graph is a probe interval graph if its vertices correspond to some set of intervals of the real line and can be partitioned into sets P and N so that vertices are adjacent if and only if their corresponding intervals intersect and at least one belongs to P. We characterize the 2-trees which are probe interval graphs and extend a list of forbidden induced subgraphs for such graphs created by Pržulj and Corneil in [2-tree probe interval graphs have a large obstruction set, Discrete Appl. Math. 150 (2005 216-231

  13. Quantum Experiments and Graphs: Multiparty States as Coherent Superpositions of Perfect Matchings

    Science.gov (United States)

    Krenn, Mario; Gu, Xuemei; Zeilinger, Anton

    2017-12-01

    We show a surprising link between experimental setups to realize high-dimensional multipartite quantum states and graph theory. In these setups, the paths of photons are identified such that the photon-source information is never created. We find that each of these setups corresponds to an undirected graph, and every undirected graph corresponds to an experimental setup. Every term in the emerging quantum superposition corresponds to a perfect matching in the graph. Calculating the final quantum state is in the #P-complete complexity class, thus it cannot be done efficiently. To strengthen the link further, theorems from graph theory—such as Hall's marriage problem—are rephrased in the language of pair creation in quantum experiments. We show explicitly how this link allows one to answer questions about quantum experiments (such as which classes of entangled states can be created) with graph theoretical methods, and how to potentially simulate properties of graphs and networks with quantum experiments (such as critical exponents and phase transitions).

  14. Quantum Experiments and Graphs: Multiparty States as Coherent Superpositions of Perfect Matchings.

    Science.gov (United States)

    Krenn, Mario; Gu, Xuemei; Zeilinger, Anton

    2017-12-15

    We show a surprising link between experimental setups to realize high-dimensional multipartite quantum states and graph theory. In these setups, the paths of photons are identified such that the photon-source information is never created. We find that each of these setups corresponds to an undirected graph, and every undirected graph corresponds to an experimental setup. Every term in the emerging quantum superposition corresponds to a perfect matching in the graph. Calculating the final quantum state is in the #P-complete complexity class, thus it cannot be done efficiently. To strengthen the link further, theorems from graph theory-such as Hall's marriage problem-are rephrased in the language of pair creation in quantum experiments. We show explicitly how this link allows one to answer questions about quantum experiments (such as which classes of entangled states can be created) with graph theoretical methods, and how to potentially simulate properties of graphs and networks with quantum experiments (such as critical exponents and phase transitions).

  15. Efficient dynamic graph construction for inductive semi-supervised learning.

    Science.gov (United States)

    Dornaika, F; Dahbi, R; Bosaghzadeh, A; Ruichek, Y

    2017-10-01

    Most of graph construction techniques assume a transductive setting in which the whole data collection is available at construction time. Addressing graph construction for inductive setting, in which data are coming sequentially, has received much less attention. For inductive settings, constructing the graph from scratch can be very time consuming. This paper introduces a generic framework that is able to make any graph construction method incremental. This framework yields an efficient and dynamic graph construction method that adds new samples (labeled or unlabeled) to a previously constructed graph. As a case study, we use the recently proposed Two Phase Weighted Regularized Least Square (TPWRLS) graph construction method. The paper has two main contributions. First, we use the TPWRLS coding scheme to represent new sample(s) with respect to an existing database. The representative coefficients are then used to update the graph affinity matrix. The proposed method not only appends the new samples to the graph but also updates the whole graph structure by discovering which nodes are affected by the introduction of new samples and by updating their edge weights. The second contribution of the article is the application of the proposed framework to the problem of graph-based label propagation using multiple observations for vision-based recognition tasks. Experiments on several image databases show that, without any significant loss in the accuracy of the final classification, the proposed dynamic graph construction is more efficient than the batch graph construction. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Google BigQuery analytics

    CERN Document Server

    Tigani, Jordan

    2014-01-01

    How to effectively use BigQuery, avoid common mistakes, and execute sophisticated queries against large datasets Google BigQuery Analytics is the perfect guide for business and data analysts who want the latest tips on running complex queries and writing code to communicate with the BigQuery API. The book uses real-world examples to demonstrate current best practices and techniques, and also explains and demonstrates streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results. In addit

  17. Sparse geometric graphs with small dilation

    NARCIS (Netherlands)

    Aronov, B.; Berg, de M.; Cheong, O.; Gudmundsson, J.; Haverkort, H.J.; Vigneron, A.; Deng, X.; Du, D.

    2005-01-01

    Given a set S of n points in the plane, and an integer k such that 0 = k a geometric graph with vertex set S, at most n – 1 + k edges, and dilation O(n / (k + 1)) can be computed in time O(n log n). We also construct n–point sets for which any geometric graph with n – 1 + k edges

  18. Efficient and Flexible KNN Query Processing in Real-Life Road Networks

    DEFF Research Database (Denmark)

    Lu, Yang; Bui, Bin; Zhao, Jiakui

    2008-01-01

    are included into the RNG index, which enables the index to support both distance-based and time-based KNN queries and continuous KNN queries. Our work extends previous ones by taking into account more practical scenarios, such as complexities in real-life road networks and time-based KNN queries. Extensive......Along with the developments of mobile services, effectively modeling road networks and efficiently indexing and querying network constrained objects has become a challenging problem. In this paper, we first introduce a road network model which captures real-life road networks better than previous...

  19. Responsive web design with jQuery

    CERN Document Server

    Carlos, Gilberto

    2013-01-01

    Responsive Web Design with jQuery follows a standard tutorial-based approach, covering various aspects of responsive web design by building a comprehensive website.""Responsive Web Design with jQuery"" is aimed at web designers who are interested in building device-agnostic websites. You should have a grasp of standard HTML, CSS, and JavaScript development, and have a familiarity with graphic design. Some exposure to jQuery and HTML5 will be beneficial but isn't essential.

  20. Citation graph based ranking in Invenio

    CERN Document Server

    Marian, Ludmila; Rajman, Martin; Vesely, Martin

    2010-01-01

    Invenio is the web-based integrated digital library system developed at CERN. Within this framework, we present four types of ranking models based on the citation graph that complement the simple approach based on citation counts: time-dependent citation counts, a relevancy ranking which extends the PageRank model, a time-dependent ranking which combines the freshness of citations with PageRank and a ranking that takes into consideration the external citations. We present our analysis and results obtained on two main data sets: Inspire and CERN Document Server. Our main contributions are: (i) a study of the currently available ranking methods based on the citation graph; (ii) the development of new ranking methods that correct some of the identified limitations of the current methods such as treating all citations of equal importance, not taking time into account or considering the citation graph complete; (iii) a detailed study of the key parameters for these ranking methods. (The original publication is ava...

  1. Interactive exploration of large-scale time-varying data using dynamic tracking graphs

    KAUST Repository

    Widanagamaachchi, W.; Christensen, C.; Bremer, P.-T; Pascucci, Valerio

    2012-01-01

    that use one spatial dimension to indicate time and show the "tracks" of each feature as it evolves, merges or disappears. However, for practical data sets creating the corresponding optimal graph layouts that minimize the number of intersections can take

  2. PhyloExplorer: a web server to validate, explore and query phylogenetic trees

    Directory of Open Access Journals (Sweden)

    Auberval Nicolas

    2009-05-01

    Full Text Available Abstract Background Many important problems in evolutionary biology require molecular phylogenies to be reconstructed. Phylogenetic trees must then be manipulated for subsequent inclusion in publications or analyses such as supertree inference and tree comparisons. However, no tool is currently available to facilitate the management of tree collections providing, for instance: standardisation of taxon names among trees with respect to a reference taxonomy; selection of relevant subsets of trees or sub-trees according to a taxonomic query; or simply computation of descriptive statistics on the collection. Moreover, although several databases of phylogenetic trees exist, there is currently no easy way to find trees that are both relevant and complementary to a given collection of trees. Results We propose a tool to facilitate assessment and management of phylogenetic tree collections. Given an input collection of rooted trees, PhyloExplorer provides facilities for obtaining statistics describing the collection, correcting invalid taxon names, extracting taxonomically relevant parts of the collection using a dedicated query language, and identifying related trees in the TreeBASE database. Conclusion PhyloExplorer is a simple and interactive website implemented through underlying Python libraries and MySQL databases. It is available at: http://www.ncbi.orthomam.univ-montp2.fr/phyloexplorer/ and the source code can be downloaded from: http://code.google.com/p/taxomanie/.

  3. Learning via Query Synthesis

    KAUST Repository

    Alabdulmohsin, Ibrahim Mansour

    2017-05-07

    Active learning is a subfield of machine learning that has been successfully used in many applications. One of the main branches of active learning is query synthe- sis, where the learning agent constructs artificial queries from scratch in order to reveal sensitive information about the underlying decision boundary. It has found applications in areas, such as adversarial reverse engineering, automated science, and computational chemistry. Nevertheless, the existing literature on membership query synthesis has, generally, focused on finite concept classes or toy problems, with a limited extension to real-world applications. In this thesis, I develop two spectral algorithms for learning halfspaces via query synthesis. The first algorithm is a maximum-determinant convex optimization method while the second algorithm is a Markovian method that relies on Khachiyan’s classical update formulas for solving linear programs. The general theme of these methods is to construct an ellipsoidal approximation of the version space and to synthesize queries, afterward, via spectral decomposition. Moreover, I also describe how these algorithms can be extended to other settings as well, such as pool-based active learning. Having demonstrated that halfspaces can be learned quite efficiently via query synthesis, the second part of this thesis proposes strategies for mitigating the risk of reverse engineering in adversarial environments. One approach that can be used to render query synthesis algorithms ineffective is to implement a randomized response. In this thesis, I propose a semidefinite program (SDP) for learning a distribution of classifiers, subject to the constraint that any individual classifier picked at random from this distributions provides reliable predictions with a high probability. This algorithm is, then, justified both theoretically and empirically. A second approach is to use a non-parametric classification method, such as similarity-based classification. In this

  4. DIGITAL WORKFLOWS FOR A 3D SEMANTIC REPRESENTATION OF AN ANCIENT MINING LANDSCAPE

    Directory of Open Access Journals (Sweden)

    G. Hiebel

    2017-08-01

    Full Text Available The ancient mining landscape of Schwaz/Brixlegg in the Tyrol, Austria witnessed mining from prehistoric times to modern times creating a first order cultural landscape when it comes to one of the most important inventions in human history: the production of metal. In 1991 a part of this landscape was lost due to an enormous landslide that reshaped part of the mountain. With our work we want to propose a digital workflow to create a 3D semantic representation of this ancient mining landscape with its mining structures to preserve it for posterity. First, we define a conceptual model to integrate the data. It is based on the CIDOC CRM ontology and CRMgeo for geometric data. To transform our information sources to a formal representation of the classes and properties of the ontology we applied semantic web technologies and created a knowledge graph in RDF (Resource Description Framework. Through the CRMgeo extension coordinate information of mining features can be integrated into the RDF graph and thus related to the detailed digital elevation model that may be visualized together with the mining structures using Geoinformation systems or 3D visualization tools. The RDF network of the triple store can be queried using the SPARQL query language. We created a snapshot of mining, settlement and burial sites in the Bronze Age. The results of the query were loaded into a Geoinformation system and a visualization of known bronze age sites related to mining, settlement and burial activities was created.

  5. MATHEMATICA APPLICATION FOR GRAPH COLORING AT THE INTERSECTION OF JALAN PANGERAN ANTASARI JAKARTA

    Directory of Open Access Journals (Sweden)

    Suwarno Suwarno

    2017-12-01

    Full Text Available This research examines about graph coloring using Welch-Powell algorithm. This research begins by trying to understand about graph coloring and its algorithm. The case study was conducted at the intersection of Pangeran Antasari Street. In the formation of graph obtained 12 vertices as traffic flow and 16 edges as traffic path. The results of this study obtained 4 chromatic numbers which describes 4 stages of traffic light arrangement. This paper also explains the application of Mathematica software in graph coloring.

  6. A characterization of horizontal visibility graphs and combinatorics on words

    Science.gov (United States)

    Gutin, Gregory; Mansour, Toufik; Severini, Simone

    2011-06-01

    A Horizontal Visibility Graph (HVG) is defined in association with an ordered set of non-negative reals. HVGs realize a methodology in the analysis of time series, their degree distribution being a good discriminator between randomness and chaos Luque et al. [B. Luque, L. Lacasa, F. Ballesteros, J. Luque, Horizontal visibility graphs: exact results for random time series, Phys. Rev. E 80 (2009), 046103]. We prove that a graph is an HVG if and only if it is outerplanar and has a Hamilton path. Therefore, an HVG is a noncrossing graph, as defined in algebraic combinatorics Flajolet and Noy [P. Flajolet, M. Noy, Analytic combinatorics of noncrossing configurations, Discrete Math., 204 (1999) 203-229]. Our characterization of HVGs implies a linear time recognition algorithm. Treating ordered sets as words, we characterize subfamilies of HVGs highlighting various connections with combinatorial statistics and introducing the notion of a visible pair. With this technique, we determine asymptotically the average number of edges of HVGs.

  7. a Novel Approach of Indexing and Retrieving Spatial Polygons for Efficient Spatial Region Queries

    Science.gov (United States)

    Zhao, J. H.; Wang, X. Z.; Wang, F. Y.; Shen, Z. H.; Zhou, Y. C.; Wang, Y. L.

    2017-10-01

    Spatial region queries are more and more widely used in web-based applications. Mechanisms to provide efficient query processing over geospatial data are essential. However, due to the massive geospatial data volume, heavy geometric computation, and high access concurrency, it is difficult to get response in real time. Spatial indexes are usually used in this situation. In this paper, based on k-d tree, we introduce a distributed KD-Tree (DKD-Tree) suitbable for polygon data, and a two-step query algorithm. The spatial index construction is recursive and iterative, and the query is an in memory process. Both the index and query methods can be processed in parallel, and are implemented based on HDFS, Spark and Redis. Experiments on a large volume of Remote Sensing images metadata have been carried out, and the advantages of our method are investigated by comparing with spatial region queries executed on PostgreSQL and PostGIS. Results show that our approach not only greatly improves the efficiency of spatial region query, but also has good scalability, Moreover, the two-step spatial range query algorithm can also save cluster resources to support a large number of concurrent queries. Therefore, this method is very useful when building large geographic information systems.

  8. A NOVEL APPROACH OF INDEXING AND RETRIEVING SPATIAL POLYGONS FOR EFFICIENT SPATIAL REGION QUERIES

    Directory of Open Access Journals (Sweden)

    J. H. Zhao

    2017-10-01

    Full Text Available Spatial region queries are more and more widely used in web-based applications. Mechanisms to provide efficient query processing over geospatial data are essential. However, due to the massive geospatial data volume, heavy geometric computation, and high access concurrency, it is difficult to get response in real time. Spatial indexes are usually used in this situation. In this paper, based on k-d tree, we introduce a distributed KD-Tree (DKD-Tree suitbable for polygon data, and a two-step query algorithm. The spatial index construction is recursive and iterative, and the query is an in memory process. Both the index and query methods can be processed in parallel, and are implemented based on HDFS, Spark and Redis. Experiments on a large volume of Remote Sensing images metadata have been carried out, and the advantages of our method are investigated by comparing with spatial region queries executed on PostgreSQL and PostGIS. Results show that our approach not only greatly improves the efficiency of spatial region query, but also has good scalability, Moreover, the two-step spatial range query algorithm can also save cluster resources to support a large number of concurrent queries. Therefore, this method is very useful when building large geographic information systems.

  9. PiCO QL: A software library for runtime interactive queries on program data

    Science.gov (United States)

    Fragkoulis, Marios; Spinellis, Diomidis; Louridas, Panos

    PiCO QL is an open source C/C++ software whose scientific scope is real-time interactive analysis of in-memory data through SQL queries. It exposes a relational view of a system's or application's data structures, which is queryable through SQL. While the application or system is executing, users can input queries through a web-based interface or issue web service requests. Queries execute on the live data structures through the respective relational views. PiCO QL makes a good candidate for ad-hoc data analysis in applications and for diagnostics in systems settings. Applications of PiCO QL include the Linux kernel, the Valgrind instrumentation framework, a GIS application, a virtual real-time observatory of stellar objects, and a source code analyser.

  10. PiCO QL: A software library for runtime interactive queries on program data

    Directory of Open Access Journals (Sweden)

    Marios Fragkoulis

    2016-01-01

    Full Text Available Pico ql is an open source c/c++ software whose scientific scope is real-time interactive analysis of in-memory data through sql queries. It exposes a relational view of a system’s or application’s data structures, which is queryable through sql. While the application or system is executing, users can input queries through a web-based interface or issue web service requests. Queries execute on the live data structures through the respective relational views. pico ql makes a good candidate for ad-hoc data analysis in applications and for diagnostics in systems settings. Applications of pico ql include the Linux kernel, the Valgrind instrumentation framework, a gis application, a virtual real-time observatory of stellar objects, and a source code analyser.

  11. Fifth Graders' Additive and Multiplicative Reasoning: Establishing Connections across Conceptual Fields Using a Graph

    Science.gov (United States)

    Caddle, Mary C.; Brizuela, Barbara M.

    2011-01-01

    This paper looks at 21 fifth grade students as they discuss a linear graph in the Cartesian plane. The problem presented to students depicted a graph showing distance as a function of elapsed time for a person walking at a constant rate of 5 miles/h. The question asked students to consider how many more hours, after having already walked 4 h,…

  12. External Memory Graph Algorithms and Range Searching Data Structures

    DEFF Research Database (Denmark)

    Walderveen, Freek van

    ). In order to present (for example geographic) data to a user, it is often necessary to select only a relatively small part of a dataset|such as all post oces in the region visible on the user's screen|and return some statictic about this part|such as the distance between the two furthest post oces...... in the region, which may help a postal company in determining what delivery time they can guarantee for their customers. Even in non-geometric settings, the part of the data that needs to be selected is often easily described geometrically, for example in database queries asking for records matching multiple......Every day larger amounts of data are generated that describe our world in terms of networks or graphs. Think for example about maps of roads or rivers, social networks, or the internet (either as a network of computers or as a network of hyperlinks). Besides this, also surface models...

  13. Application of Machine Learning Algorithms for the Query Performance Prediction

    Directory of Open Access Journals (Sweden)

    MILICEVIC, M.

    2015-08-01

    Full Text Available This paper analyzes the relationship between the system load/throughput and the query response time in a real Online transaction processing (OLTP system environment. Although OLTP systems are characterized by short transactions, which normally entail high availability and consistent short response times, the need for operational reporting may jeopardize these objectives. We suggest a new approach to performance prediction for concurrent database workloads, based on the system state vector which consists of 36 attributes. There is no bias to the importance of certain attributes, but the machine learning methods are used to determine which attributes better describe the behavior of the particular database server and how to model that system. During the learning phase, the system's profile is created using multiple reference queries, which are selected to represent frequent business processes. The possibility of the accurate response time prediction may be a foundation for automated decision-making for database (DB query scheduling. Possible applications of the proposed method include adaptive resource allocation, quality of service (QoS management or real-time dynamic query scheduling (e.g. estimation of the optimal moment for a complex query execution.

  14. Association of Finite-Time Thermodynamics and a Bond-Graph Approach for Modeling an Endoreversible Heat Engine

    Directory of Open Access Journals (Sweden)

    Michel Feidt

    2012-03-01

    Full Text Available In recent decades, the approach known as Finite-Time Thermodynamics has provided a fruitful theoretical framework for the optimization of heat engines operating between a heat source (at temperature and a heat sink (at temperature . The aim of this paper is to propose a more complete approach based on the association of Finite-Time Thermodynamics and the Bond-Graph approach for modeling endoreversible heat engines. This approach makes it possible for example to find in a simple way the characteristics of the optimal operating point at which the maximum mechanical power of the endoreversible heat engine is obtained with entropy flow rate as control variable. Furthermore it provides the analytical expressions of the optimal operating point of an irreversible heat engine where the energy conversion is accompanied by irreversibilities related to internal heat transfer and heat dissipation phenomena. This original approach, applied to an analysis of the performance of a thermoelectric generator, will be the object of a future publication.

  15. Static dictionaries on AC0 RAMs: query time (√log n/log log n) is necessary and sufficient

    DEFF Research Database (Denmark)

    Andersson, Arne; Miltersen, Peter Bro; Riis, Søren

    1996-01-01

    ) on the time for answering membership queries in a set of size n when reasonable space is used for the data structure storing the set; the upper bound can be obtained using O(n) space, and the lower bound holds even if we allow space 2polylog n. Several variations of this result are also obtained. Among others......, we show a tradeoff between time and circuit depth under the unit-cost assumption: any RAM instruction set which permits a linear space, constant query time solution to the static dictionary problem must have an instruction of depth Ω(log w/log log to), where w is the word size of the machine (and log...

  16. Worst-case Throughput Analysis for Parametric Rate and Parametric Actor Execution Time Scenario-Aware Dataflow Graphs

    Directory of Open Access Journals (Sweden)

    Mladen Skelin

    2014-03-01

    Full Text Available Scenario-aware dataflow (SADF is a prominent tool for modeling and analysis of dynamic embedded dataflow applications. In SADF the application is represented as a finite collection of synchronous dataflow (SDF graphs, each of which represents one possible application behaviour or scenario. A finite state machine (FSM specifies the possible orders of scenario occurrences. The SADF model renders the tightest possible performance guarantees, but is limited by its finiteness. This means that from a practical point of view, it can only handle dynamic dataflow applications that are characterized by a reasonably sized set of possible behaviours or scenarios. In this paper we remove this limitation for a class of SADF graphs by means of SADF model parametrization in terms of graph port rates and actor execution times. First, we formally define the semantics of the model relevant for throughput analysis based on (max,+ linear system theory and (max,+ automata. Second, by generalizing some of the existing results, we give the algorithms for worst-case throughput analysis of parametric rate and parametric actor execution time acyclic SADF graphs with a fully connected, possibly infinite state transition system. Third, we demonstrate our approach on a few realistic applications from digital signal processing (DSP domain mapped onto an embedded multi-processor architecture.

  17. Using ontology databases for scalable query answering, inconsistency detection, and data integration

    Science.gov (United States)

    Dou, Dejing

    2011-01-01

    An ontology database is a basic relational database management system that models an ontology plus its instances. To reason over the transitive closure of instances in the subsumption hierarchy, for example, an ontology database can either unfold views at query time or propagate assertions using triggers at load time. In this paper, we use existing benchmarks to evaluate our method—using triggers—and we demonstrate that by forward computing inferences, we not only improve query time, but the improvement appears to cost only more space (not time). However, we go on to show that the true penalties were simply opaque to the benchmark, i.e., the benchmark inadequately captures load-time costs. We have applied our methods to two case studies in biomedicine, using ontologies and data from genetics and neuroscience to illustrate two important applications: first, ontology databases answer ontology-based queries effectively; second, using triggers, ontology databases detect instance-based inconsistencies—something not possible using views. Finally, we demonstrate how to extend our methods to perform data integration across multiple, distributed ontology databases. PMID:22163378

  18. What does the structure of its visibility graph tell us about the nature of the time series?

    Science.gov (United States)

    Franke, Jasper G.; Donner, Reik V.

    2017-04-01

    Visibility graphs are a recently introduced method to construct complex network representations based upon univariate time series in order to study their dynamical characteristics [1]. In the last years, this approach has been successfully applied to studying a considerable variety of geoscientific research questions and data sets, including non-trivial temporal patterns in complex earthquake catalogs [2] or time-reversibility in climate time series [3]. It has been shown that several characteristic features of the thus constructed networks differ between stochastic and deterministic (possibly chaotic) processes, which is, however, relatively hard to exploit in the case of real-world applications. In this study, we propose studying two new measures related with the network complexity of visibility graphs constructed from time series, one being a special type of network entropy [4] and the other a recently introduced measure of the heterogeneity of the network's degree distribution [5]. For paradigmatic model systems exhibiting bifurcation sequences between regular and chaotic dynamics, both properties clearly trace the transitions between both types of regimes and exhibit marked quantitative differences for regular and chaotic dynamics. Moreover, for dynamical systems with a small amount of additive noise, the considered properties demonstrate gradual changes prior to the bifurcation point. This finding appears closely related to the subsequent loss of stability of the current state known to lead to a critical slowing down as the transition point is approaches. In this spirit, both considered visibility graph characteristics provide alternative tracers of dynamical early warning signals consistent with classical indicators. Our results demonstrate that measures of visibility graph complexity (i) provide a potentially useful means to tracing changes in the dynamical patterns encoded in a univariate time series that originate from increasing autocorrelation and (ii

  19. A graph edit dictionary for correcting errors in roof topology graphs reconstructed from point clouds

    Science.gov (United States)

    Xiong, B.; Oude Elberink, S.; Vosselman, G.

    2014-07-01

    In the task of 3D building model reconstruction from point clouds we face the problem of recovering a roof topology graph in the presence of noise, small roof faces and low point densities. Errors in roof topology graphs will seriously affect the final modelling results. The aim of this research is to automatically correct these errors. We define the graph correction as a graph-to-graph problem, similar to the spelling correction problem (also called the string-to-string problem). The graph correction is more complex than string correction, as the graphs are 2D while strings are only 1D. We design a strategy based on a dictionary of graph edit operations to automatically identify and correct the errors in the input graph. For each type of error the graph edit dictionary stores a representative erroneous subgraph as well as the corrected version. As an erroneous roof topology graph may contain several errors, a heuristic search is applied to find the optimum sequence of graph edits to correct the errors one by one. The graph edit dictionary can be expanded to include entries needed to cope with errors that were previously not encountered. Experiments show that the dictionary with only fifteen entries already properly corrects one quarter of erroneous graphs in about 4500 buildings, and even half of the erroneous graphs in one test area, achieving as high as a 95% acceptance rate of the reconstructed models.

  20. A tree-decomposed transfer matrix for computing exact Potts model partition functions for arbitrary graphs, with applications to planar graph colourings

    International Nuclear Information System (INIS)

    Bedini, Andrea; Jacobsen, Jesper Lykke

    2010-01-01

    Combining tree decomposition and transfer matrix techniques provides a very general algorithm for computing exact partition functions of statistical models defined on arbitrary graphs. The algorithm is particularly efficient in the case of planar graphs. We illustrate it by computing the Potts model partition functions and chromatic polynomials (the number of proper vertex colourings using Q colours) for large samples of random planar graphs with up to N = 100 vertices. In the latter case, our algorithm yields a sub-exponential average running time of ∼ exp(1.516√N), a substantial improvement over the exponential running time ∼exp (0.245N) provided by the hitherto best-known algorithm. We study the statistics of chromatic roots of random planar graphs in some detail, comparing the findings with results for finite pieces of a regular lattice.