querying biomedical terminologies: Topics by WorldWideScience.org

Sample records for querying biomedical terminologies

LexGrid: a framework for representing, storing, and querying biomedical terminologies from simple to sublime.

Science.gov (United States)

Pathak, Jyotishman; Solbrig, Harold R; Buntrock, James D; Johnson, Thomas M; Chute, Christopher G

2009-01-01

Many biomedical terminologies, classifications, and ontological resources such as the NCI Thesaurus (NCIT), International Classification of Diseases (ICD), Systematized Nomenclature of Medicine (SNOMED), Current Procedural Terminology (CPT), and Gene Ontology (GO) have been developed and used to build a variety of IT applications in biology, biomedicine, and health care settings. However, virtually all these resources involve incompatible formats, are based on different modeling languages, and lack appropriate tooling and programming interfaces (APIs) that hinder their wide-scale adoption and usage in a variety of application contexts. The Lexical Grid (LexGrid) project introduced in this paper is an ongoing community-driven initiative, coordinated by the Mayo Clinic Division of Biomedical Statistics and Informatics, designed to bridge this gap using a common terminology model called the LexGrid model. The key aspect of the model is to accommodate multiple vocabulary and ontology distribution formats and support of multiple data stores for federated vocabulary distribution. The model provides a foundation for building consistent and standardized APIs to access multiple vocabularies that support lexical search queries, hierarchy navigation, and a rich set of features such as recursive subsumption (e.g., get all the children of the concept penicillin). Existing LexGrid implementations include the LexBIG API as well as a reference implementation of the HL7 Common Terminology Services (CTS) specification providing programmatic access via Java, Web, and Grid services.
Approximating terminological queries

NARCIS (Netherlands)

Stuckenschmidt, Heiner; Van Harmelen, Frank

2002-01-01

Current proposals for languages to encode terminological knowledge in intelligent systems support logical reasoning for answering user queries about objects and classes. An application of these languages on the World Wide Web, however, is hampered by the limitations of logical reasoning in terms
Searching for rare diseases in PubMed: a blind comparison of Orphanet expert query and query based on terminological knowledge.

Science.gov (United States)

Griffon, N; Schuers, M; Dhombres, F; Merabti, T; Kerdelhué, G; Rollin, L; Darmoni, S J

2016-08-02

Despite international initiatives like Orphanet, it remains difficult to find up-to-date information about rare diseases. The aim of this study is to propose an exhaustive set of queries for PubMed based on terminological knowledge and to evaluate it versus the queries based on expertise provided by the most frequently used resource in Europe: Orphanet. Four rare disease terminologies (MeSH, OMIM, HPO and HRDO) were manually mapped to each other permitting the automatic creation of expended terminological queries for rare diseases. For 30 rare diseases, 30 citations retrieved by Orphanet expert query and/or query based on terminological knowledge were assessed for relevance by two independent reviewers unaware of the query's origin. An adjudication procedure was used to resolve any discrepancy. Precision, relative recall and F-measure were all computed. For each Orphanet rare disease (n = 8982), there was a corresponding terminological query, in contrast with only 2284 queries provided by Orphanet. Only 553 citations were evaluated due to queries with 0 or only a few hits. There were no significant differences between the Orpha query and terminological query in terms of precision, respectively 0.61 vs 0.52 (p = 0.13). Nevertheless, terminological queries retrieved more citations more often than Orpha queries (0.57 vs. 0.33; p = 0.01). Interestingly, Orpha queries seemed to retrieve older citations than terminological queries (p < 0.0001). The terminological queries proposed in this study are now currently available for all rare diseases. They may be a useful tool for both precision or recall oriented literature search.
Environmental/Biomedical Terminology Index

International Nuclear Information System (INIS)

Huffstetler, J.K.; Dailey, N.S.; Rickert, L.W.; Chilton, B.D.

1976-12-01

The Information Center Complex (ICC), a centrally administered group of information centers, provides information support to environmental and biomedical research groups and others within and outside Oak Ridge National Laboratory. In-house data base building and development of specialized document collections are important elements of the ongoing activities of these centers. ICC groups must be concerned with language which will adequately classify and insure retrievability of document records. Language control problems are compounded when the complexity of modern scientific problem solving demands an interdisciplinary approach. Although there are several word lists, indexes, and thesauri specific to various scientific disciplines usually grouped as Environmental Sciences, no single generally recognized authority can be used as a guide to the terminology of all environmental science. If biomedical terminology for the description of research on environmental effects is also needed, the problem becomes even more complex. The building of a word list which can be used as a general guide to the environmental/biomedical sciences has been a continuing activity of the Information Center Complex. This activity resulted in the publication of the Environmental Biomedical Terminology Index
Environmental/Biomedical Terminology Index

Energy Technology Data Exchange (ETDEWEB)

Huffstetler, J.K.; Dailey, N.S.; Rickert, L.W.; Chilton, B.D.

1976-12-01

The Information Center Complex (ICC), a centrally administered group of information centers, provides information support to environmental and biomedical research groups and others within and outside Oak Ridge National Laboratory. In-house data base building and development of specialized document collections are important elements of the ongoing activities of these centers. ICC groups must be concerned with language which will adequately classify and insure retrievability of document records. Language control problems are compounded when the complexity of modern scientific problem solving demands an interdisciplinary approach. Although there are several word lists, indexes, and thesauri specific to various scientific disciplines usually grouped as Environmental Sciences, no single generally recognized authority can be used as a guide to the terminology of all environmental science. If biomedical terminology for the description of research on environmental effects is also needed, the problem becomes even more complex. The building of a word list which can be used as a general guide to the environmental/biomedical sciences has been a continuing activity of the Information Center Complex. This activity resulted in the publication of the Environmental Biomedical Terminology Index (EBTI).
Customization of biomedical terminologies.

Science.gov (United States)

Homo, Julien; Dupuch, Laëtitia; Benbrahim, Allel; Grabar, Natalia; Dupuch, Marie

2012-01-01

Within the biomedical area over one hundred terminologies exist and are merged in the Unified Medical Language System Metathesaurus, which gives over 1 million concepts. When such huge terminological resources are available, the users must deal with them and specifically they must deal with irrelevant parts of these terminologies. We propose to exploit seed terms and semantic distance algorithms in order to customize the terminologies and to limit within them a semantically homogeneous space. An evaluation performed by a medical expert indicates that the proposed approach is relevant for the customization of terminologies and that the extracted terms are mostly relevant to the seeds. It also indicates that different algorithms provide with similar or identical results within a given terminology. The difference is due to the terminologies exploited. A special attention must be paid to the definition of optimal association between the semantic similarity algorithms and the thresholds specific to a given terminology.
[Big data, medical language and biomedical terminology systems].

Science.gov (United States)

Schulz, Stefan; López-García, Pablo

2015-08-01

A variety of rich terminology systems, such as thesauri, classifications, nomenclatures and ontologies support information and knowledge processing in health care and biomedical research. Nevertheless, human language, manifested as individually written texts, persists as the primary carrier of information, in the description of disease courses or treatment episodes in electronic medical records, and in the description of biomedical research in scientific publications. In the context of the discussion about big data in biomedicine, we hypothesize that the abstraction of the individuality of natural language utterances into structured and semantically normalized information facilitates the use of statistical data analytics to distil new knowledge out of textual data from biomedical research and clinical routine. Computerized human language technologies are constantly evolving and are increasingly ready to annotate narratives with codes from biomedical terminology. However, this depends heavily on linguistic and terminological resources. The creation and maintenance of such resources is labor-intensive. Nevertheless, it is sensible to assume that big data methods can be used to support this process. Examples include the learning of hierarchical relationships, the grouping of synonymous terms into concepts and the disambiguation of homonyms. Although clear evidence is still lacking, the combination of natural language technologies, semantic resources, and big data analytics is promising.
Cross-terminology mapping challenges: A demonstration using medication terminological systems

Science.gov (United States)

Saitwal, Himali; Qing, David; Jones, Stephen; Bernstam, Elmer; Chute, Christopher G.; Johnson, Todd R.

2015-01-01

Standardized terminological systems for biomedical information have provided considerable benefits to biomedical applications and research. However, practical use of this information often requires mapping across terminological systems—a complex and time-consuming process. This paper demonstrates the complexity and challenges of mapping across terminological systems in the context of medication information. It provides a review of medication terminological systems and their linkages, then describes a case study in which we mapped proprietary medication codes from an electronic health record to SNOMED-CT and the UMLS Metathesaurus. The goal was to create a polyhierarchical classification system for querying an i2b2 clinical data warehouse. We found that three methods were required to accurately map the majority of actively prescribed medications. Only 62.5% of source medication codes could be mapped automatically. The remaining codes were mapped using a combination of semi-automated string comparison with expert selection, and a completely manual approach. Compound drugs were especially difficult to map: only 7.5% could be mapped using the automatic method. General challenges to mapping across terminological systems include (1) the availability of up-to-date information to assess the suitability of a given terminological system for a particular use case, and to assess the quality and completeness of cross-terminology links; (2) the difficulty of correctly using complex, rapidly evolving, modern terminologies; (3) the time and effort required to complete and evaluate the mapping; (4) the need to address differences in granularity between the source and target terminologies; and (5) the need to continuously update the mapping as terminological systems evolve. PMID:22750536
Analyzing rare diseases terms in biomedical terminologies

Directory of Open Access Journals (Sweden)

Erika Pasceri

2012-03-01

Full Text Available Rare disease patients too often face common problems, including the lack of access to correct diagnosis, lack of quality information on the disease, lack of scientific knowledge of the disease, inequities and difficulties in access to treatment and care. These things could be changed by implementing a comprehensive approach to rare diseases, increasing international cooperation in scientific research, by gaining and sharing scientific knowledge about and by developing tools for extracting and sharing knowledge. A significant aspect to analyze is the organization of knowledge in the biomedical field for the proper management and recovery of health information. For these purposes, the sources needed have been acquired from the Office of Rare Diseases Research, the National Organization of Rare Disorders and Orphanet, organizations that provide information to patients and physicians and facilitate the exchange of information among different actors involved in this field. The present paper shows the representation of rare diseases terms in biomedical terminologies such as MeSH, ICD-10, SNOMED CT and OMIM, leveraging the fact that these terminologies are integrated in the UMLS. At the first level, it was analyzed the overlap among sources and at a second level, the presence of rare diseases terms in target sources included in UMLS, working at the term and concept level. We found that MeSH has the best representation of rare diseases terms.
Terminology representation guidelines for biomedical ontologies in the semantic web notations.

Science.gov (United States)

Tao, Cui; Pathak, Jyotishman; Solbrig, Harold R; Wei, Wei-Qi; Chute, Christopher G

2013-02-01

Terminologies and ontologies are increasingly prevalent in healthcare and biomedicine. However they suffer from inconsistent renderings, distribution formats, and syntax that make applications through common terminologies services challenging. To address the problem, one could posit a shared representation syntax, associated schema, and tags. We identified a set of commonly-used elements in biomedical ontologies and terminologies based on our experience with the Common Terminology Services 2 (CTS2) Specification as well as the Lexical Grid (LexGrid) project. We propose guidelines for precisely such a shared terminology model, and recommend tags assembled from SKOS, OWL, Dublin Core, RDF Schema, and DCMI meta-terms. We divide these guidelines into lexical information (e.g. synonyms, and definitions) and semantic information (e.g. hierarchies). The latter we distinguish for use by informal terminologies vs. formal ontologies. We then evaluate the guidelines with a spectrum of widely used terminologies and ontologies to examine how the lexical guidelines are implemented, and whether our proposed guidelines would enhance interoperability. Copyright © 2012 Elsevier Inc. All rights reserved.
A Review of Auditing Methods Applied to the Content of Controlled Biomedical Terminologies

Science.gov (United States)

Zhu, Xinxin; Fan, Jung-Wei; Baorto, David M.; Weng, Chunhua; Cimino, James J.

2012-01-01

Although controlled biomedical terminologies have been with us for centuries, it is only in the last couple of decades that close attention has been paid to the quality of these terminologies. The result of this attention has been the development of auditing methods that apply formal methods to assessing whether terminologies are complete and accurate. We have performed an extensive literature review to identify published descriptions of these methods and have created a framework for characterizing them. The framework considers manual, systematic and heuristic methods that use knowledge (within or external to the terminology) to measure quality factors of different aspects of the terminology content (terms, semantic classification, and semantic relationships). The quality factors examined included concept orientation, consistency, non-redundancy, soundness and comprehensive coverage. We reviewed 130 studies that were retrieved based on keyword search on publications in PubMed, and present our assessment of how they fit into our framework. We also identify which terminologies have been audited with the methods and provide examples to illustrate each part of the framework. PMID:19285571
From concepts to clinical reality: an essay on the benchmarking of biomedical terminologies.

Science.gov (United States)

Smith, Barry

2006-06-01

It is only by fixing on agreed meanings of terms in biomedical terminologies that we will be in a position to achieve that accumulation and integration of knowledge that is indispensable to progress at the frontiers of biomedicine. Standardly, the goal of fixing meanings is seen as being realized through the alignment of terms on what are called 'concepts.' Part I addresses three versions of the concept-based approach--by Cimino, by Wüster, and by Campbell and associates--and surveys some of the problems to which they give rise, all of which have to do with a failure to anchor the terms in terminologies to corresponding referents in reality. Part II outlines a new, realist solution to this anchorage problem, which sees terminology construction as being motivated by the goal of alignment not on concepts but on the universals (kinds, types) in reality and thereby also on the corresponding instances (individuals, tokens). We outline the realist approach and show how on its basis we can provide a benchmark of correctness for terminologies which will at the same time allow a new type of integration of terminologies and electronic health records. We conclude by outlining ways in which the framework thus defined might be exploited for purposes of diagnostic decision-support.
Automatic Detection of Terminology Evolution

Science.gov (United States)

Tahmasebi, Nina

As archives contain documents that span over a long period of time, the language used to create these documents and the language used for querying the archive can differ. This difference is due to evolution in both terminology and semantics and will cause a significant number of relevant documents being omitted. A static solution is to use query expansion based on explicit knowledge banks such as thesauri or ontologies. However as we are able to archive resources with more varied terminology, it will be infeasible to use only explicit knowledge for this purpose. There exist only few or no thesauri covering very domain specific terminologies or slang as used in blogs etc. In this Ph.D. thesis we focus on automatically detecting terminology evolution in a completely unsupervised manner as described in this technical paper.
Improving biomedical information retrieval by linear combinations of different query expansion techniques.

Science.gov (United States)

Abdulla, Ahmed AbdoAziz Ahmed; Lin, Hongfei; Xu, Bo; Banbhrani, Santosh Kumar

2016-07-25

Biomedical literature retrieval is becoming increasingly complex, and there is a fundamental need for advanced information retrieval systems. Information Retrieval (IR) programs scour unstructured materials such as text documents in large reserves of data that are usually stored on computers. IR is related to the representation, storage, and organization of information items, as well as to access. In IR one of the main problems is to determine which documents are relevant and which are not to the user's needs. Under the current regime, users cannot precisely construct queries in an accurate way to retrieve particular pieces of data from large reserves of data. Basic information retrieval systems are producing low-quality search results. In our proposed system for this paper we present a new technique to refine Information Retrieval searches to better represent the user's information need in order to enhance the performance of information retrieval by using different query expansion techniques and apply a linear combinations between them, where the combinations was linearly between two expansion results at one time. Query expansions expand the search query, for example, by finding synonyms and reweighting original terms. They provide significantly more focused, particularized search results than do basic search queries. The retrieval performance is measured by some variants of MAP (Mean Average Precision) and according to our experimental results, the combination of best results of query expansion is enhanced the retrieved documents and outperforms our baseline by 21.06 %, even it outperforms a previous study by 7.12 %. We propose several query expansion techniques and their combinations (linearly) to make user queries more cognizable to search engines and to produce higher-quality search results.
Improve Biomedical Information Retrieval using Modified Learning to Rank Methods.

Science.gov (United States)

Xu, Bo; Lin, Hongfei; Lin, Yuan; Ma, Yunlong; Yang, Liang; Wang, Jian; Yang, Zhihao

2016-06-14

In these years, the number of biomedical articles has increased exponentially, which becomes a problem for biologists to capture all the needed information manually. Information retrieval technologies, as the core of search engines, can deal with the problem automatically, providing users with the needed information. However, it is a great challenge to apply these technologies directly for biomedical retrieval, because of the abundance of domain specific terminologies. To enhance biomedical retrieval, we propose a novel framework based on learning to rank. Learning to rank is a series of state-of-the-art information retrieval techniques, and has been proved effective in many information retrieval tasks. In the proposed framework, we attempt to tackle the problem of the abundance of terminologies by constructing ranking models, which focus on not only retrieving the most relevant documents, but also diversifying the searching results to increase the completeness of the resulting list for a given query. In the model training, we propose two novel document labeling strategies, and combine several traditional retrieval models as learning features. Besides, we also investigate the usefulness of different learning to rank approaches in our framework. Experimental results on TREC Genomics datasets demonstrate the effectiveness of our framework for biomedical information retrieval.
Modeling Large Time Series for Efficient Approximate Query Processing

DEFF Research Database (Denmark)

Perera, Kasun S; Hahmann, Martin; Lehner, Wolfgang

2015-01-01

query statistics derived from experiments and when running the system. Our approach can also reduce communication load by exchanging models instead of data. To allow seamless integration of model-based querying into traditional data warehouses, we introduce a SQL compatible query terminology. Our...
The BioLexicon: a large-scale terminological resource for biomedical text mining

Directory of Open Access Journals (Sweden)

Thompson Paul

2011-10-01

Full Text Available Abstract Background Due to the rapidly expanding body of biomedical literature, biologists require increasingly sophisticated and efficient systems to help them to search for relevant information. Such systems should account for the multiple written variants used to represent biomedical concepts, and allow the user to search for specific pieces of knowledge (or events involving these concepts, e.g., protein-protein interactions. Such functionality requires access to detailed information about words used in the biomedical literature. Existing databases and ontologies often have a specific focus and are oriented towards human use. Consequently, biological knowledge is dispersed amongst many resources, which often do not attempt to account for the large and frequently changing set of variants that appear in the literature. Additionally, such resources typically do not provide information about how terms relate to each other in texts to describe events. Results This article provides an overview of the design, construction and evaluation of a large-scale lexical and conceptual resource for the biomedical domain, the BioLexicon. The resource can be exploited by text mining tools at several levels, e.g., part-of-speech tagging, recognition of biomedical entities, and the extraction of events in which they are involved. As such, the BioLexicon must account for real usage of words in biomedical texts. In particular, the BioLexicon gathers together different types of terms from several existing data resources into a single, unified repository, and augments them with new term variants automatically extracted from biomedical literature. Extraction of events is facilitated through the inclusion of biologically pertinent verbs (around which events are typically organized together with information about typical patterns of grammatical and semantic behaviour, which are acquired from domain-specific texts. In order to foster interoperability, the BioLexicon is
The BioLexicon: a large-scale terminological resource for biomedical text mining

Science.gov (United States)

2011-01-01

Background Due to the rapidly expanding body of biomedical literature, biologists require increasingly sophisticated and efficient systems to help them to search for relevant information. Such systems should account for the multiple written variants used to represent biomedical concepts, and allow the user to search for specific pieces of knowledge (or events) involving these concepts, e.g., protein-protein interactions. Such functionality requires access to detailed information about words used in the biomedical literature. Existing databases and ontologies often have a specific focus and are oriented towards human use. Consequently, biological knowledge is dispersed amongst many resources, which often do not attempt to account for the large and frequently changing set of variants that appear in the literature. Additionally, such resources typically do not provide information about how terms relate to each other in texts to describe events. Results This article provides an overview of the design, construction and evaluation of a large-scale lexical and conceptual resource for the biomedical domain, the BioLexicon. The resource can be exploited by text mining tools at several levels, e.g., part-of-speech tagging, recognition of biomedical entities, and the extraction of events in which they are involved. As such, the BioLexicon must account for real usage of words in biomedical texts. In particular, the BioLexicon gathers together different types of terms from several existing data resources into a single, unified repository, and augments them with new term variants automatically extracted from biomedical literature. Extraction of events is facilitated through the inclusion of biologically pertinent verbs (around which events are typically organized) together with information about typical patterns of grammatical and semantic behaviour, which are acquired from domain-specific texts. In order to foster interoperability, the BioLexicon is modelled using the Lexical
Development of a Pediatric Adverse Events Terminology.

Science.gov (United States)

Gipson, Debbie S; Kirkendall, Eric S; Gumbs-Petty, Brenda; Quinn, Theresa; Steen, A; Hicks, Amanda; McMahon, Ann; Nicholas, Savian; Zhao-Wong, Anna; Taylor-Zapata, Perdita; Turner, Mark; Herreshoff, Emily; Jones, Charlotte; Davis, Jonathan M; Haber, Margaret; Hirschfeld, Steven

2017-01-01

In 2009, the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) established the Pediatric Terminology Harmonization Initiative to establish a core library of terms to facilitate the acquisition and sharing of knowledge between pediatric clinical research, practice, and safety reporting. A coalition of partners established a Pediatric Terminology Adverse Event Working Group in 2013 to develop a specific terminology relevant to international pediatric adverse event (AE) reporting. Pediatric specialists with backgrounds in clinical care, research, safety reporting, or informatics, supported by biomedical terminology experts from the National Cancer Institute's Enterprise Vocabulary Services participated. The multinational group developed a working definition of AEs and reviewed concepts (terms, synonyms, and definitions) from 16 pediatric clinical domains. The resulting AE terminology contains >1000 pediatric diseases, disorders, or clinical findings. The terms were tested for proof of concept use in 2 different settings: hospital readmissions and the NICU. The advantages of the AE terminology include ease of adoption due to integration with well-established and internationally accepted biomedical terminologies, a uniquely temporal focus on pediatric health and disease from conception through adolescence, and terms that could be used in both well- and underresourced environments. The AE terminology is available for use without restriction through the National Cancer Institute's Enterprise Vocabulary Services and is fully compatible with, and represented in, the Medical Dictionary for Regulatory Activities. The terminology is intended to mature with use, user feedback, and optimization. Copyright © 2017 by the American Academy of Pediatrics.
Supporting infobuttons with terminological knowledge.

OpenAIRE

Cimino, J. J.; Elhanan, G.; Zeng, Q.

1997-01-01

We have developed several prototype applications which integrate clinical systems with on-line information resources by using patient data to drive queries in response to user information needs. We refer to these collectively as infobuttons because they are evoked with a minimum of keyboard entry. We make use of knowledge in our terminology, the Medical Entities Dictionary (MED) to assist with the selection of appropriate queries and resources, as well as the translation of patient data to fo...

Where to search top-K biomedical ontologies?

Science.gov (United States)

Oliveira, Daniela; Butt, Anila Sahar; Haller, Armin; Rebholz-Schuhmann, Dietrich; Sahay, Ratnesh

2018-03-20

Searching for precise terms and terminological definitions in the biomedical data space is problematic, as researchers find overlapping, closely related and even equivalent concepts in a single or multiple ontologies. Search engines that retrieve ontological resources often suggest an extensive list of search results for a given input term, which leads to the tedious task of selecting the best-fit ontological resource (class or property) for the input term and reduces user confidence in the retrieval engines. A systematic evaluation of these search engines is necessary to understand their strengths and weaknesses in different search requirements. We have implemented seven comparable Information Retrieval ranking algorithms to search through ontologies and compared them against four search engines for ontologies. Free-text queries have been performed, the outcomes have been judged by experts and the ranking algorithms and search engines have been evaluated against the expert-based ground truth (GT). In addition, we propose a probabilistic GT that is developed automatically to provide deeper insights and confidence to the expert-based GT as well as evaluating a broader range of search queries. The main outcome of this work is the identification of key search factors for biomedical ontologies together with search requirements and a set of recommendations that will help biomedical experts and ontology engineers to select the best-suited retrieval mechanism in their search scenarios. We expect that this evaluation will allow researchers and practitioners to apply the current search techniques more reliably and that it will help them to select the right solution for their daily work. The source code (of seven ranking algorithms), ground truths and experimental results are available at https://github.com/danielapoliveira/bioont-search-benchmark.
Discovering biomedical semantic relations in PubMed queries for information retrieval and database curation.

Science.gov (United States)

Huang, Chung-Chi; Lu, Zhiyong

2016-01-01

Identifying relevant papers from the literature is a common task in biocuration. Most current biomedical literature search systems primarily rely on matching user keywords. Semantic search, on the other hand, seeks to improve search accuracy by understanding the entities and contextual relations in user keywords. However, past research has mostly focused on semantically identifying biological entities (e.g. chemicals, diseases and genes) with little effort on discovering semantic relations. In this work, we aim to discover biomedical semantic relations in PubMed queries in an automated and unsupervised fashion. Specifically, we focus on extracting and understanding the contextual information (or context patterns) that is used by PubMed users to represent semantic relations between entities such as 'CHEMICAL-1 compared to CHEMICAL-2' With the advances in automatic named entity recognition, we first tag entities in PubMed queries and then use tagged entities as knowledge to recognize pattern semantics. More specifically, we transform PubMed queries into context patterns involving participating entities, which are subsequently projected to latent topics via latent semantic analysis (LSA) to avoid the data sparseness and specificity issues. Finally, we mine semantically similar contextual patterns or semantic relations based on LSA topic distributions. Our two separate evaluation experiments of chemical-chemical (CC) and chemical-disease (CD) relations show that the proposed approach significantly outperforms a baseline method, which simply measures pattern semantics by similarity in participating entities. The highest performance achieved by our approach is nearly 0.9 and 0.85 respectively for the CC and CD task when compared against the ground truth in terms of normalized discounted cumulative gain (nDCG), a standard measure of ranking quality. These results suggest that our approach can effectively identify and return related semantic patterns in a ranked order
Supporting infobuttons with terminological knowledge.

Science.gov (United States)

Cimino, J J; Elhanan, G; Zeng, Q

1997-01-01

We have developed several prototype applications which integrate clinical systems with on-line information resources by using patient data to drive queries in response to user information needs. We refer to these collectively as infobuttons because they are evoked with a minimum of keyboard entry. We make use of knowledge in our terminology, the Medical Entities Dictionary (MED) to assist with the selection of appropriate queries and resources, as well as the translation of patient data to forms recognized by the resources. This paper describes the kinds of knowledge in the MED, including literal attributes, hierarchical links and other semantic links, and how this knowledge is used in system integration.
BEST: Next-Generation Biomedical Entity Search Tool for Knowledge Discovery from Biomedical Literature.

Directory of Open Access Journals (Sweden)

Sunwon Lee

Full Text Available As the volume of publications rapidly increases, searching for relevant information from the literature becomes more challenging. To complement standard search engines such as PubMed, it is desirable to have an advanced search tool that directly returns relevant biomedical entities such as targets, drugs, and mutations rather than a long list of articles. Some existing tools submit a query to PubMed and process retrieved abstracts to extract information at query time, resulting in a slow response time and limited coverage of only a fraction of the PubMed corpus. Other tools preprocess the PubMed corpus to speed up the response time; however, they are not constantly updated, and thus produce outdated results. Further, most existing tools cannot process sophisticated queries such as searches for mutations that co-occur with query terms in the literature. To address these problems, we introduce BEST, a biomedical entity search tool. BEST returns, as a result, a list of 10 different types of biomedical entities including genes, diseases, drugs, targets, transcription factors, miRNAs, and mutations that are relevant to a user's query. To the best of our knowledge, BEST is the only system that processes free text queries and returns up-to-date results in real time including mutation information in the results. BEST is freely accessible at http://best.korea.ac.kr.
Facilitating biomedical researchers' interrogation of electronic health record data: Ideas from outside of biomedical informatics.

Science.gov (United States)

Hruby, Gregory W; Matsoukas, Konstantina; Cimino, James J; Weng, Chunhua

2016-04-01

Electronic health records (EHR) are a vital data resource for research uses, including cohort identification, phenotyping, pharmacovigilance, and public health surveillance. To realize the promise of EHR data for accelerating clinical research, it is imperative to enable efficient and autonomous EHR data interrogation by end users such as biomedical researchers. This paper surveys state-of-art approaches and key methodological considerations to this purpose. We adapted a previously published conceptual framework for interactive information retrieval, which defines three entities: user, channel, and source, by elaborating on channels for query formulation in the context of facilitating end users to interrogate EHR data. We show the current progress in biomedical informatics mainly lies in support for query execution and information modeling, primarily due to emphases on infrastructure development for data integration and data access via self-service query tools, but has neglected user support needed during iteratively query formulation processes, which can be costly and error-prone. In contrast, the information science literature has offered elaborate theories and methods for user modeling and query formulation support. The two bodies of literature are complementary, implying opportunities for cross-disciplinary idea exchange. On this basis, we outline the directions for future informatics research to improve our understanding of user needs and requirements for facilitating autonomous interrogation of EHR data by biomedical researchers. We suggest that cross-disciplinary translational research between biomedical informatics and information science can benefit our research in facilitating efficient data access in life sciences. Copyright © 2016 Elsevier Inc. All rights reserved.
Generating and Executing Complex Natural Language Queries across Linked Data.

Science.gov (United States)

Hamon, Thierry; Mougin, Fleur; Grabar, Natalia

2015-01-01

With the recent and intensive research in the biomedical area, the knowledge accumulated is disseminated through various knowledge bases. Links between these knowledge bases are needed in order to use them jointly. Linked Data, SPARQL language, and interfaces in Natural Language question-answering provide interesting solutions for querying such knowledge bases. We propose a method for translating natural language questions in SPARQL queries. We use Natural Language Processing tools, semantic resources, and the RDF triples description. The method is designed on 50 questions over 3 biomedical knowledge bases, and evaluated on 27 questions. It achieves 0.78 F-measure on the test set. The method for translating natural language questions into SPARQL queries is implemented as Perl module available at http://search.cpan.org/ thhamon/RDF-NLP-SPARQLQuery.
Study of Query Expansion Techniques and Their Application in the Biomedical Information Retrieval

Directory of Open Access Journals (Sweden)

A. R. Rivas

2014-01-01

retrieval systems. These techniques help to overcome vocabulary mismatch issues by expanding the original query with additional relevant terms and reweighting the terms in the expanded query. In this paper, different text preprocessing and query expansion approaches are combined to improve the documents initially retrieved by a query in a scientific documental database. A corpus belonging to MEDLINE, called Cystic Fibrosis, is used as a knowledge source. Experimental results show that the proposed combinations of techniques greatly enhance the efficiency obtained by traditional queries.
An Approach to Measuring Semantic Relatedness of Geographic Terminologies Using a Thesaurus and Lexical Database Sources

Directory of Open Access Journals (Sweden)

Zugang Chen

2018-03-01

Full Text Available In geographic information science, semantic relatedness is important for Geographic Information Retrieval (GIR, Linked Geospatial Data, geoparsing, and geo-semantics. But computing the semantic similarity/relatedness of geographic terminology is still an urgent issue to tackle. The thesaurus is a ubiquitous and sophisticated knowledge representation tool existing in various domains. In this article, we combined the generic lexical database (WordNet or HowNet with the Thesaurus for Geographic Science and proposed a thesaurus–lexical relatedness measure (TLRM to compute the semantic relatedness of geographic terminology. This measure quantified the relationship between terminologies, interlinked the discrete term trees by using the generic lexical database, and realized the semantic relatedness computation of any two terminologies in the thesaurus. The TLRM was evaluated on a new relatedness baseline, namely, the Geo-Terminology Relatedness Dataset (GTRD which was built by us, and the TLRM obtained a relatively high cognitive plausibility. Finally, we applied the TLRM on a geospatial data sharing portal to support data retrieval. The application results of the 30 most frequently used queries of the portal demonstrated that using TLRM could improve the recall of geospatial data retrieval in most situations and rank the retrieval results by the matching scores between the query of users and the geospatial dataset.
Development of a Model for the Representation of Nanotechnology-Specific Terminology

Science.gov (United States)

Bailey, LeeAnn O.; Kennedy, Christopher H.; Fritts, Martin J.; Hartel, Francis W.

2006-01-01

Nanotechnology is an important, rapidly-evolving, multidisciplinary field [1]. The tremendous growth in this area necessitates the establishment of a common, open-source terminology to support the diverse biomedical applications of nanotechnology. Currently, the consensus process to define and categorize conceptual entities pertaining to nanotechnology is in a rudimentary stage. We have constructed a nanotechnology-specific conceptual hierarchy that can be utilized by end users to retrieve accurate, controlled terminology regarding emerging nanotechnology and corresponding clinical applications. PMID:17238469
A Domain-Specific Terminology for Retinopathy of Prematurity and Its Applications in Clinical Settings.

Science.gov (United States)

Zhang, Yinsheng; Zhang, Guoming

2018-01-01

A terminology (or coding system) is a formal set of controlled vocabulary in a specific domain. With a well-defined terminology, each concept in the target domain is assigned with a unique code, which can be identified and processed across different medical systems in an unambiguous way. Though there are lots of well-known biomedical terminologies, there is currently no domain-specific terminology for ROP (retinopathy of prematurity). Based on a collection of historical ROP patients' data in the electronic medical record system, we extracted the most frequent terms in the domain and organized them into a hierarchical coding system-ROP Minimal Standard Terminology, which contains 62 core concepts in 4 categories. This terminology has been successfully used to provide highly structured and semantic-rich clinical data in several ROP-related applications.
A Domain-Specific Terminology for Retinopathy of Prematurity and Its Applications in Clinical Settings

Directory of Open Access Journals (Sweden)

Yinsheng Zhang

2018-01-01

Full Text Available A terminology (or coding system is a formal set of controlled vocabulary in a specific domain. With a well-defined terminology, each concept in the target domain is assigned with a unique code, which can be identified and processed across different medical systems in an unambiguous way. Though there are lots of well-known biomedical terminologies, there is currently no domain-specific terminology for ROP (retinopathy of prematurity. Based on a collection of historical ROP patients’ data in the electronic medical record system, we extracted the most frequent terms in the domain and organized them into a hierarchical coding system—ROP Minimal Standard Terminology, which contains 62 core concepts in 4 categories. This terminology has been successfully used to provide highly structured and semantic-rich clinical data in several ROP-related applications.
Terminology development towards harmonizing multiple clinical neuroimaging research repositories.

Science.gov (United States)

Turner, Jessica A; Pasquerello, Danielle; Turner, Matthew D; Keator, David B; Alpert, Kathryn; King, Margaret; Landis, Drew; Calhoun, Vince D; Potkin, Steven G; Tallis, Marcelo; Ambite, Jose Luis; Wang, Lei

2015-07-01

Data sharing and mediation across disparate neuroimaging repositories requires extensive effort to ensure that the different domains of data types are referred to by commonly agreed upon terms. Within the SchizConnect project, which enables querying across decentralized databases of neuroimaging, clinical, and cognitive data from various studies of schizophrenia, we developed a model for each data domain, identified common usable terms that could be agreed upon across the repositories, and linked them to standard ontological terms where possible. We had the goal of facilitating both the current user experience in querying and future automated computations and reasoning regarding the data. We found that existing terminologies are incomplete for these purposes, even with the history of neuroimaging data sharing in the field; and we provide a model for efforts focused on querying multiple clinical neuroimaging repositories.
Improving information retrieval with multiple health terminologies in a quality-controlled gateway.

Science.gov (United States)

Soualmia, Lina F; Sakji, Saoussen; Letord, Catherine; Rollin, Laetitia; Massari, Philippe; Darmoni, Stéfan J

2013-01-01

The Catalog and Index of French-language Health Internet resources (CISMeF) is a quality-controlled health gateway, primarily for Web resources in French (n=89,751). Recently, we achieved a major improvement in the structure of the catalogue by setting-up multiple terminologies, based on twelve health terminologies available in French, to overcome the potential weakness of the MeSH thesaurus, which is the main and pivotal terminology we use for indexing and retrieval since 1995. The main aim of this study was to estimate the added-value of exploiting several terminologies and their semantic relationships to improve Web resource indexing and retrieval in CISMeF, in order to provide additional health resources which meet the users' expectations. Twelve terminologies were integrated into the CISMeF information system to set up multiple-terminologies indexing and retrieval. The same sets of thirty queries were run: (i) by exploiting the hierarchical structure of the MeSH, and (ii) by exploiting the additional twelve terminologies and their semantic links. The two search modes were evaluated and compared. The overall coverage of the multiple-terminologies search mode was improved by comparison to the coverage of using the MeSH (16,283 vs. 14,159) (+15%). These additional findings were estimated at 56.6% relevant results, 24.7% intermediate results and 18.7% irrelevant. The multiple-terminologies approach improved information retrieval. These results suggest that integrating additional health terminologies was able to improve recall. Since performing the study, 21 other terminologies have been added which should enable us to make broader studies in multiple-terminologies information retrieval.
TopFed: TCGA tailored federated query processing and linking to LOD.

Science.gov (United States)

Saleem, Muhammad; Padmanabhuni, Shanmukha S; Ngomo, Axel-Cyrille Ngonga; Iqbal, Aftab; Almeida, Jonas S; Decker, Stefan; Deus, Helena F

2014-01-01

The Cancer Genome Atlas (TCGA) is a multidisciplinary, multi-institutional effort to catalogue genetic mutations responsible for cancer using genome analysis techniques. One of the aims of this project is to create a comprehensive and open repository of cancer related molecular analysis, to be exploited by bioinformaticians towards advancing cancer knowledge. However, devising bioinformatics applications to analyse such large dataset is still challenging, as it often requires downloading large archives and parsing the relevant text files. Therefore, it is making it difficult to enable virtual data integration in order to collect the critical co-variates necessary for analysis. We address these issues by transforming the TCGA data into the Semantic Web standard Resource Description Format (RDF), link it to relevant datasets in the Linked Open Data (LOD) cloud and further propose an efficient data distribution strategy to host the resulting 20.4 billion triples data via several SPARQL endpoints. Having the TCGA data distributed across multiple SPARQL endpoints, we enable biomedical scientists to query and retrieve information from these SPARQL endpoints by proposing a TCGA tailored federated SPARQL query processing engine named TopFed. We compare TopFed with a well established federation engine FedX in terms of source selection and query execution time by using 10 different federated SPARQL queries with varying requirements. Our evaluation results show that TopFed selects on average less than half of the sources (with 100% recall) with query execution time equal to one third to that of FedX. With TopFed, we aim to offer biomedical scientists a single-point-of-access through which distributed TCGA data can be accessed in unison. We believe the proposed system can greatly help researchers in the biomedical domain to carry out their research effectively with TCGA as the amount and diversity of data exceeds the ability of local resources to handle its retrieval and
The Ontology Lookup Service: more data and better tools for controlled vocabulary queries.

Science.gov (United States)

Côté, Richard G; Jones, Philip; Martens, Lennart; Apweiler, Rolf; Hermjakob, Henning

2008-07-01

The Ontology Lookup Service (OLS) (http://www.ebi.ac.uk/ols) provides interactive and programmatic interfaces to query, browse and navigate an ever increasing number of biomedical ontologies and controlled vocabularies. The volume of data available for querying has more than quadrupled since it went into production and OLS functionality has been integrated into several high-usage databases and data entry tools. Improvements have been made to both OLS query interfaces, based on user feedback and requirements, to improve usability and service interoperability and provide novel ways to perform queries.
The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries

Directory of Open Access Journals (Sweden)

Apweiler Rolf

2006-02-01

Full Text Available Abstract Background With the vast amounts of biomedical data being generated by high-throughput analysis methods, controlled vocabularies and ontologies are becoming increasingly important to annotate units of information for ease of search and retrieval. Each scientific community tends to create its own locally available ontology. The interfaces to query these ontologies tend to vary from group to group. We saw the need for a centralized location to perform controlled vocabulary queries that would offer both a lightweight web-accessible user interface as well as a consistent, unified SOAP interface for automated queries. Results The Ontology Lookup Service (OLS was created to integrate publicly available biomedical ontologies into a single database. All modified ontologies are updated daily. A list of currently loaded ontologies is available online. The database can be queried to obtain information on a single term or to browse a complete ontology using AJAX. Auto-completion provides a user-friendly search mechanism. An AJAX-based ontology viewer is available to browse a complete ontology or subsets of it. A programmatic interface is available to query the webservice using SOAP. The service is described by a WSDL descriptor file available online. A sample Java client to connect to the webservice using SOAP is available for download from SourceForge. All OLS source code is publicly available under the open source Apache Licence. Conclusion The OLS provides a user-friendly single entry point for publicly available ontologies in the Open Biomedical Ontology (OBO format. It can be accessed interactively or programmatically at http://www.ebi.ac.uk/ontology-lookup/.
CrossQuery: a web tool for easy associative querying of transcriptome data.

Directory of Open Access Journals (Sweden)

Toni U Wagner

Full Text Available Enormous amounts of data are being generated by modern methods such as transcriptome or exome sequencing and microarray profiling. Primary analyses such as quality control, normalization, statistics and mapping are highly complex and need to be performed by specialists. Thereafter, results are handed back to biomedical researchers, who are then confronted with complicated data lists. For rather simple tasks like data filtering, sorting and cross-association there is a need for new tools which can be used by non-specialists. Here, we describe CrossQuery, a web tool that enables straight forward, simple syntax queries to be executed on transcriptome sequencing and microarray datasets. We provide deep-sequencing data sets of stem cell lines derived from the model fish Medaka and microarray data of human endothelial cells. In the example datasets provided, mRNA expression levels, gene, transcript and sample identification numbers, GO-terms and gene descriptions can be freely correlated, filtered and sorted. Queries can be saved for later reuse and results can be exported to standard formats that allow copy-and-paste to all widespread data visualization tools such as Microsoft Excel. CrossQuery enables researchers to quickly and freely work with transcriptome and microarray data sets requiring only minimal computer skills. Furthermore, CrossQuery allows growing association of multiple datasets as long as at least one common point of correlated information, such as transcript identification numbers or GO-terms, is shared between samples. For advanced users, the object-oriented plug-in and event-driven code design of both server-side and client-side scripts allow easy addition of new features, data sources and data types.
CrossQuery: a web tool for easy associative querying of transcriptome data.

Science.gov (United States)

Wagner, Toni U; Fischer, Andreas; Thoma, Eva C; Schartl, Manfred

2011-01-01

Enormous amounts of data are being generated by modern methods such as transcriptome or exome sequencing and microarray profiling. Primary analyses such as quality control, normalization, statistics and mapping are highly complex and need to be performed by specialists. Thereafter, results are handed back to biomedical researchers, who are then confronted with complicated data lists. For rather simple tasks like data filtering, sorting and cross-association there is a need for new tools which can be used by non-specialists. Here, we describe CrossQuery, a web tool that enables straight forward, simple syntax queries to be executed on transcriptome sequencing and microarray datasets. We provide deep-sequencing data sets of stem cell lines derived from the model fish Medaka and microarray data of human endothelial cells. In the example datasets provided, mRNA expression levels, gene, transcript and sample identification numbers, GO-terms and gene descriptions can be freely correlated, filtered and sorted. Queries can be saved for later reuse and results can be exported to standard formats that allow copy-and-paste to all widespread data visualization tools such as Microsoft Excel. CrossQuery enables researchers to quickly and freely work with transcriptome and microarray data sets requiring only minimal computer skills. Furthermore, CrossQuery allows growing association of multiple datasets as long as at least one common point of correlated information, such as transcript identification numbers or GO-terms, is shared between samples. For advanced users, the object-oriented plug-in and event-driven code design of both server-side and client-side scripts allow easy addition of new features, data sources and data types.
Querying phenotype-genotype relationships on patient datasets using semantic web technology: the example of Cerebrotendinous xanthomatosis.

Science.gov (United States)

Taboada, María; Martínez, Diego; Pilo, Belén; Jiménez-Escrig, Adriano; Robinson, Peter N; Sobrido, María J

2012-07-31

Semantic Web technology can considerably catalyze translational genetics and genomics research in medicine, where the interchange of information between basic research and clinical levels becomes crucial. This exchange involves mapping abstract phenotype descriptions from research resources, such as knowledge databases and catalogs, to unstructured datasets produced through experimental methods and clinical practice. This is especially true for the construction of mutation databases. This paper presents a way of harmonizing abstract phenotype descriptions with patient data from clinical practice, and querying this dataset about relationships between phenotypes and genetic variants, at different levels of abstraction. Due to the current availability of ontological and terminological resources that have already reached some consensus in biomedicine, a reuse-based ontology engineering approach was followed. The proposed approach uses the Ontology Web Language (OWL) to represent the phenotype ontology and the patient model, the Semantic Web Rule Language (SWRL) to bridge the gap between phenotype descriptions and clinical data, and the Semantic Query Web Rule Language (SQWRL) to query relevant phenotype-genotype bidirectional relationships. The work tests the use of semantic web technology in the biomedical research domain named cerebrotendinous xanthomatosis (CTX), using a real dataset and ontologies. A framework to query relevant phenotype-genotype bidirectional relationships is provided. Phenotype descriptions and patient data were harmonized by defining 28 Horn-like rules in terms of the OWL concepts. In total, 24 patterns of SWQRL queries were designed following the initial list of competency questions. As the approach is based on OWL, the semantic of the framework adapts the standard logical model of an open world assumption. This work demonstrates how semantic web technologies can be used to support flexible representation and computational inference mechanisms
Assessing the practice of biomedical ontology evaluation: Gaps and opportunities.

Science.gov (United States)

Amith, Muhammad; He, Zhe; Bian, Jiang; Lossio-Ventura, Juan Antonio; Tao, Cui

2018-04-01

With the proliferation of heterogeneous health care data in the last three decades, biomedical ontologies and controlled biomedical terminologies play a more and more important role in knowledge representation and management, data integration, natural language processing, as well as decision support for health information systems and biomedical research. Biomedical ontologies and controlled terminologies are intended to assure interoperability. Nevertheless, the quality of biomedical ontologies has hindered their applicability and subsequent adoption in real-world applications. Ontology evaluation is an integral part of ontology development and maintenance. In the biomedicine domain, ontology evaluation is often conducted by third parties as a quality assurance (or auditing) effort that focuses on identifying modeling errors and inconsistencies. In this work, we first organized four categorical schemes of ontology evaluation methods in the existing literature to create an integrated taxonomy. Further, to understand the ontology evaluation practice in the biomedicine domain, we reviewed a sample of 200 ontologies from the National Center for Biomedical Ontology (NCBO) BioPortal-the largest repository for biomedical ontologies-and observed that only 15 of these ontologies have documented evaluation in their corresponding inception papers. We then surveyed the recent quality assurance approaches for biomedical ontologies and their use. We also mapped these quality assurance approaches to the ontology evaluation criteria. It is our anticipation that ontology evaluation and quality assurance approaches will be more widely adopted in the development life cycle of biomedical ontologies. Copyright © 2018 Elsevier Inc. All rights reserved.

Development and empirical user-centered evaluation of semantically-based query recommendation for an electronic health record search engine.

Science.gov (United States)

Hanauer, David A; Wu, Danny T Y; Yang, Lei; Mei, Qiaozhu; Murkowski-Steffy, Katherine B; Vydiswaran, V G Vinod; Zheng, Kai

2017-03-01

The utility of biomedical information retrieval environments can be severely limited when users lack expertise in constructing effective search queries. To address this issue, we developed a computer-based query recommendation algorithm that suggests semantically interchangeable terms based on an initial user-entered query. In this study, we assessed the value of this approach, which has broad applicability in biomedical information retrieval, by demonstrating its application as part of a search engine that facilitates retrieval of information from electronic health records (EHRs). The query recommendation algorithm utilizes MetaMap to identify medical concepts from search queries and indexed EHR documents. Synonym variants from UMLS are used to expand the concepts along with a synonym set curated from historical EHR search logs. The empirical study involved 33 clinicians and staff who evaluated the system through a set of simulated EHR search tasks. User acceptance was assessed using the widely used technology acceptance model. The search engine's performance was rated consistently higher with the query recommendation feature turned on vs. off. The relevance of computer-recommended search terms was also rated high, and in most cases the participants had not thought of these terms on their own. The questions on perceived usefulness and perceived ease of use received overwhelmingly positive responses. A vast majority of the participants wanted the query recommendation feature to be available to assist in their day-to-day EHR search tasks. Challenges persist for users to construct effective search queries when retrieving information from biomedical documents including those from EHRs. This study demonstrates that semantically-based query recommendation is a viable solution to addressing this challenge. Published by Elsevier Inc.
Abstraction networks for terminologies: Supporting management of "big knowledge".

Science.gov (United States)

Halper, Michael; Gu, Huanying; Perl, Yehoshua; Ochs, Christopher

2015-05-01

comprise tens of thousands to millions of concepts and their attendant complexity. The notion of abstraction network has been introduced as a tool in helping to overcome this challenge, thus enhancing the usefulness of terminologies. Abstraction networks have been shown to be applicable to a variety of existing biomedical terminologies, and these alternative structural views hold promise for future expanded use with additional terminologies. Copyright © 2015 Elsevier B.V. All rights reserved.
Query Log Analysis of an Electronic Health Record Search Engine

Science.gov (United States)

Yang, Lei; Mei, Qiaozhu; Zheng, Kai; Hanauer, David A.

2011-01-01

We analyzed a longitudinal collection of query logs of a full-text search engine designed to facilitate information retrieval in electronic health records (EHR). The collection, 202,905 queries and 35,928 user sessions recorded over a course of 4 years, represents the information-seeking behavior of 533 medical professionals, including frontline practitioners, coding personnel, patient safety officers, and biomedical researchers for patient data stored in EHR systems. In this paper, we present descriptive statistics of the queries, a categorization of information needs manifested through the queries, as well as temporal patterns of the users’ information-seeking behavior. The results suggest that information needs in medical domain are substantially more sophisticated than those that general-purpose web search engines need to accommodate. Therefore, we envision there exists a significant challenge, along with significant opportunities, to provide intelligent query recommendations to facilitate information retrieval in EHR. PMID:22195150
Understanding terminological systems. I: Terminology and typology

NARCIS (Netherlands)

de Keizer, N. F.; Abu-Hanna, A.; Zwetsloot-Schonk, J. H.

2000-01-01

Terminological systems are an important research issue within the field of medical informatics. For precise understanding of existing terminological systems a referential framework is needed that provides a uniform terminology and typology of terminological systems themselves. In this article a
Development and Evaluation of Thesauri-Based Bibliographic Biomedical Search Engine

Science.gov (United States)

Alghoson, Abdullah

2017-01-01

Due to the large volume and exponential growth of biomedical documents (e.g., books, journal articles), it has become increasingly challenging for biomedical search engines to retrieve relevant documents based on users' search queries. Part of the challenge is the matching mechanism of free-text indexing that performs matching based on…
KaBOB: ontology-based semantic integration of biomedical databases.

Science.gov (United States)

Livingston, Kevin M; Bada, Michael; Baumgartner, William A; Hunter, Lawrence E

2015-04-23

The ability to query many independent biological databases using a common ontology-based semantic model would facilitate deeper integration and more effective utilization of these diverse and rapidly growing resources. Despite ongoing work moving toward shared data formats and linked identifiers, significant problems persist in semantic data integration in order to establish shared identity and shared meaning across heterogeneous biomedical data sources. We present five processes for semantic data integration that, when applied collectively, solve seven key problems. These processes include making explicit the differences between biomedical concepts and database records, aggregating sets of identifiers denoting the same biomedical concepts across data sources, and using declaratively represented forward-chaining rules to take information that is variably represented in source databases and integrating it into a consistent biomedical representation. We demonstrate these processes and solutions by presenting KaBOB (the Knowledge Base Of Biomedicine), a knowledge base of semantically integrated data from 18 prominent biomedical databases using common representations grounded in Open Biomedical Ontologies. An instance of KaBOB with data about humans and seven major model organisms can be built using on the order of 500 million RDF triples. All source code for building KaBOB is available under an open-source license. KaBOB is an integrated knowledge base of biomedical data representationally based in prominent, actively maintained Open Biomedical Ontologies, thus enabling queries of the underlying data in terms of biomedical concepts (e.g., genes and gene products, interactions and processes) rather than features of source-specific data schemas or file formats. KaBOB resolves many of the issues that routinely plague biomedical researchers intending to work with data from multiple data sources and provides a platform for ongoing data integration and development and for
Integrating systems biology models and biomedical ontologies.

Science.gov (United States)

Hoehndorf, Robert; Dumontier, Michel; Gennari, John H; Wimalaratne, Sarala; de Bono, Bernard; Cook, Daniel L; Gkoutos, Georgios V

2011-08-11

Systems biology is an approach to biology that emphasizes the structure and dynamic behavior of biological systems and the interactions that occur within them. To succeed, systems biology crucially depends on the accessibility and integration of data across domains and levels of granularity. Biomedical ontologies were developed to facilitate such an integration of data and are often used to annotate biosimulation models in systems biology. We provide a framework to integrate representations of in silico systems biology with those of in vivo biology as described by biomedical ontologies and demonstrate this framework using the Systems Biology Markup Language. We developed the SBML Harvester software that automatically converts annotated SBML models into OWL and we apply our software to those biosimulation models that are contained in the BioModels Database. We utilize the resulting knowledge base for complex biological queries that can bridge levels of granularity, verify models based on the biological phenomenon they represent and provide a means to establish a basic qualitative layer on which to express the semantics of biosimulation models. We establish an information flow between biomedical ontologies and biosimulation models and we demonstrate that the integration of annotated biosimulation models and biomedical ontologies enables the verification of models as well as expressive queries. Establishing a bi-directional information flow between systems biology and biomedical ontologies has the potential to enable large-scale analyses of biological systems that span levels of granularity from molecules to organisms.
Probabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval

Science.gov (United States)

Karisani, Payam; Qin, Zhaohui S; Agichtein, Eugene

2018-01-01

Abstract The bioCADDIE dataset retrieval challenge brought together different approaches to retrieval of biomedical datasets relevant to a user’s query, expressed as a text description of a needed dataset. We describe experiments in applying a data-driven, machine learning-based approach to biomedical dataset retrieval as part of this challenge. We report on a series of experiments carried out to evaluate the performance of both probabilistic and machine learning-driven techniques from information retrieval, as applied to this challenge. Our experiments with probabilistic information retrieval methods, such as query term weight optimization, automatic query expansion and simulated user relevance feedback, demonstrate that automatically boosting the weights of important keywords in a verbose query is more effective than other methods. We also show that although there is a rich space of potential representations and features available in this domain, machine learning-based re-ranking models are not able to improve on probabilistic information retrieval techniques with the currently available training data. The models and algorithms presented in this paper can serve as a viable implementation of a search engine to provide access to biomedical datasets. The retrieval performance is expected to be further improved by using additional training data that is created by expert annotation, or gathered through usage logs, clicks and other processes during natural operation of the system. Database URL: https://github.com/emory-irlab/biocaddie
Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies.

Science.gov (United States)

Fan, Jung-Wei; Friedman, Carol

2011-10-01

Biomedical natural language processing (BioNLP) is a useful technique that unlocks valuable information stored in textual data for practice and/or research. Syntactic parsing is a critical component of BioNLP applications that rely on correctly determining the sentence and phrase structure of free text. In addition to dealing with the vast amount of domain-specific terms, a robust biomedical parser needs to model the semantic grammar to obtain viable syntactic structures. With either a rule-based or corpus-based approach, the grammar engineering process requires substantial time and knowledge from experts, and does not always yield a semantically transferable grammar. To reduce the human effort and to promote semantic transferability, we propose an automated method for deriving a probabilistic grammar based on a training corpus consisting of concept strings and semantic classes from the Unified Medical Language System (UMLS), a comprehensive terminology resource widely used by the community. The grammar is designed to specify noun phrases only due to the nominal nature of the majority of biomedical terminological concepts. Evaluated on manually parsed clinical notes, the derived grammar achieved a recall of 0.644, precision of 0.737, and average cross-bracketing of 0.61, which demonstrated better performance than a control grammar with the semantic information removed. Error analysis revealed shortcomings that could be addressed to improve performance. The results indicated the feasibility of an approach which automatically incorporates terminology semantics in the building of an operational grammar. Although the current performance of the unsupervised solution does not adequately replace manual engineering, we believe once the performance issues are addressed, it could serve as an aide in a semi-supervised solution. Copyright © 2011 Elsevier Inc. All rights reserved.
Semi-Automated Annotation of Biobank Data Using Standard Medical Terminologies in a Graph Database.

Science.gov (United States)

Hofer, Philipp; Neururer, Sabrina; Goebel, Georg

2016-01-01

Data describing biobank resources frequently contains unstructured free-text information or insufficient coding standards. (Bio-) medical ontologies like Orphanet Rare Diseases Ontology (ORDO) or the Human Disease Ontology (DOID) provide a high number of concepts, synonyms and entity relationship properties. Such standard terminologies increase quality and granularity of input data by adding comprehensive semantic background knowledge from validated entity relationships. Moreover, cross-references between terminology concepts facilitate data integration across databases using different coding standards. In order to encourage the use of standard terminologies, our aim is to identify and link relevant concepts with free-text diagnosis inputs within a biobank registry. Relevant concepts are selected automatically by lexical matching and SPARQL queries against a RDF triplestore. To ensure correctness of annotations, proposed concepts have to be confirmed by medical data administration experts before they are entered into the registry database. Relevant (bio-) medical terminologies describing diseases and phenotypes were identified and stored in a graph database which was tied to a local biobank registry. Concept recommendations during data input trigger a structured description of medical data and facilitate data linkage between heterogeneous systems.
Development of a Model for the Representation of Nanotechnology-Specific Terminology

OpenAIRE

Bailey, LeeAnn O.; Kennedy, Christopher H.; Fritts, Martin J.; Hartel, Francis W.

2006-01-01

Nanotechnology is an important, rapidly-evolving, multidisciplinary field [1]. The tremendous growth in this area necessitates the establishment of a common, open-source terminology to support the diverse biomedical applications of nanotechnology. Currently, the consensus process to define and categorize conceptual entities pertaining to nanotechnology is in a rudimentary stage. We have constructed a nanotechnology-specific conceptual hierarchy that can be utilized by end users to retrieve ac...
BIOMedical Search Engine Framework: Lightweight and customized implementation of domain-specific biomedical search engines.

Science.gov (United States)

Jácome, Alberto G; Fdez-Riverola, Florentino; Lourenço, Anália

2016-07-01

Text mining and semantic analysis approaches can be applied to the construction of biomedical domain-specific search engines and provide an attractive alternative to create personalized and enhanced search experiences. Therefore, this work introduces the new open-source BIOMedical Search Engine Framework for the fast and lightweight development of domain-specific search engines. The rationale behind this framework is to incorporate core features typically available in search engine frameworks with flexible and extensible technologies to retrieve biomedical documents, annotate meaningful domain concepts, and develop highly customized Web search interfaces. The BIOMedical Search Engine Framework integrates taggers for major biomedical concepts, such as diseases, drugs, genes, proteins, compounds and organisms, and enables the use of domain-specific controlled vocabulary. Technologies from the Typesafe Reactive Platform, the AngularJS JavaScript framework and the Bootstrap HTML/CSS framework support the customization of the domain-oriented search application. Moreover, the RESTful API of the BIOMedical Search Engine Framework allows the integration of the search engine into existing systems or a complete web interface personalization. The construction of the Smart Drug Search is described as proof-of-concept of the BIOMedical Search Engine Framework. This public search engine catalogs scientific literature about antimicrobial resistance, microbial virulence and topics alike. The keyword-based queries of the users are transformed into concepts and search results are presented and ranked accordingly. The semantic graph view portraits all the concepts found in the results, and the researcher may look into the relevance of different concepts, the strength of direct relations, and non-trivial, indirect relations. The number of occurrences of the concept shows its importance to the query, and the frequency of concept co-occurrence is indicative of biological relations
NAMED ENTITY RECOGNITION FROM BIOMEDICAL TEXT -AN INFORMATION EXTRACTION TASK

Directory of Open Access Journals (Sweden)

N. Kanya

2016-07-01

Full Text Available Biomedical Text Mining targets the Extraction of significant information from biomedical archives. Bio TM encompasses Information Retrieval (IR and Information Extraction (IE. The Information Retrieval will retrieve the relevant Biomedical Literature documents from the various Repositories like PubMed, MedLine etc., based on a search query. The IR Process ends up with the generation of corpus with the relevant document retrieved from the Publication databases based on the query. The IE task includes the process of Preprocessing of the document, Named Entity Recognition (NER from the documents and Relationship Extraction. This process includes Natural Language Processing, Data Mining techniques and machine Language algorithm. The preprocessing task includes tokenization, stop word Removal, shallow parsing, and Parts-Of-Speech tagging. NER phase involves recognition of well-defined objects such as genes, proteins or cell-lines etc. This process leads to the next phase that is extraction of relationships (IE. The work was based on machine learning algorithm Conditional Random Field (CRF.
INIS: Terminology charts

Energy Technology Data Exchange (ETDEWEB)

NONE

1970-08-01

This document is one in a series of publications known as the INIS Reference Series. It is to be used in conjunction with the INIS indexing manual and the INIS thesaurus for the preparation of input to the INIS database. The thesaurus and terminology charts in their first edition (Rev.0) were produced as the result of an agreement between the International Atomic Energy Agency (IAEA) and the European Atomic Energy Community (Euratom). Except for minor changes the terminology and the interrelationships between terms are those of the December 1969 edition of the Euratom Thesaurus. The purpose of the terminology harts is to display the descriptors of the thesaurus in the context of their hierarchical and other semantic relationships. Hierarchically related terms are grouped in clusters, each representing one of the principal concepts of a subject field. The descriptors are grouped around or under the broadest term of the clusters which is printed in upper case. The hierarchical relationships within the clusters are shown by the arrangement of the terms in smaller boxes within the larger boxes circumscribing the clusters. The clusters are connected by lines of various thickness, representing the other (mostly non-hierarchical) relationships. These connections are the equivalent to 'see also' and 'related term' cross-references. The thickness of the lines represents the strength of the semantic relations, or, in the practice of a retrieval system the probability that one term replacing a connected term in a query, will still yield pertinent references. The figures accompanying the descriptors represent their frequency of assignment to the first 987,000 documents stored in the Euratom system (May 1970). They are presented in order to show the relative importance of the descriptors within the subject field. The asterisks (*) accompanying descriptors in the charts refer to descriptors, for which a scope note can be found in the INIS: Thesaurus at the time the charts went
INIS: Terminology charts

International Nuclear Information System (INIS)

1970-08-01

This document is one in a series of publications known as the INIS Reference Series. It is to be used in conjunction with the INIS indexing manual and the INIS thesaurus for the preparation of input to the INIS database. The thesaurus and terminology charts in their first edition (Rev.0) were produced as the result of an agreement between the International Atomic Energy Agency (IAEA) and the European Atomic Energy Community (Euratom). Except for minor changes the terminology and the interrelationships between terms are those of the December 1969 edition of the Euratom Thesaurus. The purpose of the terminology harts is to display the descriptors of the thesaurus in the context of their hierarchical and other semantic relationships. Hierarchically related terms are grouped in clusters, each representing one of the principal concepts of a subject field. The descriptors are grouped around or under the broadest term of the clusters which is printed in upper case. The hierarchical relationships within the clusters are shown by the arrangement of the terms in smaller boxes within the larger boxes circumscribing the clusters. The clusters are connected by lines of various thickness, representing the other (mostly non-hierarchical) relationships. These connections are the equivalent to 'see also' and 'related term' cross-references. The thickness of the lines represents the strength of the semantic relations, or, in the practice of a retrieval system the probability that one term replacing a connected term in a query, will still yield pertinent references. The figures accompanying the descriptors represent their frequency of assignment to the first 987,000 documents stored in the Euratom system (May 1970). They are presented in order to show the relative importance of the descriptors within the subject field. The asterisks (*) accompanying descriptors in the charts refer to descriptors, for which a scope note can be found in the INIS: Thesaurus at the time the charts went
Improving accuracy for identifying related PubMed queries by an integrated approach.

Science.gov (United States)

Lu, Zhiyong; Wilbur, W John

2009-10-01

PubMed is the most widely used tool for searching biomedical literature online. As with many other online search tools, a user often types a series of multiple related queries before retrieving satisfactory results to fulfill a single information need. Meanwhile, it is also a common phenomenon to see a user type queries on unrelated topics in a single session. In order to study PubMed users' search strategies, it is necessary to be able to automatically separate unrelated queries and group together related queries. Here, we report a novel approach combining both lexical and contextual analyses for segmenting PubMed query sessions and identifying related queries and compare its performance with the previous approach based solely on concept mapping. We experimented with our integrated approach on sample data consisting of 1539 pairs of consecutive user queries in 351 user sessions. The prediction results of 1396 pairs agreed with the gold-standard annotations, achieving an overall accuracy of 90.7%. This demonstrates that our approach is significantly better than the previously published method. By applying this approach to a one day query log of PubMed, we found that a significant proportion of information needs involved more than one PubMed query, and that most of the consecutive queries for the same information need are lexically related. Finally, the proposed PubMed distance is shown to be an accurate and meaningful measure for determining the contextual similarity between biological terms. The integrated approach can play a critical role in handling real-world PubMed query log data as is demonstrated in our experiments.
A comparison of two methods for retrieving ICD-9-CM data: the effect of using an ontology-based method for handling terminology changes.

Science.gov (United States)

Yu, Alexander C; Cimino, James J

2011-04-01

Most existing controlled terminologies can be characterized as collections of terms, wherein the terms are arranged in a simple list or organized in a hierarchy. These kinds of terminologies are considered useful for standardizing terms and encoding data and are currently used in many existing information systems. However, they suffer from a number of limitations that make data reuse difficult. Relatively recently, it has been proposed that formal ontological methods can be applied to some of the problems of terminological design. Biomedical ontologies organize concepts (embodiments of knowledge about biomedical reality) whereas terminologies organize terms (what is used to code patient data at a certain point in time, based on the particular terminology version). However, the application of these methods to existing terminologies is not straightforward. The use of these terminologies is firmly entrenched in many systems, and what might seem to be a simple option of replacing these terminologies is not possible. Moreover, these terminologies evolve over time in order to suit the needs of users. Any methodology must therefore take these constraints into consideration, hence the need for formal methods of managing changes. Along these lines, we have developed a formal representation of the concept-term relation, around which we have also developed a methodology for management of terminology changes. The objective of this study was to determine whether our methodology would result in improved retrieval of data. Comparison of two methods for retrieving data encoded with terms from the International Classification of Diseases (ICD-9-CM), based on their recall when retrieving data for ICD-9-CM terms whose codes had changed but which had retained their original meaning (code change). Recall and interclass correlation coefficient. Statistically significant differences were detected (pontology-based ICD-9-CM data retrieval method that takes into account the effects of
BOSS: context-enhanced search for biomedical objects

Directory of Open Access Journals (Sweden)

Choi Jaehoon

2012-04-01

Full Text Available Abstract Background There exist many academic search solutions and most of them can be put on either ends of spectrum: general-purpose search and domain-specific "deep" search systems. The general-purpose search systems, such as PubMed, offer flexible query interface, but churn out a list of matching documents that users have to go through the results in order to find the answers to their queries. On the other hand, the "deep" search systems, such as PPI Finder and iHOP, return the precompiled results in a structured way. Their results, however, are often found only within some predefined contexts. In order to alleviate these problems, we introduce a new search engine, BOSS, Biomedical Object Search System. Methods Unlike the conventional search systems, BOSS indexes segments, rather than documents. A segment refers to a Maximal Coherent Semantic Unit (MCSU such as phrase, clause or sentence that is semantically coherent in the given context (e.g., biomedical objects or their relations. For a user query, BOSS finds all matching segments, identifies the objects appearing in those segments, and aggregates the segments for each object. Finally, it returns the ranked list of the objects along with their matching segments. Results The working prototype of BOSS is available at http://boss.korea.ac.kr. The current version of BOSS has indexed abstracts of more than 20 million articles published during last 16 years from 1996 to 2011 across all science disciplines. Conclusion BOSS fills the gap between either ends of the spectrum by allowing users to pose context-free queries and by returning a structured set of results. Furthermore, BOSS exhibits the characteristic of good scalability, just as with conventional document search engines, because it is designed to use a standard document-indexing model with minimal modifications. Considering the features, BOSS notches up the technological level of traditional solutions for search on biomedical information.
KoralQuery -- A General Corpus Query Protocol

DEFF Research Database (Denmark)

Bingel, Joachim; Diewald, Nils

2015-01-01

. In this paper, we present KoralQuery, a JSON-LD based general corpus query protocol, aiming to be independent of particular QLs, tasks and corpus formats. In addition to describing the system of types and operations that KoralQuery is built on, we exemplify the representation of corpus queries in the serialized...
G-Bean: an ontology-graph based web tool for biomedical literature retrieval.

Science.gov (United States)

Wang, James Z; Zhang, Yuanyuan; Dong, Liang; Li, Lin; Srimani, Pradip K; Yu, Philip S

2014-01-01

Currently, most people use NCBI's PubMed to search the MEDLINE database, an important bibliographical information source for life science and biomedical information. However, PubMed has some drawbacks that make it difficult to find relevant publications pertaining to users' individual intentions, especially for non-expert users. To ameliorate the disadvantages of PubMed, we developed G-Bean, a graph based biomedical search engine, to search biomedical articles in MEDLINE database more efficiently. G-Bean addresses PubMed's limitations with three innovations: (1) Parallel document index creation: a multithreaded index creation strategy is employed to generate the document index for G-Bean in parallel; (2) Ontology-graph based query expansion: an ontology graph is constructed by merging four major UMLS (Version 2013AA) vocabularies, MeSH, SNOMEDCT, CSP and AOD, to cover all concepts in National Library of Medicine (NLM) database; a Personalized PageRank algorithm is used to compute concept relevance in this ontology graph and the Term Frequency - Inverse Document Frequency (TF-IDF) weighting scheme is used to re-rank the concepts. The top 500 ranked concepts are selected for expanding the initial query to retrieve more accurate and relevant information; (3) Retrieval and re-ranking of documents based on user's search intention: after the user selects any article from the existing search results, G-Bean analyzes user's selections to determine his/her true search intention and then uses more relevant and more specific terms to retrieve additional related articles. The new articles are presented to the user in the order of their relevance to the already selected articles. Performance evaluation with 106 OHSUMED benchmark queries shows that G-Bean returns more relevant results than PubMed does when using these queries to search the MEDLINE database. PubMed could not even return any search result for some OHSUMED queries because it failed to form the appropriate Boolean

Secure count query on encrypted genomic data.

Science.gov (United States)

Hasan, Mohammad Zahidul; Mahdi, Md Safiur Rahman; Sadat, Md Nazmus; Mohammed, Noman

2018-05-01

Human genomic information can yield more effective healthcare by guiding medical decisions. Therefore, genomics research is gaining popularity as it can identify potential correlations between a disease and a certain gene, which improves the safety and efficacy of drug treatment and can also develop more effective prevention strategies [1]. To reduce the sampling error and to increase the statistical accuracy of this type of research projects, data from different sources need to be brought together since a single organization does not necessarily possess required amount of data. In this case, data sharing among multiple organizations must satisfy strict policies (for instance, HIPAA and PIPEDA) that have been enforced to regulate privacy-sensitive data sharing. Storage and computation on the shared data can be outsourced to a third party cloud service provider, equipped with enormous storage and computation resources. However, outsourcing data to a third party is associated with a potential risk of privacy violation of the participants, whose genomic sequence or clinical profile is used in these studies. In this article, we propose a method for secure sharing and computation on genomic data in a semi-honest cloud server. In particular, there are two main contributions. Firstly, the proposed method can handle biomedical data containing both genotype and phenotype. Secondly, our proposed index tree scheme reduces the computational overhead significantly for executing secure count query operation. In our proposed method, the confidentiality of shared data is ensured through encryption, while making the entire computation process efficient and scalable for cutting-edge biomedical applications. We evaluated our proposed method in terms of efficiency on a database of Single-Nucleotide Polymorphism (SNP) sequences, and experimental results demonstrate that the execution time for a query of 50 SNPs in a database of 50,000 records is approximately 5 s, where each record
Biomedical informatics: we are what we publish.

Science.gov (United States)

Elkin, P L; Brown, S H; Wright, G

2013-01-01

This article is part of a For-Discussion-Section of Methods of Information in Medicine on "Biomedical Informatics: We are what we publish". It is introduced by an editorial and followed by a commentary paper with invited comments. In subsequent issues the discussion may continue through letters to the editor. Informatics experts have attempted to define the field via consensus projects which has led to consensus statements by both AMIA. and by IMIA. We add to the output of this process the results of a study of the Pubmed publications with abstracts from the field of Biomedical Informatics. We took the terms from the AMIA consensus document and the terms from the IMIA definitions of the field of Biomedical Informatics and combined them through human review to create the Health Informatics Ontology. We built a terminology server using the Intelligent Natural Language Processor (iNLP). Then we downloaded the entire set of articles in Medline identified by searching the literature by "Medical Informatics" OR "Bioinformatics". The articles were parsed by the joint AMIA / IMIA terminology and then again using SNOMED CT and for the Bioinformatics they were also parsed using HGNC Ontology. We identified 153,580 articles using "Medical Informatics" and 20,573 articles using "Bioinformatics". This resulted in 168,298 unique articles and an overlap of 5,855 articles. Of these 62,244 articles (37%) had titles and abstracts that contained at least one concept from the Health Informatics Ontology. SNOMED CT indexing showed that the field interacts with most all clinical fields of medicine. Further defining the field by what we publish can add value to the consensus driven processes that have been the mainstay of the efforts to date. Next steps should be to extract terms from the literature that are uncovered and create class hierarchies and relationships for this content. We should also examine the high occurring of MeSH terms as markers to define Biomedical Informatics
Biomedical data integration in computational drug design and bioinformatics.

Science.gov (United States)

Seoane, Jose A; Aguiar-Pulido, Vanessa; Munteanu, Cristian R; Rivero, Daniel; Rabunal, Juan R; Dorado, Julian; Pazos, Alejandro

2013-03-01

In recent years, in the post genomic era, more and more data is being generated by biological high throughput technologies, such as proteomics and transcriptomics. This omics data can be very useful, but the real challenge is to analyze all this data, as a whole, after integrating it. Biomedical data integration enables making queries to different, heterogeneous and distributed biomedical data sources. Data integration solutions can be very useful not only in the context of drug design, but also in biomedical information retrieval, clinical diagnosis, system biology, etc. In this review, we analyze the most common approaches to biomedical data integration, such as federated databases, data warehousing, multi-agent systems and semantic technology, as well as the solutions developed using these approaches in the past few years.
Evaluating the granularity balance of hierarchical relationships within large biomedical terminologies towards quality improvement.

Science.gov (United States)

Luo, Lingyun; Tong, Ling; Zhou, Xiaoxi; Mejino, Jose L V; Ouyang, Chunping; Liu, Yongbin

2017-11-01

Organizing the descendants of a concept under a particular semantic relationship may be rather arbitrarily carried out during the manual creation processes of large biomedical terminologies, resulting in imbalances in relationship granularity. This work aims to propose scalable models towards systematically evaluating the granularity balance of semantic relationships. We first utilize "parallel concepts set (PCS)" and two features (the length and the strength) of the paths between PCSs to design the general evaluation models, based on which we propose eight concrete evaluation models generated by two specific types of PCSs: single concept set and symmetric concepts set. We then apply those concrete models to the IS-A relationship in FMA and SNOMED CT's Body Structure subset, as well as to the Part-Of relationship in FMA. Moreover, without loss of generality, we conduct two additional rounds of applications on the Part-Of relationship after removing length redundancies and strength redundancies sequentially. At last, we perform automatic evaluation on the imbalances detected after the final round for identifying missing concepts, misaligned relations and inconsistencies. For the IS-A relationship, 34 missing concepts, 80 misalignments and 18 redundancies in FMA as well as 28 missing concepts, 114 misalignments and 1 redundancy in SNOMED CT were uncovered. In addition, 6,801 instances of imbalances for the Part-Of relationship in FMA were also identified, including 3,246 redundancies. After removing those redundancies from FMA, the total number of Part-Of imbalances was dramatically reduced to 327, including 51 missing concepts, 294 misaligned relations, and 36 inconsistencies. Manual curation performed by the FMA project leader confirmed the effectiveness of our method in identifying curation errors. In conclusion, the granularity balance of hierarchical semantic relationship is a valuable property to check for ontology quality assurance, and the scalable evaluation
Hydrocephalus caused by unilateral foramen of Monro obstruction: A review on terminology

Science.gov (United States)

Nigri, Flavio; Gobbi, Gabriel Neffa; da Costa Ferreira Pinto, Pedro Henrique; Simões, Elington Lannes; Caparelli-Daquer, Egas Moniz

2016-01-01

Background: Hydrocephalus caused by unilateral foramen of Monro (FM) obstruction has been referred to in literature by many different terminologies. Precise terminology describing hydrocephalus confined to just one lateral ventricle has a very important prognostic value and determines whether or not the patient can be shunt free after an endoscopic procedure. Methods: Aiming to define the best term for unilateral FM obstruction, 19 terms were employed on PubMed database (http://www.ncbi.nlm.nih.gov/pubmed) as quoted phrases. Results: A total of 194 articles were found. Four patterns of hydrocephalus were discriminated as a result of our research term query and were divided by types for didactic purpose. Type A - partial dilation of the lateral ventricle; Type B - pure unilateral obstruction of the FM; Type C - previously shunted patients with secondary obstruction of the FM; and Type D - asymmetric lateral ventricles with patent FM. Conclusion: In unilateral FM obstruction hydrocephalus, an in-depth review on terminology application is critical to avoid mistakes that may compromise comparisons among different series. This terminology review suggests that Type B hydrocephalus, i.e., the hydrocephalus confined to just one lateral ventricle with no other sites of cerebrospinal fluid circulation blockage, are best described by the terms unilateral hydrocephalus (UH) and monoventricular hydrocephalus, the first being by far the most popular. Type A hydrocephalus is best represented in the literature by the terms uniloculated hydrocephalus and loculated ventricle; Type C hydrocephalus by the terms isolated lateral ventricle and isolated UH; and Type D hydrocephalus by the term asymmetric hydrocephalus. PMID:27274402
Computer Lexis and Terminology

Directory of Open Access Journals (Sweden)

Gintautas Grigas

2011-04-01

Full Text Available Computer becomes a widely used tool in everyday work and at home. Every computer user sees texts on its screen containing a lot of words naming new concepts. Those words come from the terminology used by specialists. The common vocabury between computer terminology and lexis of everyday language comes into existence. The article deals with the part of computer terminology which goes to everyday usage and the influence of ordinary language to computer terminology. The relation between English and Lithuanian computer terminology, the construction and pronouncing of acronyms are discussed as well.
Finding and accessing diagrams in biomedical publications.

Science.gov (United States)

Kuhn, Tobias; Luong, ThaiBinh; Krauthammer, Michael

2012-01-01

Complex relationships in biomedical publications are often communicated by diagrams such as bar and line charts, which are a very effective way of summarizing and communicating multi-faceted data sets. Given the ever-increasing amount of published data, we argue that the precise retrieval of such diagrams is of great value for answering specific and otherwise hard-to-meet information needs. To this end, we demonstrate the use of advanced image processing and classification for identifying bar and line charts by the shape and relative location of the different image elements that make up the charts. With recall and precisions of close to 90% for the detection of relevant figures, we discuss the use of this technology in an existing biomedical image search engine, and outline how it enables new forms of literature queries over biomedical relationships that are represented in these charts.
Knowledge and theme discovery across very large biological data sets using distributed queries: a prototype combining unstructured and structured data.

Directory of Open Access Journals (Sweden)

Uma S Mudunuri

Full Text Available As the discipline of biomedical science continues to apply new technologies capable of producing unprecedented volumes of noisy and complex biological data, it has become evident that available methods for deriving meaningful information from such data are simply not keeping pace. In order to achieve useful results, researchers require methods that consolidate, store and query combinations of structured and unstructured data sets efficiently and effectively. As we move towards personalized medicine, the need to combine unstructured data, such as medical literature, with large amounts of highly structured and high-throughput data such as human variation or expression data from very large cohorts, is especially urgent. For our study, we investigated a likely biomedical query using the Hadoop framework. We ran queries using native MapReduce tools we developed as well as other open source and proprietary tools. Our results suggest that the available technologies within the Big Data domain can reduce the time and effort needed to utilize and apply distributed queries over large datasets in practical clinical applications in the life sciences domain. The methodologies and technologies discussed in this paper set the stage for a more detailed evaluation that investigates how various data structures and data models are best mapped to the proper computational framework.
Query deforestation

OpenAIRE

Grust, Torsten; Scholl, Marc H.

1998-01-01

The construction of a declarative query engine for a DBMS includes the challenge of compiling algebraic queries into efficient execution plans that can be run on top of the persistent storage. This work pursues the goal of employing foldr-build deforestation for the derivation of efficient streaming programs - programs that do not allocate intermediate data structures to perform their task - from algebraic (combinator) query plans. The query engine is based on the insertion representation of ...
A study on PubMed search tag usage pattern: association rule mining of a full-day PubMed query log.

Science.gov (United States)

Mosa, Abu Saleh Mohammad; Yoo, Illhoi

2013-01-09

The practice of evidence-based medicine requires efficient biomedical literature search such as PubMed/MEDLINE. Retrieval performance relies highly on the efficient use of search field tags. The purpose of this study was to analyze PubMed log data in order to understand the usage pattern of search tags by the end user in PubMed/MEDLINE search. A PubMed query log file was obtained from the National Library of Medicine containing anonymous user identification, timestamp, and query text. Inconsistent records were removed from the dataset and the search tags were extracted from the query texts. A total of 2,917,159 queries were selected for this study issued by a total of 613,061 users. The analysis of frequent co-occurrences and usage patterns of the search tags was conducted using an association mining algorithm. The percentage of search tag usage was low (11.38% of the total queries) and only 2.95% of queries contained two or more tags. Three out of four users used no search tag and about two-third of them issued less than four queries. Among the queries containing at least one tagged search term, the average number of search tags was almost half of the number of total search terms. Navigational search tags are more frequently used than informational search tags. While no strong association was observed between informational and navigational tags, six (out of 19) informational tags and six (out of 29) navigational tags showed strong associations in PubMed searches. The low percentage of search tag usage implies that PubMed/MEDLINE users do not utilize the features of PubMed/MEDLINE widely or they are not aware of such features or solely depend on the high recall focused query translation by the PubMed's Automatic Term Mapping. The users need further education and interactive search application for effective use of the search tags in order to fulfill their biomedical information needs from PubMed/MEDLINE.
Terminological synonyms in Czech and English sports terminologies

Directory of Open Access Journals (Sweden)

Michaela Cocca

2016-11-01

Full Text Available The following paper deals with the concept and typology of terminological synonyms in English and Czech, focusing on the official sport terms codified in English and/or Czech dictionaries. The analysis focuses on Anglicisms as terminological doublets, hyposynonyms, stylistic synonyms, and false friends. Results show that a high number of synonyms were generated by the process of transshaping or translating English terms into Czech. Our analysis suggests that there may be found three types of sports synonyms in English (real, quasi-, and pseudo- synonyms and four main types in Czech (terminological doublets, Anglicisms as hyposynonyms, false friends, and stylistic synonyms. The use of synonyms is even more evident in modern or newly created sports; mass media and the accessibility of data through the Internet playing an essential role as they mediate an immense input of information to the target population.
National Medical Terminology Server in Korea

Science.gov (United States)

Lee, Sungin; Song, Seung-Jae; Koh, Soonjeong; Lee, Soo Kyoung; Kim, Hong-Gee

Interoperable EHR (Electronic Health Record) necessitates at least the use of standardized medical terminologies. This paper describes a medical terminology server, LexCare Suite, which houses terminology management applications, such as a terminology editor, and a terminology repository populated with international standard terminology systems such as Systematized Nomenclature of Medicine (SNOMED). The server is to satisfy the needs of quality terminology systems to local primary to tertiary hospitals. Our partner general hospitals have used the server to test its applicability. This paper describes the server and the results of the applicability test.
Approximate dictionary queries

DEFF Research Database (Denmark)

Brodal, Gerth Stølting; Gasieniec, Leszek

1996-01-01

Given a set of n binary strings of length m each. We consider the problem of answering d-queries. Given a binary query string of length m, a d-query is to report if there exists a string in the set within Hamming distance d of . We present a data structure of size O(nm) supporting 1-queries in ti...
A Query Cache Tool for Optimizing Repeatable and Parallel OLAP Queries

Science.gov (United States)

Santos, Ricardo Jorge; Bernardino, Jorge

On-line analytical processing against data warehouse databases is a common form of getting decision making information for almost every business field. Decision support information oftenly concerns periodic values based on regular attributes, such as sales amounts, percentages, most transactioned items, etc. This means that many similar OLAP instructions are periodically repeated, and simultaneously, between the several decision makers. Our Query Cache Tool takes advantage of previously executed queries, storing their results and the current state of the data which was accessed. Future queries only need to execute against the new data, inserted since the queries were last executed, and join these results with the previous ones. This makes query execution much faster, because we only need to process the most recent data. Our tool also minimizes the execution time and resource consumption for similar queries simultaneously executed by different users, putting the most recent ones on hold until the first finish and returns the results for all of them. The stored query results are held until they are considered outdated, then automatically erased. We present an experimental evaluation of our tool using a data warehouse based on a real-world business dataset and use a set of typical decision support queries to discuss the results, showing a very high gain in query execution time.
BioFed: federated query processing over life sciences linked open data.

Science.gov (United States)

Hasnain, Ali; Mehmood, Qaiser; Sana E Zainab, Syeda; Saleem, Muhammad; Warren, Claude; Zehra, Durre; Decker, Stefan; Rebholz-Schuhmann, Dietrich

2017-03-15

Biomedical data, e.g. from knowledge bases and ontologies, is increasingly made available following open linked data principles, at best as RDF triple data. This is a necessary step towards unified access to biological data sets, but this still requires solutions to query multiple endpoints for their heterogeneous data to eventually retrieve all the meaningful information. Suggested solutions are based on query federation approaches, which require the submission of SPARQL queries to endpoints. Due to the size and complexity of available data, these solutions have to be optimised for efficient retrieval times and for users in life sciences research. Last but not least, over time, the reliability of data resources in terms of access and quality have to be monitored. Our solution (BioFed) federates data over 130 SPARQL endpoints in life sciences and tailors query submission according to the provenance information. BioFed has been evaluated against the state of the art solution FedX and forms an important benchmark for the life science domain. The efficient cataloguing approach of the federated query processing system 'BioFed', the triple pattern wise source selection and the semantic source normalisation forms the core to our solution. It gathers and integrates data from newly identified public endpoints for federated access. Basic provenance information is linked to the retrieved data. Last but not least, BioFed makes use of the latest SPARQL standard (i.e., 1.1) to leverage the full benefits for query federation. The evaluation is based on 10 simple and 10 complex queries, which address data in 10 major and very popular data sources (e.g., Dugbank, Sider). BioFed is a solution for a single-point-of-access for a large number of SPARQL endpoints providing life science data. It facilitates efficient query generation for data access and provides basic provenance information in combination with the retrieved data. BioFed fully supports SPARQL 1.1 and gives access to the
Query responses

Directory of Open Access Journals (Sweden)

Paweł Łupkowski

2017-05-01

Full Text Available In this article we consider the phenomenon of answering a query with a query. Although such answers are common, no large scale, corpus-based characterization exists, with the exception of clarification requests. After briefly reviewing different theoretical approaches on this subject, we present a corpus study of query responses in the British National Corpus and develop a taxonomy for query responses. We point at a variety of response categories that have not been formalized in previous dialogue work, particularly those relevant to adversarial interaction. We show that different response categories have significantly different rates of subsequent answer provision. We provide a formal analysis of the response categories in the framework of KoS.
Recommending Multidimensional Queries

Science.gov (United States)

Giacometti, Arnaud; Marcel, Patrick; Negre, Elsa

Interactive analysis of datacube, in which a user navigates a cube by launching a sequence of queries is often tedious since the user may have no idea of what the forthcoming query should be in his current analysis. To better support this process we propose in this paper to apply a Collaborative Work approach that leverages former explorations of the cube to recommend OLAP queries. The system that we have developed adapts Approximate String Matching, a technique popular in Information Retrieval, to match the current analysis with the former explorations and help suggesting a query to the user. Our approach has been implemented with the open source Mondrian OLAP server to recommend MDX queries and we have carried out some preliminary experiments that show its efficiency for generating effective query recommendations.
Mining biomarker information in biomedical literature

Directory of Open Access Journals (Sweden)

Younesi Erfan

2012-12-01

Full Text Available Abstract Background For selection and evaluation of potential biomarkers, inclusion of already published information is of utmost importance. In spite of significant advancements in text- and data-mining techniques, the vast knowledge space of biomarkers in biomedical text has remained unexplored. Existing named entity recognition approaches are not sufficiently selective for the retrieval of biomarker information from the literature. The purpose of this study was to identify textual features that enhance the effectiveness of biomarker information retrieval for different indication areas and diverse end user perspectives. Methods A biomarker terminology was created and further organized into six concept classes. Performance of this terminology was optimized towards balanced selectivity and specificity. The information retrieval performance using the biomarker terminology was evaluated based on various combinations of the terminology's six classes. Further validation of these results was performed on two independent corpora representing two different neurodegenerative diseases. Results The current state of the biomarker terminology contains 119 entity classes supported by 1890 different synonyms. The result of information retrieval shows improved retrieval rate of informative abstracts, which is achieved by including clinical management terms and evidence of gene/protein alterations (e.g. gene/protein expression status or certain polymorphisms in combination with disease and gene name recognition. When additional filtering through other classes (e.g. diagnostic or prognostic methods is applied, the typical high number of unspecific search results is significantly reduced. The evaluation results suggest that this approach enables the automated identification of biomarker information in the literature. A demo version of the search engine SCAIView, including the biomarker retrieval, is made available to the public through http
THE TERMINOLOGY OF LIBRARY SCIENCE

OpenAIRE

Љиљана Матић

2014-01-01

The master’s thesis entitled The Terminology of Library Science presents the general state of the terminology of library science in the Serbian language and analyses the terminological system which was formed in the last couple of decades in relation to library and information science. The terminology of library science is seen as a characteristic of professional language. The research is conducted on a corpus which excludes sources relating extremely to either library science or information ...
The role of economics in the QUERI program: QUERI Series.

Science.gov (United States)

Smith, Mark W; Barnett, Paul G

2008-04-22

The United States (U.S.) Department of Veterans Affairs (VA) Quality Enhancement Research Initiative (QUERI) has implemented economic analyses in single-site and multi-site clinical trials. To date, no one has reviewed whether the QUERI Centers are taking an optimal approach to doing so. Consistent with the continuous learning culture of the QUERI Program, this paper provides such a reflection. We present a case study of QUERI as an example of how economic considerations can and should be integrated into implementation research within both single and multi-site studies. We review theoretical and applied cost research in implementation studies outside and within VA. We also present a critique of the use of economic research within the QUERI program. Economic evaluation is a key element of implementation research. QUERI has contributed many developments in the field of implementation but has only recently begun multi-site implementation trials across multiple regions within the national VA healthcare system. These trials are unusual in their emphasis on developing detailed costs of implementation, as well as in the use of business case analyses (budget impact analyses). Economics appears to play an important role in QUERI implementation studies, only after implementation has reached the stage of multi-site trials. Economic analysis could better inform the choice of which clinical best practices to implement and the choice of implementation interventions to employ. QUERI economics also would benefit from research on costing methods and development of widely accepted international standards for implementation economics.

The role of economics in the QUERI program: QUERI Series

Directory of Open Access Journals (Sweden)

Smith Mark W

2008-04-01

Full Text Available Abstract Background The United States (U.S. Department of Veterans Affairs (VA Quality Enhancement Research Initiative (QUERI has implemented economic analyses in single-site and multi-site clinical trials. To date, no one has reviewed whether the QUERI Centers are taking an optimal approach to doing so. Consistent with the continuous learning culture of the QUERI Program, this paper provides such a reflection. Methods We present a case study of QUERI as an example of how economic considerations can and should be integrated into implementation research within both single and multi-site studies. We review theoretical and applied cost research in implementation studies outside and within VA. We also present a critique of the use of economic research within the QUERI program. Results Economic evaluation is a key element of implementation research. QUERI has contributed many developments in the field of implementation but has only recently begun multi-site implementation trials across multiple regions within the national VA healthcare system. These trials are unusual in their emphasis on developing detailed costs of implementation, as well as in the use of business case analyses (budget impact analyses. Conclusion Economics appears to play an important role in QUERI implementation studies, only after implementation has reached the stage of multi-site trials. Economic analysis could better inform the choice of which clinical best practices to implement and the choice of implementation interventions to employ. QUERI economics also would benefit from research on costing methods and development of widely accepted international standards for implementation economics.
SeqWare Query Engine: storing and searching sequence data in the cloud

Directory of Open Access Journals (Sweden)

Merriman Barry

2010-12-01

Full Text Available Abstract Background Since the introduction of next-generation DNA sequencers the rapid increase in sequencer throughput, and associated drop in costs, has resulted in more than a dozen human genomes being resequenced over the last few years. These efforts are merely a prelude for a future in which genome resequencing will be commonplace for both biomedical research and clinical applications. The dramatic increase in sequencer output strains all facets of computational infrastructure, especially databases and query interfaces. The advent of cloud computing, and a variety of powerful tools designed to process petascale datasets, provide a compelling solution to these ever increasing demands. Results In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net. Conclusions The SeqWare Query Engine provided an easy way to make the U87MG genome accessible to programmers and non-programmers alike. This enabled a faster and more open exploration of results, quicker tuning of parameters for heuristic variant calling filters
SeqWare Query Engine: storing and searching sequence data in the cloud

Science.gov (United States)

2010-01-01

Background Since the introduction of next-generation DNA sequencers the rapid increase in sequencer throughput, and associated drop in costs, has resulted in more than a dozen human genomes being resequenced over the last few years. These efforts are merely a prelude for a future in which genome resequencing will be commonplace for both biomedical research and clinical applications. The dramatic increase in sequencer output strains all facets of computational infrastructure, especially databases and query interfaces. The advent of cloud computing, and a variety of powerful tools designed to process petascale datasets, provide a compelling solution to these ever increasing demands. Results In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc) with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net). Conclusions The SeqWare Query Engine provided an easy way to make the U87MG genome accessible to programmers and non-programmers alike. This enabled a faster and more open exploration of results, quicker tuning of parameters for heuristic variant calling filters, and a common data
SeqWare Query Engine: storing and searching sequence data in the cloud.

Science.gov (United States)

O'Connor, Brian D; Merriman, Barry; Nelson, Stanley F

2010-12-21

Since the introduction of next-generation DNA sequencers the rapid increase in sequencer throughput, and associated drop in costs, has resulted in more than a dozen human genomes being resequenced over the last few years. These efforts are merely a prelude for a future in which genome resequencing will be commonplace for both biomedical research and clinical applications. The dramatic increase in sequencer output strains all facets of computational infrastructure, especially databases and query interfaces. The advent of cloud computing, and a variety of powerful tools designed to process petascale datasets, provide a compelling solution to these ever increasing demands. In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc) with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net). The SeqWare Query Engine provided an easy way to make the U87MG genome accessible to programmers and non-programmers alike. This enabled a faster and more open exploration of results, quicker tuning of parameters for heuristic variant calling filters, and a common data interface to simplify development of
In-context query reformulation for failing SPARQL queries

Science.gov (United States)

Viswanathan, Amar; Michaelis, James R.; Cassidy, Taylor; de Mel, Geeth; Hendler, James

2017-05-01

Knowledge bases for decision support systems are growing increasingly complex, through continued advances in data ingest and management approaches. However, humans do not possess the cognitive capabilities to retain a bird's-eyeview of such knowledge bases, and may end up issuing unsatisfiable queries to such systems. This work focuses on the implementation of a query reformulation approach for graph-based knowledge bases, specifically designed to support the Resource Description Framework (RDF). The reformulation approach presented is instance-and schema-aware. Thus, in contrast to relaxation techniques found in the state-of-the-art, the presented approach produces in-context query reformulation.
Google BigQuery analytics

CERN Document Server

Tigani, Jordan

2014-01-01

How to effectively use BigQuery, avoid common mistakes, and execute sophisticated queries against large datasets Google BigQuery Analytics is the perfect guide for business and data analysts who want the latest tips on running complex queries and writing code to communicate with the BigQuery API. The book uses real-world examples to demonstrate current best practices and techniques, and also explains and demonstrates streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results. In addit
Biomedical information retrieval across languages.

Science.gov (United States)

Daumke, Philipp; Markü, Kornél; Poprat, Michael; Schulz, Stefan; Klar, Rüdiger

2007-06-01

This work presents a new dictionary-based approach to biomedical cross-language information retrieval (CLIR) that addresses many of the general and domain-specific challenges in current CLIR research. Our method is based on a multilingual lexicon that was generated partly manually and partly automatically, and currently covers six European languages. It contains morphologically meaningful word fragments, termed subwords. Using subwords instead of entire words significantly reduces the number of lexical entries necessary to sufficiently cover a specific language and domain. Mediation between queries and documents is based on these subwords as well as on lists of word-n-grams that are generated from large monolingual corpora and constitute possible translation units. The translations are then sent to a standard Internet search engine. This process makes our approach an effective tool for searching the biomedical content of the World Wide Web in different languages. We evaluate this approach using the OHSUMED corpus, a large medical document collection, within a cross-language retrieval setting.
Private and Efficient Query Processing on Outsourced Genomic Databases.

Science.gov (United States)

Ghasemi, Reza; Al Aziz, Md Momin; Mohammed, Noman; Dehkordi, Massoud Hadian; Jiang, Xiaoqian

2017-09-01

Applications of genomic studies are spreading rapidly in many domains of science and technology such as healthcare, biomedical research, direct-to-consumer services, and legal and forensic. However, there are a number of obstacles that make it hard to access and process a big genomic database for these applications. First, sequencing genomic sequence is a time consuming and expensive process. Second, it requires large-scale computation and storage systems to process genomic sequences. Third, genomic databases are often owned by different organizations, and thus, not available for public usage. Cloud computing paradigm can be leveraged to facilitate the creation and sharing of big genomic databases for these applications. Genomic data owners can outsource their databases in a centralized cloud server to ease the access of their databases. However, data owners are reluctant to adopt this model, as it requires outsourcing the data to an untrusted cloud service provider that may cause data breaches. In this paper, we propose a privacy-preserving model for outsourcing genomic data to a cloud. The proposed model enables query processing while providing privacy protection of genomic databases. Privacy of the individuals is guaranteed by permuting and adding fake genomic records in the database. These techniques allow cloud to evaluate count and top-k queries securely and efficiently. Experimental results demonstrate that a count and a top-k query over 40 Single Nucleotide Polymorphisms (SNPs) in a database of 20 000 records takes around 100 and 150 s, respectively.
Merge of terminological resources

DEFF Research Database (Denmark)

Henriksen, Lina; Braasch, Anna

2012-01-01

In our globalized world, the amount of cross-national communication increases rapidly, which also calls for easy access to multi-lingual high quality terminological resources. Sharing of terminology resources is currently becoming common practice, and efficient strategies for integration...... – or merging – of terminology resources are strongly needed. This paper discusses prerequisites for successful merging with the focus on identification of candidate duplicates of a subject domain found in the resources to be merged, and it describes automatic merging strategies to be applied to such duplicates...... in electronic terminology resources. Further, some perspectives of manual, supplementary assessment methods supporting the automatic procedures are sketched. Our considerations are primarily based on experience gained in the IATE and EuroTermBank projects, as merging was a much discussed issue in both projects....
Query optimization over crowdsourced data

KAUST Repository

Park, Hyunjung

2013-08-26

Deco is a comprehensive system for answering declarative queries posed over stored relational data together with data obtained on-demand from the crowd. In this paper we describe Deco\\'s cost-based query optimizer, building on Deco\\'s data model, query language, and query execution engine presented earlier. Deco\\'s objective in query optimization is to find the best query plan to answer a query, in terms of estimated monetary cost. Deco\\'s query semantics and plan execution strategies require several fundamental changes to traditional query optimization. Novel techniques incorporated into Deco\\'s query optimizer include a cost model distinguishing between "free" existing data versus paid new data, a cardinality estimation algorithm coping with changes to the database state during query execution, and a plan enumeration algorithm maximizing reuse of common subplans in a setting that makes reuse challenging. We experimentally evaluate Deco\\'s query optimizer, focusing on the accuracy of cost estimation and the efficiency of plan enumeration.
Supporting inter-topic entity search for biomedical Linked Data based on heterogeneous relationships.

Science.gov (United States)

Zong, Nansu; Lee, Sungin; Ahn, Jinhyun; Kim, Hong-Gee

2017-08-01

The keyword-based entity search restricts search space based on the preference of search. When given keywords and preferences are not related to the same biomedical topic, existing biomedical Linked Data search engines fail to deliver satisfactory results. This research aims to tackle this issue by supporting an inter-topic search-improving search with inputs, keywords and preferences, under different topics. This study developed an effective algorithm in which the relations between biomedical entities were used in tandem with a keyword-based entity search, Siren. The algorithm, PERank, which is an adaptation of Personalized PageRank (PPR), uses a pair of input: (1) search preferences, and (2) entities from a keyword-based entity search with a keyword query, to formalize the search results on-the-fly based on the index of the precomputed Individual Personalized PageRank Vectors (IPPVs). Our experiments were performed over ten linked life datasets for two query sets, one with keyword-preference topic correspondence (intra-topic search), and the other without (inter-topic search). The experiments showed that the proposed method achieved better search results, for example a 14% increase in precision for the inter-topic search than the baseline keyword-based search engine. The proposed method improved the keyword-based biomedical entity search by supporting the inter-topic search without affecting the intra-topic search based on the relations between different entities. Copyright © 2017 Elsevier Ltd. All rights reserved.
Enhancing biomedical text summarization using semantic relation extraction.

Directory of Open Access Journals (Sweden)

Yue Shang

Full Text Available Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1 We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2 We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3 For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.
Enhancing biomedical text summarization using semantic relation extraction.

Science.gov (United States)

Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao

2011-01-01

Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.
Legal terminology in African languages | Alberts | Lexikos

African Journals Online (AJOL)

Various aspects regarding the present project (such as financing, time-schedule, training and terminological problems encountered) are treated. Keywords: legal terminology, sociolinguistic factors, terminology development, african languages, indigenous languages, multilingualism, subject fields, terminology, translation, ...
The Janus Head Article - How Much Terminology Theory Can Practical Terminology Management Use?

Directory of Open Access Journals (Sweden)

Petra Drewer

2007-03-01

Full Text Available The god Janus in Greek mythology was a two-faced god; each face had its own view of the world. Our idea behind the Janus Head article is to give you two different and maybe even contradicting views on a certain topic. This issue’s Janus Head Article, however, features not two but three different views on terminology work, as researchers, professionals and students (the professionals of tomorrow discuss “How Much Terminology Theory Can Practical Terminology Management Use?” at DaimlerChrysler AG.
Should Terminology Principles be re-examined?

OpenAIRE

Roche, Christophe

2016-01-01

International audience; Operationalization of terminology for IT applications has revived the Wüsterian approach. The conceptual dimension once more prevails after taking back seat to specialised lexicography. This is demonstrated by the emergence of ontology in terminology. While the Terminology Principles as defined in Felber's manual and the ISO standards remain at the core of traditional terminology , their computational implementation raises some issues. In this article, while reiteratin...
Querying on Federated Sensor Networks

Directory of Open Access Journals (Sweden)

Zuhal Can

2016-09-01

Full Text Available A Federated Sensor Network (FSN is a network of geographically distributed Wireless Sensor Networks (WSNs called islands. For querying on an FSN, we introduce the Layered Federated Sensor Network (L-FSN Protocol. For layered management, L-FSN provides communication among islands by its inter-island querying protocol by which a query packet routing path is determined according to some path selection policies. L-FSN allows autonomous management of each island by island-specific intra-island querying protocols that can be selected according to island properties. We evaluate the applicability of L-FSN and compare the L-FSN protocol with various querying protocols running on the flat federation model. Flat federation is a method to federate islands by running a single querying protocol on an entire FSN without distinguishing communication among and within islands. For flat federation, we select a querying protocol from geometrical, hierarchical cluster-based, hash-based, and tree-based WSN querying protocol categories. We found that a layered federation of islands by L-FSN increases the querying performance with respect to energy-efficiency, query resolving distance, and query resolving latency. Moreover, L-FSN’s flexibility of choosing intra-island querying protocols regarding the island size brings advantages on energy-efficiency and query resolving latency.
Terminologi og oversigtsplaner

DEFF Research Database (Denmark)

Roesdahl, Else; Sindbæk, Søren Michael

2014-01-01

Key to terminology used in the Aggersborg book relating to features of the rural settlement and the circular fortress, and information on excavation documentation and on the plans published in the book......Key to terminology used in the Aggersborg book relating to features of the rural settlement and the circular fortress, and information on excavation documentation and on the plans published in the book...
A journey to Semantic Web query federation in the life sciences.

Science.gov (United States)

Cheung, Kei-Hoi; Frost, H Robert; Marshall, M Scott; Prud'hommeaux, Eric; Samwald, Matthias; Zhao, Jun; Paschke, Adrian

2009-10-01

As interest in adopting the Semantic Web in the biomedical domain continues to grow, Semantic Web technology has been evolving and maturing. A variety of technological approaches including triplestore technologies, SPARQL endpoints, Linked Data, and Vocabulary of Interlinked Datasets have emerged in recent years. In addition to the data warehouse construction, these technological approaches can be used to support dynamic query federation. As a community effort, the BioRDF task force, within the Semantic Web for Health Care and Life Sciences Interest Group, is exploring how these emerging approaches can be utilized to execute distributed queries across different neuroscience data sources. We have created two health care and life science knowledge bases. We have explored a variety of Semantic Web approaches to describe, map, and dynamically query multiple datasets. We have demonstrated several federation approaches that integrate diverse types of information about neurons and receptors that play an important role in basic, clinical, and translational neuroscience research. Particularly, we have created a prototype receptor explorer which uses OWL mappings to provide an integrated list of receptors and executes individual queries against different SPARQL endpoints. We have also employed the AIDA Toolkit, which is directed at groups of knowledge workers who cooperatively search, annotate, interpret, and enrich large collections of heterogeneous documents from diverse locations. We have explored a tool called "FeDeRate", which enables a global SPARQL query to be decomposed into subqueries against the remote databases offering either SPARQL or SQL query interfaces. Finally, we have explored how to use the vocabulary of interlinked Datasets (voiD) to create metadata for describing datasets exposed as Linked Data URIs or SPARQL endpoints. We have demonstrated the use of a set of novel and state-of-the-art Semantic Web technologies in support of a neuroscience query
The Janus Head Article - How Much Terminology Theory Can Practical Terminology Management Use?

Directory of Open Access Journals (Sweden)

Petra Drewer

2012-08-01

Full Text Available The god Janus in Greek mythology was a two-faced god; each face had its own view of the world. Our idea behind the Janus Head article is to give you two different and maybe even contradicting views on a certain topic. This issue’s Janus Head Article, however, features not two but three different views on terminology work, as researchers, professionals and students (the professionals of tomorrow discuss “How Much Terminology Theory Can Practical Terminology Management Use?” at DaimlerChrysler AG.

Partitioning an object-oriented terminology schema.

Science.gov (United States)

Gu, H; Perl, Y; Halper, M; Geller, J; Kuo, F; Cimino, J J

2001-07-01

Controlled medical terminologies are increasingly becoming strategic components of various healthcare enterprises. However, the typical medical terminology can be difficult to exploit due to its extensive size and high density. The schema of a medical terminology offered by an object-oriented representation is a valuable tool in providing an abstract view of the terminology, enhancing comprehensibility and making it more usable. However, schemas themselves can be large and unwieldy. We present a methodology for partitioning a medical terminology schema into manageably sized fragments that promote increased comprehension. Our methodology has a refinement process for the subclass hierarchy of the terminology schema. The methodology is carried out by a medical domain expert in conjunction with a computer. The expert is guided by a set of three modeling rules, which guarantee that the resulting partitioned schema consists of a forest of trees. This makes it easier to understand and consequently use the medical terminology. The application of our methodology to the schema of the Medical Entities Dictionary (MED) is presented.
Terminology for Achilles tendon related disorders

NARCIS (Netherlands)

van Dijk, C. N.; van Sterkenburg, M. N.; Wiegerinck, J. I.; Karlsson, J.; Maffulli, N.

2011-01-01

The terminology of Achilles tendon pathology has become inconsistent and confusing throughout the years. For proper research, assessment and treatment, a uniform and clear terminology is necessary. A new terminology is proposed; the definitions hereof encompass the anatomic location, symptoms,
INTERVIEW: Knowledge and Terminology Management at Crisplant

DEFF Research Database (Denmark)

Møller, Margrethe H.; Toft, Birthe

2012-01-01

that the terminological resources of the two enterprises are in the process of being integrated. The challenges presented by this process demonstrate the importance of adhering to terminological principles when recording terminology resources, while at the same time reminding us what an essential discipline terminology...
Collective spatial keyword querying

DEFF Research Database (Denmark)

Cao, Xin; Cong, Gao; Jensen, Christian S.

2011-01-01

With the proliferation of geo-positioning and geo-tagging, spatial web objects that possess both a geographical location and a textual description are gaining in prevalence, and spatial keyword queries that exploit both location and textual description are gaining in prominence. However, the quer......With the proliferation of geo-positioning and geo-tagging, spatial web objects that possess both a geographical location and a textual description are gaining in prevalence, and spatial keyword queries that exploit both location and textual description are gaining in prominence. However......, the queries studied so far generally focus on finding individual objects that each satisfy a query rather than finding groups of objects where the objects in a group collectively satisfy a query. We define the problem of retrieving a group of spatial web objects such that the group's keywords cover the query......'s keywords and such that objects are nearest to the query location and have the lowest inter-object distances. Specifically, we study two variants of this problem, both of which are NP-complete. We devise exact solutions as well as approximate solutions with provable approximation bounds to the problems. We...
Medical terminology: Its size and typology.

Science.gov (United States)

Kucharz, Eugeniusz Józef

2015-01-01

Medical terminology is one of the largest specialized terminologies and is estimated to contain over 250,000 items. Classification of medical terminology into six categories is proposed. The categories are as the following: (A) medical terms that are a part of general basic lexicon of average native speaker (0.02-0.03 % of all terms), (B) specialized medical terms known by average physician (about 45 % of all terms), (C) highly-specialized terms of subspecialties (about 15 % of all terms) (D) medical terms that primarily belong to other terminologies (e.g. biological, chemical, physical, statistical) (about 20 % of all terms), (E) medical slang (0.04-0.05 % of all terms), and (F) pharmaceutical terminology (about 20 % of all terms).
Normalizing biomedical terms by minimizing ambiguity and variability

Directory of Open Access Journals (Sweden)

McNaught John

2008-04-01

Full Text Available Abstract Background One of the difficulties in mapping biomedical named entities, e.g. genes, proteins, chemicals and diseases, to their concept identifiers stems from the potential variability of the terms. Soft string matching is a possible solution to the problem, but its inherent heavy computational cost discourages its use when the dictionaries are large or when real time processing is required. A less computationally demanding approach is to normalize the terms by using heuristic rules, which enables us to look up a dictionary in a constant time regardless of its size. The development of good heuristic rules, however, requires extensive knowledge of the terminology in question and thus is the bottleneck of the normalization approach. Results We present a novel framework for discovering a list of normalization rules from a dictionary in a fully automated manner. The rules are discovered in such a way that they minimize the ambiguity and variability of the terms in the dictionary. We evaluated our algorithm using two large dictionaries: a human gene/protein name dictionary built from BioThesaurus and a disease name dictionary built from UMLS. Conclusions The experimental results showed that automatically discovered rules can perform comparably to carefully crafted heuristic rules in term mapping tasks, and the computational overhead of rule application is small enough that a very fast implementation is possible. This work will help improve the performance of term-concept mapping tasks in biomedical information extraction especially when good normalization heuristics for the target terminology are not fully known.
Querying Workflow Logs

Directory of Open Access Journals (Sweden)

Yan Tang

2018-01-01

Full Text Available A business process or workflow is an assembly of tasks that accomplishes a business goal. Business process management is the study of the design, configuration/implementation, enactment and monitoring, analysis, and re-design of workflows. The traditional methodology for the re-design and improvement of workflows relies on the well-known sequence of extract, transform, and load (ETL, data/process warehousing, and online analytical processing (OLAP tools. In this paper, we study the ad hoc queryiny of process enactments for (data-centric business processes, bypassing the traditional methodology for more flexibility in querying. We develop an algebraic query language based on “incident patterns” with four operators inspired from Business Process Model and Notation (BPMN representation, allowing the user to formulate ad hoc queries directly over workflow logs. A formal semantics of this query language, a preliminary query evaluation algorithm, and a group of elementary properties of the operators are provided.
Sport supporting act: terminology issues

Directory of Open Access Journals (Sweden)

Petr Vlček

2013-01-01

Full Text Available BACKGROUND: The text deals with terminology issues from an interdisciplinary point of view. It is based on two diﬀerent disciplines, law and kinanthropology, in an area of their overlap. AIM: The aim of the author is to point out some possible legislative problems, which could arise due to the current reading of the sport supporting act (Act no. 115/2001. The second aim of the author is to contribute to the discussion of kinantropologists (possibly also the educational researchers and lawyers and to stress the importance of the systematic approach to terminology formulation. METHODS: The author uses the method of language interpretation. We also use the basic analytical methods, induction and deduction, while we stress the systematic approach to the term formulation. RESULTS: The analysis of the sport supporting act terminology shows some specific legislative problems, which could arise due to the definition of sport in the sport supporting act. The author discusses a possible alternative solution. CONCLUSION: According to the opinion of the author, clear, obvious and unified terminology of kinantropologists as specialists in their discipline should represent a source, from which other sciences could derive their terminology. Defined and inexpert terminology used in other disciplines should not be used as an argument for its adopting in kinanthropology.
Building a biomedical ontology recommender web service

Directory of Open Access Journals (Sweden)

Jonquet Clement

2010-06-01

Full Text Available Abstract Background Researchers in biomedical informatics use ontologies and terminologies to annotate their data in order to facilitate data integration and translational discoveries. As the use of ontologies for annotation of biomedical datasets has risen, a common challenge is to identify ontologies that are best suited to annotating specific datasets. The number and variety of biomedical ontologies is large, and it is cumbersome for a researcher to figure out which ontology to use. Methods We present the Biomedical Ontology Recommender web service. The system uses textual metadata or a set of keywords describing a domain of interest and suggests appropriate ontologies for annotating or representing the data. The service makes a decision based on three criteria. The first one is coverage, or the ontologies that provide most terms covering the input text. The second is connectivity, or the ontologies that are most often mapped to by other ontologies. The final criterion is size, or the number of concepts in the ontologies. The service scores the ontologies as a function of scores of the annotations created using the National Center for Biomedical Ontology (NCBO Annotator web service. We used all the ontologies from the UMLS Metathesaurus and the NCBO BioPortal. Results We compare and contrast our Recommender by an exhaustive functional comparison to previously published efforts. We evaluate and discuss the results of several recommendation heuristics in the context of three real world use cases. The best recommendations heuristics, rated ‘very relevant’ by expert evaluators, are the ones based on coverage and connectivity criteria. The Recommender service (alpha version is available to the community and is embedded into BioPortal.
[Establishment of anatomical terminology in Japan].

Science.gov (United States)

Shimada, Kazuyuki

2008-12-01

The history of anatomical terminology in Japan began with the publication of Waran Naikei Ihan-teimŏ in 1805 and Chŏtei Kaitai Shinsho in 1826. Although the establishment of Japanese anatomical terminology became necessary during the Meiji era when many western anatomy books imported into Janan were translated, such terminology was not unified during this period and varied among translators. In 1871, Tsukumo Ono's Kaibŏgaku Gosen was published by the Ministry of Education. Although this book is considered to be the first anatomical glossary terms in Japan, its contents were incomplete. Overseas, the German Anatomical Society established a unified anatomical terminology in 1895 called the Basle Nomina Anatomica (B.N.A.). Based on this development, Kaibŏgaku Meishŭ which follows the BNA, by Buntarŏ Suzuki was published in 1905. With the subsequent establishment in 1935 of Jena Nomina Anatomica (J.N.A.), the unification of anatomical terminology was also accelerated in Japan, leading to the further development of terminology.
Characteristics desired in clinical data warehouse for biomedical research.

Science.gov (United States)

Shin, Soo-Yong; Kim, Woo Sung; Lee, Jae-Ho

2014-04-01

Due to the unique characteristics of clinical data, clinical data warehouses (CDWs) have not been successful so far. Specifically, the use of CDWs for biomedical research has been relatively unsuccessful thus far. The characteristics necessary for the successful implementation and operation of a CDW for biomedical research have not clearly defined yet. THREE EXAMPLES OF CDWS WERE REVIEWED: a multipurpose CDW in a hospital, a CDW for independent multi-institutional research, and a CDW for research use in an institution. After reviewing the three CDW examples, we propose some key characteristics needed in a CDW for biomedical research. A CDW for research should include an honest broker system and an Institutional Review Board approval interface to comply with governmental regulations. It should also include a simple query interface, an anonymized data review tool, and a data extraction tool. Also, it should be a biomedical research platform for data repository use as well as data analysis. The proposed characteristics desired in a CDW may have limited transfer value to organizations in other countries. However, these analysis results are still valid in Korea, and we have developed clinical research data warehouse based on these desiderata.
The BioIntelligence Framework: a new computational platform for biomedical knowledge computing.

Science.gov (United States)

Farley, Toni; Kiefer, Jeff; Lee, Preston; Von Hoff, Daniel; Trent, Jeffrey M; Colbourn, Charles; Mousses, Spyro

2013-01-01

Breakthroughs in molecular profiling technologies are enabling a new data-intensive approach to biomedical research, with the potential to revolutionize how we study, manage, and treat complex diseases. The next great challenge for clinical applications of these innovations will be to create scalable computational solutions for intelligently linking complex biomedical patient data to clinically actionable knowledge. Traditional database management systems (DBMS) are not well suited to representing complex syntactic and semantic relationships in unstructured biomedical information, introducing barriers to realizing such solutions. We propose a scalable computational framework for addressing this need, which leverages a hypergraph-based data model and query language that may be better suited for representing complex multi-lateral, multi-scalar, and multi-dimensional relationships. We also discuss how this framework can be used to create rapid learning knowledge base systems to intelligently capture and relate complex patient data to biomedical knowledge in order to automate the recovery of clinically actionable information.
jQuery Pocket Reference

CERN Document Server

Flanagan, David

2010-01-01

"As someone who uses jQuery on a regular basis, it was surprising to discover how much of the library I'm not using. This book is indispensable for anyone who is serious about using jQuery for non-trivial applications."-- Raffaele Cecco, longtime developer of video games, including Cybernoid, Exolon, and Stormlord jQuery is the "write less, do more" JavaScript library. Its powerful features and ease of use have made it the most popular client-side JavaScript framework for the Web. This book is jQuery's trusty companion: the definitive "read less, learn more" guide to the library. jQuery P
jQuery UI cookbook

CERN Document Server

Boduch, Adam

2013-01-01

Filled with a practical collection of recipes, jQuery UI Cookbook is full of clear, step-by-step instructions that will help you harness the powerful UI framework in jQuery. Depending on your needs, you can dip in and out of the Cookbook and its recipes, or follow the book from start to finish.If you are a jQuery UI developer looking to improve your existing applications, extract ideas for your new application, or to better understand the overall widget architecture, then jQuery UI Cookbook is a must-have for you. The reader should at least have a rudimentary understanding of what jQuery UI is
Instant jQuery selectors

CERN Document Server

De Rosa, Aurelio

2013-01-01

Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. Instant jQuery Selectors follows a simple how-to format with recipes aimed at making you well versed with the wide range of selectors that jQuery has to offer through a myriad of examples.Instant jQuery Selectors is for web developers who want to delve into jQuery from its very starting point: selectors. Even if you're already familiar with the framework and its selectors, you could find several tips and tricks that you aren't aware of, especially about performance and how jQuery ac
SkyQuery - A Prototype Distributed Query and Cross-Matching Web Service for the Virtual Observatory

Science.gov (United States)

Thakar, A. R.; Budavari, T.; Malik, T.; Szalay, A. S.; Fekete, G.; Nieto-Santisteban, M.; Haridas, V.; Gray, J.

2002-12-01

We have developed a prototype distributed query and cross-matching service for the VO community, called SkyQuery, which is implemented with hierarchichal Web Services. SkyQuery enables astronomers to run combined queries on existing distributed heterogeneous astronomy archives. SkyQuery provides a simple, user-friendly interface to run distributed queries over the federation of registered astronomical archives in the VO. The SkyQuery client connects to the portal Web Service, which farms the query out to the individual archives, which are also Web Services called SkyNodes. The cross-matching algorithm is run recursively on each SkyNode. Each archive is a relational DBMS with a HTM index for fast spatial lookups. The results of the distributed query are returned as an XML DataSet that is automatically rendered by the client. SkyQuery also returns the image cutout corresponding to the query result. SkyQuery finds not only matches between the various catalogs, but also dropouts - objects that exist in some of the catalogs but not in others. This is often as important as finding matches. We demonstrate the utility of SkyQuery with a brown-dwarf search between SDSS and 2MASS, and a search for radio-quiet quasars in SDSS, 2MASS and FIRST. The importance of a service like SkyQuery for the worldwide astronomical community cannot be overstated: data on the same objects in various archives is mapped in different wavelength ranges and looks very different due to different errors, instrument sensitivities and other peculiarities of each archive. Our cross-matching algorithm preforms a fuzzy spatial join across multiple catalogs. This type of cross-matching is currently often done by eye, one object at a time. A static cross-identification table for a set of archives would become obsolete by the time it was built - the exponential growth of astronomical data means that a dynamic cross-identification mechanism like SkyQuery is the only viable option. SkyQuery was funded by a
INTERVIEW: Knowledge and Terminology Management at Crisplant

DEFF Research Database (Denmark)

Møller, Margrethe H.; Toft, Birthe

2012-01-01

Margrethe H. Møller interviews Lisbeth Kjeldgaard Almsten (translator/coauthor: Birthe Toft) “If you think that terminology work is simply a matter of buying terminology management software and getting started, you are in for trouble” At Crisplant, we have been doing terminology management for th...... management really is, in enterprise practice as well as in education.......Margrethe H. Møller interviews Lisbeth Kjeldgaard Almsten (translator/coauthor: Birthe Toft) “If you think that terminology work is simply a matter of buying terminology management software and getting started, you are in for trouble” At Crisplant, we have been doing terminology management...... for the past 20 years. Today, term bases are used not just for terminology-oriented term management. Recording other types of master data needed by all kinds of professionals in the enterprise is equally important. Within the past year, Crisplant has been acquired by the German BEUMER group, which means...
Image BOSS: a biomedical object storage system

Science.gov (United States)

Stacy, Mahlon C.; Augustine, Kurt E.; Robb, Richard A.

1997-05-01

Researchers using biomedical images have data management needs which are oriented perpendicular to clinical PACS. The image BOSS system is designed to permit researchers to organize and select images based on research topic, image metadata, and a thumbnail of the image. Image information is captured from existing images in a Unix based filesystem, stored in an object oriented database, and presented to the user in a familiar laboratory notebook metaphor. In addition, the ImageBOSS is designed to provide an extensible infrastructure for future content-based queries directly on the images.
Medical terminology in online patient-patient communication: evidence of high health literacy?

Science.gov (United States)

Fage-Butler, Antoinette M; Nisbeth Jensen, Matilde

2016-06-01

Health communication research and guidelines often recommend that medical terminology be avoided when communicating with patients due to their limited understanding of medical terms. However, growing numbers of e-patients use the Internet to equip themselves with specialized biomedical knowledge that is couched in medical terms, which they then share on participatory media, such as online patient forums. Given possible discrepancies between preconceptions about the kind of language that patients can understand and the terms they may actually know and use, the purpose of this paper was to investigate medical terminology used by patients in online patient forums. Using data from online patient-patient communication where patients communicate with each other without expert moderation or intervention, we coded two data samples from two online patient forums dedicated to thyroid issues. Previous definitions of medical terms (dichotomized into technical and semi-technical) proved too rudimentary to encapsulate the types of medical terms the patients used. Therefore, using an inductive approach, we developed an analytical framework consisting of five categories of medical terms: dictionary-defined medical terms, co-text-defined medical terms, medical initialisms, medication brand names and colloquial technical terms. The patients in our data set used many medical terms from all of these categories. Our findings suggest the value of a situated, condition-specific approach to health literacy that recognizes the vertical kind of knowledge that patients with chronic diseases may have. We make cautious recommendations for clinical practice, arguing for an adaptive approach to medical terminology use with patients. © 2015 The Authors. Health Expectations Published by John Wiley & Sons Ltd.
A Comparative Study of Legal Terminologies in French and Romanian. The Translation of International Contract Law Terminologies

Directory of Open Access Journals (Sweden)

Adriana SFERLE

2012-01-01

Full Text Available Our article is a comparative study investigating the main aspects of legal terminology in French and Romanian. In this context, the analysis aims at translating French - Romanian, Romanian - French, terminologies of international commercial contracts. With this study we intend to improve the knowledge of legal terminology in Romanian. Romania has been faced lately, particularly since January 1st 2007, when it joined the European Union, with a real need for terminological studies, for dictionaries and data bases in all fields relating to translation and interpreting.

Learning semantic query suggestions

NARCIS (Netherlands)

Meij, E.; Bron, M.; Hollink, L.; Huurnink, B.; de Rijke, M.

2009-01-01

An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide
Indexing for summary queries

DEFF Research Database (Denmark)

Yi, Ke; Wang, Lu; Wei, Zhewei

2014-01-01

), of a particular attribute of these records. Aggregation queries are especially useful in business intelligence and data analysis applications where users are interested not in the actual records, but some statistics of them. They can also be executed much more efficiently than reporting queries, by embedding...... returned by reporting queries. In this article, we design indexing techniques that allow for extracting a statistical summary of all the records in the query. The summaries we support include frequent items, quantiles, and various sketches, all of which are of central importance in massive data analysis....... Our indexes require linear space and extract a summary with the optimal or near-optimal query cost. We illustrate the efficiency and usefulness of our designs through extensive experiments and a system demonstration....
Managing terminology assets in Electronic Health Records.

Science.gov (United States)

Abrams, Kelly; Schneider, Sue; Scichilone, Rita

2009-01-01

Electronic Health Record (EHR)systems rely on standard terminologies and classification systems that require both Information Technology (IT) and Information Management (IM) skills. Convergence of perspectives is necessary for effective terminology asset management including evaluation for use, maintenance and intersection with software applications. Multiple terminologies are necessary for patient care communication and data capture within EHRs and other information management tasks. Terminology asset management encompasses workflow and operational context as well as IT specifications and software application run time requirements. This paper identifies the tasks, skills and collaboration of IM and IT approaches for terminology asset management.
The CMS DBS query language

International Nuclear Information System (INIS)

Kuznetsov, Valentin; Riley, Daniel; Afaq, Anzar; Sekhri, Vijay; Guo Yuyi; Lueking, Lee

2010-01-01

The CMS experiment has implemented a flexible and powerful system enabling users to find data within the CMS physics data catalog. The Dataset Bookkeeping Service (DBS) comprises a database and the services used to store and access metadata related to CMS physics data. To this, we have added a generalized query system in addition to the existing web and programmatic interfaces to the DBS. This query system is based on a query language that hides the complexity of the underlying database structure by discovering the join conditions between database tables. This provides a way of querying the system that is simple and straightforward for CMS data managers and physicists to use without requiring knowledge of the database tables or keys. The DBS Query Language uses the ANTLR tool to build the input query parser and tokenizer, followed by a query builder that uses a graph representation of the DBS schema to construct the SQL query sent to underlying database. We will describe the design of the query system, provide details of the language components and overview of how this component fits into the overall data discovery system architecture.
Unification of Sinonasal Anatomical Terminology

Directory of Open Access Journals (Sweden)

Voegels, Richard Louis

2015-07-01

Full Text Available The advent of endoscopy and computed tomography at the beginning of the 1980s brought to rhinology a revival of anatomy and physiology study. In 1994, the International Conference of Sinus Disease was conceived because the official “Terminologia Anatomica”[1] had little information on the detailed sinonasal anatomy. In addition, there was a lack of uniformity of terminology and definitions. After 20 years, a new conference has been held. The need to use the same terminology led to the publication by the European Society of Rhinology of the “European Position Paper on the Anatomical Terminology of the Internal Nose and Paranasal Sinuses,” that can be accessed freely at www.rhinologyjournal.com. Professor Valerie Lund et al[2] wrote this document reviewing the anatomical terms, comparing to the “Terminology Anatomica” official order to define the structures without eponyms, while respecting the embryological development and especially universalizing and simplifying the terms. A must-read! The text's purpose lies beyond the review of anatomical terminology to universalize the language used to refer to structures of the nasal and paranasal cavities. Information about the anatomy, based on extensive review of the current literature, is arranged in just over 50 pages, which are direct and to the point. The publication may be pleasant reading for learners and teachers of rhinology. This text can be a starting point and enables searching the universal terminology used in Brazil, seeking to converge with this new European proposal for a nomenclature to help us communicate with our peers in Brazil and the rest of the world. The original text of the European Society of Rhinology provides English terms that avoided the use of Latin, and thus fall beyond several national personal translations. It would be admirable if we created our own cross-cultural adaptation of this new suggested anatomical terminology.
The Changes in Architecture Terminology

Directory of Open Access Journals (Sweden)

Francois Tran

2012-10-01

Full Text Available The intention of this research is to inspire a discussion about the changes in architecture terminologywith the revolution in communication and representation forms as a result of digitalisation.The blurred boundary between the virtual and the analogue worlds, the misunderstandings andthe confusion that appear with the interaction of these two worlds nowadays form the major problems facing architectural design, education and research. The researchers in this field arefocused on the interface, the meeting and the transformation point between the digital and analogue worlds in order to prevent those problems and confusions. One of the main reasonsof this ambiguity is the architectural terminology that changes according to the changing status of architectural representation i.e. new forms of representation; new forms of communicationi.e. the new role of the architect and the researcher.Whenever and wherever information and knowledge specialised is created, communicated ortransformed terminology is involved in a way or another. An absence of terminology is combined with an absence of an understanding of concepts. Therefore with the new information and communication technologies; new and developing subject areas the existence of terminology and its update is indispensable. Thus the changing status of the terminology must be analysed. As architecture terminology is essential to improve today’s challenging, multidisciplinary communication in order to clarify the problems of ambiguity and unawareness (as a result of shift of specific architectural vocabulary it is necessary to analyse the changes in the architectural terminology which will form the discussion point of the following paper.As this paper is the beginning step of a research project which started on the occasion of the conference proposed by EAAE/ARCC we will here present only the objectives of this research,its general problematics, the methods that we wish to develop and some provisional
Mastering jQuery mobile

CERN Document Server

Lambert, Chip

2015-01-01

You've started down the path of jQuery Mobile, now begin mastering some of jQuery Mobile's higher level topics. Go beyond jQuery Mobile's documentation and master one of the hottest mobile technologies out there. Previous JavaScript and PHP experience can help you get the most out of this book.
CUFID-query: accurate network querying through random walk based network flow estimation.

Science.gov (United States)

Jeong, Hyundoo; Qian, Xiaoning; Yoon, Byung-Jun

2017-12-28

Functional modules in biological networks consist of numerous biomolecules and their complicated interactions. Recent studies have shown that biomolecules in a functional module tend to have similar interaction patterns and that such modules are often conserved across biological networks of different species. As a result, such conserved functional modules can be identified through comparative analysis of biological networks. In this work, we propose a novel network querying algorithm based on the CUFID (Comparative network analysis Using the steady-state network Flow to IDentify orthologous proteins) framework combined with an efficient seed-and-extension approach. The proposed algorithm, CUFID-query, can accurately detect conserved functional modules as small subnetworks in the target network that are expected to perform similar functions to the given query functional module. The CUFID framework was recently developed for probabilistic pairwise global comparison of biological networks, and it has been applied to pairwise global network alignment, where the framework was shown to yield accurate network alignment results. In the proposed CUFID-query algorithm, we adopt the CUFID framework and extend it for local network alignment, specifically to solve network querying problems. First, in the seed selection phase, the proposed method utilizes the CUFID framework to compare the query and the target networks and to predict the probabilistic node-to-node correspondence between the networks. Next, the algorithm selects and greedily extends the seed in the target network by iteratively adding nodes that have frequent interactions with other nodes in the seed network, in a way that the conductance of the extended network is maximally reduced. Finally, CUFID-query removes irrelevant nodes from the querying results based on the personalized PageRank vector for the induced network that includes the fully extended network and its neighboring nodes. Through extensive
jQuery cookbook

CERN Document Server

2010-01-01

jQuery simplifies building rich, interactive web frontends. Getting started with this JavaScript library is easy, but it can take years to fully realize its breadth and depth; this cookbook shortens the learning curve considerably. With these recipes, you'll learn patterns and practices from 19 leading developers who use jQuery for everything from integrating simple components into websites and applications to developing complex, high-performance user interfaces. Ideal for newcomers and JavaScript veterans alike, jQuery Cookbook starts with the basics and then moves to practical use cases w
The genre tutorial and social networks terminology

Directory of Open Access Journals (Sweden)

Márcio Sales Santiago

2014-02-01

Full Text Available This paper analyzes the terminology in the Internet social networks tutorials. A tutorial is a specialized text, full of terms, aiming to teach an individual or group of individuals who need some guidelines to operationalize a computerized tool, such as a social network. It is necessary to identify linguistic and terminological characteristics from the specialized lexical units in this digital genre. Social networks terminology is described and exemplified here. The results show that it is possible to refer to two specific terminologies in tutorials which help to determine the terminological profile of the thematic area, specifically from the point of view of denomination.
User perspectives on query difficulty

DEFF Research Database (Denmark)

Lioma, Christina; Larsen, Birger; Schütze, Hinrich

2011-01-01

be difficult for the system to address? (2) Are users aware of specific features in their query (e.g., domain-specificity, vagueness) that may render their query difficult for an IR system to address? A study of 420 queries from a Web search engine query log that are pre-categorised as easy, medium, hard...
How to Manage and Plan Terminology: Creating Management TDBs

Directory of Open Access Journals (Sweden)

Gordana Jakić

2016-09-01

Full Text Available Scientific and technical terminology represents a very topical issue in economically and technologically dependent countries with small languages such as Serbian. The current terminological problems in the Serbian language, especially in specialized areas that are experiencing dynamic development, are: Anglicization of the language for special purposes, underdeveloped and unstable terminology, and lack of adequate and modern terminological and lexical resources. On the one hand, the terminological problems listed above are of concern to subject-field specialists, since inadequate and non-existent terminology significantly affects the representation, transfer and management of specialized knowledge and information. On the other hand, terminology and language planners point to the growing need for immediate and systematic intervention aimed at terminology harmonization, consolidation and standardization. In spite of the awareness, there is no systematic approach to the solving of terminological problems in Serbian. In addition, practical activities regarding the collection and organization of terminology are few and reduced to individual initiatives. Under the paradigm of language planning (LP-oriented terminology management (2, this paper is going to address a practical activity of terminology management: the creation of a Serbian management terminology database (TDB with equivalent terms in English. The paper will discuss the methodology of terminology work, potential obstacles in termbase creation, as well as potential benefits that such a resource would have on all its potential users: management specialists and practitioners, professional translators, and language and terminology planners. A particular focus will be placed on the potential significance that this kind of a database would have for terminology policy and planning in the Serbian language, on the one hand, and knowledge transfer and management, on the other hand.
Terminology management at the national language service | Alberts ...

African Journals Online (AJOL)

Through the use of correct, standardised terminology, effective scientific and technical communication skills are developed. A brief overview is given of terminology development in South Africa, with special emphasis on the work of the Terminology Division of the National Language Service. Aspects of present terminology ...
Heuristic query optimization for query multiple table and multiple clausa on mobile finance application

Science.gov (United States)

Indrayana, I. N. E.; P, N. M. Wirasyanti D.; Sudiartha, I. KG

2018-01-01

Mobile application allow many users to access data from the application without being limited to space, space and time. Over time the data population of this application will increase. Data access time will cause problems if the data record has reached tens of thousands to millions of records.The objective of this research is to maintain the performance of data execution for large data records. One effort to maintain data access time performance is to apply query optimization method. The optimization used in this research is query heuristic optimization method. The built application is a mobile-based financial application using MySQL database with stored procedure therein. This application is used by more than one business entity in one database, thus enabling rapid data growth. In this stored procedure there is an optimized query using heuristic method. Query optimization is performed on a “Select” query that involves more than one table with multiple clausa. Evaluation is done by calculating the average access time using optimized and unoptimized queries. Access time calculation is also performed on the increase of population data in the database. The evaluation results shown the time of data execution with query heuristic optimization relatively faster than data execution time without using query optimization.
Morphing Terminology Study

Energy Technology Data Exchange (ETDEWEB)

Rose, Stuart J.; Brockman, Fred J.; Hart, Michelle L.; Engel, David W.; Valentine, Nancy B.; Calapristi, Augustin J.

2010-06-28

This study investigates methods of automatically identifying and characterizing significant transitions in term usage over time. Within scientific literature, the occurrence of terms reflects the use of technologies and techniques as well as the study of specific species and materials. Transitions in terminology usage may be a result of vocabulary standardization or specialization in which terms are replaced with their shorter form. They may also be a result of new applications, combinations, alternatives, or interests that result in the appearance of new or existing terminology in unexpected contexts.
Standard Terminology Relating to Photovoltaic Solar Energy Conversion

CERN Document Server

American Society for Testing and Materials. Philadelphia

2005-01-01

1.1 This terminology pertains to photovoltaic (radiant-to-electrical energy conversion) device performance measurements and is not a comprehensive list of terminology for photovoltaics in general. 1.2 Additional terms used in this terminology and of interest to solar energy may be found in Terminology E 772.
Mastering jQuery

CERN Document Server

Libby, Alex

2015-01-01

If you are a developer who is already familiar with using jQuery and wants to push your skill set further, then this book is for you. The book assumes an intermediate knowledge level of jQuery, JavaScript, HTML5, and CSS.
Pediatric Terminology

Science.gov (United States)

The National Institute of Child Health and Human Development (NICHD) works with NCI Enterprise Vocabulary Services (EVS) to provide standardized terminology for coding pediatric clinical trials and other research activities.
Smart Query Answering for Marine Sensor Data

Directory of Open Access Journals (Sweden)

Paulo de Souza

2011-03-01

Full Text Available We review existing query answering systems for sensor data. We then propose an extended query answering approach termed smart query, specifically for marine sensor data. The smart query answering system integrates pattern queries and continuous queries. The proposed smart query system considers both streaming data and historical data from marine sensor networks. The smart query also uses query relaxation technique and semantics from domain knowledge as a recommender system. The proposed smart query benefits in building data and information systems for marine sensor networks.
Smart query answering for marine sensor data.

Science.gov (United States)

Shahriar, Md Sumon; de Souza, Paulo; Timms, Greg

2011-01-01

We review existing query answering systems for sensor data. We then propose an extended query answering approach termed smart query, specifically for marine sensor data. The smart query answering system integrates pattern queries and continuous queries. The proposed smart query system considers both streaming data and historical data from marine sensor networks. The smart query also uses query relaxation technique and semantics from domain knowledge as a recommender system. The proposed smart query benefits in building data and information systems for marine sensor networks.

Medical radiology terminology

International Nuclear Information System (INIS)

1986-01-01

Standardization achievements in the field of radiology induced the IEC to compile the terminology used in its safety and application standards and present it in publication 788 (1984 issue), entitled 'Medical radiology terminology'. The objective pursued is to foster the use of standard terminology in the radiology standards. The value of publication 788 lies in the fact that it presents definitions of terms used in the French and English versions of IEC standards in the field of radiology, and thus facilitates adequate translation of these terms into other languages. In the glossary in hand, German-language definitions have been adopted from the DIN standards in cases where the French or English versions of definitions are identical with the German wording or meaning. The numbers of DIN standards or sections are then given without brackets, ahead of the text of the definition. In cases where correspondance of the various texts is not so good, or reference should be made to a term in a DIN standard, the numbers are given in brackets. (orig./HP) [de
hMuLab: A Biomedical Hybrid MUlti-LABel Classifier Based on Multiple Linear Regression.

Science.gov (United States)

Wang, Pu; Ge, Ruiquan; Xiao, Xuan; Zhou, Manli; Zhou, Fengfeng

2017-01-01

Many biomedical classification problems are multi-label by nature, e.g., a gene involved in a variety of functions and a patient with multiple diseases. The majority of existing classification algorithms assumes each sample with only one class label, and the multi-label classification problem remains to be a challenge for biomedical researchers. This study proposes a novel multi-label learning algorithm, hMuLab, by integrating both feature-based and neighbor-based similarity scores. The multiple linear regression modeling techniques make hMuLab capable of producing multiple label assignments for a query sample. The comparison results over six commonly-used multi-label performance measurements suggest that hMuLab performs accurately and stably for the biomedical datasets, and may serve as a complement to the existing literature.
jQuery For Dummies

CERN Document Server

Beighley, Lynn

2010-01-01

Learn how jQuery can make your Web page or blog stand out from the crowd!. jQuery is free, open source software that allows you to extend and customize Joomla!, Drupal, AJAX, and WordPress via plug-ins. Assuming no previous programming experience, Lynn Beighley takes you through the basics of jQuery from the very start. You'll discover how the jQuery library separates itself from other JavaScript libraries through its ease of use, compactness, and friendliness if you're a beginner programmer. Written in the easy-to-understand style of the For Dummies brand, this book demonstrates how you can a
Web development with jQuery

CERN Document Server

York, Richard

2015-01-01

Newly revised and updated resource on jQuery's many features and advantages Web Development with jQuery offers a major update to the popular Beginning JavaScript and CSS Development with jQuery from 2009. More than half of the content is new or updated, and reflects recent innovations with regard to mobile applications, jQuery mobile, and the spectrum of associated plugins. Readers can expect thorough revisions with expanded coverage of events, CSS, AJAX, animation, and drag and drop. New chapters bring developers up to date on popular features like jQuery UI, navigation, tables, interacti
Optimizing Temporal Queries

DEFF Research Database (Denmark)

Toman, David; Bowman, Ivan Thomas

2003-01-01

Recent research in the area of temporal databases has proposed a number of query languages that vary in their expressive power and the semantics they provide to users. These query languages represent a spectrum of solutions to the tension between clean semantics and efficient evaluation. Often, t...
SYNONYMY IN TERMINOLOGY OF SPORT

Directory of Open Access Journals (Sweden)

Nenad Zivanovic

2009-11-01

Full Text Available Synonyms entitle the same thing, but they connect this with different names and in this way through the name they uncover different features of the same thing. Synonyms consider words which identify one unique concept, word which are the same or similar in their meaning, which are, in the some way, interlocked in the language and serve for enhance of details and making difference in fine nuances of concept meaning. Different terms for the same concepts in terminology usually come from diffe- rent sources of terms derivation. Especially, there is a lot of terms in terminology which developed spontaneously, thereafter in more unorganized terminologies because in the process of organizing of terminology is intend to push out the synonyms. In the time of constitution of each science, actually constituting of concepts related to it, there is no systematical approach in selecting of their denotation, but they are accepting as they come in to the language.
The Biomedical Resource Ontology (BRO) to enable resource discovery in clinical and translational research.

Science.gov (United States)

Tenenbaum, Jessica D; Whetzel, Patricia L; Anderson, Kent; Borromeo, Charles D; Dinov, Ivo D; Gabriel, Davera; Kirschner, Beth; Mirel, Barbara; Morris, Tim; Noy, Natasha; Nyulas, Csongor; Rubenson, David; Saxman, Paul R; Singh, Harpreet; Whelan, Nancy; Wright, Zach; Athey, Brian D; Becich, Michael J; Ginsburg, Geoffrey S; Musen, Mark A; Smith, Kevin A; Tarantal, Alice F; Rubin, Daniel L; Lyster, Peter

2011-02-01

The biomedical research community relies on a diverse set of resources, both within their own institutions and at other research centers. In addition, an increasing number of shared electronic resources have been developed. Without effective means to locate and query these resources, it is challenging, if not impossible, for investigators to be aware of the myriad resources available, or to effectively perform resource discovery when the need arises. In this paper, we describe the development and use of the Biomedical Resource Ontology (BRO) to enable semantic annotation and discovery of biomedical resources. We also describe the Resource Discovery System (RDS) which is a federated, inter-institutional pilot project that uses the BRO to facilitate resource discovery on the Internet. Through the RDS framework and its associated Biositemaps infrastructure, the BRO facilitates semantic search and discovery of biomedical resources, breaking down barriers and streamlining scientific research that will improve human health. Copyright © 2010 Elsevier Inc. All rights reserved.
jQuery Mobile

CERN Document Server

Reid, Jon

2011-01-01

Native apps have distinct advantages, but the future belongs to mobile web apps that function on a broad range of smartphones and tablets. Get started with jQuery Mobile, the touch-optimized framework for creating apps that look and behave consistently across many devices. This concise book provides HTML5, CSS3, and JavaScript code examples, screen shots, and step-by-step guidance to help you build a complete working app with jQuery Mobile. If you're already familiar with the jQuery JavaScript library, you can use your existing skills to build cross-platform mobile web apps right now. This b
Terminology standardisation in the nuclear engineering field

International Nuclear Information System (INIS)

Kraut, A.

1987-01-01

Terminological standardisation is made for the purpose of unambiguous understanding, at least among experts in a given field of knowledge. The author explains a number of criteria and aspects to be taken into account in the process of standardisation by referring to the work of the Terminology Committee on Nuclear Engineering. He discusses the word formation in a technical language and the features of standardised terminology. Accepted terminology is a main factor in all procedures concerning design, testing, and approval and licensing of nuclear facilities, and also is of importance in terms of economics. (HP) [de
KISTI at TREC 2014 Clinical Decision Support Track: Concept-based Document Re-ranking to Biomedical Information Retrieval

Science.gov (United States)

2014-11-01

sematic type. Injury or Poisoning inpo T037 Anatomical Abnormality anab T190 Given a document D, a concept vector = {1, 2, … , ...integrating biomedical terminology . Nucleic acids research 32, Database issue (2004), 267–270. 5. Chapman, W.W., Hillert, D., Velupillai, S., et...Conference (TREC), (2011). 9. Koopman, B. and Zuccon, G. Understanding negation and family history to improve clinical information retrieval. Proceedings
[Project HRANAFINA--Croatian anatomical and physiological terminology].

Science.gov (United States)

Vodanović, Marin

2012-01-01

HRANAFINA--Croatian Anatomical and Physiological Terminology is a project of the University of Zagreb School of Dental Medicine funded by the Croatian Science Foundation. It is performed in cooperation with other Croatian universities with medical schools. This project has a two-pronged aim: firstly, building of Croatian anatomical and physiological terminology and secondly, Croatian anatomical and physiological terminology usage popularization between health professionals, medical students, scientists and translators. Internationally recognized experts from Croatian universities with medical faculties and linguistics experts are involved in the project. All project activities are coordinated in agreement with the National Coordinator for Development of Croatian Professional Terminology. The project enhances Croatian professional terminology and Croatian language in general, increases competitiveness of Croatian scientists on international level and facilitates the involvement of Croatian scientists, health care providers and medical students in European projects.
Incremental Query Rewriting with Resolution

Science.gov (United States)

Riazanov, Alexandre; Aragão, Marcelo A. T.

We address the problem of semantic querying of relational databases (RDB) modulo knowledge bases using very expressive knowledge representation formalisms, such as full first-order logic or its various fragments. We propose to use a resolution-based first-order logic (FOL) reasoner for computing schematic answers to deductive queries, with the subsequent translation of these schematic answers to SQL queries which are evaluated using a conventional relational DBMS. We call our method incremental query rewriting, because an original semantic query is rewritten into a (potentially infinite) series of SQL queries. In this chapter, we outline the main idea of our technique - using abstractions of databases and constrained clauses for deriving schematic answers, and provide completeness and soundness proofs to justify the applicability of this technique to the case of resolution for FOL without equality. The proposed method can be directly used with regular RDBs, including legacy databases. Moreover, we propose it as a potential basis for an efficient Web-scale semantic search technology.
A framework for evaluating and utilizing medical terminology mappings.

Science.gov (United States)

Hussain, Sajjad; Sun, Hong; Sinaci, Anil; Erturkmen, Gokce Banu Laleci; Mead, Charles; Gray, Alasdair J G; McGuinness, Deborah L; Prud'Hommeaux, Eric; Daniel, Christel; Forsberg, Kerstin

2014-01-01

Use of medical terminologies and mappings across them are considered to be crucial pre-requisites for achieving interoperable eHealth applications. Built upon the outcomes of several research projects, we introduce a framework for evaluating and utilizing terminology mappings that offers a platform for i) performing various mappings strategies, ii) representing terminology mappings together with their provenance information, and iii) enabling terminology reasoning for inferring both new and erroneous mappings. We present the results of the introduced framework from SALUS project where we evaluated the quality of both existing and inferred terminology mappings among standard terminologies.
From Ambiguities to Insights: Query-based Comparisons of High-Dimensional Data

Science.gov (United States)

Kowalski, Jeanne; Talbot, Conover; Tsai, Hua L.; Prasad, Nijaguna; Umbricht, Christopher; Zeiger, Martha A.

2007-11-01

Genomic technologies will revolutionize drag discovery and development; that much is universally agreed upon. The high dimension of data from such technologies has challenged available data analytic methods; that much is apparent. To date, large-scale data repositories have not been utilized in ways that permit their wealth of information to be efficiently processed for knowledge, presumably due in large part to inadequate analytical tools to address numerous comparisons of high-dimensional data. In candidate gene discovery, expression comparisons are often made between two features (e.g., cancerous versus normal), such that the enumeration of outcomes is manageable. With multiple features, the setting becomes more complex, in terms of comparing expression levels of tens of thousands transcripts across hundreds of features. In this case, the number of outcomes, while enumerable, become rapidly large and unmanageable, and scientific inquiries become more abstract, such as "which one of these (compounds, stimuli, etc.) is not like the others?" We develop analytical tools that promote more extensive, efficient, and rigorous utilization of the public data resources generated by the massive support of genomic studies. Our work innovates by enabling access to such metadata with logically formulated scientific inquires that define, compare and integrate query-comparison pair relations for analysis. We demonstrate our computational tool's potential to address an outstanding biomedical informatics issue of identifying reliable molecular markers in thyroid cancer. Our proposed query-based comparison (QBC) facilitates access to and efficient utilization of metadata through logically formed inquires expressed as query-based comparisons by organizing and comparing results from biotechnologies to address applications in biomedicine.
''Hazardous'' terminology

International Nuclear Information System (INIS)

Powers, J.

1991-01-01

A number of terms (e.g., ''hazardous chemicals,'' ''hazardous materials,'' ''hazardous waste,'' and similar nomenclature) refer to substances that are subject to regulation under one or more federal environmental laws. State laws and regulations also provide additional, similar, or identical terminology that may be confused with the federally defined terms. Many of these terms appear synonymous, and it easy to use them interchangeably. However, in a regulatory context, inappropriate use of narrowly defined terms can lead to confusion about the substances referred to, the statutory provisions that apply, and the regulatory requirements for compliance under the applicable federal statutes. This information Brief provides regulatory definitions, a brief discussion of compliance requirements, and references for the precise terminology that should be used when referring to ''hazardous'' substances regulated under federal environmental laws. A companion CERCLA Information Brief (EH-231-004/0191) addresses ''toxic'' nomenclature
Range-clustering queries

NARCIS (Netherlands)

Abrahamsen, M.; de Berg, M.T.; Buchin, K.A.; Mehr, M.; Mehrabi, A.D.

2017-01-01

In a geometric k -clustering problem the goal is to partition a set of points in R d into k subsets such that a certain cost function of the clustering is minimized. We present data structures for orthogonal range-clustering queries on a point set S : given a query box Q and an integer k>2 , compute
A Semantics-Based Approach to Retrieving Biomedical Information

DEFF Research Database (Denmark)

Andreasen, Troels; Bulskov, Henrik; Zambach, Sine

2011-01-01

This paper describes an approach to representing, organising, and accessing conceptual content of biomedical texts using a formal ontology. The ontology is based on UMLS resources supplemented with domain ontologies developed in the project. The approach introduces the notion of ‘generative ontol...... of data mining of texts identifying paraphrases and concept relations and measuring distances between key concepts in texts. Thus, the project is distinct in its attempt to provide a formal underpinning of conceptual similarity or relatedness of meaning.......This paper describes an approach to representing, organising, and accessing conceptual content of biomedical texts using a formal ontology. The ontology is based on UMLS resources supplemented with domain ontologies developed in the project. The approach introduces the notion of ‘generative...... ontologies’, i.e., ontologies providing increasingly specialised concepts reflecting the phrase structure of natural language. Furthermore, we propose a novel so called ontological semantics which maps noun phrases from texts and queries into nodes in the generative ontology. This enables an advanced form...
SPARK: Adapting Keyword Query to Semantic Search

Science.gov (United States)

Zhou, Qi; Wang, Chong; Xiong, Miao; Wang, Haofen; Yu, Yong

Semantic search promises to provide more accurate result than present-day keyword search. However, progress with semantic search has been delayed due to the complexity of its query languages. In this paper, we explore a novel approach of adapting keywords to querying the semantic web: the approach automatically translates keyword queries into formal logic queries so that end users can use familiar keywords to perform semantic search. A prototype system named 'SPARK' has been implemented in light of this approach. Given a keyword query, SPARK outputs a ranked list of SPARQL queries as the translation result. The translation in SPARK consists of three major steps: term mapping, query graph construction and query ranking. Specifically, a probabilistic query ranking model is proposed to select the most likely SPARQL query. In the experiment, SPARK achieved an encouraging translation result.
Ontobee: A linked ontology data server to support ontology term dereferencing, linkage, query and integration

Science.gov (United States)

Ong, Edison; Xiang, Zuoshuang; Zhao, Bin; Liu, Yue; Lin, Yu; Zheng, Jie; Mungall, Chris; Courtot, Mélanie; Ruttenberg, Alan; He, Yongqun

2017-01-01

Linked Data (LD) aims to achieve interconnected data by representing entities using Unified Resource Identifiers (URIs), and sharing information using Resource Description Frameworks (RDFs) and HTTP. Ontologies, which logically represent entities and relations in specific domains, are the basis of LD. Ontobee (http://www.ontobee.org/) is a linked ontology data server that stores ontology information using RDF triple store technology and supports query, visualization and linkage of ontology terms. Ontobee is also the default linked data server for publishing and browsing biomedical ontologies in the Open Biological Ontology (OBO) Foundry (http://obofoundry.org) library. Ontobee currently hosts more than 180 ontologies (including 131 OBO Foundry Library ontologies) with over four million terms. Ontobee provides a user-friendly web interface for querying and visualizing the details and hierarchy of a specific ontology term. Using the eXtensible Stylesheet Language Transformation (XSLT) technology, Ontobee is able to dereference a single ontology term URI, and then output RDF/eXtensible Markup Language (XML) for computer processing or display the HTML information on a web browser for human users. Statistics and detailed information are generated and displayed for each ontology listed in Ontobee. In addition, a SPARQL web interface is provided for custom advanced SPARQL queries of one or multiple ontologies. PMID:27733503
Querying and Mining Strings Made Easy

KAUST Repository

Sahli, Majed

2017-10-13

With the advent of large string datasets in several scientific and business applications, there is a growing need to perform ad-hoc analysis on strings. Currently, strings are stored, managed, and queried using procedural codes. This limits users to certain operations supported by existing procedural applications and requires manual query planning with limited tuning opportunities. This paper presents StarQL, a generic and declarative query language for strings. StarQL is based on a native string data model that allows StarQL to support a large variety of string operations and provide semantic-based query optimization. String analytic queries are too intricate to be solved on one machine. Therefore, we propose a scalable and efficient data structure that allows StarQL implementations to handle large sets of strings and utilize large computing infrastructures. Our evaluation shows that StarQL is able to express workloads of application-specific tools, such as BLAST and KAT in bioinformatics, and to mine Wikipedia text for interesting patterns using declarative queries. Furthermore, the StarQL query optimizer shows an order of magnitude reduction in query execution time.

Secure Skyline Queries on Cloud Platform.

Science.gov (United States)

Liu, Jinfei; Yang, Juncheng; Xiong, Li; Pei, Jian

2017-04-01

Outsourcing data and computation to cloud server provides a cost-effective way to support large scale data storage and query processing. However, due to security and privacy concerns, sensitive data (e.g., medical records) need to be protected from the cloud server and other unauthorized users. One approach is to outsource encrypted data to the cloud server and have the cloud server perform query processing on the encrypted data only. It remains a challenging task to support various queries over encrypted data in a secure and efficient way such that the cloud server does not gain any knowledge about the data, query, and query result. In this paper, we study the problem of secure skyline queries over encrypted data. The skyline query is particularly important for multi-criteria decision making but also presents significant challenges due to its complex computations. We propose a fully secure skyline query protocol on data encrypted using semantically-secure encryption. As a key subroutine, we present a new secure dominance protocol, which can be also used as a building block for other queries. Finally, we provide both serial and parallelized implementations and empirically study the protocols in terms of efficiency and scalability under different parameter settings, verifying the feasibility of our proposed solutions.
Using SNOMED CT to represent two interface terminologies.

Science.gov (United States)

Rosenbloom, S Trent; Brown, Steven H; Froehling, David; Bauer, Brent A; Wahner-Roedler, Dietlind L; Gregg, William M; Elkin, Peter L

2009-01-01

Interface terminologies are designed to support interactions between humans and structured medical information. In particular, many interface terminologies have been developed for structured computer based documentation systems. Experts and policy-makers have recommended that interface terminologies be mapped to reference terminologies. The goal of the current study was to evaluate how well the reference terminology SNOMED CT could map to and represent two interface terminologies, MEDCIN and the Categorical Health Information Structured Lexicon (CHISL). Automated mappings between SNOMED CT and 500 terms from each of the two interface terminologies were evaluated by human reviewers, who also searched SNOMED CT to identify better mappings when this was judged to be necessary. Reviewers judged whether they believed the interface terms to be clinically appropriate, whether the terms were covered by SNOMED CT concepts and whether the terms' implied semantic structure could be represented by SNOMED CT. Outcomes included concept coverage by SNOMED CT for study terms and their implied semantics. Agreement statistics and compositionality measures were calculated. The SNOMED CT terminology contained concepts to represent 92.4% of MEDCIN and 95.9% of CHISL terms. Semantic structures implied by study terms were less well covered, with some complex compositional expressions requiring semantics not present in SNOMED CT. Among sampled terms, those from MEDCIN were more complex than those from CHISL, containing an average 3.8 versus 1.8 atomic concepts respectively, pterms.
The value of Retrospective and Concurrent Think Aloud in formative usability testing of a physician data query tool.

Science.gov (United States)

Peute, Linda W P; de Keizer, Nicolette F; Jaspers, Monique W M

2015-06-01

To compare the performance of the Concurrent (CTA) and Retrospective (RTA) Think Aloud method and to assess their value in a formative usability evaluation of an Intensive Care Registry-physician data query tool designed to support ICU quality improvement processes. Sixteen representative intensive care physicians participated in the usability evaluation study. Subjects were allocated to either the CTA or RTA method by a matched randomized design. Each subject performed six usability-testing tasks of varying complexity in the query tool in a real-working context. Methods were compared with regard to number and type of problems detected. Verbal protocols of CTA and RTA were analyzed in depth to assess differences in verbal output. Standardized measures were applied to assess thoroughness in usability problem detection weighted per problem severity level and method overall effectiveness in detecting usability problems with regard to the time subjects spent per method. The usability evaluation of the data query tool revealed a total of 43 unique usability problems that the intensive care physicians encountered. CTA detected unique usability problems with regard to graphics/symbols, navigation issues, error messages, and the organization of information on the query tool's screens. RTA detected unique issues concerning system match with subjects' language and applied terminology. The in-depth verbal protocol analysis of CTA provided information on intensive care physicians' query design strategies. Overall, CTA performed significantly better than RTA in detecting usability problems. CTA usability problem detection effectiveness was 0.80 vs. 0.62 (pusability problems of a moderate (0.85 vs. 0.7) and severe nature (0.71 vs. 0.57). In this study, the CTA is more effective in usability-problem detection and provided clarification of intensive care physician query design strategies to inform redesign of the query tool. However, CTA does not outperform RTA. The RTA
Lost in translation? A multilingual Query Builder improves the quality of PubMed queries: a randomised controlled trial.

Science.gov (United States)

Schuers, Matthieu; Joulakian, Mher; Kerdelhué, Gaetan; Segas, Léa; Grosjean, Julien; Darmoni, Stéfan J; Griffon, Nicolas

2017-07-03

MEDLINE is the most widely used medical bibliographic database in the world. Most of its citations are in English and this can be an obstacle for some researchers to access the information the database contains. We created a multilingual query builder to facilitate access to the PubMed subset using a language other than English. The aim of our study was to assess the impact of this multilingual query builder on the quality of PubMed queries for non-native English speaking physicians and medical researchers. A randomised controlled study was conducted among French speaking general practice residents. We designed a multi-lingual query builder to facilitate information retrieval, based on available MeSH translations and providing users with both an interface and a controlled vocabulary in their own language. Participating residents were randomly allocated either the French or the English version of the query builder. They were asked to translate 12 short medical questions into MeSH queries. The main outcome was the quality of the query. Two librarians blind to the arm independently evaluated each query, using a modified published classification that differentiated eight types of errors. Twenty residents used the French version of the query builder and 22 used the English version. 492 queries were analysed. There were significantly more perfect queries in the French group vs. the English group (respectively 37.9% vs. 17.9%; p PubMed queries in particular for researchers whose first language is not English.
Multi-Dimensional Path Queries

DEFF Research Database (Denmark)

Bækgaard, Lars

1998-01-01

to create nested path structures. We present an SQL-like query language that is based on path expressions and we show how to use it to express multi-dimensional path queries that are suited for advanced data analysis in decision support environments like data warehousing environments......We present the path-relationship model that supports multi-dimensional data modeling and querying. A path-relationship database is composed of sets of paths and sets of relationships. A path is a sequence of related elements (atoms, paths, and sets of paths). A relationship is a binary path...
There Is No Knowledge Without Terminology. How Terminological Methods and Tools Can Help to Manage Monolingual and Multilingual Knowledge and Communication

Directory of Open Access Journals (Sweden)

Gabriele Sauberer

2011-04-01

Full Text Available The paper presents “10 good reasons for terminology” in any expert field and any language(s by discussing the areas of application in the public and the private sector as well as in science and education. After a short introduction on the history of terminology, the term “ontology” will be discussed, as one of the key terms in current knowledge engineering and terminology. The paper gives an overview on means and methods of assuring and improving the quality of knowledge generation, communication and management through terminology. Also, it introduces the main standards, players and experts in the terminology community, such as the International Network for Terminology (www.termnet.org.
Learning via Query Synthesis

KAUST Repository

Alabdulmohsin, Ibrahim

2017-01-01

Active learning is a subfield of machine learning that has been successfully used in many applications. One of the main branches of active learning is query synthe- sis, where the learning agent constructs artificial queries from scratch in order
CDISC Terminology

Science.gov (United States)

Clinical Data Interchange Standards Consortium (CDISC) is an international, non-profit organization that develops and supports global data standards for medical research. CDISC is working actively with EVS to develop and support controlled terminology in several areas, notably CDISC's Study Data Tabulation Model (SDTM).
Truth Space Method for Caching Database Queries

Directory of Open Access Journals (Sweden)

S. V. Mosin

2015-01-01

Full Text Available We propose a new method of client-side data caching for relational databases with a central server and distant clients. Data are loaded into the client cache based on queries executed on the server. Every query has the corresponding DB table – the result of the query execution. These queries have a special form called "universal relational query" based on three fundamental Relational Algebra operations: selection, projection and natural join. We have to mention that such a form is the closest one to the natural language and the majority of database search queries can be expressed in this way. Besides, this form allows us to analyze query correctness by checking lossless join property. A subsequent query may be executed in a client’s local cache if we can determine that the query result is entirely contained in the cache. For this we compare truth spaces of the logical restrictions in a new user’s query and the results of the queries execution in the cache. Such a comparison can be performed analytically , without need in additional Database queries. This method may be used to define lacking data in the cache and execute the query on the server only for these data. To do this the analytical approach is also used, what distinguishes our paper from the existing technologies. We propose four theorems for testing the required conditions. The first and the third theorems conditions allow us to define the existence of required data in cache. The second and the fourth theorems state conditions to execute queries with cache only. The problem of cache data actualizations is not discussed in this paper. However, it can be solved by cataloging queries on the server and their serving by triggers in background mode. The article is published in the author’s wording.
TERMINOLOGY MANAGEMENT FRAMEWORK DEVIATIONS IN PROJECTS

Directory of Open Access Journals (Sweden)

Олена Борисівна ДАНЧЕНКО

2015-05-01

Full Text Available The article reviews new approaches to managing projects deviations (risks, changes, problems. By offering integrated control these parameters of the project and by analogy with medical terminological systems building a new system for managing terminological variations in the projects. With an improved method of triads system definitions are analyzed medical terms that make up terminological basis. Using the method of analogy proposed new definitions for managing deviations in projects. By using triad integrity built a new system triad in project management, which will subsequently also analogous to develop a new methodology of deviations in projects.
Using SNOMED CT to Represent Two Interface Terminologies

Science.gov (United States)

Rosenbloom, S. Trent; Brown, Steven H.; Froehling, David; Bauer, Brent A.; Wahner-Roedler, Dietlind L.; Gregg, William M.; Elkin, Peter L.

2009-01-01

Objective Interface terminologies are designed to support interactions between humans and structured medical information. In particular, many interface terminologies have been developed for structured computer based documentation systems. Experts and policy-makers have recommended that interface terminologies be mapped to reference terminologies. The goal of the current study was to evaluate how well the reference terminology SNOMED CT could map to and represent two interface terminologies, MEDCIN and the Categorical Health Information Structured Lexicon (CHISL). Design Automated mappings between SNOMED CT and 500 terms from each of the two interface terminologies were evaluated by human reviewers, who also searched SNOMED CT to identify better mappings when this was judged to be necessary. Reviewers judged whether they believed the interface terms to be clinically appropriate, whether the terms were covered by SNOMED CT concepts and whether the terms' implied semantic structure could be represented by SNOMED CT. Measurements Outcomes included concept coverage by SNOMED CT for study terms and their implied semantics. Agreement statistics and compositionality measures were calculated. Results The SNOMED CT terminology contained concepts to represent 92.4% of MEDCIN and 95.9% of CHISL terms. Semantic structures implied by study terms were less well covered, with some complex compositional expressions requiring semantics not present in SNOMED CT. Among sampled terms, those from MEDCIN were more complex than those from CHISL, containing an average 3.8 versus 1.8 atomic concepts respectively, pterms. PMID:18952944
Optimizing Temporal Queries: Efficient Handling of Duplicates

DEFF Research Database (Denmark)

Toman, David; Bowman, Ivan Thomas

2001-01-01

, these query languages are implemented by translating temporal queries into standard relational queries. However, the compiled queries are often quite cumbersome and expensive to execute even using state-of-the- art relational products. This paper presents an optimization technique that produces more efficient...... translated SQL queries by taking into account the properties of the encoding used for temporal attributes. For concreteness, this translation technique is presented in the context of SQL/TP; however, these techniques are also applicable to other temporal query languages....
E-terminology*

African Journals Online (AJOL)

rbr

knowledge continue to increase in both quantity and quality. In the lively sci- entific ... making human–computer interfaces with environments available. Tools to sup- ... equivalents can then be pasted into the document being translated. Compound ..... Terminology can be used for artificial intelligence purposes (e.g. speech.
Interoperable Archetypes With a Three Folded Terminology Governance.

Science.gov (United States)

Pederson, Rune; Ellingsen, Gunnar

2015-01-01

The use of openEHR archetypes increases the interoperability of clinical terminology, and in doing so improves upon the availability of clinical terminology for both primary and secondary purposes. Where clinical terminology is employed in the EPR system, research reports conflicting a results for the use of structuring and standardization as measurements of success. In order to elucidate this concept, this paper focuses on the effort to establish a national repository for openEHR based archetypes in Norway where clinical terminology could be included with benefit for interoperability three folded.
A comparison study on algorithms of detecting long forms for short forms in biomedical text

Directory of Open Access Journals (Sweden)

Wu Cathy H

2007-11-01

Full Text Available Abstract Motivation With more and more research dedicated to literature mining in the biomedical domain, more and more systems are available for people to choose from when building literature mining applications. In this study, we focus on one specific kind of literature mining task, i.e., detecting definitions of acronyms, abbreviations, and symbols in biomedical text. We denote acronyms, abbreviations, and symbols as short forms (SFs and their corresponding definitions as long forms (LFs. The study was designed to answer the following questions; i how well a system performs in detecting LFs from novel text, ii what the coverage is for various terminological knowledge bases in including SFs as synonyms of their LFs, and iii how to combine results from various SF knowledge bases. Method We evaluated the following three publicly available detection systems in detecting LFs for SFs: i a handcrafted pattern/rule based system by Ao and Takagi, ALICE, ii a machine learning system by Chang et al., and iii a simple alignment-based program by Schwartz and Hearst. In addition, we investigated the conceptual coverage of two terminological knowledge bases: i the UMLS (the Unified Medical Language System, and ii the BioThesaurus (a thesaurus of names for all UniProt protein records. We also implemented a web interface that provides a virtual integration of various SF knowledge bases. Results We found that detection systems agree with each other on most cases, and the existing terminological knowledge bases have a good coverage of synonymous relationship for frequently defined LFs. The web interface allows people to detect SF definitions from text and to search several SF knowledge bases. Availability The web site is http://gauss.dbb.georgetown.edu/liblab/SFThesaurus.
Pro PHP and jQuery

CERN Document Server

Lengstorf, Jason

2010-01-01

This book is for intermediate programmers interested in building AJAX web applications using jQuery and PHP. Along with teaching some advanced PHP techniques, it will teach you how to take your dynamic applications to the next level by adding a JavaScript layer with jQuery. * Learn to utilize built-in PHP functions to build calendar tools.* Learn how jQuery can be used for AJAX, animation, client-side validation, and more.What you'll learn* Use PHP to build a calendar application that allows users to post, view, edit, and delete events.* Use jQuery to allow the calendar app to be viewed and ed
Query recommendation for children

NARCIS (Netherlands)

Duarte Torres, Sergio; Hiemstra, Djoerd; Weber, Ingmar; Serdyukov, Pavel

2012-01-01

One of the biggest problems that children experience while searching the web occurs during the query formulation process. Children have been found to struggle formulating queries based on keywords given their limited vocabulary and their difficulty to choose the right keywords. In this work we
Integration of relational and textual biomedical sources. A pilot experiment using a semi-automated method for logical schema acquisition.

Science.gov (United States)

García-Remesal, M; Maojo, V; Billhardt, H; Crespo, J

2010-01-01

Bringing together structured and text-based sources is an exciting challenge for biomedical informaticians, since most relevant biomedical sources belong to one of these categories. In this paper we evaluate the feasibility of integrating relational and text-based biomedical sources using: i) an original logical schema acquisition method for textual databases developed by the authors, and ii) OntoFusion, a system originally designed by the authors for the integration of relational sources. We conducted an integration experiment involving a test set of seven differently structured sources covering the domain of genetic diseases. We used our logical schema acquisition method to generate schemas for all textual sources. The sources were integrated using the methods and tools provided by OntoFusion. The integration was validated using a test set of 500 queries. A panel of experts answered a questionnaire to evaluate i) the quality of the extracted schemas, ii) the query processing performance of the integrated set of sources, and iii) the relevance of the retrieved results. The results of the survey show that our method extracts coherent and representative logical schemas. Experts' feedback on the performance of the integrated system and the relevance of the retrieved results was also positive. Regarding the validation of the integration, the system successfully provided correct results for all queries in the test set. The results of the experiment suggest that text-based sources including a logical schema can be regarded as equivalent to structured databases. Using our method, previous research and existing tools designed for the integration of structured databases can be reused - possibly subject to minor modifications - to integrate differently structured sources.
Nuclear terminology during forty years

International Nuclear Information System (INIS)

Sjoestrand, N.G.

1994-04-01

In Sweden terminology work in the field of nuclear technology started in the early 1950s. Three dictionaries were completed in 1962, 1975 and 1990, respectively, mainly through cooperation between Swedish Mechanical Standards Institution (SMS) and Swedish Centre for Technical Terminology (TNC). In parallel to this, international work has been performed through the International Organization for Standardization (ISO). In conclusion, problems concerning some special terms are discussed. 17 refs
Facilitating Cohort Discovery by Enhancing Ontology Exploration, Query Management and Query Sharing for Large Clinical Data Repositories

Science.gov (United States)

Tao, Shiqiang; Cui, Licong; Wu, Xi; Zhang, Guo-Qiang

2017-01-01

To help researchers better access clinical data, we developed a prototype query engine called DataSphere for exploring large-scale integrated clinical data repositories. DataSphere expedites data importing using a NoSQL data management system and dynamically renders its user interface for concept-based querying tasks. DataSphere provides an interactive query-building interface together with query translation and optimization strategies, which enable users to build and execute queries effectively and efficiently. We successfully loaded a dataset of one million patients for University of Kentucky (UK) Healthcare into DataSphere with more than 300 million clinical data records. We evaluated DataSphere by comparing it with an instance of i2b2 deployed at UK Healthcare, demonstrating that DataSphere provides enhanced user experience for both query building and execution. PMID:29854239

Facilitating Cohort Discovery by Enhancing Ontology Exploration, Query Management and Query Sharing for Large Clinical Data Repositories.

Science.gov (United States)

Tao, Shiqiang; Cui, Licong; Wu, Xi; Zhang, Guo-Qiang

2017-01-01

To help researchers better access clinical data, we developed a prototype query engine called DataSphere for exploring large-scale integrated clinical data repositories. DataSphere expedites data importing using a NoSQL data management system and dynamically renders its user interface for concept-based querying tasks. DataSphere provides an interactive query-building interface together with query translation and optimization strategies, which enable users to build and execute queries effectively and efficiently. We successfully loaded a dataset of one million patients for University of Kentucky (UK) Healthcare into DataSphere with more than 300 million clinical data records. We evaluated DataSphere by comparing it with an instance of i2b2 deployed at UK Healthcare, demonstrating that DataSphere provides enhanced user experience for both query building and execution.
Towards Verbalizing SPARQL Queries in Arabic

Directory of Open Access Journals (Sweden)

I. Al Agha

2016-04-01

Full Text Available With the wide spread of Open Linked Data and Semantic Web technologies, a larger amount of data has been published on the Web in the RDF and OWL formats. This data can be queried using SPARQL, the Semantic Web Query Language. SPARQL cannot be understood by ordinary users and is not directly accessible to humans, and thus they will not be able to check whether the retrieved answers truly correspond to the intended information need. Driven by this challenge, natural language generation from SPARQL data has recently attracted a considerable attention. However, most existing solutions to verbalize SPARQL in natural language focused on English and Latin-based languages. Little effort has been made on the Arabic language which has different characteristics and morphology. This work aims to particularly help Arab users to perceive SPARQL queries on the Semantic Web by translating SPARQL to Arabic. It proposes an approach that gets a SPARQL query as an input and generates a query expressed in Arabic as an output. The translation process combines both morpho-syntactic analysis and language dependencies to generate a legible and understandable Arabic query. The approach was preliminary assessed with a sample query set, and results indicated that 75% of the queries were correctly translated into Arabic.
Terminology and methodology in modelling for water quality management

DEFF Research Database (Denmark)

Carstensen, J.; Vanrolleghem, P.; Rauch, W.

1997-01-01

There is a widespread need for a common terminology in modelling for water quality management. This paper points out sources of confusion in the communication between researchers due to misuse of existing terminology or use of unclear terminology. The paper attempts to clarify the context...... of the most widely used terms for characterising models and within the process of model building. It is essential to the ever growing society of researchers within water quality management, that communication is eased by establishing a common terminology. This should not be done by giving broader definitions...... of the terms, but by stressing the use of a stringent terminology. Therefore, the goal of the paper is to advocate the use of such a well defined and clear terminology. (C) 1997 IAWQ. Published by Elsevier Science Ltd....
A Framework for WWW Query Processing

Science.gov (United States)

Wu, Binghui Helen; Wharton, Stephen (Technical Monitor)

2000-01-01

Query processing is the most common operation in a DBMS. Sophisticated query processing has been mainly targeted at a single enterprise environment providing centralized control over data and metadata. Submitting queries by anonymous users on the web is different in such a way that load balancing or DBMS' accessing control becomes the key issue. This paper provides a solution by introducing a framework for WWW query processing. The success of this framework lies in the utilization of query optimization techniques and the ontological approach. This methodology has proved to be cost effective at the NASA Goddard Space Flight Center Distributed Active Archive Center (GDAAC).
Learning via Query Synthesis

KAUST Repository

Alabdulmohsin, Ibrahim Mansour

2017-05-07

Active learning is a subfield of machine learning that has been successfully used in many applications. One of the main branches of active learning is query synthe- sis, where the learning agent constructs artificial queries from scratch in order to reveal sensitive information about the underlying decision boundary. It has found applications in areas, such as adversarial reverse engineering, automated science, and computational chemistry. Nevertheless, the existing literature on membership query synthesis has, generally, focused on finite concept classes or toy problems, with a limited extension to real-world applications. In this thesis, I develop two spectral algorithms for learning halfspaces via query synthesis. The first algorithm is a maximum-determinant convex optimization method while the second algorithm is a Markovian method that relies on Khachiyan’s classical update formulas for solving linear programs. The general theme of these methods is to construct an ellipsoidal approximation of the version space and to synthesize queries, afterward, via spectral decomposition. Moreover, I also describe how these algorithms can be extended to other settings as well, such as pool-based active learning. Having demonstrated that halfspaces can be learned quite efficiently via query synthesis, the second part of this thesis proposes strategies for mitigating the risk of reverse engineering in adversarial environments. One approach that can be used to render query synthesis algorithms ineffective is to implement a randomized response. In this thesis, I propose a semidefinite program (SDP) for learning a distribution of classifiers, subject to the constraint that any individual classifier picked at random from this distributions provides reliable predictions with a high probability. This algorithm is, then, justified both theoretically and empirically. A second approach is to use a non-parametric classification method, such as similarity-based classification. In this
jQuery Tools UI Library

CERN Document Server

Libby, Alex

2012-01-01

A practical tutorial with powerful yet simple projects that are quick to implement. This book is aimed at developers who have prior jQuery knowledge, but may not have any prior experience with jQuery Tools. It is possible that they may have started with the basics of jQuery Tools, but want to learn more about how it can be used, as well as get ideas for future projects.
Joint Top-K Spatial Keyword Query Processing

DEFF Research Database (Denmark)

Wu, Dingming; Yiu, Man Lung; Cong, Gao

2012-01-01

Web users and content are increasingly being geopositioned, and increased focus is being given to serving local content in response to web queries. This development calls for spatial keyword queries that take into account both the locations and textual descriptions of content. We study the effici......Web users and content are increasingly being geopositioned, and increased focus is being given to serving local content in response to web queries. This development calls for spatial keyword queries that take into account both the locations and textual descriptions of content. We study...... the efficient, joint processing of multiple top-k spatial keyword queries. Such joint processing is attractive during high query loads and also occurs when multiple queries are used to obfuscate a user's true query. We propose a novel algorithm and index structure for the joint processing of top-k spatial...... keyword queries. Empirical studies show that the proposed solution is efficient on real data sets. We also offer analytical studies on synthetic data sets to demonstrate the efficiency of the proposed solution. Index Terms IEEE Terms Electronic mail , Google , Indexes , Joints , Mobile communication...
Federal Medication Terminologies

Science.gov (United States)

Federal Medication (FedMed) collaboration of 8 partner agencies agreed on a set of standard, comprehensive, freely and easily accessible FMT terminologies to improve the exchange and public availability of medication information.
Medical Terminology.

Science.gov (United States)

Mercer County Community Coll., Trenton, NJ.

This document is one of a series of student workbooks developed for workplace skill development courses or workshops by Mercer County Community College (New Jersey) and its partners. Designed to help employees of medical establishments learn medical terminology, this course provides information on basic word structure, body parts, suffixes and…
Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols.

Science.gov (United States)

Campagne, Fabien

2008-02-29

The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications. We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79-0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86-0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently. Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute
Research Issues in Mobile Querying

DEFF Research Database (Denmark)

Breunig, M.; Jensen, Christian Søndergaard; Klein, M.

2004-01-01

This document reports on key aspects of the discussions conducted within the working group. In particular, the document aims to offer a structured and somewhat digested summary of the group's discussions. The document first offers concepts that enable characterization of "mobile queries" as well...... as the types of systems that enable such queries. It explores the notion of context in mobile queries. The document ends with a few observations, mainly regarding challenges....
Ontobee: A linked ontology data server to support ontology term dereferencing, linkage, query and integration.

Science.gov (United States)

Ong, Edison; Xiang, Zuoshuang; Zhao, Bin; Liu, Yue; Lin, Yu; Zheng, Jie; Mungall, Chris; Courtot, Mélanie; Ruttenberg, Alan; He, Yongqun

2017-01-04

Linked Data (LD) aims to achieve interconnected data by representing entities using Unified Resource Identifiers (URIs), and sharing information using Resource Description Frameworks (RDFs) and HTTP. Ontologies, which logically represent entities and relations in specific domains, are the basis of LD. Ontobee (http://www.ontobee.org/) is a linked ontology data server that stores ontology information using RDF triple store technology and supports query, visualization and linkage of ontology terms. Ontobee is also the default linked data server for publishing and browsing biomedical ontologies in the Open Biological Ontology (OBO) Foundry (http://obofoundry.org) library. Ontobee currently hosts more than 180 ontologies (including 131 OBO Foundry Library ontologies) with over four million terms. Ontobee provides a user-friendly web interface for querying and visualizing the details and hierarchy of a specific ontology term. Using the eXtensible Stylesheet Language Transformation (XSLT) technology, Ontobee is able to dereference a single ontology term URI, and then output RDF/eXtensible Markup Language (XML) for computer processing or display the HTML information on a web browser for human users. Statistics and detailed information are generated and displayed for each ontology listed in Ontobee. In addition, a SPARQL web interface is provided for custom advanced SPARQL queries of one or multiple ontologies. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Terminology for pregnancy loss prior to viability

DEFF Research Database (Denmark)

Kolte, A M; Bernardi, L A; Christiansen, O B

2015-01-01

Pregnancy loss prior to viability is common and research in the field is extensive. Unfortunately, terminology in the literature is inconsistent. The lack of consensus regarding nomenclature and classification of pregnancy loss prior to viability makes it difficult to compare study results from...... different centres. In our opinion, terminology and definitions should be based on clinical findings, and when possible, transvaginal ultrasound. With this Early Pregnancy Consensus Statement, it is our goal to provide clear and consistent terminology for pregnancy loss prior to viability....
On tractable query evaluation for SPARQL

OpenAIRE

Mengel, Stefan; Skritek, Sebastian

2017-01-01

Despite much work within the last decade on foundational properties of SPARQL - the standard query language for RDF data - rather little is known about the exact limits of tractability for this language. In particular, this is the case for SPARQL queries that contain the OPTIONAL-operator, even though it is one of the most intensively studied features of SPARQL. The aim of our work is to provide a more thorough picture of tractable classes of SPARQL queries. In general, SPARQL query evaluatio...
Man vs. Machine: Differences in SPARQL Queries

NARCIS (Netherlands)

Rietveld, L.; Hoekstra, R.

2014-01-01

Server-side SPARQL query logs have been a topic of study for some time now. The USEWOD collection of query logs is currently the primary source of information for researchers. A recurring problem is that these logs leave application queries and queries created by humans indistinguishable. In this
Development and evaluation of a biomedical search engine using a predicate-based vector space model.

Science.gov (United States)

Kwak, Myungjae; Leroy, Gondy; Martinez, Jesse D; Harwell, Jeffrey

2013-10-01

Although biomedical information available in articles and patents is increasing exponentially, we continue to rely on the same information retrieval methods and use very few keywords to search millions of documents. We are developing a fundamentally different approach for finding much more precise and complete information with a single query using predicates instead of keywords for both query and document representation. Predicates are triples that are more complex datastructures than keywords and contain more structured information. To make optimal use of them, we developed a new predicate-based vector space model and query-document similarity function with adjusted tf-idf and boost function. Using a test bed of 107,367 PubMed abstracts, we evaluated the first essential function: retrieving information. Cancer researchers provided 20 realistic queries, for which the top 15 abstracts were retrieved using a predicate-based (new) and keyword-based (baseline) approach. Each abstract was evaluated, double-blind, by cancer researchers on a 0-5 point scale to calculate precision (0 versus higher) and relevance (0-5 score). Precision was significantly higher (psearching than keywords, laying the foundation for rich and sophisticated information search. Copyright © 2013 Elsevier Inc. All rights reserved.
How Good Are Query Optimizers, Really?

NARCIS (Netherlands)

Leis, Viktor; Gubichev, Andrey; Mirchev, Atanas; Boncz, Peter; Kemper, Alfons; Neumann, Thomas

2016-01-01

Finding a good join order is crucial for query performance. In this paper, we introduce the Join Order Benchmark (JOB) and experimentally revisit the main components in the classic query optimizer architecture using a complex, real-world data set and realistic multi-join queries. We investigate the
Reconciliation of ontology and terminology to cope with linguistics.

Science.gov (United States)

Baud, Robert H; Ceusters, Werner; Ruch, Patrick; Rassinoux, Anne-Marie; Lovis, Christian; Geissbühler, Antoine

2007-01-01

To discuss the relationships between ontologies, terminologies and language in the context of Natural Language Processing (NLP) applications in order to show the negative consequences of confusing them. The viewpoints of the terminologist and (computational) linguist are developed separately, and then compared, leading to the presentation of reconciliation among these points of view, with consideration of the role of the ontologist. In order to encourage appropriate usage of terminologies, guidelines are presented advocating the simultaneous publication of pragmatic vocabularies supported by terminological material based on adequate ontological analysis. Ontologies, terminologies and natural languages each have their own purpose. Ontologies support machine understanding, natural languages support human communication, and terminologies should form the bridge between them. Therefore, future terminology standards should be based on sound ontology and do justice to the diversities in natural languages. Moreover, they should support local vocabularies, in order to be easily adaptable to local needs and practices.
Querying XML Data with SPARQL

Science.gov (United States)

Bikakis, Nikos; Gioldasis, Nektarios; Tsinaraki, Chrisa; Christodoulakis, Stavros

SPARQL is today the standard access language for Semantic Web data. In the recent years XML databases have also acquired industrial importance due to the widespread applicability of XML in the Web. In this paper we present a framework that bridges the heterogeneity gap and creates an interoperable environment where SPARQL queries are used to access XML databases. Our approach assumes that fairly generic mappings between ontology constructs and XML Schema constructs have been automatically derived or manually specified. The mappings are used to automatically translate SPARQL queries to semantically equivalent XQuery queries which are used to access the XML databases. We present the algorithms and the implementation of SPARQL2XQuery framework, which is used for answering SPARQL queries over XML databases.
Superfund Query

Data.gov (United States)

U.S. Environmental Protection Agency — The Superfund Query allows users to retrieve data from the Comprehensive Environmental Response, Compensation, and Liability Information System (CERCLIS) database.

User Experimentation with Terminological Ontologies

DEFF Research Database (Denmark)

Pram Nielsen, Louise

This paper outlines work-in-progress research suggesting that domain-specific knowledge in terminological resources can be transferred efficiently to end-users across different levels of expertise and by means of different information modes including articles (written mode) and concept diagrams...... (graph mode). An experimental approach is applied in an eye-tracking laboratory, where a natural user situation is replicated for Danish professional potential end-users of a ter-minology and knowledge bank in a chosen pilot domain (taxation)....
Terminology and definitions on groin pain in athletes

DEFF Research Database (Denmark)

Weir, Adam; Hölmich, Per; Schache, Anthony G

2015-01-01

BACKGROUND: Groin pain in athletes occurs frequently and can be difficult to treat, which may partly be due to the lack of agreement on diagnostic terminology. OBJECTIVE: To perform a short Delphi survey on terminology agreement for groin pain in athletes by a group of experts. METHODS: A selected...... taxonomy reflects only a slight agreement between the various diagnostic terms provided by the selected experts. CONCLUSIONS: This short Delphi survey of two 'typical, straightforward' cases demonstrated major inconsistencies in the diagnostic terminology used by experts for groin pain in athletes....... These results underscore the need for consensus on definitions and terminology on groin pain in athletes....
Optimizing queries in distributed systems

Directory of Open Access Journals (Sweden)

Ion LUNGU

2006-01-01

Full Text Available This research presents the main elements of query optimizations in distributed systems. First, data architecture according with system level architecture in a distributed environment is presented. Then the architecture of a distributed database management system (DDBMS is described on conceptual level followed by the presentation of the distributed query execution steps on these information systems. The research ends with presentation of some aspects of distributed database query optimization and strategies used for that.
The National Terminology Services: A New Paradigm

African Journals Online (AJOL)

terminologies which are vital for meaningful interaction, thus protecting the ... taken that this situation does not counteract the goal and promotion of multi- .... touch with the changing terminology needs of society in a fast changing envi- ronment ...
Advanced Query Formulation in Deductive Databases.

Science.gov (United States)

Niemi, Timo; Jarvelin, Kalervo

1992-01-01

Discusses deductive databases and database management systems (DBMS) and introduces a framework for advanced query formulation for end users. Recursive processing is described, a sample extensional database is presented, query types are explained, and criteria for advanced query formulation from the end user's viewpoint are examined. (31…
Dynamic Planar Range Maxima Queries

DEFF Research Database (Denmark)

Brodal, Gerth Stølting; Tsakalidis, Konstantinos

2011-01-01

We consider the dynamic two-dimensional maxima query problem. Let P be a set of n points in the plane. A point is maximal if it is not dominated by any other point in P. We describe two data structures that support the reporting of the t maximal points that dominate a given query point, and allow...... for insertions and deletions of points in P. In the pointer machine model we present a linear space data structure with O(logn + t) worst case query time and O(logn) worst case update time. This is the first dynamic data structure for the planar maxima dominance query problem that achieves these bounds...... are integers in the range U = {0, …,2 w − 1 }. We present a linear space data structure that supports 3-sided range maxima queries in O(logn/loglogn+t) worst case time and updates in O(logn/loglogn) worst case time. These are the first sublogarithmic worst case bounds for all operations in the RAM model....
CDAPubMed: a browser extension to retrieve EHR-based biomedical literature

Directory of Open Access Journals (Sweden)

Perez-Rey David

2012-04-01

Full Text Available Abstract Background Over the last few decades, the ever-increasing output of scientific publications has led to new challenges to keep up to date with the literature. In the biomedical area, this growth has introduced new requirements for professionals, e.g., physicians, who have to locate the exact papers that they need for their clinical and research work amongst a huge number of publications. Against this backdrop, novel information retrieval methods are even more necessary. While web search engines are widespread in many areas, facilitating access to all kinds of information, additional tools are required to automatically link information retrieved from these engines to specific biomedical applications. In the case of clinical environments, this also means considering aspects such as patient data security and confidentiality or structured contents, e.g., electronic health records (EHRs. In this scenario, we have developed a new tool to facilitate query building to retrieve scientific literature related to EHRs. Results We have developed CDAPubMed, an open-source web browser extension to integrate EHR features in biomedical literature retrieval approaches. Clinical users can use CDAPubMed to: (i load patient clinical documents, i.e., EHRs based on the Health Level 7-Clinical Document Architecture Standard (HL7-CDA, (ii identify relevant terms for scientific literature search in these documents, i.e., Medical Subject Headings (MeSH, automatically driven by the CDAPubMed configuration, which advanced users can optimize to adapt to each specific situation, and (iii generate and launch literature search queries to a major search engine, i.e., PubMed, to retrieve citations related to the EHR under examination. Conclusions CDAPubMed is a platform-independent tool designed to facilitate literature searching using keywords contained in specific EHRs. CDAPubMed is visually integrated, as an extension of a widespread web browser, within the standard
Nearest Neighbor Queries in Road Networks

DEFF Research Database (Denmark)

Jensen, Christian Søndergaard; Kolar, Jan; Pedersen, Torben Bach

2003-01-01

in road networks. Such queries may be of use in many services. Specifically, we present an easily implementable data model that serves well as a foundation for such queries. We also present the design of a prototype system that implements the queries based on the data model. The algorithm used...
Fingerprinting Keywords in Search Queries over Tor

Directory of Open Access Journals (Sweden)

Oh Se Eun

2017-10-01

Full Text Available Search engine queries contain a great deal of private and potentially compromising information about users. One technique to prevent search engines from identifying the source of a query, and Internet service providers (ISPs from identifying the contents of queries is to query the search engine over an anonymous network such as Tor.
Adding Query Privacy to Robust DHTs

DEFF Research Database (Denmark)

Backes, Michael; Goldberg, Ian; Kate, Aniket

2011-01-01

intermediate peers that (help to) route the queries towards their destinations. In this paper, we satisfy this requirement by presenting an approach for providing privacy for the keys in DHT queries. We use the concept of oblivious transfer (OT) in communication over DHTs to preserve query privacy without...... of obtaining query privacy over robust DHTs. Finally, we compare the performance of our privacy-preserving protocols with their more privacy-invasive counterparts. We observe that there is no increase in the message complexity and only a small overhead in the computational complexity....
QUERY SUPPORT FOR GMZ

Directory of Open Access Journals (Sweden)

A. Khandelwal

2017-07-01

Full Text Available Generic text-based compression models are simple and fast but there are two issues that needs to be addressed. They cannot leverage the structure that exists in data to achieve better compression and there is an unnecessary decompression step before the user can actually use the data. To address these issues, we came up with GMZ, a lossless compression model aimed at achieving high compression ratios. The decision to design GMZ (Khandelwal and Rajan, 2017 exclusively for GML's Simple Features Profile (SFP seems fair because of the high use of SFP in WFS and that it facilitates high optimisation of the compression model. This is an extension of our work on GMZ. In a typical server-client model such as Web Feature Service, the server is the primary creator and provider of GML, and therefore, requires compression and query capabilities. On the other hand, the client is the primary consumer of GML, and therefore, requires decompression and visualisation capabilities. In the first part of our work, we demonstrated compression using a python script that can be plugged in a server architecture, and decompression and visualisation in a web browser using a Firefox addon. The focus of this work is to develop the already existing tools to provide query capability to server. Our model provides the ability to decompress individual features in isolation, which is an essential requirement for realising query in compressed state. We con - struct an R-Tree index for spatial data and a custom index for non-spatial data and store these in a separate index file to prevent alter - ing the compression model. This facilitates independent use of compressed GMZ file where index can be constructed when required. The focus of this work is the bounding-box or range query commonly used in webGIS with provision for other spatial and non-spatial queries. The decrement in compression ratios due to the new index file is in the range of 1–3 percent which is trivial considering
Ranking Queries on Uncertain Data

CERN Document Server

Hua, Ming

2011-01-01

Uncertain data is inherent in many important applications, such as environmental surveillance, market analysis, and quantitative economics research. Due to the importance of those applications and rapidly increasing amounts of uncertain data collected and accumulated, analyzing large collections of uncertain data has become an important task. Ranking queries (also known as top-k queries) are often natural and useful in analyzing uncertain data. Ranking Queries on Uncertain Data discusses the motivations/applications, challenging problems, the fundamental principles, and the evaluation algorith
Recommendation Sets and Choice Queries

DEFF Research Database (Denmark)

Viappiani, Paolo Renato; Boutilier, Craig

2011-01-01

Utility elicitation is an important component of many applications, such as decision support systems and recommender systems. Such systems query users about their preferences and offer recommendations based on the system's belief about the user's utility function. We analyze the connection between...... the problem of generating optimal recommendation sets and the problem of generating optimal choice queries, considering both Bayesian and regret-based elicitation. Our results show that, somewhat surprisingly, under very general circumstances, the optimal recommendation set coincides with the optimal query....
Predecessor queries in dynamic integer sets

DEFF Research Database (Denmark)

Brodal, Gerth Stølting

1997-01-01

We consider the problem of maintaining a set of n integers in the range 0.2w–1 under the operations of insertion, deletion, predecessor queries, minimum queries and maximum queries on a unit cost RAM with word size w bits. Let f (n) be an arbitrary nondecreasing smooth function satisfying n...
NCBO Ontology Recommender 2.0: an enhanced approach for biomedical ontology recommendation.

Science.gov (United States)

Martínez-Romero, Marcos; Jonquet, Clement; O'Connor, Martin J; Graybeal, John; Pazos, Alejandro; Musen, Mark A

2017-06-07

Ontologies and controlled terminologies have become increasingly important in biomedical research. Researchers use ontologies to annotate their data with ontology terms, enabling better data integration and interoperability across disparate datasets. However, the number, variety and complexity of current biomedical ontologies make it cumbersome for researchers to determine which ones to reuse for their specific needs. To overcome this problem, in 2010 the National Center for Biomedical Ontology (NCBO) released the Ontology Recommender, which is a service that receives a biomedical text corpus or a list of keywords and suggests ontologies appropriate for referencing the indicated terms. We developed a new version of the NCBO Ontology Recommender. Called Ontology Recommender 2.0, it uses a novel recommendation approach that evaluates the relevance of an ontology to biomedical text data according to four different criteria: (1) the extent to which the ontology covers the input data; (2) the acceptance of the ontology in the biomedical community; (3) the level of detail of the ontology classes that cover the input data; and (4) the specialization of the ontology to the domain of the input data. Our evaluation shows that the enhanced recommender provides higher quality suggestions than the original approach, providing better coverage of the input data, more detailed information about their concepts, increased specialization for the domain of the input data, and greater acceptance and use in the community. In addition, it provides users with more explanatory information, along with suggestions of not only individual ontologies but also groups of ontologies to use together. It also can be customized to fit the needs of different ontology recommendation scenarios. Ontology Recommender 2.0 suggests relevant ontologies for annotating biomedical text data. It combines the strengths of its predecessor with a range of adjustments and new features that improve its reliability
Flexible Query Answering Systems 2006

DEFF Research Database (Denmark)

-computer interaction. The overall theme of the FQAS conferences is innovative query systems aimed at providing easy, flexible, and intuitive access to information. Such systems are intended to facilitate retrieval from information repositories such as databases, libraries, and the World-Wide Web. These repositories......This volume constitutes the proceedings of the Seventh International Conference on Flexible Query Answering Systems, FQAS 2006, held in Milan, Italy, on June 7--10, 2006. FQAS is the premier conference for researchers and practitioners concerned with the vital task of providing easy, flexible...... are typically equipped with standard query systems which are often inadequate, and the focus of FQAS is the development of query systems that are more expressive, informative, cooperative, and productive. These proceedings contain contributions from invited speakers and 53 original papers out of about 100...
Ontology mapping and data discovery for the translational investigator.

Science.gov (United States)

Wynden, Rob; Weiner, Mark G; Sim, Ida; Gabriel, Davera; Casale, Marco; Carini, Simona; Hastings, Shannon; Ervin, David; Tu, Samson; Gennari, John H; Anderson, Nick; Mobed, Ketty; Lakshminarayanan, Prakash; Massary, Maggie; Cucina, Russ J

2010-03-01

An integrated data repository (IDR) containing aggregations of clinical, biomedical, economic, administrative, and public health data is a key component of an overall translational research infrastructure. But most available data repositories are designed using standard data warehouse architecture that employs arbitrary data encoding standards, making queries across disparate repositories difficult. In response to these shortcomings we have designed a Health Ontology Mapper (HOM) that translates terminologies into formal data encoding standards without altering the underlying source data. We believe the HOM system promotes inter-institutional data sharing and research collaboration, and will ultimately lower the barrier to developing and using an IDR.
Spatio-temporal databases complex motion pattern queries

CERN Document Server

Vieira, Marcos R

2013-01-01

This brief presents several new query processing techniques, called complex motion pattern queries, specifically designed for very large spatio-temporal databases of moving objects. The brief begins with the definition of flexible pattern queries, which are powerful because of the integration of variables and motion patterns. This is followed by a summary of the expressive power of patterns and flexibility of pattern queries. The brief then present the Spatio-Temporal Pattern System (STPS) and density-based pattern queries. STPS databases contain millions of records with information about mobi
Management and Internal Standardization of Chemistry Terminology ...

African Journals Online (AJOL)

This in turn implies the development, consolidation and especially ... This article describes the terminological processing of a technical source text prior to translation, ... functions, i.e. languages of learning and teaching, and also of scientific dis- ... tronic terminology management systems or translation memory systems.
Multi-Dimensional Top-k Dominating Queries

DEFF Research Database (Denmark)

Yiu, Man Lung; Mamoulis, Nikos

2009-01-01

The top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. This query is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects. In addition, it combines the advantages of top......-k and skyline queries without sharing their disadvantages: (i) the output size can be controlled, (ii) no ranking functions need to be specified by users, and (iii) the result is independent of the scales at different dimensions. Despite their importance, top-k dominating queries have not received adequate...

Query optimization over crowdsourced data

KAUST Repository

Park, Hyunjung; Widom, Jennifer

2013-01-01

Deco is a comprehensive system for answering declarative queries posed over stored relational data together with data obtained on-demand from the crowd. In this paper we describe Deco's cost-based query optimizer, building on Deco's data model
Query Optimizations over Decentralized RDF Graphs

KAUST Repository

Abdelaziz, Ibrahim

2017-05-18

Applications in life sciences, decentralized social networks, Internet of Things, and statistical linked dataspaces integrate data from multiple decentralized RDF graphs via SPARQL queries. Several approaches have been proposed to optimize query processing over a small number of heterogeneous data sources by utilizing schema information. In the case of schema similarity and interlinks among sources, these approaches cause unnecessary data retrieval and communication, leading to poor scalability and response time. This paper addresses these limitations and presents Lusail, a system for scalable and efficient SPARQL query processing over decentralized graphs. Lusail achieves scalability and low query response time through various optimizations at compile and run times. At compile time, we use a novel locality-aware query decomposition technique that maximizes the number of query triple patterns sent together to a source based on the actual location of the instances satisfying these triple patterns. At run time, we use selectivity-awareness and parallel query execution to reduce network latency and to increase parallelism by delaying the execution of subqueries expected to return large results. We evaluate Lusail using real and synthetic benchmarks, with data sizes up to billions of triples on an in-house cluster and a public cloud. We show that Lusail outperforms state-of-the-art systems by orders of magnitude in terms of scalability and response time.
jQuery UI 1.10 the user interface library for jQuery

CERN Document Server

Libby, Alex

2013-01-01

This book consists of an easy-to-follow, example-based approach that leads you step-by-step through the implementation and customization of each library component.This book is for frontend designers and developers who need to learn how to use jQuery UI quickly. To get the most out of this book, you should have a good working knowledge of HTML, CSS, and JavaScript, and should ideally be comfortable using jQuery.
Optimal Planar Orthogonal Skyline Counting Queries

DEFF Research Database (Denmark)

Brodal, Gerth Stølting; Larsen, Kasper Green

2014-01-01

counting queries, i.e. given a query rectangle R to report the size of the skyline of P\\cap R. We present a data structure for storing n points with integer coordinates having query time O(lg n/lglg n) and space usage O(n). The model of computation is a unit cost RAM with logarithmic word size. We prove...
Terminology: A necessary tool for the Specialized Translator

Directory of Open Access Journals (Sweden)

Aura E. Navarro

2016-03-01

Full Text Available Language disciplines, including Terminology and Specialized Translation, have made great strides after the second half of the twentieth century. This development, related to technological growth and international communication that occurred during this period, has resulted in a considerable increase of concepts. Thus, experts have become more and more aware of the importance of naming these new concepts. Specialized translators were among the first language professionals to recognize the need of mastering the terminology of specialized fields in order to perform their duties well (Antia et coll., 2005. In this work, we study the very close relationship between Terminology and Specialized Translation. We also study the theoretical and practical knowledge of Terminology that a specialized translator should have.
Terminology in South Africa Terminologie in Suid-Afrika

Directory of Open Access Journals (Sweden)

Mariëtta Alberts

2012-09-01

Full Text Available
This article deals with terminology and terminography in South Africa. It gives the different meanings attached to the term terminology and describes points of difference between terminology and terminography. It focuses on the dimensions of terminology, namely the cognitive, linguistic and communicative dimension. Since terminologists need to consult with subject specialists, linguists, language users and mother-tongue speakers during different phases of the terminography process, the role of consultation in terminology work is stressed. Various aspects such as cultural differences that need to be taken care of, are discussed. The current South African terminology and terminography situation regarding terminology work undertaken by the National Language Service is examined. Emphasis is placed on the database system being used and the National Termbank. Terminology training also receives attention.
Keywords: terminology, terminography, terminologist, terminographer, cognitive dimension, linguistic dimension, communicative dimension, technical dictionary, subject specialist, subject field, subject-oriented, concept-oriented, language-oriented, standardisation, primary term formation, secondary term formation, loan words, borrowing, transliteration, neologism, extension of meaning, total embedding, transference

Hierdie artikel handel oor terminologie en terminografie in Suid-Afrika. Dit verskaf die verskillende betekenisse wat aan die term terminologie geheg word en beskryf punte van verskil tussen terminologie en terminografie. Daar word gefokus op die dimensies van terminologie, naamlik die kognitiewe dimensie, die taaldimensie en die kommunikatiewe dimensie. Aangesien terminoloë vakspesialiste, linguiste, taalgebruikers en moedertaalsprekers gedurende verskillende fases van terminologiewerk moet raadpleeg, word die rol van konsultasie in terminologiewerk beklemtoon. Verskeie aspekte waaraan aandag gegee
Using lexical and logical methods for the alignment of medical terminologies

NARCIS (Netherlands)

Klein, Michel; Aleksovski, Zharko

2005-01-01

Standardized medical terminologies are often used for the registration of patient data. In several situations there is a need to align these terminologies to other terminologies. Even when the terminologies cover the same domain, this is often a non-trivial task. The task is even more complicated
Collections for terminology in chemistry

International Nuclear Information System (INIS)

1974-08-01

This book describes terminology in chemistry, which is divided into seven chapters. The contents of this book are element name, names of an inorganic compound such as ion and radical and polyacid, an organic compound on general principle and names, general terminology 1 and 2, unit and description method on summary, unit and the symbol for unit, number and pH, Korean mark for people's name in chemistry, names of JUPAC organic compound of summary, hydrocarbons, fused polycyclic hydrocarbons, bridged hydrocarbons, cyclic hydrocarbons with side chains, terpenes hydrocarbons, fundamental heterocyclic systems and heterocyclic spiro compounds.
Evaluating standard terminologies for encoding allergy information.

Science.gov (United States)

Goss, Foster R; Zhou, Li; Plasek, Joseph M; Broverman, Carol; Robinson, George; Middleton, Blackford; Rocha, Roberto A

2013-01-01

Allergy documentation and exchange are vital to ensuring patient safety. This study aims to analyze and compare various existing standard terminologies for representing allergy information. Five terminologies were identified, including the Systemized Nomenclature of Medical Clinical Terms (SNOMED CT), National Drug File-Reference Terminology (NDF-RT), Medication Dictionary for Regulatory Activities (MedDRA), Unique Ingredient Identifier (UNII), and RxNorm. A qualitative analysis was conducted to compare desirable characteristics of each terminology, including content coverage, concept orientation, formal definitions, multiple granularities, vocabulary structure, subset capability, and maintainability. A quantitative analysis was also performed to compare the content coverage of each terminology for (1) common food, drug, and environmental allergens and (2) descriptive concepts for common drug allergies, adverse reactions (AR), and no known allergies. Our qualitative results show that SNOMED CT fulfilled the greatest number of desirable characteristics, followed by NDF-RT, RxNorm, UNII, and MedDRA. Our quantitative results demonstrate that RxNorm had the highest concept coverage for representing drug allergens, followed by UNII, SNOMED CT, NDF-RT, and MedDRA. For food and environmental allergens, UNII demonstrated the highest concept coverage, followed by SNOMED CT. For representing descriptive allergy concepts and adverse reactions, SNOMED CT and NDF-RT showed the highest coverage. Only SNOMED CT was capable of representing unique concepts for encoding no known allergies. The proper terminology for encoding a patient's allergy is complex, as multiple elements need to be captured to form a fully structured clinical finding. Our results suggest that while gaps still exist, a combination of SNOMED CT and RxNorm can satisfy most criteria for encoding common allergies and provide sufficient content coverage.
PAQ: Persistent Adaptive Query Middleware for Dynamic Environments

Science.gov (United States)

Rajamani, Vasanth; Julien, Christine; Payton, Jamie; Roman, Gruia-Catalin

Pervasive computing applications often entail continuous monitoring tasks, issuing persistent queries that return continuously updated views of the operational environment. We present PAQ, a middleware that supports applications' needs by approximating a persistent query as a sequence of one-time queries. PAQ introduces an integration strategy abstraction that allows composition of one-time query responses into streams representing sophisticated spatio-temporal phenomena of interest. A distinguishing feature of our middleware is the realization that the suitability of a persistent query's result is a function of the application's tolerance for accuracy weighed against the associated overhead costs. In PAQ, programmers can specify an inquiry strategy that dictates how information is gathered. Since network dynamics impact the suitability of a particular inquiry strategy, PAQ associates an introspection strategy with a persistent query, that evaluates the quality of the query's results. The result of introspection can trigger application-defined adaptation strategies that alter the nature of the query. PAQ's simple API makes developing adaptive querying systems easily realizable. We present the key abstractions, describe their implementations, and demonstrate the middleware's usefulness through application examples and evaluation.
Pareto-depth for multiple-query image retrieval.

Science.gov (United States)

Hsiao, Ko-Jen; Calder, Jeff; Hero, Alfred O

2015-02-01

Most content-based image retrieval systems consider either one single query, or multiple queries that include the same object or represent the same semantic information. In this paper, we consider the content-based image retrieval problem for multiple query images corresponding to different image semantics. We propose a novel multiple-query information retrieval algorithm that combines the Pareto front method with efficient manifold ranking. We show that our proposed algorithm outperforms state of the art multiple-query retrieval algorithms on real-world image databases. We attribute this performance improvement to concavity properties of the Pareto fronts, and prove a theoretical result that characterizes the asymptotic concavity of the fronts.
EquiX-A Search and Query Language for XML.

Science.gov (United States)

Cohen, Sara; Kanza, Yaron; Kogan, Yakov; Sagiv, Yehoshua; Nutt, Werner; Serebrenik, Alexander

2002-01-01

Describes EquiX, a search language for XML that combines querying with searching to query the data and the meta-data content of Web pages. Topics include search engines; a data model for XML documents; search query syntax; search query semantics; an algorithm for evaluating a query on a document; and indexing EquiX queries. (LRW)
QUERY RESPONSE TIME COMPARISON NOSQLDB MONGODB WITH SQLDB ORACLE

Directory of Open Access Journals (Sweden)

Humasak T. A. Simanjuntak

2015-01-01

Full Text Available Penyimpanan data saat ini terdapat dua jenis yakni relational database dan non-relational database. Kedua jenis DBMS (Database Managemnet System tersebut berbeda dalam berbagai aspek seperti per-formansi eksekusi query, scalability, reliability maupun struktur penyimpanan data. Kajian ini memiliki tujuan untuk mengetahui perbandingan performansi DBMS antara Oracle sebagai jenis relational data-base dan MongoDB sebagai jenis non-relational database dalam mengolah data terstruktur. Eksperimen dilakukan untuk mengetahui perbandingan performansi kedua DBMS tersebut untuk operasi insert, select, update dan delete dengan menggunakan query sederhana maupun kompleks pada database Northwind. Untuk mencapai tujuan eksperimen, 18 query yang terdiri dari 2 insert query, 10 select query, 2 update query dan 2 delete query dieksekusi. Query dieksekusi melalui sebuah aplikasi .Net yang dibangun sebagai perantara antara user dengan basis data. Eksperimen dilakukan pada tabel dengan atau tanpa relasi pada Oracle dan embedded atau bukan embedded dokumen pada MongoDB. Response time untuk setiap eksekusi query dibandingkan dengan menggunakan metode statistik. Eksperimen menunjukkan response time query untuk proses select, insert, dan update pada MongoDB lebih cepatdaripada Oracle. MongoDB lebih cepat 64.8 % untuk select query;MongoDB lebihcepat 72.8 % untuk insert query dan MongoDB lebih cepat 33.9 % untuk update query. Pada delete query, Oracle lebih cepat 96.8 % daripada MongoDB untuk table yang berelasi, tetapi MongoDB lebih cepat 83.8 % daripada Oracle untuk table yang tidak memiliki relasi.Untuk query kompleks dengan Map Reduce pada MongoDB lebih lambat 97.6% daripada kompleks query dengan aggregate function pada Oracle.
Comparison of Japanese notation and meanings among three terminologies in radiological technology domain

International Nuclear Information System (INIS)

Yagahara, Ayako; Tsuji, Shintaro; Fukuda, Akihisa; Nishimoto, Naoki; Ogasawara, Katsuhiko

2016-01-01

The purpose of this study is to investigate the differences in the notation of technical terms and their meanings among three terminologies in Japanese radiology-related societies. The three terminologies compared in this study were 'radiological technology terminology' and its supplement published by the Japan Society of Radiological Technology, 'medical physics terminology' published by the Japan Society of Medical Physics, and 'electric radiation terminology' published by the Japan Radiological Society. Terms were entered into spreadsheets and classified into the following three categories: Japanese notation, English notation, and meanings. In the English notation, terms were matched to character strings in the three terminologies and were extracted and compared. The Japanese notations were compared among three terminologies, and the difference between the meanings of the two terminologies radiological technology terminology and electric radiation terminology were compared. There were a total of 14,982 terms in the three terminologies. In English character strings, 2,735 terms were matched to more than two terminologies, with 801 of these terms matched to all the three terminologies. Of those terms in English character strings matched to three terminologies, 752 matched to Japanese character strings. Of the terms in English character strings matched to two terminologies, 1,240 matched to Japanese character strings. With regard to the meanings category, eight terms had mismatched meanings between the two terminologies. For these terms, there were common concepts between two different meaning terms, and it was considered that the derived concepts were described based on domain. (author)
Learning jQuery

CERN Document Server

Chaffer, Jonathan

2013-01-01

Step through each of the core concepts of the jQuery library, building an overall picture of its capabilities. Once you have thoroughly covered the basics, the book returns to each concept to cover more advanced examples and techniques.This book is for web designers who want to create interactive elements for their designs, and for developers who want to create the best user interface for their web applications. Basic JavaScript programming and knowledge of HTML and CSS is required. No knowledge of jQuery is assumed, nor is experience with any other JavaScript libraries.
Knowledge Query Language (KQL)

Science.gov (United States)

2016-02-12

described as a sparse, distributed multidimensional sorted map. Unlike a relational database , BigTable has no multicolumn primary keys or constraints. The...in query languages such as SQL. Figure 3. Address expression-based querying. Each circled step in Figure 3 is described below. Datastore/ Database ...implementation we describe in later sections stores the instance of registry ontology in JSON files. 7 Throughout the rest of this report, we use the
A novel biomedical image indexing and retrieval system via deep preference learning.

Science.gov (United States)

Pang, Shuchao; Orgun, Mehmet A; Yu, Zhezhou

2018-05-01

The traditional biomedical image retrieval methods as well as content-based image retrieval (CBIR) methods originally designed for non-biomedical images either only consider using pixel and low-level features to describe an image or use deep features to describe images but still leave a lot of room for improving both accuracy and efficiency. In this work, we propose a new approach, which exploits deep learning technology to extract the high-level and compact features from biomedical images. The deep feature extraction process leverages multiple hidden layers to capture substantial feature structures of high-resolution images and represent them at different levels of abstraction, leading to an improved performance for indexing and retrieval of biomedical images. We exploit the current popular and multi-layered deep neural networks, namely, stacked denoising autoencoders (SDAE) and convolutional neural networks (CNN) to represent the discriminative features of biomedical images by transferring the feature representations and parameters of pre-trained deep neural networks from another domain. Moreover, in order to index all the images for finding the similarly referenced images, we also introduce preference learning technology to train and learn a kind of a preference model for the query image, which can output the similarity ranking list of images from a biomedical image database. To the best of our knowledge, this paper introduces preference learning technology for the first time into biomedical image retrieval. We evaluate the performance of two powerful algorithms based on our proposed system and compare them with those of popular biomedical image indexing approaches and existing regular image retrieval methods with detailed experiments over several well-known public biomedical image databases. Based on different criteria for the evaluation of retrieval performance, experimental results demonstrate that our proposed algorithms outperform the state
Enhancing Recall in Semantic Querying

DEFF Research Database (Denmark)

Rouces, Jacobo

2013-01-01

lexically and structurally different, which we will introduce in the next section. As RDF graphs from different sources are expected to be linked, the modeling heterogeneities will make the federated graph become sparser and inconsistent. This is detrimental to the recall of SPARQL queries, as the query...
Terminology tools: state of the art and practical lessons.

Science.gov (United States)

Cimino, J J

2001-01-01

As controlled medical terminologies evolve from simple code-name-hierarchy arrangements, into rich, knowledge-based ontologies of medical concepts, increased demands are placed on both the developers and users of the terminologies. In response, researchers have begun developing tools to address their needs. The aims of this article are to review previous work done to develop these tools and then to describe work done at Columbia University and New York Presbyterian Hospital (NYPH). Researchers working with the Systematized Nomenclature of Medicine (SNOMED), the Unified Medical Language System (UMLS), and NYPH's Medical Entities Dictionary (MED) have created a wide variety of terminology browsers, editors and servers to facilitate creation, maintenance and use of these terminologies. Although much work has been done, no generally available tools have yet emerged. Consensus on requirement for tool functions, especially terminology servers is emerging. Tools at NYPH have been used successfully to support the integration of clinical applications and the merger of health care institutions. Significant advancement has occurred over the past fifteen years in the development of sophisticated controlled terminologies and the tools to support them. The tool set at NYPH provides a case study to demonstrate one feasible architecture.
Location-Dependent Query Processing Under Soft Real-Time Constraints

Directory of Open Access Journals (Sweden)

Zoubir Mammeri

2009-01-01

Full Text Available In recent years, mobile devices and applications achieved an increasing development. In database field, this development required methods to consider new query types like location-dependent queries (i.e. the query results depend on the query issuer location. Although several researches addressed problems related to location-dependent query processing, a few works considered timing requirements that may be associated with queries (i.e., the query results must be delivered to mobile clients on time. The main objective of this paper is to propose a solution for location-dependent query processing under soft real-time constraints. Hence, we propose methods to take into account client location-dependency and to maximize the percentage of queries respecting their deadlines. We validate our proposal by implementing a prototype based on Oracle DBMS. Performance evaluation results show that the proposed solution optimizes the percentage of queries meeting their deadlines and the communication cost.

Beyond teaching language: Towards terminological primacy in learners’ geometric conceptualisation

Directory of Open Access Journals (Sweden)

Humphrey U. Atebe

2010-07-01

Full Text Available This paper reports on a specific aspect of a broader geometry conceptualisation study that sought to explore and explicate learners’ knowledge of basic geometric terminology in selected Nigerian and South African high schools. It is framed by the notion that students’ acquisition of the correct terminology in school geometry is important for their success in the subject. The original study further aimed to determine the relationship that might exist between a learner’s ability in verbal geometry terminology tasks and his/her ability in visual geometry terminology tasks. A total of 144 learners (72 each from South Africa and Nigeria were selected for the study, using both the stratified and the fish‐bowl sampling techniques. A questionnaire consisting of a sixty‐item multiple‐choice objective test provided the data for the study. An overall percentage mean score of 44,17% obtained in the test indicated that learners in this study had only a limited knowledge of basic geometric terminology. The Nigerian subsample in the study had a weaker understanding of basic geometric terminology than their South African counterparts. Importantly, there were high positive correlations between participants’ ability in verbal geometry terminology tasks and their ability in visual geometry terminology tasks. These results are consistent with those of several earlier studies, and provide a reasonably firm basis for certain recommendations to be made.
SCRY: Enabling quantitative reasoning in SPARQL queries

NARCIS (Netherlands)

Meroño-Peñuela, A.; Stringer, Bas; Loizou, Antonis; Abeln, Sanne; Heringa, Jaap

2015-01-01

The inability to include quantitative reasoning in SPARQL queries slows down the application of Semantic Web technology in the life sciences. SCRY, our SPARQL compatible service layer, improves this by executing services at query time and making their outputs query-accessible, generating RDF data on
Answering SPARQL queries modulo RDF Schema with paths

OpenAIRE

Alkhateeb, Faisal; Euzenat, Jérôme

2013-01-01

alkhateeb2013a; SPARQL is the standard query language for RDF graphs. In its strict instantiation, it only offers querying according to the RDF semantics and would thus ignore the semantics of data expressed with respect to (RDF) schemas or (OWL) ontologies. Several extensions to SPARQL have been proposed to query RDF data modulo RDFS, i.e., interpreting the query with RDFS semantics and/or considering external ontologies. We introduce a general framework which allows for expressing query ans...
NOBLE - Flexible concept recognition for large-scale biomedical natural language processing.

Science.gov (United States)

Tseytlin, Eugene; Mitchell, Kevin; Legowski, Elizabeth; Corrigan, Julia; Chavan, Girish; Jacobson, Rebecca S

2016-01-14

Natural language processing (NLP) applications are increasingly important in biomedical data analysis, knowledge engineering, and decision support. Concept recognition is an important component task for NLP pipelines, and can be either general-purpose or domain-specific. We describe a novel, flexible, and general-purpose concept recognition component for NLP pipelines, and compare its speed and accuracy against five commonly used alternatives on both a biological and clinical corpus. NOBLE Coder implements a general algorithm for matching terms to concepts from an arbitrary vocabulary set. The system's matching options can be configured individually or in combination to yield specific system behavior for a variety of NLP tasks. The software is open source, freely available, and easily integrated into UIMA or GATE. We benchmarked speed and accuracy of the system against the CRAFT and ShARe corpora as reference standards and compared it to MMTx, MGrep, Concept Mapper, cTAKES Dictionary Lookup Annotator, and cTAKES Fast Dictionary Lookup Annotator. We describe key advantages of the NOBLE Coder system and associated tools, including its greedy algorithm, configurable matching strategies, and multiple terminology input formats. These features provide unique functionality when compared with existing alternatives, including state-of-the-art systems. On two benchmarking tasks, NOBLE's performance exceeded commonly used alternatives, performing almost as well as the most advanced systems. Error analysis revealed differences in error profiles among systems. NOBLE Coder is comparable to other widely used concept recognition systems in terms of accuracy and speed. Advantages of NOBLE Coder include its interactive terminology builder tool, ease of configuration, and adaptability to various domains and tasks. NOBLE provides a term-to-concept matching system suitable for general concept recognition in biomedical NLP pipelines.
Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition

Directory of Open Access Journals (Sweden)

Suzuki Motoyuki

2009-01-01

Full Text Available Abstract We are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves Web documents using the chosen keywords. A problem is that the selected keywords tend to contain misrecognized words. The proposed method introduces two new ideas for avoiding the effects of keywords derived from misrecognized words. The first idea is to compose multiple queries from selected keyword candidates so that the misrecognized words and correct words do not fall into one query. The second idea is that the number of Web documents downloaded for each query is determined according to the "query relevance." Combining these two ideas, we can alleviate bad effect of misrecognized keywords by decreasing the number of downloaded Web documents from queries that contain misrecognized keywords. Finally, we examine a method of determining the number of iterative adaptations based on the recognition likelihood. Experiments have shown that the proposed stopping criterion can determine almost the optimum number of iterations. In the final experiment, the word accuracy without adaptation (55.29% was improved to 60.38%, which was 1.13 point better than the result of the conventional unsupervised adaptation method (59.25%.
Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition

Directory of Open Access Journals (Sweden)

Akinori Ito

2009-01-01

Full Text Available We are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves Web documents using the chosen keywords. A problem is that the selected keywords tend to contain misrecognized words. The proposed method introduces two new ideas for avoiding the effects of keywords derived from misrecognized words. The first idea is to compose multiple queries from selected keyword candidates so that the misrecognized words and correct words do not fall into one query. The second idea is that the number of Web documents downloaded for each query is determined according to the “query relevance.” Combining these two ideas, we can alleviate bad effect of misrecognized keywords by decreasing the number of downloaded Web documents from queries that contain misrecognized keywords. Finally, we examine a method of determining the number of iterative adaptations based on the recognition likelihood. Experiments have shown that the proposed stopping criterion can determine almost the optimum number of iterations. In the final experiment, the word accuracy without adaptation (55.29% was improved to 60.38%, which was 1.13 point better than the result of the conventional unsupervised adaptation method (59.25%.
Recent Advances of Graphene-based Hybrids with Magnetic Nanoparticles for Biomedical Applications.

Science.gov (United States)

Alegret, Nuria; Criado, Alejandro; Prato, Maurizio

2017-01-01

The utilization of graphene-based nanomaterials combined with magnetic nanoparticles offers key benefits in the modern biomedicine. In this minireview, we focus on the most recent advances in hybrids of magnetic graphene derivatives for biomedical applications. We initially analyze the several methodologies employed for the preparation of graphene-based composites with magnetic nanoparticles, more specifically the kind of linkage between the two components. In the last section, we focus on the biomedical applications where these magnetic-graphene hybrids are essential and pay special attention on how the addition of graphene improves the resulting devices in magnetic resonance imaging, controlled drug delivery, magnetic photothermal therapy and cellular separation and isolation. Finally, we highlight the use of these magnetic hybrids as multifunctional material that will lead to a next generation of theranostics. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Alkemio: association of chemicals with biomedical topics by text and data mining.

Science.gov (United States)

Gijón-Correas, José A; Andrade-Navarro, Miguel A; Fontaine, Jean F

2014-07-01

The PubMed® database of biomedical citations allows the retrieval of scientific articles studying the function of chemicals in biology and medicine. Mining millions of available citations to search reported associations between chemicals and topics of interest would require substantial human time. We have implemented the Alkemio text mining web tool and SOAP web service to help in this task. The tool uses biomedical articles discussing chemicals (including drugs), predicts their relatedness to the query topic with a naïve Bayesian classifier and ranks all chemicals by P-values computed from random simulations. Benchmarks on seven human pathways showed good retrieval performance (areas under the receiver operating characteristic curves ranged from 73.6 to 94.5%). Comparison with existing tools to retrieve chemicals associated to eight diseases showed the higher precision and recall of Alkemio when considering the top 10 candidate chemicals. Alkemio is a high performing web tool ranking chemicals for any biomedical topics and it is free to non-commercial users. http://cbdm.mdc-berlin.de/∼medlineranker/cms/alkemio. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
42 CFR 405.512 - Carriers' procedural terminology and coding systems.

Science.gov (United States)

2010-10-01

... 42 Public Health 2 2010-10-01 2010-10-01 false Carriers' procedural terminology and coding systems... Determining Reasonable Charges § 405.512 Carriers' procedural terminology and coding systems. (a) General. Procedural terminology and coding systems are designed to provide physicians and third party payers with a...
Implementation of Quantum Private Queries Using Nuclear Magnetic Resonance

International Nuclear Information System (INIS)

Wang Chuan; Hao Liang; Zhao Lian-Jie

2011-01-01

We present a modified protocol for the realization of a quantum private query process on a classical database. Using one-qubit query and CNOT operation, the query process can be realized in a two-mode database. In the query process, the data privacy is preserved as the sender would not reveal any information about the database besides her query information, and the database provider cannot retain any information about the query. We implement the quantum private query protocol in a nuclear magnetic resonance system. The density matrix of the memory registers are constructed. (general)
Modeling and mining term association for improving biomedical information retrieval performance.

Science.gov (United States)

Hu, Qinmin; Huang, Jimmy Xiangji; Hu, Xiaohua

2012-06-11

The growth of the biomedical information requires most information retrieval systems to provide short and specific answers in response to complex user queries. Semantic information in the form of free text that is structured in a way makes it straightforward for humans to read but more difficult for computers to interpret automatically and search efficiently. One of the reasons is that most traditional information retrieval models assume terms are conditionally independent given a document/passage. Therefore, we are motivated to consider term associations within different contexts to help the models understand semantic information and use it for improving biomedical information retrieval performance. We propose a term association approach to discover term associations among the keywords from a query. The experiments are conducted on the TREC 2004-2007 Genomics data sets and the TREC 2004 HARD data set. The proposed approach is promising and achieves superiority over the baselines and the GSP results. The parameter settings and different indices are investigated that the sentence-based index produces the best results in terms of the document-level, the word-based index for the best results in terms of the passage-level and the paragraph-based index for the best results in terms of the passage2-level. Furthermore, the best term association results always come from the best baseline. The tuning number k in the proposed recursive re-ranking algorithm is discussed and locally optimized to be 10. First, modelling term association for improving biomedical information retrieval using factor analysis, is one of the major contributions in our work. Second, the experiments confirm that term association considering co-occurrence and dependency among the keywords can produce better results than the baselines treating the keywords independently. Third, the baselines are re-ranked according to the importance and reliance of latent factors behind term associations. These latent
SPARQL Query Re-writing Using Partonomy Based Transformation Rules

Science.gov (United States)

Jain, Prateek; Yeh, Peter Z.; Verma, Kunal; Henson, Cory A.; Sheth, Amit P.

Often the information present in a spatial knowledge base is represented at a different level of granularity and abstraction than the query constraints. For querying ontology's containing spatial information, the precise relationships between spatial entities has to be specified in the basic graph pattern of SPARQL query which can result in long and complex queries. We present a novel approach to help users intuitively write SPARQL queries to query spatial data, rather than relying on knowledge of the ontology structure. Our framework re-writes queries, using transformation rules to exploit part-whole relations between geographical entities to address the mismatches between query constraints and knowledge base. Our experiments were performed on completely third party datasets and queries. Evaluations were performed on Geonames dataset using questions from National Geographic Bee serialized into SPARQL and British Administrative Geography Ontology using questions from a popular trivia website. These experiments demonstrate high precision in retrieval of results and ease in writing queries.
Mobile Information Access with Spoken Query Answering

DEFF Research Database (Denmark)

Brøndsted, Tom; Larsen, Henrik Legind; Larsen, Lars Bo

2006-01-01

window focused over the part which most likely contains an answer to the query. The two systems are integrated into a full spoken query answering system. The prototype can answer queries and questions within the chosen football (soccer) test domain, but the system has the flexibility for being ported...
On the formulation of performant sparql queries

NARCIS (Netherlands)

Loizou, A.; Angles, R.; Groth, P.T.

2014-01-01

Abstract The combination of the flexibility of RDF and the expressiveness of SPARQL provides a powerful mechanism to model, integrate and query data. However, these properties also mean that it is nontrivial to write performant SPARQL queries. Indeed, it is quite easy to create queries that tax even
Terminology extraction from medical texts in Polish.

Science.gov (United States)

Marciniak, Małgorzata; Mykowiecka, Agnieszka

2014-01-01

Hospital documents contain free text describing the most important facts relating to patients and their illnesses. These documents are written in specific language containing medical terminology related to hospital treatment. Their automatic processing can help in verifying the consistency of hospital documentation and obtaining statistical data. To perform this task we need information on the phrases we are looking for. At the moment, clinical Polish resources are sparse. The existing terminologies, such as Polish Medical Subject Headings (MeSH), do not provide sufficient coverage for clinical tasks. It would be helpful therefore if it were possible to automatically prepare, on the basis of a data sample, an initial set of terms which, after manual verification, could be used for the purpose of information extraction. Using a combination of linguistic and statistical methods for processing over 1200 children hospital discharge records, we obtained a list of single and multiword terms used in hospital discharge documents written in Polish. The phrases are ordered according to their presumed importance in domain texts measured by the frequency of use of a phrase and the variety of its contexts. The evaluation showed that the automatically identified phrases cover about 84% of terms in domain texts. At the top of the ranked list, only 4% out of 400 terms were incorrect while out of the final 200, 20% of expressions were either not domain related or syntactically incorrect. We also observed that 70% of the obtained terms are not included in the Polish MeSH. Automatic terminology extraction can give results which are of a quality high enough to be taken as a starting point for building domain related terminological dictionaries or ontologies. This approach can be useful for preparing terminological resources for very specific subdomains for which no relevant terminologies already exist. The evaluation performed showed that none of the tested ranking procedures were
Harvest: an open platform for developing web-based biomedical data discovery and reporting applications.

Science.gov (United States)

Pennington, Jeffrey W; Ruth, Byron; Italia, Michael J; Miller, Jeffrey; Wrazien, Stacey; Loutrel, Jennifer G; Crenshaw, E Bryan; White, Peter S

2014-01-01

Biomedical researchers share a common challenge of making complex data understandable and accessible as they seek inherent relationships between attributes in disparate data types. Data discovery in this context is limited by a lack of query systems that efficiently show relationships between individual variables, but without the need to navigate underlying data models. We have addressed this need by developing Harvest, an open-source framework of modular components, and using it for the rapid development and deployment of custom data discovery software applications. Harvest incorporates visualizations of highly dimensional data in a web-based interface that promotes rapid exploration and export of any type of biomedical information, without exposing researchers to underlying data models. We evaluated Harvest with two cases: clinical data from pediatric cardiology and demonstration data from the OpenMRS project. Harvest's architecture and public open-source code offer a set of rapid application development tools to build data discovery applications for domain-specific biomedical data repositories. All resources, including the OpenMRS demonstration, can be found at http://harvest.research.chop.edu.
Assessment of incidental learning of medical terminology in a veterinary curriculum.

Science.gov (United States)

Ainsworth, A Jerald; Hardin, Laura; Robertson, Stanley

2007-01-01

The objective of this study was to determine whether students in a veterinary curriculum at Mississippi State University would gain an understanding of medical terminology, as they matriculate through their courses, comparable to that obtained during a focused medical terminology unit of study. Evaluation of students' incidental learning related to medical terminology during the 2004/2005 and 2005/2006 academic years indicated that 88.7% and 81.9% of students, respectively, scored above 70% on a medical terminology exam by the end of the first year of the curriculum. For the 2004/2005 academic, 67.6% increased their percentage of correct answers above 70% from the first medical terminology exam to the third. For the 2005/2006 academic year, 61.1% of students increased their score above 70% from the first to the third exam. Our data indicate that students can achieve comprehension of medical terminology in the absence of a formal terminology course.
Evaluation of Sub Query Performance in SQL Server

Science.gov (United States)

Oktavia, Tanty; Sujarwo, Surya

2014-03-01

The paper explores several sub query methods used in a query and their impact on the query performance. The study uses experimental approach to evaluate the performance of each sub query methods combined with indexing strategy. The sub query methods consist of in, exists, relational operator and relational operator combined with top operator. The experimental shows that using relational operator combined with indexing strategy in sub query has greater performance compared with using same method without indexing strategy and also other methods. In summary, for application that emphasized on the performance of retrieving data from database, it better to use relational operator combined with indexing strategy. This study is done on Microsoft SQL Server 2012.
Responsive web design with jQuery

CERN Document Server

Carlos, Gilberto

2013-01-01

Responsive Web Design with jQuery follows a standard tutorial-based approach, covering various aspects of responsive web design by building a comprehensive website.""Responsive Web Design with jQuery"" is aimed at web designers who are interested in building device-agnostic websites. You should have a grasp of standard HTML, CSS, and JavaScript development, and have a familiarity with graphic design. Some exposure to jQuery and HTML5 will be beneficial but isn't essential.
Adding query privacy to robust DHTs

DEFF Research Database (Denmark)

Backes, Michael; Goldberg, Ian; Kate, Aniket

2012-01-01

intermediate peers that (help to) route the queries towards their destinations. In this paper, we satisfy this requirement by presenting an approach for providing privacy for the keys in DHT queries. We use the concept of oblivious transfer (OT) in communication over DHTs to preserve query privacy without...... privacy over robust DHTs. Finally, we compare the performance of our privacy-preserving protocols with their more privacy-invasive counterparts. We observe that there is no increase in the message complexity...

SPARQL Assist language-neutral query composer

Science.gov (United States)

2012-01-01

Background SPARQL query composition is difficult for the lay-person, and even the experienced bioinformatician in cases where the data model is unfamiliar. Moreover, established best-practices and internationalization concerns dictate that the identifiers for ontological terms should be opaque rather than human-readable, which further complicates the task of synthesizing queries manually. Results We present SPARQL Assist: a Web application that addresses these issues by providing context-sensitive type-ahead completion during SPARQL query construction. Ontological terms are suggested using their multi-lingual labels and descriptions, leveraging existing support for internationalization and language-neutrality. Moreover, the system utilizes the semantics embedded in ontologies, and within the query itself, to help prioritize the most likely suggestions. Conclusions To ensure success, the Semantic Web must be easily available to all users, regardless of locale, training, or preferred language. By enhancing support for internationalization, and moreover by simplifying the manual construction of SPARQL queries through the use of controlled-natural-language interfaces, we believe we have made some early steps towards simplifying access to Semantic Web resources. PMID:22373327
SPARQL assist language-neutral query composer.

Science.gov (United States)

McCarthy, Luke; Vandervalk, Ben; Wilkinson, Mark

2012-01-25

SPARQL query composition is difficult for the lay-person, and even the experienced bioinformatician in cases where the data model is unfamiliar. Moreover, established best-practices and internationalization concerns dictate that the identifiers for ontological terms should be opaque rather than human-readable, which further complicates the task of synthesizing queries manually. We present SPARQL Assist: a Web application that addresses these issues by providing context-sensitive type-ahead completion during SPARQL query construction. Ontological terms are suggested using their multi-lingual labels and descriptions, leveraging existing support for internationalization and language-neutrality. Moreover, the system utilizes the semantics embedded in ontologies, and within the query itself, to help prioritize the most likely suggestions. To ensure success, the Semantic Web must be easily available to all users, regardless of locale, training, or preferred language. By enhancing support for internationalization, and moreover by simplifying the manual construction of SPARQL queries through the use of controlled-natural-language interfaces, we believe we have made some early steps towards simplifying access to Semantic Web resources.
The role of local terminologies in electronic health records. The HEGP experience.

Science.gov (United States)

Daniel-Le Bozec, Christel; Steichen, Olivier; Dart, Thierry; Jaulent, Marie-Christine

2007-01-01

Despite decades of work, there is no universally accepted standard medical terminology and no generally usable terminological tools have yet emerged. The local dictionary of concepts of the Georges Pompidou European Hospital (HEGP) is a Terminological System (TS) designed to support clinical data entry. It covers 93 data entry forms and contains definitions and synonyms of more than 5000 concepts, sometimes linked to reference terminologies such as ICD-10. In this article, we evaluate to which extend SNOMED CT could fully replace or rather be mapped to the local terminology system. We first describe the local dictionary of concepts of HEGP according to some published TS characterization framework. Then we discuss the specific role that a local terminology system plays with regards to reference terminologies.
Terminology Standardization in Education and the Construction of Resources: The Welsh Experience

Directory of Open Access Journals (Sweden)

Tegau Andrews

2016-01-01

Full Text Available This paper describes developments in Welsh-language terminology within the education system in Wales. Following an outline of historical terminology work, it concentrates on the consolidation of terminology standardization at the Language Technologies Unit, Bangor University, with particular reference to two projects, one concerned with terminology for school-age and further education, the second concerned with higher education. The developments described include the adoption of international standards in terminology standardization and their incorporation in an online terminology standardization environment and dissemination platform that enable access to the centralized terminological dictionaries via a number of sophisticated websites, portals and mobile apps featuring rich dictionary entries. Some of the issues in managing large term collections are explored, and usage statistics are presented for the resources described.
Query Optimizations over Decentralized RDF Graphs

KAUST Repository

Abdelaziz, Ibrahim; Mansour, Essam; Ouzzani, Mourad; Aboulnaga, Ashraf; Kalnis, Panos

2017-01-01

Applications in life sciences, decentralized social networks, Internet of Things, and statistical linked dataspaces integrate data from multiple decentralized RDF graphs via SPARQL queries. Several approaches have been proposed to optimize query
PERANGKAT BANTU UNTUK OPTIMASI QUERY PADA ORACLE DENGAN RESTRUKTURISASI SQL

Directory of Open Access Journals (Sweden)

Darlis Heru Murti

2006-07-01

Full Text Available Query merupakan bagian dari bahasa pemrograman SQL (Structured Query Language yang berfungsi untuk mengambil data (read dalam DBMS (Database Management System, termasuk Oracle [3]. Pada Oracle, ada tiga tahap proses yang dilakukan dalam pengeksekusian query, yaitu Parsing, Execute dan Fetch. Sebelum proses execute dijalankan, Oracle terlebih dahulu membuat execution plan yang akan menjadi skenario dalam proses excute.Dalam proses pengeksekusian query, terdapat faktor-faktor yang mempengaruhi kinerja query, di antaranya access path (cara pengambilan data dari sebuah tabel dan operasi join (cara menggabungkan data dari dua tabel. Untuk mendapatkan query dengan kinerja optimal, maka diperlukan pertimbangan-pertimbangan dalam menyikapi faktor-faktor tersebut. Optimasi query merupakan suatu cara untuk mendapatkan query dengan kinerja seoptimal mungkin, terutama dilihat dari sudut pandang waktu. Ada banyak metode untuk mengoptimasi query, tapi pada Penelitian ini, penulis membuat sebuah aplikasi untuk mengoptimasi query dengan metode restrukturisasi SQL statement. Pada metode ini, objek yang dianalisa adalah struktur klausa yang membangun sebuah query. Aplikasi ini memiliki satu input dan lima jenis output. Input dari aplikasi ini adalah sebuah query sedangkan kelima jenis output aplikasi ini adalah berupa query hasil optimasi, saran perbaikan, saran pembuatan indeks baru, execution plan dan data statistik. Cara kerja aplikasi ini dibagi menjadi empat tahap yaitu mengurai query menjadi sub query, mengurai query per-klausa, menentukan access path dan operasi join dan restrukturisasi query.Dari serangkaian ujicoba yang dilakukan penulis, aplikasi telah dapat berjalan sesuai dengan tujuan pembuatan Penelitian ini, yaitu mendapatkan query dengan kinerja optimal.Kata Kunci : Query, SQL, DBMS, Oracle, Parsing, Execute, Fetch, Execution Plan, Access Path, Operasi Join, Restrukturisasi SQL statement.
Evaluating SPARQL queries on massive RDF datasets

KAUST Repository

Al-Harbi, Razen; Abdelaziz, Ibrahim; Kalnis, Panos; Mamoulis, Nikos

2015-01-01

In this paper, we propose AdHash, a distributed RDF system which addresses the shortcomings of previous work. First, AdHash initially applies lightweight hash partitioning, which drastically minimizes the startup cost, while favoring the parallel processing of join patterns on subjects, without any data communication. Using a locality-aware planner, queries that cannot be processed in parallel are evaluated with minimal communication. Second, AdHash monitors the data access patterns and adapts dynamically to the query load by incrementally redistributing and replicating frequently accessed data. As a result, the communication cost for future queries is drastically reduced or even eliminated. Our experiments with synthetic and real data verify that AdHash (i) starts faster than all existing systems, (ii) processes thousands of queries before other systems become online, and (iii) gracefully adapts to the query load, being able to evaluate queries on billion-scale RDF data in sub-seconds. In this demonstration, audience can use a graphical interface of AdHash to verify its performance superiority compared to state-of-the-art distributed RDF systems.
Concept Systems and Ontologies: Recommendations for Basic Terminology

Science.gov (United States)

Klein, Gunnar O.; Smith, Barry

This essay concerns the problems surrounding the use of the term ``concept'' in current ontology and terminology research. It is based on the constructive dialogue between realist ontology on the one hand and the world of formal standardization of health informatics on the other, but its conclusions are not restricted to the domain of medicine. The term ``concept'' is one of the most misused even in literature and technical standards which attempt to bring clarity. In this paper we propose to use the term ``concept'' in the context of producing defined professional terminologies with one specific and consistent meaning which we propose for adoption as the agreed meaning of the term in future terminological research, and specifically in the development of formal terminologies to be used in computer systems. We also discuss and propose new definitions of a set of cognate terms. We describe the relations governing the realm of concepts, and compare these to the richer and more complex set of relations obtaining between entities in the real world. On this basis we also summarize an associated terminology for ontologies as representations of the real world and a partial mapping between the world of concepts and the world of reality.
An RDF/OWL knowledge base for query answering and decision support in clinical pharmacogenetics.

Science.gov (United States)

Samwald, Matthias; Freimuth, Robert; Luciano, Joanne S; Lin, Simon; Powers, Robert L; Marshall, M Scott; Adlassnig, Klaus-Peter; Dumontier, Michel; Boyce, Richard D

2013-01-01

Genetic testing for personalizing pharmacotherapy is bound to become an important part of clinical routine. To address associated issues with data management and quality, we are creating a semantic knowledge base for clinical pharmacogenetics. The knowledge base is made up of three components: an expressive ontology formalized in the Web Ontology Language (OWL 2 DL), a Resource Description Framework (RDF) model for capturing detailed results of manual annotation of pharmacogenomic information in drug product labels, and an RDF conversion of relevant biomedical datasets. Our work goes beyond the state of the art in that it makes both automated reasoning as well as query answering as simple as possible, and the reasoning capabilities go beyond the capabilities of previously described ontologies.
Evaluating SPARQL queries on massive RDF datasets

KAUST Repository

Al-Harbi, Razen

2015-08-01

Distributed RDF systems partition data across multiple computer nodes. Partitioning is typically based on heuristics that minimize inter-node communication and it is performed in an initial, data pre-processing phase. Therefore, the resulting partitions are static and do not adapt to changes in the query workload; as a result, existing systems are unable to consistently avoid communication for queries that are not favored by the initial data partitioning. Furthermore, for very large RDF knowledge bases, the partitioning phase becomes prohibitively expensive, leading to high startup costs. In this paper, we propose AdHash, a distributed RDF system which addresses the shortcomings of previous work. First, AdHash initially applies lightweight hash partitioning, which drastically minimizes the startup cost, while favoring the parallel processing of join patterns on subjects, without any data communication. Using a locality-aware planner, queries that cannot be processed in parallel are evaluated with minimal communication. Second, AdHash monitors the data access patterns and adapts dynamically to the query load by incrementally redistributing and replicating frequently accessed data. As a result, the communication cost for future queries is drastically reduced or even eliminated. Our experiments with synthetic and real data verify that AdHash (i) starts faster than all existing systems, (ii) processes thousands of queries before other systems become online, and (iii) gracefully adapts to the query load, being able to evaluate queries on billion-scale RDF data in sub-seconds. In this demonstration, audience can use a graphical interface of AdHash to verify its performance superiority compared to state-of-the-art distributed RDF systems.
Code query by example

Science.gov (United States)

Vaucouleur, Sebastien

2011-02-01

We introduce code query by example for customisation of evolvable software products in general and of enterprise resource planning systems (ERPs) in particular. The concept is based on an initial empirical study on practices around ERP systems. We motivate our design choices based on those empirical results, and we show how the proposed solution helps with respect to the infamous upgrade problem: the conflict between the need for customisation and the need for upgrade of ERP systems. We further show how code query by example can be used as a form of lightweight static analysis, to detect automatically potential defects in large software products. Code query by example as a form of lightweight static analysis is particularly interesting in the context of ERP systems: it is often the case that programmers working in this field are not computer science specialists but more of domain experts. Hence, they require a simple language to express custom rules.
Anatomical eponyms - unloved names in medical terminology.

Science.gov (United States)

Burdan, F; Dworzański, W; Cendrowska-Pinkosz, M; Burdan, M; Dworzańska, A

2016-01-01

Uniform international terminology is a fundamental issue of medicine. Names of various organs or structures have developed since early human history. The first proper anatomical books were written by Hippocrates, Aristotle and Galen. For this reason the modern terms originated from Latin or Greek. In a modern time the terminology was improved in particular by Vasalius, Fabricius and Harvey. Presently each known structure has internationally approved term that is explained in anatomical or histological terminology. However, some elements received eponyms, terms that incorporate the surname of the people that usually describe them for the first time or studied them (e.g., circle of Willis, follicle of Graff, fossa of Sylvious, foramen of Monro, Adamkiewicz artery). Literature and historical hero also influenced medical vocabulary (e.g. Achilles tendon and Atlas). According to various scientists, all the eponyms bring colour to medicine, embed medical traditions and culture to our history but lack accuracy, lead of confusion, and hamper scientific discussion. The current article presents a wide list of the anatomical eponyms with their proper anatomical term or description according to international anatomical terminology. However, since different eponyms are used in various countries, the list could be expanded.
Rita Temmerman. Towards New Ways of Terminology Description: The Sociocognitive Approach

Directory of Open Access Journals (Sweden)

Rosemarie Gläser

2011-10-01

Full Text Available This book appeared as Volume 3 in the Series Terminology and Lexicography Research and Practice edited by Helmi Sonneveld and Sue-Ellen Wright. The author, Rita Temmerman, presently working at the Erasmus Hogeschool, Brussels and specialising in problems of terminology in various domains of the life sciences, presents a polemical, stimulating and innovative monograph which continues and deepens her previous research work. Her doctoral dissertation (Louvain 1998 focused on Terminology Beyond Standardisation: Language and Categorisation in the Life Sciences. The aim of the book under review, Towards New Ways of Terminology Description: The Sociocognitive Approach, is to elaborate a new theory, method and application of terminology research which seeks to overcome the obvious limitations of traditional terminology as chiefly represented by the Vienna School (Eugen W?ster, Helmut Felber, Infoterm and associated institutions.
Alternative Concepts and Terminologies for Teaching African Art.

Science.gov (United States)

Chanda, Jacqueline

1992-01-01

Considers concepts and terminologies that focus on generalizations concerning traditional African art and cultures. Argues that alternative concepts and terminologies should be used in developing curriculum and in teaching non-Western art. Discusses traditional African religious beliefs, primitivism, and the function of African art objects. (KM)
Efficient Approximate OLAP Querying Over Time Series

DEFF Research Database (Denmark)

Perera, Kasun Baruhupolage Don Kasun Sanjeewa; Hahmann, Martin; Lehner, Wolfgang

2016-01-01

The ongoing trend for data gathering not only produces larger volumes of data, but also increases the variety of recorded data types. Out of these, especially time series, e.g. various sensor readings, have attracted attention in the domains of business intelligence and decision making. As OLAP...... queries play a major role in these domains, it is desirable to also execute them on time series data. While this is not a problem on the conceptual level, it can become a bottleneck with regards to query run-time. In general, processing OLAP queries gets more computationally intensive as the volume...... of data grows. This is a particular problem when querying time series data, which generally contains multiple measures recorded at fine time granularities. Usually, this issue is addressed either by scaling up hardware or by employing workload based query optimization techniques. However, these solutions...
CrowdMapping: A Crowdsourcing-Based Terminology Mapping Method for Medical Data Standardization.

Science.gov (United States)

Mao, Huajian; Chi, Chenyang; Huang, Boyu; Meng, Haibin; Yu, Jinghui; Zhao, Dongsheng

2017-01-01

Standardized terminology is the prerequisite of data exchange in analysis of clinical processes. However, data from different electronic health record systems are based on idiosyncratic terminology systems, especially when the data is from different hospitals and healthcare organizations. Terminology standardization is necessary for the medical data analysis. We propose a crowdsourcing-based terminology mapping method, CrowdMapping, to standardize the terminology in medical data. CrowdMapping uses a confidential model to determine how terminologies are mapped to a standard system, like ICD-10. The model uses mappings from different health care organizations and evaluates the diversity of the mapping to determine a more sophisticated mapping rule. Further, the CrowdMapping model enables users to rate the mapping result and interact with the model evaluation. CrowdMapping is a work-in-progress system, we present initial results mapping terminologies.
What Does Anonymization Mean? DataSHIELD and the Need for Consensus on Anonymization Terminology.

Science.gov (United States)

Wallace, Susan E

2016-06-01

Anonymization is a recognized process by which identifiers can be removed from identifiable data to protect an individual's confidentiality and is used as a standard practice when sharing data in biomedical research. However, a plethora of terms, such as coding, pseudonymization, unlinked, and deidentified, have been and continue to be used, leading to confusion and uncertainty. This article shows that this is a historic problem and argues that such continuing uncertainty regarding the levels of protection given to data risks damaging initiatives designed to assist researchers conducting cross-national studies and sharing data internationally. DataSHIELD and the creation of a legal template are used as examples of initiatives that rely on anonymization, but where the inconsistency in terminology could hinder progress. More broadly, this article argues that there is a real possibility that there could be possible damage to the public's trust in research and the institutions that carry it out by relying on vague notions of the anonymization process. Research participants whose lack of clear understanding of the research process is compensated for by trusting those carrying out the research may have that trust damaged if the level of protection given to their data does not match their expectations. One step toward ensuring understanding between parties would be consistent use of clearly defined terminology used internationally, so that all those involved are clear on the level of identifiability of any particular set of data and, therefore, how that data can be accessed and shared.
Flexible Query Answering Systems

DEFF Research Database (Denmark)

This book constitutes the refereed proceedings of the 10th International Conference on Flexible Query Answering Systems, FQAS 2013, held in Granada, Spain, in September 2013. The 59 full papers included in this volume were carefully reviewed and selected from numerous submissions. The papers...... are organized in a general session train and a parallel special session track. The general session train covers the following topics: querying-answering systems; semantic technology; patterns and classification; personalization and recommender systems; searching and ranking; and Web and human...
Algebraic Optimization of Recursive Database Queries

DEFF Research Database (Denmark)

Hansen, Michael Reichhardt

1988-01-01

Queries are expressed by relational algebra expressions including a fixpoint operation. A condition is presented under which a natural join commutes with a fixpoint operation. This condition is a simple check of attribute sets of sub-expressions of the query. The work may be considered a generali......Queries are expressed by relational algebra expressions including a fixpoint operation. A condition is presented under which a natural join commutes with a fixpoint operation. This condition is a simple check of attribute sets of sub-expressions of the query. The work may be considered...... a generalization of Aho and Ullman, (1979). The result is interpreted in function free logic database terms as a transformation of the recursively defined predicate involving: (a) elimination of an argument, and (b) propagation of selections (instantiations) to the extensionally defined predicates. A collection...
SATORI: a system for ontology-guided visual exploration of biomedical data repositories.

Science.gov (United States)

Lekschas, Fritz; Gehlenborg, Nils

2018-04-01

The ever-increasing number of biomedical datasets provides tremendous opportunities for re-use but current data repositories provide limited means of exploration apart from text-based search. Ontological metadata annotations provide context by semantically relating datasets. Visualizing this rich network of relationships can improve the explorability of large data repositories and help researchers find datasets of interest. We developed SATORI-an integrative search and visual exploration interface for the exploration of biomedical data repositories. The design is informed by a requirements analysis through a series of semi-structured interviews. We evaluated the implementation of SATORI in a field study on a real-world data collection. SATORI enables researchers to seamlessly search, browse and semantically query data repositories via two visualizations that are highly interconnected with a powerful search interface. SATORI is an open-source web application, which is freely available at http://satori.refinery-platform.org and integrated into the Refinery Platform. nils@hms.harvard.edu. Supplementary data are available at Bioinformatics online.

Information retrieval and terminology extraction in online resources for patients with diabetes.

Science.gov (United States)

Seljan, Sanja; Baretić, Maja; Kucis, Vlasta

2014-06-01

Terminology use, as a mean for information retrieval or document indexing, plays an important role in health literacy. Specific types of users, i.e. patients with diabetes need access to various online resources (on foreign and/or native language) searching for information on self-education of basic diabetic knowledge, on self-care activities regarding importance of dietetic food, medications, physical exercises and on self-management of insulin pumps. Automatic extraction of corpus-based terminology from online texts, manuals or professional papers, can help in building terminology lists or list of "browsing phrases" useful in information retrieval or in document indexing. Specific terminology lists represent an intermediate step between free text search and controlled vocabulary, between user's demands and existing online resources in native and foreign language. The research aiming to detect the role of terminology in online resources, is conducted on English and Croatian manuals and Croatian online texts, and divided into three interrelated parts: i) comparison of professional and popular terminology use ii) evaluation of automatic statistically-based terminology extraction on English and Croatian texts iii) comparison and evaluation of extracted terminology performed on English manual using statistical and hybrid approaches. Extracted terminology candidates are evaluated by comparison with three types of reference lists: list created by professional medical person, list of highly professional vocabulary contained in MeSH and list created by non-medical persons, made as intersection of 15 lists. Results report on use of popular and professional terminology in online diabetes resources, on evaluation of automatically extracted terminology candidates in English and Croatian texts and on comparison of statistical and hybrid extraction methods in English text. Evaluation of automatic and semi-automatic terminology extraction methods is performed by recall
The effect of query complexity on Web searching results

Directory of Open Access Journals (Sweden)

B.J. Jansen

2000-01-01

Full Text Available This paper presents findings from a study of the effects of query structure on retrieval by Web search services. Fifteen queries were selected from the transaction log of a major Web search service in simple query form with no advanced operators (e.g., Boolean operators, phrase operators, etc. and submitted to 5 major search engines - Alta Vista, Excite, FAST Search, Infoseek, and Northern Light. The results from these queries became the baseline data. The original 15 queries were then modified using the various search operators supported by each of the 5 search engines for a total of 210 queries. Each of these 210 queries was also submitted to the applicable search service. The results obtained were then compared to the baseline results. A total of 2,768 search results were returned by the set of all queries. In general, increasing the complexity of the queries had little effect on the results with a greater than 70% overlap in results, on average. Implications for the design of Web search services and directions for future research are discussed.
SM4MQ: A Semantic Model for Multidimensional Queries

DEFF Research Database (Denmark)

Varga, Jovan; Dobrokhotova, Ekaterina; Romero, Oscar

2017-01-01

metadata artifacts (e.g., queries) to assist users with the analysis. However, modeling and sharing of most of these artifacts are typically overlooked. Thus, in this paper we focus on the query metadata artifact in the Exploratory OLAP context and propose an RDF-based vocabulary for its representation......, sharing, and reuse on the SW. As OLAP is based on the underlying multidimensional (MD) data model we denote such queries as MD queries and define SM4MQ: A Semantic Model for Multidimensional Queries. Furthermore, we propose a method to automate the exploitation of queries by means of SPARQL. We apply...... the method to a use case of transforming queries from SM4MQ to a vector representation. For the use case, we developed the prototype and performed an evaluation that shows how our approach can significantly ease and support user assistance such as query recommendation....
Mining the SDSS SkyServer SQL queries log

Science.gov (United States)

Hirota, Vitor M.; Santos, Rafael; Raddick, Jordan; Thakar, Ani

2016-05-01

SkyServer, the Internet portal for the Sloan Digital Sky Survey (SDSS) astronomic catalog, provides a set of tools that allows data access for astronomers and scientific education. One of SkyServer data access interfaces allows users to enter ad-hoc SQL statements to query the catalog. SkyServer also presents some template queries that can be used as basis for more complex queries. This interface has logged over 330 million queries submitted since 2001. It is expected that analysis of this data can be used to investigate usage patterns, identify potential new classes of queries, find similar queries, etc. and to shed some light on how users interact with the Sloan Digital Sky Survey data and how scientists have adopted the new paradigm of e-Science, which could in turn lead to enhancements on the user interfaces and experience in general. In this paper we review some approaches to SQL query mining, apply the traditional techniques used in the literature and present lessons learned, namely, that the general text mining approach for feature extraction and clustering does not seem to be adequate for this type of data, and, most importantly, we find that this type of analysis can result in very different queries being clustered together.
Fragger: a protein fragment picker for structural queries.

Science.gov (United States)

Berenger, Francois; Simoncini, David; Voet, Arnout; Shrestha, Rojan; Zhang, Kam Y J

2017-01-01

Protein modeling and design activities often require querying the Protein Data Bank (PDB) with a structural fragment, possibly containing gaps. For some applications, it is preferable to work on a specific subset of the PDB or with unpublished structures. These requirements, along with specific user needs, motivated the creation of a new software to manage and query 3D protein fragments. Fragger is a protein fragment picker that allows protein fragment databases to be created and queried. All fragment lengths are supported and any set of PDB files can be used to create a database. Fragger can efficiently search a fragment database with a query fragment and a distance threshold. Matching fragments are ranked by distance to the query. The query fragment can have structural gaps and the allowed amino acid sequences matching a query can be constrained via a regular expression of one-letter amino acid codes. Fragger also incorporates a tool to compute the backbone RMSD of one versus many fragments in high throughput. Fragger should be useful for protein design, loop grafting and related structural bioinformatics tasks.
GMB: An Efficient Query Processor for Biological Data

Directory of Open Access Journals (Sweden)

Taha Kamal

2011-06-01

Full Text Available Bioinformatics applications manage complex biological data stored into distributed and often heterogeneous databases and require large computing power. These databases are too big and complicated to be rapidly queried every time a user submits a query, due to the overhead involved in decomposing the queries, sending the decomposed queries to remote databases, and composing the results. There is also considerable communication costs involved. This study addresses the mentioned problems in Grid-based environment for bioinformatics. We propose a Grid middleware called GMB that alleviates these problems by caching the results of Frequently Used Queries (FUQ. Queries are classified based on their types and frequencies. FUQ are answered from the middleware, which improves their response time. GMB acts as a gateway to TeraGrid Grid: it resides between users’ applications and TeraGrid Grid. We evaluate GMB experimentally.
The Data Cyclotron query processing scheme

NARCIS (Netherlands)

Goncalves, R.; Kersten, M.

2011-01-01

A grand challenge of distributed query processing is to devise a self-organizing architecture which exploits all hardware resources optimally to manage the database hot set, minimize query response time, and maximize throughput without single point global coordination. The Data Cyclotron
Approximate furthest neighbor with application to annulus query

DEFF Research Database (Denmark)

Pagh, Rasmus; Silvestri, Francesco; Sivertsen, Johan von Tangen

2016-01-01

-dimensional Euclidean space. The method builds on the technique of Indyk (SODA 2003), storing random projections to provide sublinear query time for AFN. However, we introduce a different query algorithm, improving on Indyk׳s approximation factor and reducing the running time by a logarithmic factor. We also present......, the query-dependent approach is used for deriving a data structure for the approximate annulus query problem, which is defined as follows: given an input set S and two parameters r>0 and w≥1, construct a data structure that returns for each query point q a point p∈S such that the distance between p and q...
Manchester visual query language

Science.gov (United States)

Oakley, John P.; Davis, Darryl N.; Shann, Richard T.

1993-04-01

We report a database language for visual retrieval which allows queries on image feature information which has been computed and stored along with images. The language is novel in that it provides facilities for dealing with feature data which has actually been obtained from image analysis. Each line in the Manchester Visual Query Language (MVQL) takes a set of objects as input and produces another, usually smaller, set as output. The MVQL constructs are mainly based on proven operators from the field of digital image analysis. An example is the Hough-group operator which takes as input a specification for the objects to be grouped, a specification for the relevant Hough space, and a definition of the voting rule. The output is a ranked list of high scoring bins. The query could be directed towards one particular image or an entire image database, in the latter case the bins in the output list would in general be associated with different images. We have implemented MVQL in two layers. The command interpreter is a Lisp program which maps each MVQL line to a sequence of commands which are used to control a specialized database engine. The latter is a hybrid graph/relational system which provides low-level support for inheritance and schema evolution. In the paper we outline the language and provide examples of useful queries. We also describe our solution to the engineering problems associated with the implementation of MVQL.
A structural query system for Han characters

DEFF Research Database (Denmark)

Skala, Matthew

2016-01-01

The IDSgrep structural query system for Han character dictionaries is presented. This dictionary search system represents the spatial structure of Han characters using Extended Ideographic Description Sequences (EIDSes), a data model and syntax based on the Unicode IDS concept. It includes a query...... language for EIDS databases, with a freely available implementation and format translation from popular third-party IDS and XML character databases. The system is designed to suit the needs of font developers and foreign language learners. The search algorithm includes a bit vector index inspired by Bloom...... filters to support faster query operations. Experimental results are presented, evaluating the effect of the indexing on query performance....
Enabling Incremental Query Re-Optimization.

Science.gov (United States)

Liu, Mengmeng; Ives, Zachary G; Loo, Boon Thau

2016-01-01

As declarative query processing techniques expand to the Web, data streams, network routers, and cloud platforms, there is an increasing need to re-plan execution in the presence of unanticipated performance changes. New runtime information may affect which query plan we prefer to run. Adaptive techniques require innovation both in terms of the algorithms used to estimate costs , and in terms of the search algorithm that finds the best plan. We investigate how to build a cost-based optimizer that recomputes the optimal plan incrementally given new cost information, much as a stream engine constantly updates its outputs given new data. Our implementation especially shows benefits for stream processing workloads. It lays the foundations upon which a variety of novel adaptive optimization algorithms can be built. We start by leveraging the recently proposed approach of formulating query plan enumeration as a set of recursive datalog queries ; we develop a variety of novel optimization approaches to ensure effective pruning in both static and incremental cases. We further show that the lessons learned in the declarative implementation can be equally applied to more traditional optimizer implementations.
BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications.

Science.gov (United States)

Whetzel, Patricia L; Noy, Natalya F; Shah, Nigam H; Alexander, Paul R; Nyulas, Csongor; Tudorache, Tania; Musen, Mark A

2011-07-01

The National Center for Biomedical Ontology (NCBO) is one of the National Centers for Biomedical Computing funded under the NIH Roadmap Initiative. Contributing to the national computing infrastructure, NCBO has developed BioPortal, a web portal that provides access to a library of biomedical ontologies and terminologies (http://bioportal.bioontology.org) via the NCBO Web services. BioPortal enables community participation in the evaluation and evolution of ontology content by providing features to add mappings between terms, to add comments linked to specific ontology terms and to provide ontology reviews. The NCBO Web services (http://www.bioontology.org/wiki/index.php/NCBO_REST_services) enable this functionality and provide a uniform mechanism to access ontologies from a variety of knowledge representation formats, such as Web Ontology Language (OWL) and Open Biological and Biomedical Ontologies (OBO) format. The Web services provide multi-layered access to the ontology content, from getting all terms in an ontology to retrieving metadata about a term. Users can easily incorporate the NCBO Web services into software applications to generate semantically aware applications and to facilitate structured data collection.
Spatial Keyword Query Processing

DEFF Research Database (Denmark)

Chen, Lisi; Jensen, Christian S.; Wu, Dingming

2013-01-01

Geo-textual indices play an important role in spatial keyword query- ing. The existing geo-textual indices have not been compared sys- tematically under the same experimental framework. This makes it difficult to determine which indexing technique best supports specific functionality. We provide...... an all-around survey of 12 state- of-the-art geo-textual indices. We propose a benchmark that en- ables the comparison of the spatial keyword query performance. We also report on the findings obtained when applying the bench- mark to the indices, thus uncovering new insights that may guide index...
RDF-GL: A SPARQL-Based Graphical Query Language for RDF

Science.gov (United States)

Hogenboom, Frederik; Milea, Viorel; Frasincar, Flavius; Kaymak, Uzay

This chapter presents RDF-GL, a graphical query language (GQL) for RDF. The GQL is based on the textual query language SPARQL and mainly focuses on SPARQL SELECT queries. The advantage of a GQL over textual query languages is that complexity is hidden through the use of graphical symbols. RDF-GL is supported by a Java-based editor, SPARQLinG, which is presented as well. The editor does not only allow for RDF-GL query creation, but also converts RDF-GL queries to SPARQL queries and is able to subsequently execute these. Experiments show that using the GQL in combination with the editor makes RDF querying more accessible for end users.
The Data Cyclotron query processing scheme.

NARCIS (Netherlands)

R.A. Goncalves (Romulo); M.L. Kersten (Martin)

2011-01-01

htmlabstractA grand challenge of distributed query processing is to devise a self-organizing architecture which exploits all hardware resources optimally to manage the database hot set, minimize query response time, and maximize throughput without single point global coordination. The Data Cyclotron
A Multi-Query Optimizer for Monet

NARCIS (Netherlands)

S. Manegold (Stefan); A.J. Pellenkoft (Jan); M.L. Kersten (Martin)

2000-01-01

textabstractDatabase systems allow for concurrent use of several applications (and query interfaces). Each application generates an ``optimal'' plan---a sequence of low-level database operators---for accessing the database. The queries posed by users through the same application can be optimized
A multi-query optimizer for Monet

NARCIS (Netherlands)

S. Manegold (Stefan); A.J. Pellenkoft (Jan); M.L. Kersten (Martin)

2000-01-01

textabstractDatabase systems allow for concurrent use of several applications (and query interfaces). Each application generates an ``optimal'' plan---a sequence of low-level database operators---for accessing the database. The queries posed by users through the same application can be optimized
Path-based Queries on Trajectory Data

DEFF Research Database (Denmark)

Krogh, Benjamin Bjerre; Pelekis, Nikos; Theodoridis, Yannis

2014-01-01

In traffic research, management, and planning a number of path-based analyses are heavily used, e.g., for computing turn-times, evaluating green waves, or studying traffic flow. These analyses require retrieving the trajectories that follow the full path being analyzed. Existing path queries cannot...... sufficiently support such path-based analyses because they retrieve all trajectories that touch any edge in the path. In this paper, we define and formalize the strict path query. This is a novel query type tailored to support path-based analysis, where trajectories must follow all edges in the path...... a specific path by only retrieving data from the first and last edge in the path. To correctly answer strict path queries existing network-constrained trajectory indexes must retrieve data from all edges in the path. An extensive performance study of NETTRA using a very large real-world trajectory data set...
A standard for terminology in chronic pelvic pain syndromes

DEFF Research Database (Denmark)

Doggweiler, Regula; Whitmore, Kristene E; Meijlink, Jane M

2017-01-01

AIMS: Terms used in the field of chronic pelvic pain (CPP) are poorly defined and often confusing. An International Continence Society (ICS) Standard for Terminology in chronic pelvic pain syndromes (CPPS) has been developed with the aim of improving diagnosis and treatment of patients affected...... domain from 1980 to 2014. Existing ICS Standards for terminology were utilized where appropriate to ensure transparency, accessibility, flexibility, and evolution. Consensus was based on majority agreement. RESULTS: The multidisciplinary CPPS Standard reports updated consensus terminology in nine domains...
Result Diversification Based on Query-Specific Cluster Ranking

NARCIS (Netherlands)

J. He (Jiyin); E. Meij; M. de Rijke (Maarten)

2011-01-01

htmlabstractResult diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking,

Visual Querying in Chemical Databases using SMARTS Patterns

OpenAIRE

Šípek, Vojtěch

2014-01-01

The purpose of this thesis is to create framework for visual querying in chemical databases which will be implemented as a web application. By using graphical editor, which is a part of client side, the user creates queries which are translated into chemical query language SMARTS. This query is parsed on the application server which is connected to the chemical database. This framework also contains tooling for creating the database and index structure above it. 1
Result diversification based on query-specific cluster ranking

NARCIS (Netherlands)

He, J.; Meij, E.; de Rijke, M.

2011-01-01

Result diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking, in which diversification
What is in a name? Understanding the implications of participant terminology.

Science.gov (United States)

Bibace, Roger; Clegg, Joshua W; Valsiner, Jaan

2009-03-01

The authors discuss the history of research terminology in American psychology with respect to the various labels given to those upon whom we conduct research ("observer"-"subject"-"participant"-"client"). This history is supplemented with an analysis of participant terminology in APA manuals from four historical eras, from the 1950s to the present. The general trend in participant terminology reflects the overall trends in American psychology, beginning with a complex lexicon that admitted both the passive and the active research participant, followed by a dominance of the passive term 'subject' and ending with the terminological ambiguity and multiplicity reflected in contemporary psychology. This selective history serves to contextualize a discussion of the meaning, functions, and implications of the transformations in, and debates over, participant terminology.
Medical Terminology: Root Words. Health Occupations Education Module.

Science.gov (United States)

Temple Univ., Philadelphia, PA. Div. of Vocational Education.

This module on medical terminology (root words) is one of 17 modules designed for individualized instruction in health occupations education programs at both the secondary and postsecondary levels. This module consists of an introduction to root words, a list of resources needed, procedures for using the module, a list of terminology used in the…
Cumulative query method for influenza surveillance using search engine data.

Science.gov (United States)

Seo, Dong-Woo; Jo, Min-Woo; Sohn, Chang Hwan; Shin, Soo-Yong; Lee, JaeHo; Yu, Maengsoo; Kim, Won Young; Lim, Kyoung Soo; Lee, Sang-Il

2014-12-16

Internet search queries have become an important data source in syndromic surveillance system. However, there is currently no syndromic surveillance system using Internet search query data in South Korea. The objective of this study was to examine correlations between our cumulative query method and national influenza surveillance data. Our study was based on the local search engine, Daum (approximately 25% market share), and influenza-like illness (ILI) data from the Korea Centers for Disease Control and Prevention. A quota sampling survey was conducted with 200 participants to obtain popular queries. We divided the study period into two sets: Set 1 (the 2009/10 epidemiological year for development set 1 and 2010/11 for validation set 1) and Set 2 (2010/11 for development Set 2 and 2011/12 for validation Set 2). Pearson's correlation coefficients were calculated between the Daum data and the ILI data for the development set. We selected the combined queries for which the correlation coefficients were .7 or higher and listed them in descending order. Then, we created a cumulative query method n representing the number of cumulative combined queries in descending order of the correlation coefficient. In validation set 1, 13 cumulative query methods were applied, and 8 had higher correlation coefficients (min=.916, max=.943) than that of the highest single combined query. Further, 11 of 13 cumulative query methods had an r value of ≥.7, but 4 of 13 combined queries had an r value of ≥.7. In validation set 2, 8 of 15 cumulative query methods showed higher correlation coefficients (min=.975, max=.987) than that of the highest single combined query. All 15 cumulative query methods had an r value of ≥.7, but 6 of 15 combined queries had an r value of ≥.7. Cumulative query method showed relatively higher correlation with national influenza surveillance data than combined queries in the development and validation set.
Croatian Analytical Terminology

Directory of Open Access Journals (Sweden)

Kastelan-Macan; M.

2008-04-01

Full Text Available Results of analytical research are necessary in all human activities. They are inevitable in making decisions in the environmental chemistry, agriculture, forestry, veterinary medicine, pharmaceutical industry, and biochemistry. Without analytical measurements the quality of materials and products cannot be assessed, so that analytical chemistry is an essential part of technical sciences and disciplines.The language of Croatian science, and analytical chemistry within it, was one of the goals of our predecessors. Due to the political situation, they did not succeed entirely, but for the scientists in independent Croatia this is a duty, because language is one of the most important features of the Croatian identity. The awareness of the need to introduce Croatian terminology was systematically developed in the second half of the 19th century, along with the founding of scientific societies and the wish of scientists to write their scientific works in Croatian, so that the results of their research may be applied in economy. Many authors of textbooks from the 19th and the first half of the 20th century contributed to Croatian analytical terminology (F. Rački, B. Šulek, P. Žulić, G. Pexidr, J. Domac, G. Janeček , F. Bubanović, V. Njegovan and others. M. DeŢelić published the first systematic chemical terminology in 1940, adjusted to the IUPAC recommendations. In the second half of 20th century textbooks in classic analytical chemistry were written by V. Marjanović-Krajovan, M. Gyiketta-Ogrizek, S. Žilić and others. I. Filipović wrote the General and Inorganic Chemistry textbook and the Laboratory Handbook (in collaboration with P. Sabioncello and contributed greatly to establishing the terminology in instrumental analytical methods.The source of Croatian nomenclature in modern analytical chemistry today are translated textbooks by Skoog, West and Holler, as well as by Günnzler i Gremlich, and original textbooks by S. Turina, Z. �
Dynamic tables: an architecture for managing evolving, heterogeneous biomedical data in relational database management systems.

Science.gov (United States)

Corwin, John; Silberschatz, Avi; Miller, Perry L; Marenco, Luis

2007-01-01

Data sparsity and schema evolution issues affecting clinical informatics and bioinformatics communities have led to the adoption of vertical or object-attribute-value-based database schemas to overcome limitations posed when using conventional relational database technology. This paper explores these issues and discusses why biomedical data are difficult to model using conventional relational techniques. The authors propose a solution to these obstacles based on a relational database engine using a sparse, column-store architecture. The authors provide benchmarks comparing the performance of queries and schema-modification operations using three different strategies: (1) the standard conventional relational design; (2) past approaches used by biomedical informatics researchers; and (3) their sparse, column-store architecture. The performance results show that their architecture is a promising technique for storing and processing many types of data that are not handled well by the other two semantic data models.
Query Health: standards-based, cross-platform population health surveillance.

Science.gov (United States)

Klann, Jeffrey G; Buck, Michael D; Brown, Jeffrey; Hadley, Marc; Elmore, Richard; Weber, Griffin M; Murphy, Shawn N

2014-01-01

Understanding population-level health trends is essential to effectively monitor and improve public health. The Office of the National Coordinator for Health Information Technology (ONC) Query Health initiative is a collaboration to develop a national architecture for distributed, population-level health queries across diverse clinical systems with disparate data models. Here we review Query Health activities, including a standards-based methodology, an open-source reference implementation, and three pilot projects. Query Health defined a standards-based approach for distributed population health queries, using an ontology based on the Quality Data Model and Consolidated Clinical Document Architecture, Health Quality Measures Format (HQMF) as the query language, the Query Envelope as the secure transport layer, and the Quality Reporting Document Architecture as the result language. We implemented this approach using Informatics for Integrating Biology and the Bedside (i2b2) and hQuery for data analytics and PopMedNet for access control, secure query distribution, and response. We deployed the reference implementation at three pilot sites: two public health departments (New York City and Massachusetts) and one pilot designed to support Food and Drug Administration post-market safety surveillance activities. The pilots were successful, although improved cross-platform data normalization is needed. This initiative resulted in a standards-based methodology for population health queries, a reference implementation, and revision of the HQMF standard. It also informed future directions regarding interoperability and data access for ONC's Data Access Framework initiative. Query Health was a test of the learning health system that supplied a functional methodology and reference implementation for distributed population health queries that has been validated at three sites. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under
Terminology supported archiving and publication of environmental science data in PANGAEA.

Science.gov (United States)

Diepenbroek, Michael; Schindler, Uwe; Huber, Robert; Pesant, Stéphane; Stocker, Markus; Felden, Janine; Buss, Melanie; Weinrebe, Matthias

2017-11-10

Exemplified on the information system PANGAEA, we describe the application of terminologies for archiving and publishing environmental science data. A terminology catalogue (TC) was embedded into the system, with interfaces allowing to replicate and to manually work on terminologies. For data ingest and archiving, we show how the TC can improve structuring and harmonizing lineage and content descriptions of data sets. Key is the conceptualization of measurement and observation types (parameters) and methods, for which we have implemented a basic syntax and rule set. For data access and dissemination, we have improved findability of data through enrichment of metadata with TC terms. Semantic annotations, e.g. adding term concepts (including synonyms and hierarchies) or mapped terms of different terminologies, facilitate comprehensive data retrievals. The PANGAEA thesaurus of classifying terms, which is part of the TC is used as an umbrella vocabulary that links the various domains and allows drill downs and side drills with various facets. Furthermore, we describe how TC terms can be linked to nominal data values. This improves data harmonization and facilitates structural transformation of heterogeneous data sets to a common schema. Technical developments are complemented by work on the metadata content. Over the last 20 years, more than 100 new parameters have been defined on average per week. Recently, PANGAEA has increasingly been submitting new terms to various terminology services. Matching terms from terminology services with our parameter or method strings is supported programmatically. However, the process ultimately needs manual input by domain experts. The quality of terminology services is an additional limiting factor, and varies with respect to content, editorial, interoperability, and sustainability. Good quality terminology services are the building blocks for the conceptualization of parameters and methods. In our view, they are essential for data
[German influences on Romanian medical terminology].

Science.gov (United States)

Răcilă, R G; Răileanu, Irena; Rusu, V

2008-01-01

The medical terminology plays a key part both in the study of medicine as well as in its practice. Moreover, understanding the medical terms is important not only for the doctor but also for the patients who want to learn more about their condition. For these reasons we believe that the study of medical terminology is one of great interest. The aim of our paper was to evaluate the German linguistic and medical influences on the evolution of the Romanian medical terminology. Since the Romanian-German cultural contacts date back to the 12th century we had reasons to believe that the number of German medical words in Romanian would be significant. To our surprise, the Romanian language has very few German words and even less medical terms of German origin. However, when we searched the list of diseases coined after famous medical personalities, we found out that 26 % of them bore the names of German doctors and scientists. Taken together this proves that the German medical school played an important role on the evolution of Romanian medicine despite the fact that the Romanian vocabulary was slightly influenced by the German language. We explain this fact on the structural differences between the Romanian and German languages, which make it hard for German loans to be integrated in the Romanian lexis. In conclusion we state that the German influence on the Romanian medical terminology is weak despite the important contribution of the German medical school to the development of medical education and healthcare in Romania. Key
A general approach to query flattening

NARCIS (Netherlands)

van Ruth, J.

The translation of queries from complex data models to simpler data models is a recurring theme in the construction of efficient data management systems. In this paper we propose a general framework to guide the translation from data models with nested types to a flat relational model (query
Exploiting External Collections for Query Expansion

NARCIS (Netherlands)

Weerkamp, W.; Balog, K.; de Rijke, M.

2012-01-01

A persisting challenge in the field of information retrieval is the vocabulary mismatch between a user’s information need and the relevant documents. One way of addressing this issue is to apply query modeling: to add terms to the original query and reweigh the terms. In social media, where
Sonata: Query-Driven Network Telemetry

KAUST Repository

Gupta, Arpit; Harrison, Rob; Pawar, Ankita; Birkner, Rü diger; Canini, Marco; Feamster, Nick; Rexford, Jennifer; Willinger, Walter

2017-01-01

Operating networks depends on collecting and analyzing measurement data. Current technologies do not make it easy to do so, typically because they separate data collection (e.g., packet capture or flow monitoring) from analysis, producing either too much data to answer a general question or too little data to answer a detailed question. In this paper, we present Sonata, a network telemetry system that uses a uniform query interface to drive the joint collection and analysis of network traffic. Sonata takes the advantage of two emerging technologies---streaming analytics platforms and programmable network devices---to facilitate joint collection and analysis. Sonata allows operators to more directly express network traffic analysis tasks in terms of a high-level language. The underlying runtime partitions each query into a portion that runs on the switch and another that runs on the streaming analytics platform iteratively refines the query to efficiently capture only the traffic that pertains to the operator's query, and exploits sketches to reduce state in switches in exchange for more approximate results. Through an evaluation of a prototype implementation, we demonstrate that Sonata can support a wide range of network telemetry tasks with less state in the network, and lower data rates to streaming analytics systems, than current approaches can achieve.
Sonata: Query-Driven Network Telemetry

KAUST Repository

Gupta, Arpit

2017-05-02

Operating networks depends on collecting and analyzing measurement data. Current technologies do not make it easy to do so, typically because they separate data collection (e.g., packet capture or flow monitoring) from analysis, producing either too much data to answer a general question or too little data to answer a detailed question. In this paper, we present Sonata, a network telemetry system that uses a uniform query interface to drive the joint collection and analysis of network traffic. Sonata takes the advantage of two emerging technologies---streaming analytics platforms and programmable network devices---to facilitate joint collection and analysis. Sonata allows operators to more directly express network traffic analysis tasks in terms of a high-level language. The underlying runtime partitions each query into a portion that runs on the switch and another that runs on the streaming analytics platform iteratively refines the query to efficiently capture only the traffic that pertains to the operator\\'s query, and exploits sketches to reduce state in switches in exchange for more approximate results. Through an evaluation of a prototype implementation, we demonstrate that Sonata can support a wide range of network telemetry tasks with less state in the network, and lower data rates to streaming analytics systems, than current approaches can achieve.
jQuery Mobile Up and Running

CERN Document Server

Firtman, Maximiliano

2012-01-01

Would you like to build one mobile web application that works on iPad and Kindle Fire as well as iPhone and Android smartphones? This introductory guide to jQuery Mobile shows you how. Through a series of hands-on exercises, you'll learn the best ways to use this framework's many interface components to build customizable, multiplatform apps. You don't need any programming skills or previous experience with jQuery to get started. By the time you finish this book, you'll know how to create responsive, Ajax-based interfaces that work on a variety of smartphones and tablets, using jQuery Mobile
jQuery for designers beginner's guide

CERN Document Server

MacLees, Natalie

2014-01-01

A step-by-step guide that spices up your web pages and designs them in the way you want using the most widely used JavaScript library, jQuery. The beginner-friendly and easy-to-understand approach of the book will help get to grips with jQuery in no time. If you know the fundamentals of HTML and CSS, and want to extend your knowledge by learning to use JavaScript, then this is just the book for you. jQuery makes JavaScript straightforward and approachable - you'll be surprised at how easy it can be to add animations and special effects to your beautifully designed pages.
Querying Business Process Models with VMQL

DEFF Research Database (Denmark)

Störrle, Harald; Acretoaie, Vlad

2013-01-01

The Visual Model Query Language (VMQL) has been invented with the objectives (1) to make it easier for modelers to query models effectively, and (2) to be universally applicable to all modeling languages. In previous work, we have applied VMQL to UML, and validated the first of these two claims. ...
Does query expansion limit our learning? A comparison of social-based expansion to content-based expansion for medical queries on the internet.

Science.gov (United States)

Pentoney, Christopher; Harwell, Jeff; Leroy, Gondy

2014-01-01

Searching for medical information online is a common activity. While it has been shown that forming good queries is difficult, Google's query suggestion tool, a type of query expansion, aims to facilitate query formation. However, it is unknown how this expansion, which is based on what others searched for, affects the information gathering of the online community. To measure the impact of social-based query expansion, this study compared it with content-based expansion, i.e., what is really in the text. We used 138,906 medical queries from the AOL User Session Collection and expanded them using Google's Autocomplete method (social-based) and the content of the Google Web Corpus (content-based). We evaluated the specificity and ambiguity of the expansion terms for trigram queries. We also looked at the impact on the actual results using domain diversity and expansion edit distance. Results showed that the social-based method provided more precise expansion terms as well as terms that were less ambiguous. Expanded queries do not differ significantly in diversity when expanded using the social-based method (6.72 different domains returned in the first ten results, on average) vs. content-based method (6.73 different domains, on average).
Research in Mobile Database Query Optimization and Processing

Directory of Open Access Journals (Sweden)

Agustinus Borgy Waluyo

2005-01-01

Full Text Available The emergence of mobile computing provides the ability to access information at any time and place. However, as mobile computing environments have inherent factors like power, storage, asymmetric communication cost, and bandwidth limitations, efficient query processing and minimum query response time are definitely of great interest. This survey groups a variety of query optimization and processing mechanisms in mobile databases into two main categories, namely: (i query processing strategy, and (ii caching management strategy. Query processing includes both pull and push operations (broadcast mechanisms. We further classify push operation into on-demand broadcast and periodic broadcast. Push operation (on-demand broadcast relates to designing techniques that enable the server to accommodate multiple requests so that the request can be processed efficiently. Push operation (periodic broadcast corresponds to data dissemination strategies. In this scheme, several techniques to improve the query performance by broadcasting data to a population of mobile users are described. A caching management strategy defines a number of methods for maintaining cached data items in clients' local storage. This strategy considers critical caching issues such as caching granularity, caching coherence strategy and caching replacement policy. Finally, this survey concludes with several open issues relating to mobile query optimization and processing strategy.
A comprehensive SWOT audit of the role of the biomedical physicist in the education of healthcare professionals in Europe.

Science.gov (United States)

Caruana, C J; Wasilewska-Radwanska, M; Aurengo, A; Dendy, P P; Karenauskaite, V; Malisan, M R; Meijer, J H; Mihov, D; Mornstein, V; Rokita, E; Vano, E; Weckstrom, M; Wucherer, M

2010-04-01

Although biomedical physicists provide educational services to the healthcare professions in the majority of universities in Europe, their precise role with respect to the education of the healthcare professions has not been studied systematically. To address this issue we are conducting a research project to produce a strategic development model for the role using the well-established SWOT (Strengths, Weaknesses, Opportunities, Threats) methodology. SWOT based strategic planning is a two-step process: one first carries out a SWOT position audit and then uses the identified SWOT themes to construct the strategic development model. This paper reports the results of a SWOT audit for the role of the biomedical physicist in the education of the healthcare professions in Europe. Internal Strengths and Weaknesses of the role were identified through a qualitative survey of biomedical physics departments and biomedical physics curricula delivered to healthcare professionals across Europe. External environmental Opportunities and Threats were identified through a systematic survey of the healthcare, healthcare professional education and higher education literature and categorized under standard PEST (Political, Economic, Social-Psychological, Technological-Scientific) categories. The paper includes an appendix of terminology. Defined terms are marked with an asterisk in the text. Copyright 2009 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.

Lignocellulosic Biomass Derived Functional Materials: Synthesis and Applications in Biomedical Engineering.

Science.gov (United States)

Zhang, Lei; Peng, Xinwen; Zhong, Linxin; Chua, Weitian; Xiang, Zhihua; Sun, Runcang

2017-09-18

The pertinent issue of resources shortage arising from global climate change in the recent years has accentuated the importance of materials that are environmental friendly. Despite the merits of current material like cellulose as the most abundant natural polysaccharide on earth, the incorporation of lignocellulosic biomass has the potential to value-add the recent development of cellulose-derivatives in drug delivery systems. Lignocellulosic biomass, with a hierarchical structure, comprised of cellulose, hemicellulose and lignin. As an excellent substrate that is renewable, biodegradable, biocompatible and chemically accessible for modified materials, lignocellulosic biomass sets forth a myriad of applications. To date, materials derived from lignocellulosic biomass have been extensively explored for new technological development and applications, such as biomedical, green electronics and energy products. In this review, chemical constituents of lignocellulosic biomass are first discussed before we critically examine the potential alternatives in the field of biomedical application. In addition, the pretreatment methods for extracting cellulose, hemicellulose and lignin from lignocellulosic biomass as well as their biological applications including drug delivery, biosensor, tissue engineering etc will be reviewed. It is anticipated there will be an increasing interest and research findings in cellulose, hemicellulose and lignin from natural resources, which help provide important directions for the development in biomedical applications. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Terminology in South Africa*

African Journals Online (AJOL)

This paper was presented at the Third International Conference of the African ... Various aspects relating to principles and methods of terminology and .... Standardization. Research and Development. Marketing. Communications ..... Exam- ple 8). This is an attempt at conveying to the user the meaning attached to the tenn.
Reformulating XQuery queries using GLAV mapping and complex unification

Directory of Open Access Journals (Sweden)

Saber Benharzallah

2016-01-01

Full Text Available This paper describes an algorithm for reformulation of XQuery queries. The mediation is based on an essential component called mediator. Its main role is to reformulate a user query, written in terms of global schema, into queries written in terms of source schemas. Our algorithm is based on the principle of logical equivalence, simple and complex unification, to obtain a better reformulation. It takes XQuery query, global schema (written in XMLSchema, and mappings GLAV as input parameters and provides resultant query written in terms of source schemas. The results of implementation show the proper functioning of the algorithm.
Towards Optimal Multi-Dimensional Query Processing with BitmapIndices

Energy Technology Data Exchange (ETDEWEB)

Rotem, Doron; Stockinger, Kurt; Wu, Kesheng

2005-09-30

Bitmap indices have been widely used in scientific applications and commercial systems for processing complex, multi-dimensional queries where traditional tree-based indices would not work efficiently. This paper studies strategies for minimizing the access costs for processing multi-dimensional queries using bitmap indices with binning. Innovative features of our algorithm include (a) optimally placing the bin boundaries and (b) dynamically reordering the evaluation of the query terms. In addition, we derive several analytical results concerning optimal bin allocation for a probabilistic query model. Our experimental evaluation with real life data shows an average I/O cost improvement of at least a factor of 10 for multi-dimensional queries on datasets from two different applications. Our experiments also indicate that the speedup increases with the number of query dimensions.
The Medical Terminology Course--Its Necessity and the Solution.

Science.gov (United States)

Phillips, J. H.

1981-01-01

Addresses difficulties faced by medical students in the acquisition of a technical terminology largely based on Greek or Latin, and explains how in recent years undergraduate Classics departments have met the challenge by offering a Medical Terminology course. Discusses course development and currently available instruction materials. (MES)
Similarity-based recommendation of new concepts to a terminology

NARCIS (Netherlands)

Chandar, Praveen; Yaman, Anil; Hoxha, Julia; He, Zhe; Weng, Chunhua

2015-01-01

Terminologies can suffer from poor concept coverage due to delays in addition of new concepts. This study tests a similarity-based approach to recommending concepts from a text corpus to a terminology. Our approach involves extraction of candidate concepts from a given text corpus, which are
Terminology in South Africa*

African Journals Online (AJOL)

needs to facilitate international communication. Various aspects ... terminology would therefore form part of the special language of a particular ... teaching (see Figure 1). 2.2 .... a cognitive one, which relates the linguistic forms to their conceptual .... "to bear a burden, keep in custody", from bajulus "porter, load carrier".
From Data to Knowledge through Concept-oriented Terminologies

Science.gov (United States)

Cimino, James J.

2000-01-01

Knowledge representation involves enumeration of conceptual symbols and arrangement of these symbols into some meaningful structure. Medical knowledge representation has traditionally focused more on the structure than the symbols. Several significant efforts are under way, at local, national, and international levels, to address the representation of the symbols though the creation of high-quality terminologies that are themselves knowledge based. This paper reviews these efforts, including the Medical Entities Dictionary (MED) in use at Columbia University and the New York Presbyterian Hospital. A decade's experience with the MED is summarized to serve as a proof-of-concept that knowledge-based terminologies can support the use of coded patient data for a variety of knowledge-based activities, including the improved understanding of patient data, the access of information sources relevant to specific patient care problems, the application of expert systems directly to the care of patients, and the discovery of new medical knowledge. The terminological knowledge in the MED has also been used successfully to support clinical application development and maintenance, including that of the MED itself. On the basis of this experience, current efforts to create standard knowledge-based terminologies appear to be justified. PMID:10833166
AQBE — QBE Style Queries for Archetyped Data

Science.gov (United States)

Sachdeva, Shelly; Yaginuma, Daigo; Chu, Wanming; Bhalla, Subhash

Large-scale adoption of electronic healthcare applications requires semantic interoperability. The new proposals propose an advanced (multi-level) DBMS architecture for repository services for health records of patients. These also require query interfaces at multiple levels and at the level of semi-skilled users. In this regard, a high-level user interface for querying the new form of standardized Electronic Health Records system has been examined in this study. It proposes a step-by-step graphical query interface to allow semi-skilled users to write queries. Its aim is to decrease user effort and communication ambiguities, and increase user friendliness.
Group-by Skyline Query Processing in Relational Engines

DEFF Research Database (Denmark)

Yiu, Man Lung; Luk, Ming-Hay; Lo, Eric

2009-01-01

the missing cost model for the BBS algorithm. Experimental results show that our techniques are able to devise the best query plans for a variety of group-by skyline queries. Our focus is on algorithms that can be directly implemented in today's commercial database systems without the addition of new access......The skyline operator was first proposed in 2001 for retrieving interesting tuples from a dataset. Since then, 100+ skyline-related papers have been published; however, we discovered that one of the most intuitive and practical type of skyline queries, namely, group-by skyline queries remains...
Head First jQuery

CERN Document Server

Benedetti, Ryan

2011-01-01

Want to add more interactivity and polish to your websites? Discover how jQuery can help you build complex scripting functionality in just a few lines of code. With Head First jQuery, you'll quickly get up to speed on this amazing JavaScript library by learning how to navigate HTML documents while handling events, effects, callbacks, and animations. By the time you've completed the book, you'll be incorporating Ajax apps, working seamlessly with HTML and CSS, and handling data with PHP, MySQL and JSON. If you want to learn-and understand-how to create interactive web pages, unobtrusive scrip
Standard Terminology Relating to Wear and Erosion

CERN Document Server

American Society for Testing and Materials. Philadelphia

2010-01-01

1.1 The terms and their definitions given herein represent terminology relating to wear and erosion of solid bodies due to mechanical interactions such as occur with cavitation, impingement by liquid jets or drops or by solid particles, or relative motion against contacting solid surfaces or fluids. This scope interfaces with but generally excludes those processes where material loss is wholly or principally due to chemical action and other related technical fields as, for instance, lubrication. 1.2 This terminology is not exhaustive; the absence of any particular term from this collection does not necessarily imply that its use within this scope is discouraged. However, the terms given herein are the recommended terms for the concepts they represent unless otherwise noted. 1.3 Certain general terms and definitions may be restricted and interpreted, if necessary, to make them particularly applicable to the scope as defined herein. 1.4 The purpose of this terminology is to encourage uniformity and accuracy ...
ALGORITMA RC4 DALAM PROTEKSI TRANSMISI DAN HASIL QUERY UNTUK ORDBMS POSTGRESQL

Directory of Open Access Journals (Sweden)

Yuri Ariyanto

2009-01-01

Full Text Available In this research will be worked through about how cryptography RC4's algorithm implementation in protection to query result and of query, security by encryption and descryption up to both is in network. Implementation of this research which is build software in client that function access databases that is placed by the side of server. Software that building to have facility for encryption and descryption query result and of query that is sent from client goes to server and. transmission query result and of query can secure its security. Well guaranted transmission security him of query result and of query can be told to succeed if success software can encryption query result and of query which transmission so that in the event of scanning to both, scanning will not understand data content. Conclusion of this research that is woke up software succeed encryption query and result of query which transmission between application of client and of server databases. Abstract in Bahasa Indonesia: Pada penelitian ini dibahas mengenai bagaimana mengimplementasikan algoritma kriptografi RC4 dalam proteksi terhadap query dan hasil query, pengamanan dilakukan dengan cara melakukan enkripsi dan dekripsi selama keduanya berada di dalam jaringan. Pengimplementasian dari penelitian ini yaitu membangun sebuah software yang akan diletakkan di sisi client yang berfungsi mengakses database yang diletakkan di sisi server. Software yang dibangun memiliki fasilitas untuk mengenkripsi dan mendektipsi query dan hasil query yang dikirimkan dari client ke server dan juga sebaliknya. Dengan demikian tramsmisi query dan hasil query dapat terjamin keamanannya.Terjaminnya keamanan transmisi query dan hasil query dapat dikatakan berhasil jika software berhasil mengenkripsi query dan hasil query yang ditransmisikan sehingga apabila terjadi penyadapan terhadap keduanya, penyadap tidak akan mengerti isi data tersebut. Kesimpulan dari penelitian ini yaitu software yang dibangun
Relative aggregation operator in database fuzzy querying

Directory of Open Access Journals (Sweden)

Luminita DUMITRIU

2005-12-01

Full Text Available Fuzzy selection criteria querying relational databases include vague terms; they usually refer linguistic values form the attribute linguistic domains, defined as fuzzy sets. Generally, when a vague query is processed, the definitions of vague terms must already exist in a knowledge base. But there are also cases when vague terms must be dynamically defined, when a particular operation is used to aggregate simple criteria in a complex selection. The paper presents a new aggregation operator and the corresponding algorithm to evaluate the fuzzy query.
Query-Time Optimization Techniques for Structured Queries in Information Retrieval

Science.gov (United States)

Cartright, Marc-Allen

2013-01-01

The use of information retrieval (IR) systems is evolving towards larger, more complicated queries. Both the IR industrial and research communities have generated significant evidence indicating that in order to continue improving retrieval effectiveness, increases in retrieval model complexity may be unavoidable. From an operational perspective,…
Improving Web Search for Difficult Queries

Science.gov (United States)

Wang, Xuanhui

2009-01-01

Search engines have now become essential tools in all aspects of our life. Although a variety of information needs can be served very successfully, there are still a lot of queries that search engines can not answer very effectively and these queries always make users feel frustrated. Since it is quite often that users encounter such "difficult…
Matching health information seekers' queries to medical terms.

Science.gov (United States)

Soualmia, Lina F; Prieur-Gaston, Elise; Moalla, Zied; Lecroq, Thierry; Darmoni, Stéfan J

2012-01-01

The Internet is a major source of health information but most seekers are not familiar with medical vocabularies. Hence, their searches fail due to bad query formulation. Several methods have been proposed to improve information retrieval: query expansion, syntactic and semantic techniques or knowledge-based methods. However, it would be useful to clean those queries which are misspelled. In this paper, we propose a simple yet efficient method in order to correct misspellings of queries submitted by health information seekers to a medical online search tool. In addition to query normalizations and exact phonetic term matching, we tested two approximate string comparators: the similarity score function of Stoilos and the normalized Levenshtein edit distance. We propose here to combine them to increase the number of matched medical terms in French. We first took a sample of query logs to determine the thresholds and processing times. In the second run, at a greater scale we tested different combinations of query normalizations before or after misspelling correction with the retained thresholds in the first run. According to the total number of suggestions (around 163, the number of the first sample of queries), at a threshold comparator score of 0.3, the normalized Levenshtein edit distance gave the highest F-Measure (88.15%) and at a threshold comparator score of 0.7, the Stoilos function gave the highest F-Measure (84.31%). By combining Levenshtein and Stoilos, the highest F-Measure (80.28%) is obtained with 0.2 and 0.7 thresholds respectively. However, queries are composed by several terms that may be combination of medical terms. The process of query normalization and segmentation is thus required. The highest F-Measure (64.18%) is obtained when this process is realized before spelling-correction. Despite the widely known high performance of the normalized edit distance of Levenshtein, we show in this paper that its combination with the Stoilos algorithm improved
Processing SPARQL queries with regular expressions in RDF databases

Science.gov (United States)

2011-01-01

Background As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users’ requests for extracting information from the RDF data as well as the lack of users’ knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. Results In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Conclusions Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns. PMID:21489225
Processing SPARQL queries with regular expressions in RDF databases.

Science.gov (United States)

Lee, Jinsoo; Pham, Minh-Duc; Lee, Jihwan; Han, Wook-Shin; Cho, Hune; Yu, Hwanjo; Lee, Jeong-Hoon

2011-03-29

As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users' requests for extracting information from the RDF data as well as the lack of users' knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns.
Next generation terminology infrastructure to support interprofessional care planning.

Science.gov (United States)

Collins, Sarah; Klinkenberg-Ramirez, Stephanie; Tsivkin, Kira; Mar, Perry L; Iskhakova, Dina; Nandigam, Hari; Samal, Lipika; Rocha, Roberto A

2017-11-01

Develop a prototype of an interprofessional terminology and information model infrastructure that can enable care planning applications to facilitate patient-centered care, learn care plan linkages and associations, provide decision support, and enable automated, prospective analytics. The study steps included a 3 step approach: (1) Process model and clinical scenario development, and (2) Requirements analysis, and (3) Development and validation of information and terminology models. Components of the terminology model include: Health Concerns, Goals, Decisions, Interventions, Assessments, and Evaluations. A terminology infrastructure should: (A) Include discrete care plan concepts; (B) Include sets of profession-specific concerns, decisions, and interventions; (C) Communicate rationales, anticipatory guidance, and guidelines that inform decisions among the care team; (D) Define semantic linkages across clinical events and professions; (E) Define sets of shared patient goals and sub-goals, including patient stated goals; (F) Capture evaluation toward achievement of goals. These requirements were mapped to AHRQ Care Coordination Measures Framework. This study used a constrained set of clinician-validated clinical scenarios. Terminology models for goals and decisions are unavailable in SNOMED CT, limiting the ability to evaluate these aspects of the proposed infrastructure. Defining and linking subsets of care planning concepts appears to be feasible, but also essential to model interprofessional care planning for common co-occurring conditions and chronic diseases. We recommend the creation of goal dynamics and decision concepts in SNOMED CT to further enable the necessary models. Systems with flexible terminology management infrastructure may enable intelligent decision support to identify conflicting and aligned concerns, goals, decisions, and interventions in shared care plans, ultimately decreasing documentation effort and cognitive burden for clinicians and

RCQ-GA: RDF Chain Query Optimization Using Genetic Algorithms

Science.gov (United States)

Hogenboom, Alexander; Milea, Viorel; Frasincar, Flavius; Kaymak, Uzay

The application of Semantic Web technologies in an Electronic Commerce environment implies a need for good support tools. Fast query engines are needed for efficient querying of large amounts of data, usually represented using RDF. We focus on optimizing a special class of SPARQL queries, the so-called RDF chain queries. For this purpose, we devise a genetic algorithm called RCQ-GA that determines the order in which joins need to be performed for an efficient evaluation of RDF chain queries. The approach is benchmarked against a two-phase optimization algorithm, previously proposed in literature. The more complex a query is, the more RCQ-GA outperforms the benchmark in solution quality, execution time needed, and consistency of solution quality. When the algorithms are constrained by a time limit, the overall performance of RCQ-GA compared to the benchmark further improves.
A Streams-Based Framework for Defining Location-Based Queries

DEFF Research Database (Denmark)

Jensen, Christian Søndergaard; Xuegang, Huang

2007-01-01

n infrastructure is emerging that supports the delivery of on-line, location-enabled services to mobile users. Such services involve novel database queries, and the database research community is quite active in proposing techniques for the efficient processing of such queries. In parallel to this......, the management of data streams has become an active area of research. While most research in mobile services concerns performance issues, this paper aims to establish a formal framework for defining the semantics of queries encountered in mobile services, most notably the so-called continuous queries...... that are particularly relevant in this context. Rather than inventing an entirely new framework, the paper proposes a framework that builds on concepts from data streams and temporal databases. Definitions of example queries demonstrates how the framework enables clear formulation of query semantics and the comparison...
Terminology in the Making

DEFF Research Database (Denmark)

Ørum, Tania

2017-01-01

Pop art seems to be a more prevalent term in Sweden, whereas in Denmark the dominant term was minimalism. However, some of the problems of developing a terminology and agreeing on a description of the new art movements in the 1960s seem to exist in the American context as well....
Query Expansion: Is It Necessary In Textual Case-Based Reasoning ...

African Journals Online (AJOL)

Query expansion (QE) is the process of transforming a seed query to improve retrieval performance in information retrieval operations. It is often intended to overcome a vocabulary mismatch between the query and the document collection. Query expansion is known to improve retrieval effectiveness of some information ...
Determinacy in Static Analysis of jQuery

DEFF Research Database (Denmark)

Andreasen, Esben; Møller, Anders

2014-01-01

Static analysis for JavaScript can potentially help programmers find errors early during development. Although much progress has been made on analysis techniques, a major obstacle is the prevalence of libraries, in particular jQuery, which apply programming patterns that have detrimental conseque......Static analysis for JavaScript can potentially help programmers find errors early during development. Although much progress has been made on analysis techniques, a major obstacle is the prevalence of libraries, in particular jQuery, which apply programming patterns that have detrimental...... present a static dataflow analysis for JavaScript that infers and exploits determinacy information on-the-fly, to enable analysis of some of the most complex parts of jQuery. The techniques are implemented in the TAJS analysis tool and evaluated on a collection of small programs that use jQuery. Our...
Considerations regarding nuclear medicine terminology

International Nuclear Information System (INIS)

Als, C.

2008-01-01

This article through some examples shows us all the interest of the use of terminology in nuclear medicine. Each would find in it its interest, from the patient to the doctors in different disciplines. (N.C.)
Experimental quantum private queries with linear optics

International Nuclear Information System (INIS)

De Martini, Francesco; Giovannetti, Vittorio; Lloyd, Seth; Maccone, Lorenzo; Nagali, Eleonora; Sansoni, Linda; Sciarrino, Fabio

2009-01-01

The quantum private query is a quantum cryptographic protocol to recover information from a database, preserving both user and data privacy: the user can test whether someone has retained information on which query was asked and the database provider can test the amount of information released. Here we discuss a variant of the quantum private query algorithm that admits a simple linear optical implementation: it employs the photon's momentum (or time slot) as address qubits and its polarization as bus qubit. A proof-of-principle experimental realization is implemented.
Evaluating XML-Extended OLAP Queries Based on a Physical Algebra

DEFF Research Database (Denmark)

Yin, Xuepeng; Pedersen, Torben Bach

2006-01-01

. In this paper, we extend previous work on the logical federation of OLAP and XML data sources by presenting a simplified query semantics, a physical query algebra and a robust OLAP-XML query engine as well as the query evaluation techniques. Performance experiments with a prototypical implementation suggest...
Analysis of PubMed User Sessions Using a Full-Day PubMed Query Log: A Comparison of Experienced and Nonexperienced PubMed Users

Science.gov (United States)

2015-01-01

Background PubMed is the largest biomedical bibliographic information source on the Internet. PubMed has been considered one of the most important and reliable sources of up-to-date health care evidence. Previous studies examined the effects of domain expertise/knowledge on search performance using PubMed. However, very little is known about PubMed users’ knowledge of information retrieval (IR) functions and their usage in query formulation. Objective The purpose of this study was to shed light on how experienced/nonexperienced PubMed users perform their search queries by analyzing a full-day query log. Our hypotheses were that (1) experienced PubMed users who use system functions quickly retrieve relevant documents and (2) nonexperienced PubMed users who do not use them have longer search sessions than experienced users. Methods To test these hypotheses, we analyzed PubMed query log data containing nearly 3 million queries. User sessions were divided into two categories: experienced and nonexperienced. We compared experienced and nonexperienced users per number of sessions, and experienced and nonexperienced user sessions per session length, with a focus on how fast they completed their sessions. Results To test our hypotheses, we measured how successful information retrieval was (at retrieving relevant documents), represented as the decrease rates of experienced and nonexperienced users from a session length of 1 to 2, 3, 4, and 5. The decrease rate (from a session length of 1 to 2) of the experienced users was significantly larger than that of the nonexperienced groups. Conclusions Experienced PubMed users retrieve relevant documents more quickly than nonexperienced PubMed users in terms of session length. PMID:26139516
Parallelizing Federated SPARQL Queries in Presence of Replicated Data

DEFF Research Database (Denmark)

Minier, Thomas; Montoya, Gabriela; Skaf-Molli, Hala

2017-01-01

Federated query engines have been enhanced to exploit new data localities created by replicated data, e.g., Fedra. However, existing replication aware federated query engines mainly focus on pruning sources during the source selection and query decomposition in order to reduce intermediate result...
A study of medical and health queries to web search engines.

Science.gov (United States)

Spink, Amanda; Yang, Yin; Jansen, Jim; Nykanen, Pirrko; Lorence, Daniel P; Ozmutlu, Seda; Ozmutlu, H Cenk

2004-03-01

This paper reports findings from an analysis of medical or health queries to different web search engines. We report results: (i). comparing samples of 10000 web queries taken randomly from 1.2 million query logs from the AlltheWeb.com and Excite.com commercial web search engines in 2001 for medical or health queries, (ii). comparing the 2001 findings from Excite and AlltheWeb.com users with results from a previous analysis of medical and health related queries from the Excite Web search engine for 1997 and 1999, and (iii). medical or health advice-seeking queries beginning with the word 'should'. Findings suggest: (i). a small percentage of web queries are medical or health related, (ii). the top five categories of medical or health queries were: general health, weight issues, reproductive health and puberty, pregnancy/obstetrics, and human relationships, and (iii). over time, the medical and health queries may have declined as a proportion of all web queries, as the use of specialized medical/health websites and e-commerce-related queries has increased. Findings provide insights into medical and health-related web querying and suggests some implications for the use of the general web search engines when seeking medical/health information.
Morphological assimilation of borrowed terminology (on the example of terminological units, borrowed from French

Directory of Open Access Journals (Sweden)

Kaneeva Anna Vitalievna

2015-06-01

Full Text Available The study of morphological assimilation is an important and mandatory problem as far as the native speaker media will be comfortable using the term of the borrowing language in the flow of speech, without considering the specific grammatical forms, largely determines its subsequent semantic assimilation, its incorporation into a particular terminology system. The analysis clearly shows the place of French borrowings in the morphological system of the Russian language, helps to identify the most significant differences between the structures of the two languages. At the same time it suggests that many French terminology borrowings morphologically assimilated fairly well, there were some groups in the Russian language, which transformed morphologically French elements falling into Russian. This makes borrowings more smooth, and loan word, provided that it actually meets the needs of the language - receptor in the host language adapts quickly and easily absorbed by native speakers
Processing SPARQL queries with regular expressions in RDF databases

Directory of Open Access Journals (Sweden)

Cho Hune

2011-03-01

Full Text Available Abstract Background As the Resource Description Framework (RDF data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf or Bio2RDF (bio2rdf.org, SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users’ requests for extracting information from the RDF data as well as the lack of users’ knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. Results In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1 We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2 We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3 We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Conclusions Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns.
Macromolecular query language (MMQL): prototype data model and implementation.

Science.gov (United States)

Shindyalov, I N; Chang, W; Pu, C; Bourne, P E

1994-11-01

Macromolecular query language (MMQL) is an extensible interpretive language in which to pose questions concerning the experimental or derived features of the 3-D structure of biological macromolecules. MMQL portends to be intuitive with a simple syntax, so that from a user's perspective complex queries are easily written. A number of basic queries and a more complex query--determination of structures containing a five-strand Greek key motif--are presented to illustrate the strengths and weaknesses of the language. The predominant features of MMQL are a filter and pattern grammar which are combined to express a wide range of interesting biological queries. Filters permit the selection of object attributes, for example, compound name and resolution, whereas the patterns currently implemented query primary sequence, close contacts, hydrogen bonding, secondary structure, conformation and amino acid properties (volume, polarity, isoelectric point, hydrophobicity and different forms of exposure). MMQL queries are processed by MMQLlib; a C++ class library, to which new query methods and pattern types are easily added. The prototype implementation described uses PDBlib, another C(++)-based class library from representing the features of biological macromolecules at the level of detail parsable from a PDB file. Since PDBlib can represent data stored in relational and object-oriented databases, as well as PDB files, once these data are loaded they too can be queried by MMQL. Performance metrics are given for queries of PDB files for which all derived data are calculated at run time and compared to a preliminary version of OOPDB, a prototype object-oriented database with a schema based on a persistent version of PDBlib which offers more efficient data access and the potential to maintain derived information. MMQLlib, PDBlib and associated software are available via anonymous ftp from cuhhca.hhmi.columbia.edu.
RDF-GL : a SPARQL-based graphical query language for RDF

NARCIS (Netherlands)

Hogenboom, F.P.; Milea, D.V.; Frasincar, F.; Kaymak, U.; Chbeir, R.; Badr, Y.; Abraham, A.; Hassanien, A.-E.

2010-01-01

This chapter presents RDF-GL, a graphical query language (GQL) for RDF. The GQL is based on the textual query language SPARQL and mainly focuses on SPARQL SELECT queries. The advantage of a GQL over textual query languages is that complexity is hidden through the use of graphical symbols. RDF-GL is
Calibration of personal dosimeters: Quantities and terminology

International Nuclear Information System (INIS)

Aleinikov, V.E.

1999-01-01

The numerical results obtained in the interpretation of individual monitoring of external radiation depend not only on the accurate calibration of the radiation measurement instruments involved, but also on the definition of the quantities in term of which these instruments are calibrated The absence of uniformity in terminology not only makes it difficult to understand properly the scientific and technical literature but can also lead to incorrect interpretation of particular concepts and recommendations. In this paper, brief consideration is given to definition of radiation quantities and terminology used in calibration procedures. (author)
Executing SPARQL Queries over the Web of Linked Data

Science.gov (United States)

Hartig, Olaf; Bizer, Christian; Freytag, Johann-Christoph

The Web of Linked Data forms a single, globally distributed dataspace. Due to the openness of this dataspace, it is not possible to know in advance all data sources that might be relevant for query answering. This openness poses a new challenge that is not addressed by traditional research on federated query processing. In this paper we present an approach to execute SPARQL queries over the Web of Linked Data. The main idea of our approach is to discover data that might be relevant for answering a query during the query execution itself. This discovery is driven by following RDF links between data sources based on URIs in the query and in partial results. The URIs are resolved over the HTTP protocol into RDF data which is continuously added to the queried dataset. This paper describes concepts and algorithms to implement our approach using an iterator-based pipeline. We introduce a formalization of the pipelining approach and show that classical iterators may cause blocking due to the latency of HTTP requests. To avoid blocking, we propose an extension of the iterator paradigm. The evaluation of our approach shows its strengths as well as the still existing challenges.
A Fuzzy Query Mechanism for Human Resource Websites

Science.gov (United States)

Lai, Lien-Fu; Wu, Chao-Chin; Huang, Liang-Tsung; Kuo, Jung-Chih

Users' preferences often contain imprecision and uncertainty that are difficult for traditional human resource websites to deal with. In this paper, we apply the fuzzy logic theory to develop a fuzzy query mechanism for human resource websites. First, a storing mechanism is proposed to store fuzzy data into conventional database management systems without modifying DBMS models. Second, a fuzzy query language is proposed for users to make fuzzy queries on fuzzy databases. User's fuzzy requirement can be expressed by a fuzzy query which consists of a set of fuzzy conditions. Third, each fuzzy condition associates with a fuzzy importance to differentiate between fuzzy conditions according to their degrees of importance. Fourth, the fuzzy weighted average is utilized to aggregate all fuzzy conditions based on their degrees of importance and degrees of matching. Through the mutual compensation of all fuzzy conditions, the ordering of query results can be obtained according to user's preference.
Evaluating Trajectory Queries over Imprecise Location Data

DEFF Research Database (Denmark)

Xie, Scott, Xike; Cheng, Reynold; Yiu, Man Lung

2012-01-01

Trajectory queries, which retrieve nearby objects for every point of a given route, can be used to identify alerts of potential threats along a vessel route, or monitor the adjacent rescuers to a travel path. However, the locations of these objects (e.g., threats, succours) may not be precisely...... obtained due to hardware limitations of measuring devices, as well as the constantly-changing nature of the external environment. Ignoring data uncertainty can render low query quality, and cause undesirable consequences such as missing alerts of threats and poor response time in rescue operations. Also......, the query is quite time-consuming, since all the points on the trajectory are considered. In this paper, we study how to efficiently evaluate trajectory queries over imprecise location data, by proposing a new concept called the u-bisector. In general, the u-bisector is an extension of bisector to handle...
Semantic querying of data guided by Formal Concept Analysis

OpenAIRE

Codocedo , Victor; Lykourentzou , Ioanna; Napoli , Amedeo

2012-01-01

International audience; In this paper we present a novel approach to handle querying over a concept lattice of documents and annotations. We focus on the problem of "non-matching documents", which are those that, despite being semantically relevant to the user query, do not contain the query's elements and hence cannot be retrieved by typical string matching approaches. In order to find these documents, we modify the initial user query using the concept lattice as a guide. We achieve this by ...

Unemployment Insurance Query (UIQ)

Data.gov (United States)

Social Security Administration — The Unemployment Insurance Query (UIQ) provides State Unemployment Insurance agencies real-time online access to SSA data. This includes SSN verification and Title...
Linguistic aspects of eponymic professional endocrinologic terminology

Directory of Open Access Journals (Sweden)

N.I. Bytsko

2017-03-01

Full Text Available Background. Special linguistic researches of terminological units of different branches of medicine allow analyzing in details the ways of creating the systems of clinical terminology from different aspects: historical, scientific, cultural, linguistic and semantic. There is a wide area of terminology related to the clinical and experimental endocrinology within general medical terminological system. The purpose of the study: to demonstrate the structure of endocrine medical terms — eponyms through the prism of systematization of methodological researches on eponymic vocabulary. Materials and methods. The actual material received as a result of a total choice of eponyms (there were 296 terms from the “Reference dictionary for endocrinologist”, which was composed by the scientists of V. Danilevsky Institute of Endocrine Pathology Problems and Kharkiv Medical Academy of Postgraduate education — A.V. Kozakov, N.A. Kravchun, I.M. Ilyina, M.I. Zubko, O.A. Goncharova, I.V. Cherniavska has 10,000 endocrine terms, the authors successfully streamlined medical terms of the clinical and experimental endocrinology into the vocabulary. The method of total choice of terms from professional literature, the descriptive method and distributive method were used in the study that allowed distinguishing lexical and semantic features of eponymic terms in the branch of endocrinology. Results. The obtained results point out to the modernity of studies in the field of clinical and experimental endocrinology, which is due to the fact that this is the oldest terminology, by the example of which it is possible to trace the ways of formation, development and improvement of terms, the realization of semantic processes, certain trends, ways and means of word formation. Conclusions. The results of the research on the above mentioned sublanguage of clinical medicine at the level of linguistic observations of the functioning in dictionaries and scientific works will
Varieties of propositional and dictum motivation analytical terms of scientifictechnical terminology

OpenAIRE

Garashchenko, Liliya

2014-01-01

The article focuses on cognitive-onomasiological analysis of analytical terms of scientific-technical terminology. The motivational features of terminological construction predicate-thematic and hyperonym varieties are defined.
Development of terminology for mammographic techniques for radiological technologists.

Science.gov (United States)

Yagahara, Ayako; Yokooka, Yuki; Tsuji, Shintaro; Nishimoto, Naoki; Uesugi, Masahito; Muto, Hiroshi; Ohba, Hisateru; Kurowarabi, Kunio; Ogasawara, Katsuhiko

2011-07-01

We are developing a mammographic ontology to share knowledge of the mammographic domain for radiologic technologists, with the aim of improving mammographic techniques. As a first step in constructing the ontology, we used mammography reference books to establish mammographic terminology for identifying currently available knowledge. This study proceeded in three steps: (1) determination of the domain and scope of the terminology, (2) lexical extraction, and (3) construction of hierarchical structures. We extracted terms mainly from three reference books and constructed the hierarchical structures manually. We compared features of the terms extracted from the three reference books. We constructed a terminology consisting of 440 subclasses grouped into 19 top-level classes: anatomic entity, image quality factor, findings, material, risk, breast, histological classification of breast tumors, role, foreign body, mammographic technique, physics, purpose of mammography examination, explanation of mammography examination, image development, abbreviation, quality control, equipment, interpretation, and evaluation of clinical imaging. The number of terms that occurred in the subclasses varied depending on which reference book was used. We developed a terminology of mammographic techniques for radiologic technologists consisting of 440 terms.
Thoughts on ISO and the development of terminologies in Southern ...

African Journals Online (AJOL)

The implications of language policy decisions on the development of technical languages, terminologies, general lexicography and the dissemination of information will also require special attention in a new South Africa. Keywords: standardisation, terminology, technical language, termbank, lexical data, networks, ...
A Relational Algebra Query Language for Programming Relational Databases

Science.gov (United States)

McMaster, Kirby; Sambasivam, Samuel; Anderson, Nicole

2011-01-01

In this paper, we describe a Relational Algebra Query Language (RAQL) and Relational Algebra Query (RAQ) software product we have developed that allows database instructors to teach relational algebra through programming. Instead of defining query operations using mathematical notation (the approach commonly taken in database textbooks), students…
Functioning of the English tourism terminology in the guides to Ukraine

OpenAIRE

Прима, В. В.

2015-01-01

The article outlines main aspects of study of English tourism terminology, in particular, functional. General specific features of English guides and peculiarities of tourism terms functioning in them have been reviewed in the article.Attempted analysis of theoretical and practical aspects of investigating terminology in the works of contemporary scientists made it possible for us to identify a general tendency to consider terminological vocabulary from the points of view of semantics, struct...
Instant MDX queries for SQL Server 2012

CERN Document Server

Emond, Nicholas

2013-01-01

Get to grips with a new technology, understand what it is and what it can do for you, and then get to work with the most important features and tasks. This short, focused guide is a great way to get stated with writing MDX queries. New developers can use this book as a reference for how to use functions and the syntax of a query as well as how to use Calculated Members and Named Sets.This book is great for new developers who want to learn the MDX query language from scratch and install SQL Server 2012 with Analysis Services
Keyword Query Expansion Paradigm Based on Recommendation and Interpretation in Relational Databases

Directory of Open Access Journals (Sweden)

Yingqi Wang

2017-01-01

Full Text Available Due to the ambiguity and impreciseness of keyword query in relational databases, the research on keyword query expansion has attracted wide attention. Existing query expansion methods expose users’ query intention to a certain extent, but most of them cannot balance the precision and recall. To address this problem, a novel two-step query expansion approach is proposed based on query recommendation and query interpretation. First, a probabilistic recommendation algorithm is put forward by constructing a term similarity matrix and Viterbi model. Second, by using the translation algorithm of triples and construction algorithm of query subgraphs, query keywords are translated to query subgraphs with structural and semantic information. Finally, experimental results on a real-world dataset demonstrate the effectiveness and rationality of the proposed method.
A usability evaluation of a SNOMED CT based compositional interface terminology for intensive care.

Science.gov (United States)

Bakhshi-Raiez, F; de Keizer, N F; Cornet, R; Dorrepaal, M; Dongelmans, D; Jaspers, M W M

2012-05-01

To evaluate the usability of a large compositional interface terminology based on SNOMED CT and the terminology application for registration of the reasons for intensive care admission in a Patient Data Management System. Observational study with user-based usability evaluations before and 3 months after the system was implemented and routinely used. Usability was defined by five aspects: effectiveness, efficiency, learnability, overall user satisfaction, and experienced usability problems. Qualitative (the Think-Aloud user testing method) and quantitative (the System Usability Scale questionnaire and Time-on-Task analyses) methods were used to examine these usability aspects. The results of the evaluation study revealed that the usability of the interface terminology fell short (SUS scores before and after implementation of 47.2 out of 100 and 37.5 respectively out of 100). The qualitative measurements revealed a high number (n=35) of distinct usability problems, leading to ineffective and inefficient registration of reasons for admission. The effectiveness and efficiency of the system did not change over time. About 14% (n=5) of the revealed usability problems were related to the terminology content based on SNOMED CT, while the remaining 86% (n=30) was related to the terminology application. The problems related to the terminology content were more severe than the problems related to the terminology application. This study provides a detailed insight into how clinicians interact with a controlled compositional terminology through a terminology application. The extensiveness, complexity of the hierarchy, and the language usage of an interface terminology are defining for its usability. Carefully crafted domain-specific subsets and a well-designed terminology application are needed to facilitate the use of a complex compositional interface terminology based on SNOMED CT. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Enabling Semantic Queries Against the Spatial Database

Directory of Open Access Journals (Sweden)

PENG, X.

2012-02-01

Full Text Available The spatial database based upon the object-relational database management system (ORDBMS has the merits of a clear data model, good operability and high query efficiency. That is why it has been widely used in spatial data organization and management. However, it cannot express the semantic relationships among geospatial objects, making the query results difficult to meet the user's requirement well. Therefore, this paper represents an attempt to combine the Semantic Web technology with the spatial database so as to make up for the traditional database's disadvantages. In this way, on the one hand, users can take advantages of ORDBMS to store and manage spatial data; on the other hand, if the spatial database is released in the form of Semantic Web, the users could describe a query more concisely with the cognitive pattern which is similar to that of daily life. As a consequence, this methodology enables the benefits of both Semantic Web and the object-relational database (ORDB available. The paper discusses systematically the semantic enriched spatial database's architecture, key technologies and implementation. Subsequently, we demonstrate the function of spatial semantic queries via a practical prototype system. The query results indicate that the method used in this study is feasible.
Extracting Rankings for Spatial Keyword Queries from GPS Data

DEFF Research Database (Denmark)

Keles, Ilkcan; Jensen, Christian Søndergaard; Saltenis, Simonas

2018-01-01

Studies suggest that many search engine queries have local intent. We consider the evaluation of ranking functions important for such queries. The key challenge is to be able to determine the “best” ranking for a query, as this enables evaluation of the results of ranking functions. We propose...
An Object-Oriented Approach of Keyword Querying over Fuzzy XML

Directory of Open Access Journals (Sweden)

Ting Li

2016-09-01

Full Text Available As the fuzzy data management has become one of the main research topics and directions, the question of how to obtain the useful information by means of keyword query from fuzzy XML documents is becoming a subject of an increasing needed investigation. Considering the keyword query methods on crisp XML documents, smallest lowest common ancestor (SLCA semantics is one of the most widely accepted semantics. When users propose the keyword query on fuzzy XML documents with the SLCA semantics, the query results are always incomplate, with low precision, and with no possibilities values returned. Most of keyword query semantics on XML documents only consider query results matching all keywords, yet users may also be interested in the query results matching partial keywords. To overcome these limitations, in this paper, we investigate how to obtain more comprehensive and meaningful results of keyword querying on fuzzy XML documents. We propose a semantics of object-oriented keyword querying on fuzzy XML documents. First, we introduce the concept of "object tree", analyze different types of matching result object trees and find the "minimum result object trees" which contain all keywords and "result object trees" which contain partial keywords. Then an object-oriented keyword query algorithm ROstack is proposed to obtain the root nodes of these matching result object trees, together with their possibilities. At last, experiments are conducted to verify the effectiveness and efficiency of our proposed algorithm.
An architecture for standardized terminology services by wrapping and integration of existing applications

NARCIS (Netherlands)

Cornet, Roland; Prins, Antoon K.

2003-01-01

Research on terminology services has resulted in development of applications and definition of standards, but has not yet led to widespread use of (standardized) terminology services in practice. Current terminology services offer functionality both for concept representation and lexical knowledge
Intellectualization through Terminology Development | Khumalo ...

African Journals Online (AJOL)

The article will propose an improved model to cater for AnyTime Access, which is convenient for student needs between lec-tures, and improve the harvesting mechanism in the existing model. Keywords: Intellectualization, Terminology Development, Harvesting, Crowdsourcing, Consultation, Verification, Authentication, ...
Genetic algorithms for RDF chain query optimization

NARCIS (Netherlands)

Hogenboom, A.C.; Milea, D.V.; Frasincar, F.; Kaymak, U.; Calders, T.; Tuyls, K.; Pechenizkiy, M.

2009-01-01

The application of Semantic Web technologies in an Electronic Commerce environment implies a need for good support tools. Fast query engines are required for efficient real-time querying of large amounts of data, usually represented using RDF. We focus on optimizing a special class of SPARQL
Error Checking for Chinese Query by Mining Web Log

Directory of Open Access Journals (Sweden)

Jianyong Duan

2015-01-01

Full Text Available For the search engine, error-input query is a common phenomenon. This paper uses web log as the training set for the query error checking. Through the n-gram language model that is trained by web log, the queries are analyzed and checked. Some features including query words and their number are introduced into the model. At the same time data smoothing algorithm is used to solve data sparseness problem. It will improve the overall accuracy of the n-gram model. The experimental results show that it is effective.
Query construction, entropy, and generalization in neural-network models

Science.gov (United States)

Sollich, Peter

1994-05-01

We study query construction algorithms, which aim at improving the generalization ability of systems that learn from examples by choosing optimal, nonredundant training sets. We set up a general probabilistic framework for deriving such algorithms from the requirement of optimizing a suitable objective function; specifically, we consider the objective functions entropy (or information gain) and generalization error. For two learning scenarios, the high-low game and the linear perceptron, we evaluate the generalization performance obtained by applying the corresponding query construction algorithms and compare it to training on random examples. We find qualitative differences between the two scenarios due to the different structure of the underlying rules (nonlinear and ``noninvertible'' versus linear); in particular, for the linear perceptron, random examples lead to the same generalization ability as a sequence of queries in the limit of an infinite number of examples. We also investigate learning algorithms which are ill matched to the learning environment and find that, in this case, minimum entropy queries can in fact yield a lower generalization ability than random examples. Finally, we study the efficiency of single queries and its dependence on the learning history, i.e., on whether the previous training examples were generated randomly or by querying, and the difference between globally and locally optimal query construction.
Using local language syndromic terminology in participatory epidemiology: Lessons for One Health practitioners among the Maasai of Ngorongoro, Tanzania.

Science.gov (United States)

Queenan, Kevin; Mangesho, Peter; Ole-Neselle, Moses; Karimuribo, Esron; Rweyemamu, Mark; Kock, Richard; Häsler, Barbara

2017-04-01

Pastoralists and agro-pastoralists often occupy remote and hostile environments, which lack infrastructure and capacity in human and veterinary healthcare and disease surveillance systems. Participatory epidemiology (PE) and Participatory Disease Surveillance (PDS) are particularly useful in situations of resource scarcity, where conventional diagnostics and surveillance data of disease prevalence may be intermittent or limited. Livestock keepers, when participating in PE studies about health issues, commonly use their local language terms, which are often syndromic and descriptive in nature. Practitioners of PE recommend confirmation of their findings with triangulation including biomedical diagnostic techniques. However, the latter is not practiced in all studies, usually due to time, financial or logistical constraints. A cross sectional study was undertaken with the Maasai of Ngorongoro District, Tanzania. It aimed to identify the terms used to describe the infectious diseases of livestock and humans with the greatest perceived impact on livelihoods. Furthermore, it aimed to characterise the usefulness and limitations of relying on local terminology when conducting PE studies in which diagnoses were not confirmed. Semi-structured interviews were held with 23 small groups, totalling 117 community members within five villages across the district. In addition, informal discussions and field observations were conducted with village elders, district veterinary and medical officers, meat inspectors and livestock field officers. For human conditions including zoonoses, several biomedical terms are now part of the common language. Conversely, livestock conditions are described using local Maasai terms, usually associated with the signs observed by the livestock keeper. Several of these descriptive, syndromic terms are used inconsistently and showed temporal and spatial variations. This study highlights the complexity and ambiguity which may exist in local terminology
Accelerating SPARQL Queries and Analytics on RDF Data

KAUST Repository

Al-Harbi, Razen

2016-11-09

The complexity of SPARQL queries and RDF applications poses great challenges on distributed RDF management systems. SPARQL workloads are dynamic and con- sist of queries with variable complexities. Hence, systems that use static partitioning su↵er from communication overhead for workloads that generate excessive communi- cation. Concurrently, RDF applications are becoming more sophisticated, mandating analytical operations that extend beyond SPARQL queries. Being primarily designed and optimized to execute SPARQL queries, which lack procedural capabilities, exist- ing systems are not suitable for rich RDF analytics. This dissertation tackles the problem of accelerating SPARQL queries and RDF analytics on distributed shared-nothing RDF systems. First, a distributed RDF en- gine, coined AdPart, is introduced. AdPart uses lightweight hash partitioning for sharding triples using their subject values; rendering its startup overhead very low. The locality-aware query optimizer of AdPart takes full advantage of the partition- ing to (i) support the fully parallel processing of join patterns on subjects and (ii) minimize data communication for general queries by applying hash distribution of intermediate results instead of broadcasting, wherever possible. By exploiting hash- based locality, AdPart achieves better or comparable performance to systems that employ sophisticated partitioning schemes. To cope with workloads dynamism, AdPart is extended to dynamically adapt to workload changes. AdPart monitors the data access patterns and dynamically redis- tributes and replicates the instances of the most frequent patterns among workers.Consequently, the communication cost for future queries is drastically reduced or even eliminated. Experiments with synthetic and real data verify that AdPart starts faster than all existing systems and gracefully adapts to the query load. Finally, to support and accelerate rich RDF analytical tasks, a vertex-centric RDF analytics framework is

A semantic perspective on query log analysis

NARCIS (Netherlands)

Hofmann, K.; de Rijke, M.; Huurnink, B.; Meij, E.

2009-01-01

We present our views on the CLEF log file analysis task. We argue for a task definition that focuses on the semantic enrichment of query logs. In addition, we discuss how additional information about the context in which queries are being made could further our understanding of users’ information
Synonymy in the English-origin Romanian Medical Terminology

OpenAIRE

Oana BADEA

2013-01-01

The Romanian medical terminology has been enriched quite a lot lately. This phenomena was not only due to the significant influence of the English language, but also because of the relationships developed between the already existing terms and the new ones. Thus, the present study comprises the analysis on Romanian medical terms of Englsih origin and their native synonymous correspondents in the Romanian medical terminology. The dictionnaries used to select the synonymous pairs of medical ter...
GeoSpark SQL: An Effective Framework Enabling Spatial Queries on Spark

Directory of Open Access Journals (Sweden)

Zhou Huang

2017-09-01

Full Text Available In the era of big data, Internet-based geospatial information services such as various LBS apps are deployed everywhere, followed by an increasing number of queries against the massive spatial data. As a result, the traditional relational spatial database (e.g., PostgreSQL with PostGIS and Oracle Spatial cannot adapt well to the needs of large-scale spatial query processing. Spark is an emerging outstanding distributed computing framework in the Hadoop ecosystem. This paper aims to address the increasingly large-scale spatial query-processing requirement in the era of big data, and proposes an effective framework GeoSpark SQL, which enables spatial queries on Spark. On the one hand, GeoSpark SQL provides a convenient SQL interface; on the other hand, GeoSpark SQL achieves both efficient storage management and high-performance parallel computing through integrating Hive and Spark. In this study, the following key issues are discussed and addressed: (1 storage management methods under the GeoSpark SQL framework, (2 the spatial operator implementation approach in the Spark environment, and (3 spatial query optimization methods under Spark. Experimental evaluation is also performed and the results show that GeoSpark SQL is able to achieve real-time query processing. It should be noted that Spark is not a panacea. It is observed that the traditional spatial database PostGIS/PostgreSQL performs better than GeoSpark SQL in some query scenarios, especially for the spatial queries with high selectivity, such as the point query and the window query. In general, GeoSpark SQL performs better when dealing with compute-intensive spatial queries such as the kNN query and the spatial join query.
Adaptive and Optimized RDF Query Interface for Distributed WFS Data

Directory of Open Access Journals (Sweden)

Tian Zhao

2017-04-01

Full Text Available Web Feature Service (WFS is a protocol for accessing geospatial data stores such as databases and Shapefiles over the Web. However, WFS does not provide direct access to data distributed in multiple servers. In addition, WFS features extracted from their original sources are not convenient for user access due to the lack of connection to high-level concepts. Users are facing the choices of either querying each WFS server first and then integrating the results, or converting the data from all WFS servers to a more expressive format such as RDF (Resource Description Framework and then querying the integrated data. The first choice requires additional programming while the second choice is not practical for large or frequently updated datasets. The new contribution of this paper is that we propose a novel adaptive and optimized RDF query interface to overcome the aforementioned limitation. Specifically, in this paper, we propose a novel algorithm to query and synthesize distributed WFS data through an RDF query interface, where users can specify data requests to multiple WFS servers using a single RDF query. Users can also define a simple configuration to associate WFS feature types, attributes, and values with RDF classes, properties, and values so that user queries can be written using a more uniform and informative vocabulary. The algorithm translates each RDF query written in SPARQL-like syntax to multiple WFS GetFeature requests, and then converts and integrates the multiple WFS results to get the answers to the original query. The generated GetFeature requests are sent asynchronously and simultaneously to WFS servers to take advantage of the server parallelism. The results of each GetFeature request are cached to improve query response time for subsequent queries that involve one or more of the cached requests. A JavaScript-based prototype is implemented and experimental results show that the query response time can be greatly reduced through
Federated query processing for the semantic web

CERN Document Server

Buil-Aranda, C

2014-01-01

During the last years, the amount of RDF data has increased exponentially over the Web, exposed via SPARQL endpoints. These SPARQL endpoints allow users to direct SPARQL queries to the RDF data. Federated SPARQL query processing allows to query several of these RDF databases as if they were a single one, integrating the results from all of them. This is a key concept in the Web of Data and it is also a hot topic in the community. Besides of that, the W3C SPARQL-WG has standardized it in the new Recommendation SPARQL 1.1.This book provides a formalisation of the W3C proposed recommendation. Thi
Towards A Streams-Based Framework for Defining Location-Based Queries

DEFF Research Database (Denmark)

Huang, Xuegang; Jensen, Christian S.

2004-01-01

An infrastructure is emerging that supports the delivery of on-line, location-enabled services to mobile users. Such services involve novel database queries, and the database research community is quite active in proposing techniques for the effi- cient processing of such queries. In parallel...... to this, the management of data streams has become an active area of research. While most research in mobile services concerns performance issues, this paper aims to establish a formal framework for defining the semantics of queries encountered in mobile services, most notably the so-called continuous...... queries that are particularly relevant in this context. Rather than inventing an entirely new framework, the paper proposes a framework that builds on concepts from data streams and temporal databases. Definitions of example queries demonstrates how the framework enables clear formulation of query...
A service-oriented distributed semantic mediator: integrating multiscale biomedical information.

Science.gov (United States)

Mora, Oscar; Engelbrecht, Gerhard; Bisbal, Jesus

2012-11-01

Biomedical research continuously generates large amounts of heterogeneous and multimodal data spread over multiple data sources. These data, if appropriately shared and exploited, could dramatically improve the research practice itself, and ultimately the quality of health care delivered. This paper presents DISMED (DIstributed Semantic MEDiator), an open source semantic mediator that provides a unified view of a federated environment of multiscale biomedical data sources. DISMED is a Web-based software application to query and retrieve information distributed over a set of registered data sources, using semantic technologies. It also offers a userfriendly interface specifically designed to simplify the usage of these technologies by non-expert users. Although the architecture of the software mediator is generic and domain independent, in the context of this paper, DISMED has been evaluated for managing biomedical environments and facilitating research with respect to the handling of scientific data distributed in multiple heterogeneous data sources. As part of this contribution, a quantitative evaluation framework has been developed. It consist of a benchmarking scenario and the definition of five realistic use-cases. This framework, created entirely with public datasets, has been used to compare the performance of DISMED against other available mediators. It is also available to the scientific community in order to evaluate progress in the domain of semantic mediation, in a systematic and comparable manner. The results show an average improvement in the execution time by DISMED of 55% compared to the second best alternative in four out of the five use-cases of the experimental evaluation.
Linked Heritage: a collaborative terminology management platform for a network of multilingual thesauri and controlled vocabularies

Directory of Open Access Journals (Sweden)

Marie-Veronique Leroi

2013-01-01

Full Text Available Terminology and multilingualism have been one of the main focuses of the Athena Project. Linked Heritage as a legacy of this project also deals with terminology and bring theory to practice applying the recommendations given in the Athena Project. Linked Heritage as a direct follow-up of these recommendations on terminology and multilingualism is currently working on the development of a Terminology Management Platform (TMP. This platform will allow any cultural institution to register, SKOSify and manage its terminology in a collaborative way. This Terminology Management Platform will provide a network of multilingual and cross-domain terminologies.
Mistakes in the usage of anatomical terminology in clinical practice.

Science.gov (United States)

Kachlik, David; Bozdechova, Ivana; Cech, Pavel; Musil, Vladimir; Baca, Vaclav

2009-06-01

Anatomical terminology serves as a basic communication tool in all the medical fields. Therefore Latin anatomical nomenclature has been repetitively issued and revised from 1895 (Basiliensia Nomina Anatomica) until 1998, when the last version was approved and published as the Terminologia Anatomica (International Anatomical Terminology) by the Federative Committee on Anatomical Terminology. A brief history of the terminology and nomenclature development is mentioned, along with the concept and contributions of the Terminologia Anatomica including the employed abbreviations. Examples of obsolete anatomical terms and their current synonyms are listed. Clinicians entered the process of the nomenclature revision and this aspect is demonstrated with several examples of terms used in clinical fields only, some already incorporated in the Terminologia Anatomica and a few obsolete terms still alive in non-theoretical communication. Frequent mistakes in grammar and orthography are stated as well. Authors of the article strongly recommend the use of the recent revision of the Latin anatomical nomenclature both in theoretical and clinical medicine.
Substance abuse: medical and slang terminology.

Science.gov (United States)

Hamid, Humera; El-Mallakh, Rif S; Vandeveir, Keith

2005-03-01

Substance abuse is among one of the major problems plaguing our society. It has come to the attention of several healthcare professionals that a communication gap exists between themselves and substance abusers. Most of the time the substance abusers are only familiar with the slang terms of abused substances, a terminology that medical professionals are usually unaware of. This paper is an attempt to close that communication gap, allowing health care professionals to understand the slang terminology that their patients use, thus enabling them to make appropriate treatment decisions. In addition, the article presents some key features (including active ingredient, pharmacological classification, medical use, abuse form, usage method, combinations used, effects sought, long-term possible effects, and detectability in urine) of the most commonly abused substances.
Remote sensing terminology: past experience and recent needs

Science.gov (United States)

Kancheva, Rumiana

2013-10-01

Terminology is a key issue for a better understanding among people using various languages. Terminology accuracy is essential during all phases of international cooperation. It is crucial to keep up with the latest quantitative and qualitative developments and novelties of the terminology in advanced technology fields such as aerospace science and industry. This is especially true in remote sensing and geoinformatics which develop rapidly and have wide and ever extending applications in various domains of human activity. The importance of the correct use of remote sensing terms refers not only to people working in this field but also to experts in many disciplines who handle remote sensing data and information products. The paper is devoted to terminology issues that refer to all aspects of remote sensing research and application areas. The attention is drawn on the recent needs and peculiarities of compiling specialized dictionaries in the subject area of remote sensing. Details are presented about the work in progress on the preparation of an English-Bulgarian dictionary of remote sensing terms focusing on Earth observations and geoinformation science. Our belief is that the elaboration of bilingual and multilingual dictionaries and glossaries in this spreading, most technically advanced and promising field of human expertise is of great practical importance. Any interest in cooperation and initiating of suchlike collaborative multilingual projects is welcome and highly appreciated.
Eucharistic Hospitality : Reconsidering the Terminology

NARCIS (Netherlands)

Casadei, Giulia; Wouda, Fokke

2016-01-01

Giulia Casadei MA and Fokke Wouda MA work on PhD projects about Eucharistic sharing in ecumenical relations; a pressing, yet controversial topic in Roman Catholic ecumenical engagement. As they both encounter questions concerning the terminology of this field, they decided on writing an article
SM4MQ: A Semantic Model for Multidimensional Queries

DEFF Research Database (Denmark)

Varga, Jovan; Dobrokhotova, Ekaterina; Romero, Oscar

2017-01-01

On-Line Analytical Processing (OLAP) is a data analysis approach to support decision-making. On top of that, Exploratory OLAP is a novel initiative for the convergence of OLAP and the Semantic Web (SW) that enables the use of OLAP techniques on SW data. Moreover, OLAP approaches exploit different......, sharing, and reuse on the SW. As OLAP is based on the underlying multidimensional (MD) data model we denote such queries as MD queries and define SM4MQ: A Semantic Model for Multidimensional Queries. Furthermore, we propose a method to automate the exploitation of queries by means of SPARQL. We apply...
Accelerating SPARQL Queries and Analytics on RDF Data

KAUST Repository

Al-Harbi, Razen

2016-01-01

The complexity of SPARQL queries and RDF applications poses great challenges on distributed RDF management systems. SPARQL workloads are dynamic and con- sist of queries with variable complexities. Hence, systems that use static partitioning su
Query Language for Location-Based Services: A Model Checking Approach

Science.gov (United States)

Hoareau, Christian; Satoh, Ichiro

We present a model checking approach to the rationale, implementation, and applications of a query language for location-based services. Such query mechanisms are necessary so that users, objects, and/or services can effectively benefit from the location-awareness of their surrounding environment. The underlying data model is founded on a symbolic model of space organized in a tree structure. Once extended to a semantic model for modal logic, we regard location query processing as a model checking problem, and thus define location queries as hybrid logicbased formulas. Our approach is unique to existing research because it explores the connection between location models and query processing in ubiquitous computing systems, relies on a sound theoretical basis, and provides modal logic-based query mechanisms for expressive searches over a decentralized data structure. A prototype implementation is also presented and will be discussed.
Anatomical terminology and nomenclature: past, present and highlights.

Science.gov (United States)

Kachlik, David; Baca, Vaclav; Bozdechova, Ivana; Cech, Pavel; Musil, Vladimir

2008-08-01

The anatomical terminology is a base for medical communication. It is elaborated into a nomenclature in Latin. Its history goes back to 1895, when the first Latin anatomical nomenclature was published as Basiliensia Nomina Anatomica. It was followed by seven revisions (Jenaiensia Nomina Anatomica 1935, Parisiensia Nomina Anatomica 1955, Nomina Anatomica 2nd to 6th edition 1960-1989). The last revision, Terminologia Anatomica, (TA) created by the Federative Committee on Anatomical Terminology and approved by the International Federation of Associations of Anatomists, was published in 1998. Apart from the official Latin anatomical terminology, it includes a list of recommended English equivalents. In this article, major changes and pitfalls of the nomenclature are discussed, as well as the clinical anatomy terms. The last revision (TA) is highly recommended to the attention of not only teachers, students and researchers, but also to clinicians, doctors, translators, editors and publishers to be followed in their activities.
Energy-aware SQL query acceleration through FPGA-based dynamic partial reconfiguration

NARCIS (Netherlands)

Becher, Andreas; Bauer, Florian; Ziener, Daniel; Teich, Jürgen

2014-01-01

In this paper, we propose an approach for energy-aware FPGA-based query acceleration for databases on embedded devices. After the analysis of an incoming query, a query-specific hardware accelerator is generated on-the-fly and loaded on the FPGA for subsequent query execution using partial dynamic
How Do Children Reformulate Their Search Queries?

Science.gov (United States)

Rutter, Sophie; Ford, Nigel; Clough, Paul

2015-01-01

Introduction: This paper investigates techniques used by children in year 4 (age eight to nine) of a UK primary school to reformulate their queries, and how they use information retrieval systems to support query reformulation. Method: An in-depth study analysing the interactions of twelve children carrying out search tasks in a primary school…
The Compilation of the Shona–English Biomedical Dictionary: Problems and Challenges

Directory of Open Access Journals (Sweden)

Nomalanga Mpofu

2011-10-01

Full Text Available
ABSTRACT: The bilingual Shona–English dictionary of biomedical terms, Duramazwi reUrapi neUtano, was compiled with the aim of improving the efficiency of communication between doctor and patient. The dictionary is composed of terms from both modern and traditional medicinal practices. The article seeks to look at the methods of production of the dictionary, the presentation of entries in the dictionary and the problems and challenges encountered in the compilation proc-ess, namely, developing Shona medical terminology in the cultural context and especially the as-pect of equivalence between English and Shona biomedical terms.
Keywords: BIOMEDICAL, ADOPTIVES, ENTRIES, SYNONYMS, CROSS-REFERENCES, IDIOMS, CIRCUMLOCUTION, STANDARDISATION, HEADWORD, EQUIVALENCE, VARI-ANTS, DEFINITION, CULTURE, EUPHEMISMS, MODERN, TRADITIONAL, MONOLINGUAL, BILINGUAL, CORPUS, BORROWING, SHONA, COMMUNICATION
*****
OPSOMMING: Die samestelling van die Sjona–Engelse biomediese woorde-boek: Probleme en uitdagings. Die tweetalige Sjona–Engelse woordeboek van biomediese terme, Duramazwi reUrapi neUtano, is saamgestel met die doel om die effektiwiteit van kommunika-sie tussen dokter en pasiënt te verbeter. Die woordeboek bestaan uit terme van sowel moderne as tradisionele geneeskundige praktyke. Die artikel wil die metodes van die totstandkoming van die woordeboek beskou, die aanbieding van die inskrywings in die woordeboek en die probleme en uitdagings wat in die samestellingsproses teëgekom is, naamlik, die ontwikkeling van Sjona- mediese terminolgie binne die kulturele konteks en veral die aspek van ekwivalensie tussen Engel-se en Sjona- biomediese terme.

Sleutelwoorde: BIOMEDIES, LEENWOORDE, INSKRYWINGS, SINONIEME, KRUISVER-WYSINGS, IDIOME, OMSKRYWING, STANDAARDISASIE, TREFWOORD, EKWIVALENSIE, WISSELVORME, DEFINISIE, KULTUUR, EUFEMISMES, MODERN, TRADISIONEEL, EEN-TALIG, TWEETALIG, KORPUS, ONTLENING, KOMMUNIKASIE, SJONA
Investigating Computer-Based Formative Assessments in a Medical Terminology Course

Science.gov (United States)

Wilbanks, Jammie T.

2012-01-01

Research has been conducted on the effectiveness of formative assessments and on effectively teaching medical terminology; however, research had not been conducted on the use of formative assessments in a medical terminology course. A quantitative study was performed which captured data from a pretest, self-assessment, four module exams, and a…

ConnectomeExplorer: Query-guided visual analysis of large volumetric neuroscience data

KAUST Repository

Beyer, Johanna

2013-12-01

This paper presents ConnectomeExplorer, an application for the interactive exploration and query-guided visual analysis of large volumetric electron microscopy (EM) data sets in connectomics research. Our system incorporates a knowledge-based query algebra that supports the interactive specification of dynamically evaluated queries, which enable neuroscientists to pose and answer domain-specific questions in an intuitive manner. Queries are built step by step in a visual query builder, building more complex queries from combinations of simpler queries. Our application is based on a scalable volume visualization framework that scales to multiple volumes of several teravoxels each, enabling the concurrent visualization and querying of the original EM volume, additional segmentation volumes, neuronal connectivity, and additional meta data comprising a variety of neuronal data attributes. We evaluate our application on a data set of roughly one terabyte of EM data and 750 GB of segmentation data, containing over 4,000 segmented structures and 1,000 synapses. We demonstrate typical use-case scenarios of our collaborators in neuroscience, where our system has enabled them to answer specific scientific questions using interactive querying and analysis on the full-size data for the first time. © 1995-2012 IEEE.
[Standardization of terminology in laboratory medicine I].

Science.gov (United States)

Yoon, Soo Young; Yoon, Jong Hyun; Min, Won Ki; Lim, Hwan Sub; Song, Junghan; Chae, Seok Lae; Lee, Chang Kyu; Kwon, Jung Ah; Lee, Kap No

2007-04-01

Standardization of medical terminology is essential for data transmission between health-care institutions or clinical laboratories and for maximizing the benefits of information technology. Purpose of our study was to standardize the medical terms used in the clinical laboratory, such as test names, units, terms used in result descriptions, etc. During the first year of the study, we developed a standard database of concept names for laboratory terms, which covered the terms used in government health care centers, their branch offices, and primary health care units. Laboratory terms were collected from the electronic data interchange (EDI) codes from National Health Insurance Corporation (NHIC), Logical Observation Identifier Names and Codes (LOINC) database, community health centers and their branch offices, and clinical laboratories of representative university medical centers. For standard expression, we referred to the English-Korean/ Korean-English medical dictionary of Korean Medical Association and the rules for foreign language translation. Programs for mapping between LOINC DB and EDI code and for translating English to Korean were developed. A Korean standard laboratory terminology database containing six axial concept names such as components, property, time aspect, system (specimen), scale type, and method type was established for 7,508 test observations. Short names and a mapping table for EDI codes and Unified Medical Language System (UMLS) were added. Synonym tables for concept names, words used in the database, and six axial terms were prepared to make it easier to find the standard terminology with common terms used in the field of laboratory medicine. Here we report for the first time a Korean standard laboratory terminology database for test names, result description terms, result units covering most laboratory tests in primary healthcare centers.
EFSA Scientific Committee; Scientific Opinion on Risk Assessment Terminology

DEFF Research Database (Denmark)

Hald, Tine

of improving the expression and communication of risk and/or uncertainties in the selected opinions. The Scientific Committee concluded that risk assessment terminology is not fully harmonised within EFSA. In part this is caused by sectoral legislation defining specific terminology and international standards......The Scientific Committee of the European Food Safety Authority (EFSA) reviewed the use of risk assessment terminology within its Scientific Panels. An external report, commissioned by EFSA, analysed 219 opinions published by the Scientific Committee and Panels to recommend possible ways......, the Scientific Committee concludes that particular care must be taken that the principles of CAC, OIE or IPPC are followed strictly. EFSA Scientific Panels should identify which specific approach is most useful in dealing with their individual mandates. The Scientific Committee considered detailed aspects...
Representation of ophthalmology concepts by electronic systems: adequacy of controlled medical terminologies.

Science.gov (United States)

Chiang, Michael F; Casper, Daniel S; Cimino, James J; Starren, Justin

2005-02-01

To assess the adequacy of 5 controlled medical terminologies (International Classification of Diseases 9, Clinical Modification [ICD9-CM]; Current Procedural Terminology 4 [CPT-4]; Systematized Nomenclature of Medicine, Clinical Terms [SNOMED-CT]; Logical Identifiers, Names, and Codes [LOINC]; Medical Entities Dictionary [MED]) for representing concepts in ophthalmology. Noncomparative case series. Twenty complete ophthalmology case presentations were sequentially selected from a publicly available ophthalmology journal. Each of the 20 cases was parsed into discrete concepts, and each concept was classified along 2 axes: (1) diagnosis, finding, or procedure and (2) ophthalmic or medical concept. Electronic or paper browsers were used to assign a code for every concept in each of the 5 terminologies. Adequacy of assignment for each concept was scored on a 3-point scale. Findings from all 20 case presentations were combined and compared based on a coverage score, which was the average score for all concepts in that terminology. Adequacy of assignment for concepts in each terminology, based on a 3-point Likert scale (0, no match; 1, partial match; 2, complete match). Cases were parsed into 1603 concepts. SNOMED-CT had the highest mean overall coverage score (1.625+/-0.667), followed by MED (0.974+/-0.764), LOINC (0.781+/-0.929), ICD9-CM (0.280+/-0.619), and CPT-4 (0.082+/-0.337). SNOMED-CT also had higher coverage scores than any of the other terminologies for concepts in the diagnosis, finding, and procedure categories. Average coverage scores for ophthalmic concepts were lower than those for medical concepts. Controlled terminologies are required for electronic representation of ophthalmology data. SNOMED-CT had significantly higher content coverage than any other terminology in this study.
A distributed query execution engine of big attributed graphs.

Science.gov (United States)

Batarfi, Omar; Elshawi, Radwa; Fayoumi, Ayman; Barnawi, Ahmed; Sakr, Sherif

2016-01-01

A graph is a popular data model that has become pervasively used for modeling structural relationships between objects. In practice, in many real-world graphs, the graph vertices and edges need to be associated with descriptive attributes. Such type of graphs are referred to as attributed graphs. G-SPARQL has been proposed as an expressive language, with a centralized execution engine, for querying attributed graphs. G-SPARQL supports various types of graph querying operations including reachability, pattern matching and shortest path where any G-SPARQL query may include value-based predicates on the descriptive information (attributes) of the graph edges/vertices in addition to the structural predicates. In general, a main limitation of centralized systems is that their vertical scalability is always restricted by the physical limits of computer systems. This article describes the design, implementation in addition to the performance evaluation of DG-SPARQL, a distributed, hybrid and adaptive parallel execution engine of G-SPARQL queries. In this engine, the topology of the graph is distributed over the main memory of the underlying nodes while the graph data are maintained in a relational store which is replicated on the disk of each of the underlying nodes. DG-SPARQL evaluates parts of the query plan via SQL queries which are pushed to the underlying relational stores while other parts of the query plan, as necessary, are evaluated via indexless memory-based graph traversal algorithms. Our experimental evaluation shows the efficiency and the scalability of DG-SPARQL on querying massive attributed graph datasets in addition to its ability to outperform the performance of Apache Giraph, a popular distributed graph processing system, by orders of magnitudes.
[Biomedical informatics].

Science.gov (United States)

Capurro, Daniel; Soto, Mauricio; Vivent, Macarena; Lopetegui, Marcelo; Herskovic, Jorge R

2011-12-01

Biomedical Informatics is a new discipline that arose from the need to incorporate information technologies to the generation, storage, distribution and analysis of information in the domain of biomedical sciences. This discipline comprises basic biomedical informatics, and public health informatics. The development of the discipline in Chile has been modest and most projects have originated from the interest of individual people or institutions, without a systematic and coordinated national development. Considering the unique features of health care system of our country, research in the area of biomedical informatics is becoming an imperative.
Path Minima Queries in Dynamic Weighted Trees

DEFF Research Database (Denmark)

Davoodi, Pooya; Brodal, Gerth Stølting; Satti, Srinivasa Rao

2011-01-01

In the path minima problem on a tree, each edge is assigned a weight and a query asks for the edge with minimum weight on a path between two nodes. For the dynamic version of the problem, where the edge weights can be updated, we give data structures that achieve optimal query time\\todo{what about...
Secure Nearest Neighbor Query on Crowd-Sensing Data

Directory of Open Access Journals (Sweden)

Ke Cheng

2016-09-01

Full Text Available Nearest neighbor queries are fundamental in location-based services, and secure nearest neighbor queries mainly focus on how to securely and quickly retrieve the nearest neighbor in the outsourced cloud server. However, the previous big data system structure has changed because of the crowd-sensing data. On the one hand, sensing data terminals as the data owner are numerous and mistrustful, while, on the other hand, in most cases, the terminals find it difficult to finish many safety operation due to computation and storage capability constraints. In light of they Multi Owners and Multi Users (MOMU situation in the crowd-sensing data cloud environment, this paper presents a secure nearest neighbor query scheme based on the proxy server architecture, which is constructed by protocols of secure two-party computation and secure Voronoi diagram algorithm. It not only preserves the data confidentiality and query privacy but also effectively resists the collusion between the cloud server and the data owners or users. Finally, extensive theoretical and experimental evaluations are presented to show that our proposed scheme achieves a superior balance between the security and query performance compared to other schemes.
National Drug File - Reference Terminology API

Data.gov (United States)

U.S. Department of Health & Human Services — The National Drug File - Reference Terminology (NDF-RT) is produced by the U.S. Department of Veterans Affairs, Veterans Health Administration (VHA). NDF-RT is an...
CSRQ: Communication-Efficient Secure Range Queries in Two-Tiered Sensor Networks

Directory of Open Access Journals (Sweden)

Hua Dai

2016-02-01

Full Text Available In recent years, we have seen many applications of secure query in two-tiered wireless sensor networks. Storage nodes are responsible for storing data from nearby sensor nodes and answering queries from Sink. It is critical to protect data security from a compromised storage node. In this paper, the Communication-efficient Secure Range Query (CSRQ—a privacy and integrity preserving range query protocol—is proposed to prevent attackers from gaining information of both data collected by sensor nodes and queries issued by Sink. To preserve privacy and integrity, in addition to employing the encoding mechanisms, a novel data structure called encrypted constraint chain is proposed, which embeds the information of integrity verification. Sink can use this encrypted constraint chain to verify the query result. The performance evaluation shows that CSRQ has lower communication cost than the current range query protocols.
PubFocus: semantic MEDLINE/PubMed citations analytics through integration of controlled biomedical dictionaries and ranking algorithm

Directory of Open Access Journals (Sweden)

Chuong Cheng-Ming

2006-10-01

Full Text Available Abstract Background Understanding research activity within any given biomedical field is important. Search outputs generated by MEDLINE/PubMed are not well classified and require lengthy manual citation analysis. Automation of citation analytics can be very useful and timesaving for both novices and experts. Results PubFocus web server automates analysis of MEDLINE/PubMed search queries by enriching them with two widely used human factor-based bibliometric indicators of publication quality: journal impact factor and volume of forward references. In addition to providing basic volumetric statistics, PubFocus also prioritizes citations and evaluates authors' impact on the field of search. PubFocus also analyses presence and occurrence of biomedical key terms within citations by utilizing controlled vocabularies. Conclusion We have developed citations' prioritisation algorithm based on journal impact factor, forward referencing volume, referencing dynamics, and author's contribution level. It can be applied either to the primary set of PubMed search results or to the subsets of these results identified through key terms from controlled biomedical vocabularies and ontologies. NCI (National Cancer Institute thesaurus and MGD (Mouse Genome Database mammalian gene orthology have been implemented for key terms analytics. PubFocus provides a scalable platform for the integration of multiple available ontology databases. PubFocus analytics can be adapted for input sources of biomedical citations other than PubMed.
Nurse's use of power to standardise nursing terminology in electronic health records.

Science.gov (United States)

Ali, Samira; Sieloff, Christina L

2017-07-01

To describe nurses' use of power to influence the incorporation of standardised nursing terminology within electronic health records. Little is known about nurses' potential use of power to influence the incorporation of standardised nursing terminology within electronic health records. The theory of group power within organisations informed the design of the descriptive, cross-sectional study used a survey method to assess nurses' use of power to influence the incorporation of standardised nursing terminology within electronic health records. The Sieloff-King Assessment of Group Power within Organizations © and Nursing Power Scale was used. A total of 232 nurses responded to the survey. The mean power capability score was moderately high at 134.22 (SD 18.49), suggesting that nurses could use power to achieve the incorporation of standardised nursing terminology within electronic health records. The nurses' power capacity was significantly correlated with their power capability (r = 0.96, P power to achieve their goals, such as the incorporation of standardised nursing terminology within electronic health records. Nurse administrators may use their power to influence the incorporation of standardised nursing terminology within electronic health records. If nurses lack power, this could decrease nurses' ability to achieve their goals and contribute to the achievement of effective patient outcomes. © 2017 John Wiley & Sons Ltd.
Pecularities of Economic and Information Terminology

Directory of Open Access Journals (Sweden)

Elvyra Vida Tadauskienė

2011-04-01

Full Text Available The article investigates the pecularities of economic and information terminology and concludes their original source. As economic terms turn out to have appeared earlier than those of information, so the beginning of the emergence of them was influenced by the Greek and Latin languages. During the Soviet period economic terms were under the influence of the Russian language. A lot of information terms originated from the English language so the dominance of this language is still greatly felt. The common language can be considered to be the original source of some of the mentioned terminology when expanding the meaning of adequate terms. Translation of some of the terms creates problems related to the synonymous meaning of the terms or certain variations of the vocabulary meanings.
In-route skyline querying for location-based services

DEFF Research Database (Denmark)

Xuegang, Huang; Jensen, Kristian S.

2005-01-01

With the emergence of an infrastructure for location-aware mobile services, the processing of advanced, location-based queries that are expected to underlie such services is gaining in relevance, While much work has assumed that users move in Euclidean space, this paper assumes that movement...... their efficient computation. The queries take into account several spatial preferences. and they intuitively return a set of most interesting results for each result returned by the corresponding non-skyline queries. The paper also covers a performance study of the proposed techniques based on real point...
jQuery 2.0 animation techniques beginner's guide

CERN Document Server

Culpepper, Adam

2013-01-01

This book is a guide to help you create attractive web page animations using jQuery. Written in a friendly and engaging approach this book is designed to be placed alongside your computer as a mentor.If you are a web designer or a frontend developer or if you want to learn how to animate the user interface of your web applications with jQuery, this book is for you. Experience with jQuery or Javascript would be helpful but solid knowledge base of HTML and CSS is assumed.
More terminological clarity in the interprofessional field – a call for reflection on the use of terminologies, in both practice and research, on a national and international level

Directory of Open Access Journals (Sweden)

Mitzkat, Anika

2016-04-01

Full Text Available The terminology which has been used up until now within interprofessional healthcare has been characterised by a certain definitional weakness, which, among other factors, has been caused by an uncritical adoption of language conventions and a lack of theoretical reflection. However, as terminological clarity plays a significant role in the development and profiling of a discipline, the clarification and definition of commonly-used terminology has manifested itself as a considerable objective for the interprofessional research community. One of the most important journals for research in the area of interprofessional education and care, the Journal of Interprofessional Care, has expanded its author guidelines relating to terminology, modeled after the conceptual considerations of the research group around Barr et. al and Reeves et al. A German translation of the suggested terms therein has been presented in this contribution, and discussed in light of the challenges to a possible adaptation for the German-speaking world. The objective is to assist communication in practice and research in becoming clearer, while promoting an increasing awareness to and the transparency of determined definitions and terminologies.
An Adaptive Directed Query Dissemination Scheme for Wireless Sensor Networks

NARCIS (Netherlands)

Chatterjea, Supriyo; De Luigi, Simone; Havinga, Paul J.M.; Sun, M.T.

This paper describes a directed query dissemination scheme, DirQ that routes queries to the appropriate source nodes based on both constant and dynamicvalued attributes such as sensor types and sensor values. Unlike certain other query dissemination schemes, location information is not essential for
Querying Natural Logic Knowledge Bases

DEFF Research Database (Denmark)

Andreasen, Troels; Bulskov, Henrik; Jensen, Per Anker

2017-01-01

This paper describes the principles of a system applying natural logic as a knowledge base language. Natural logics are regimented fragments of natural language employing high level inference rules. We advocate the use of natural logic for knowledge bases dealing with querying of classes...... in ontologies and class-relationships such as are common in life-science descriptions. The paper adopts a version of natural logic with recursive restrictive clauses such as relative clauses and adnominal prepositional phrases. It includes passive as well as active voice sentences. We outline a prototype...... for partial translation of natural language into natural logic, featuring further querying and conceptual path finding in natural logic knowledge bases....
Memory aware query scheduling in a database cluster

NARCIS (Netherlands)

F. Waas; M.L. Kersten (Martin)

2000-01-01

textabstractQuery throughput is one of the primary optimization goals in interactive web-based information systems in order to achieve the performance necessary to serve large user communities. Queries in this application domain differ significantly from those in traditional database applications:
Templates and Queries in Contextual Hypermedia

DEFF Research Database (Denmark)

Anderson, Kenneth Mark; Hansen, Frank Allan; Bouvin, Niels Olof

2006-01-01

discuss a framework, HyConSC, that implements this model and describe how it can be used to build new contextual hypermedia systems. Our framework aids the developer in the iterative development of contextual queries (via a dynamic query browser) and offers support for con-text matching, a key feature...... of contextual hypermedia. We have tested the framework with data and sensors taken from the HyCon contextual hypermedia system and are now migrating HyCon to this new framework....

Spatial Keyword Querying

DEFF Research Database (Denmark)

Cao, Xin; Chen, Lisi; Cong, Gao

2012-01-01

The web is increasingly being used by mobile users. In addition, it is increasingly becoming possible to accurately geo-position mobile users and web content. This development gives prominence to spatial web data management. Specifically, a spatial keyword query takes a user location and user-sup...... different kinds of functionality as well as the ideas underlying their definition....
Parallel Index and Query for Large Scale Data Analysis

Energy Technology Data Exchange (ETDEWEB)

Chou, Jerry; Wu, Kesheng; Ruebel, Oliver; Howison, Mark; Qiang, Ji; Prabhat,; Austin, Brian; Bethel, E. Wes; Ryne, Rob D.; Shoshani, Arie

2011-07-18

Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index and query technologies are critical for facilitating interactive exploration of large datasets, but numerous challenges remain in terms of designing a system for process- ing general scientific datasets. The system needs to be able to run on distributed multi-core platforms, efficiently utilize underlying I/O infrastructure, and scale to massive datasets. We present FastQuery, a novel software framework that address these challenges. FastQuery utilizes a state-of-the-art index and query technology (FastBit) and is designed to process mas- sive datasets on modern supercomputing platforms. We apply FastQuery to processing of a massive 50TB dataset generated by a large scale accelerator modeling code. We demonstrate the scalability of the tool to 11,520 cores. Motivated by the scientific need to search for inter- esting particles in this dataset, we use our framework to reduce search time from hours to tens of seconds.
Conceptual metaphors in computer networking terminology ...

African Journals Online (AJOL)

Lakoff & Johnson, 1980) is used as a basic framework for analysing and explaining the occurrence of metaphor in the terminology used by computer networking professionals in the information technology (IT) industry. An analysis of linguistic ...
Synonymy in the English-origin Romanian Medical Terminology

Directory of Open Access Journals (Sweden)

Oana BADEA

2013-01-01

Full Text Available The Romanian medical terminology has been enriched quite a lot lately. This phenomena was not only due to the significant influence of the English language, but also because of the relationships developed between the already existing terms and the new ones. Thus, the present study comprises the analysis on Romanian medical terms of Englsih origin and their native synonymous correspondents in the Romanian medical terminology. The dictionnaries used to select the synonymous pairs of medical terms were the Medical Dictionary (2010 and The Great Dictionary of Neologisms (2008
Conceptual querying through ontologies

DEFF Research Database (Denmark)

Andreasen, Troels; Bulskov, Henrik

2009-01-01

is motivated by an obvious need for users to survey huge volumes of objects in query answers. An ontology formalism and a special notion of-instantiated ontology" are introduced. The latter is a structure reflecting the content in the document collection in that; it is a restriction of a general world......We present here ail approach to conceptual querying where the aim is, given a collection of textual database objects or documents, to target an abstraction of the entire database content in terms of the concepts appearing in documents, rather than the documents in the collection. The approach...... knowledge ontology to the concepts instantiated in the collection. The notion of ontology-based similarity is briefly described, language constructs for direct navigation and retrieval of concepts in the ontology are discussed and approaches to conceptual summarization are presented....
COEUS: "semantic web in a box" for biomedical applications.

Science.gov (United States)

Lopes, Pedro; Oliveira, José Luís

2012-12-17

As the "omics" revolution unfolds, the growth in data quantity and diversity is bringing about the need for pioneering bioinformatics software, capable of significantly improving the research workflow. To cope with these computer science demands, biomedical software engineers are adopting emerging semantic web technologies that better suit the life sciences domain. The latter's complex relationships are easily mapped into semantic web graphs, enabling a superior understanding of collected knowledge. Despite increased awareness of semantic web technologies in bioinformatics, their use is still limited. COEUS is a new semantic web framework, aiming at a streamlined application development cycle and following a "semantic web in a box" approach. The framework provides a single package including advanced data integration and triplification tools, base ontologies, a web-oriented engine and a flexible exploration API. Resources can be integrated from heterogeneous sources, including CSV and XML files or SQL and SPARQL query results, and mapped directly to one or more ontologies. Advanced interoperability features include REST services, a SPARQL endpoint and LinkedData publication. These enable the creation of multiple applications for web, desktop or mobile environments, and empower a new knowledge federation layer. The platform, targeted at biomedical application developers, provides a complete skeleton ready for rapid application deployment, enhancing the creation of new semantic information systems. COEUS is available as open source at http://bioinformatics.ua.pt/coeus/.
STUDYING TECHNOLOGIES FOR CREATING ELECTRONIC TERMINOLOGICAL BASES IN THE PROCESS OF PROFESSIONAL TRAINING OF TRANSLATORS

Directory of Open Access Journals (Sweden)

Svitlana M. Amelina

2017-09-01

Full Text Available The article deals with the peculiarities of studying the technologies of creating electronic terminology databases at different stages of professional training of future translators in accordance with the level of their information competence. The issues of studying terminology management in foreign universities are considered. It is clarified that the ability to create and to use terminology databases is included in the curricula of disciplines on translation practice and translation technologies. There are various ways of creating terminological databases depending on their structure and technology. It is accentuated on mastering the technology of forming terminology databases by extracting terms from specialized texts. It is noted that the accumulation of own terminological resources makes it possible to use them in high-tech translation systems.
Constraint-based query distribution framework for an integrated global schema

DEFF Research Database (Denmark)

Malik, Ahmad Kamran; Qadir, Muhammad Abdul; Iftikhar, Nadeem

2009-01-01

and replicated data sources. The provided system is all XML-based which poses query in XML form, transforms, and integrates local results in an XML document. Contributions include the use of constraints in our existing global schema which help in source selection and query optimization, and a global query...
Evaluating XML-Extended OLAP Queries Based on a Physical Algebra

DEFF Research Database (Denmark)

Yin, Xuepeng; Pedersen, Torben Bach

2004-01-01

is desirable. In this paper, we extend previous work on the logical federation of OLAP and XML data sources by presenting a simplified query semantics,a physical query algebra and a robust OLAP-XML query engine.Performance experiments with a prototypical implementation suggest that the performance for OLAP...
Parasol: An Architecture for Cross-Cloud Federated Graph Querying

Energy Technology Data Exchange (ETDEWEB)

Lieberman, Michael; Choudhury, Sutanay; Hughes, Marisa; Patrone, Dennis; Hider, Sandy; Piatko, Christine; Chapman, Matthew; Marple, JP; Silberberg, David

2014-06-22

Large scale data fusion of multiple datasets can often provide in- sights that examining datasets individually cannot. However, when these datasets reside in different data centers and cannot be collocated due to technical, administrative, or policy barriers, a unique set of problems arise that hamper querying and data fusion. To ad- dress these problems, a system and architecture named Parasol is presented that enables federated queries over graph databases residing in multiple clouds. Parasol’s design is flexible and requires only minimal assumptions for participant clouds. Query optimization techniques are also described that are compatible with Parasol’s lightweight architecture. Experiments on a prototype implementation of Parasol indicate its suitability for cross-cloud federated graph queries.
Labeling RDF Graphs for Linear Time and Space Querying

Science.gov (United States)

Furche, Tim; Weinzierl, Antonius; Bry, François

Indices and data structures for web querying have mostly considered tree shaped data, reflecting the view of XML documents as tree-shaped. However, for RDF (and when querying ID/IDREF constraints in XML) data is indisputably graph-shaped. In this chapter, we first study existing indexing and labeling schemes for RDF and other graph datawith focus on support for efficient adjacency and reachability queries. For XML, labeling schemes are an important part of the widespread adoption of XML, in particular for mapping XML to existing (relational) database technology. However, the existing indexing and labeling schemes for RDF (and graph data in general) sacrifice one of the most attractive properties of XML labeling schemes, the constant time (and per-node space) test for adjacency (child) and reachability (descendant). In the second part, we introduce the first labeling scheme for RDF data that retains this property and thus achieves linear time and space processing of acyclic RDF queries on a significantly larger class of graphs than previous approaches (which are mostly limited to tree-shaped data). Finally, we show how this labeling scheme can be applied to (acyclic) SPARQL queries to obtain an evaluation algorithm with time and space complexity linear in the number of resources in the queried RDF graph.
Concept-based query language approach to enterprise information systems

Science.gov (United States)

Niemi, Timo; Junkkari, Marko; Järvelin, Kalervo

2014-01-01

In enterprise information systems (EISs) it is necessary to model, integrate and compute very diverse data. In advanced EISs the stored data often are based both on structured (e.g. relational) and semi-structured (e.g. XML) data models. In addition, the ad hoc information needs of end-users may require the manipulation of data-oriented (structural), behavioural and deductive aspects of data. Contemporary languages capable of treating this kind of diversity suit only persons with good programming skills. In this paper we present a concept-oriented query language approach to manipulate this diversity so that the programming skill requirements are considerably reduced. In our query language, the features which need technical knowledge are hidden in application-specific concepts and structures. Therefore, users need not be aware of the underlying technology. Application-specific concepts and structures are represented by the modelling primitives of the extended RDOOM (relational deductive object-oriented modelling) which contains primitives for all crucial real world relationships (is-a relationship, part-of relationship, association), XML documents and views. Our query language also supports intensional and extensional-intensional queries, in addition to conventional extensional queries. In its query formulation, the end-user combines available application-specific concepts and structures through shared variables.
Efficient Processing of Multiple DTW Queries in Time Series Databases

DEFF Research Database (Denmark)

Kremer, Hardy; Günnemann, Stephan; Ivanescu, Anca-Maria

2011-01-01

. In many of today’s applications, however, large numbers of queries arise at any given time. Existing DTW techniques do not process multiple DTW queries simultaneously, a serious limitation which slows down overall processing. In this paper, we propose an efficient processing approach for multiple DTW...... for multiple DTW queries....
Advanced SPARQL querying in small molecule databases.

Science.gov (United States)

Galgonek, Jakub; Hurt, Tomáš; Michlíková, Vendula; Onderka, Petr; Schwarz, Jan; Vondrášek, Jiří

2016-01-01

In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users. We extended a SPARQL engine to be able to use special procedures inside SPARQL queries. This allows the user to work with data that cannot be simply precomputed and thus cannot be directly stored in the database. We designed an algorithm that checks a query against data ontology to identify possible user errors. This greatly improves query debugging. We also introduced an approach to visualize retrieved data in a user-friendly way, based on templates describing visualizations of resource classes. To integrate all of our approaches, we developed a simple web application. Our system was implemented successfully, and we demonstrated its usability on the ChEBI database transformed into RDF form. To demonstrate procedure call functions, we employed compound similarity searching based on OrChem. The application is publicly available at https://bioinfo.uochb.cas.cz/projects/chemRDF.
Clinical data integration model. Core interoperability ontology for research using primary care data.

Science.gov (United States)

Ethier, J-F; Curcin, V; Barton, A; McGilchrist, M M; Bastiaens, H; Andreasson, A; Rossiter, J; Zhao, L; Arvanitis, T N; Taweel, A; Delaney, B C; Burgun, A

2015-01-01

This article is part of the Focus Theme of METHODS of Information in Medicine on "Managing Interoperability and Complexity in Health Systems". Primary care data is the single richest source of routine health care data. However its use, both in research and clinical work, often requires data from multiple clinical sites, clinical trials databases and registries. Data integration and interoperability are therefore of utmost importance. TRANSFoRm's general approach relies on a unified interoperability framework, described in a previous paper. We developed a core ontology for an interoperability framework based on data mediation. This article presents how such an ontology, the Clinical Data Integration Model (CDIM), can be designed to support, in conjunction with appropriate terminologies, biomedical data federation within TRANSFoRm, an EU FP7 project that aims to develop the digital infrastructure for a learning healthcare system in European Primary Care. TRANSFoRm utilizes a unified structural / terminological interoperability framework, based on the local-as-view mediation paradigm. Such an approach mandates the global information model to describe the domain of interest independently of the data sources to be explored. Following a requirement analysis process, no ontology focusing on primary care research was identified and, thus we designed a realist ontology based on Basic Formal Ontology to support our framework in collaboration with various terminologies used in primary care. The resulting ontology has 549 classes and 82 object properties and is used to support data integration for TRANSFoRm's use cases. Concepts identified by researchers were successfully expressed in queries using CDIM and pertinent terminologies. As an example, we illustrate how, in TRANSFoRm, the Query Formulation Workbench can capture eligibility criteria in a computable representation, which is based on CDIM. A unified mediation approach to semantic interoperability provides a
Algebra-Based Optimization of XML-Extended OLAP Queries

DEFF Research Database (Denmark)

Yin, Xuepeng; Pedersen, Torben Bach

In today’s OLAP systems, integrating fast changing data, e.g., stock quotes, physically into a cube is complex and time-consuming. The widespread use of XML makes it very possible that this data is available in XML format on the WWW; thus, making XML data logically federated with OLAP systems...... is desirable. This report presents a complete foundation for such OLAP-XML federations. This includes a prototypical query engine, a simplified query semantics based on previous work, and a complete physical algebra which enables precise modeling of the execution tasks of an OLAP-XML query. Effective algebra...
QueryArch3D: Querying and Visualising 3D Models of a Maya Archaeological Site in a Web-Based Interface

Directory of Open Access Journals (Sweden)

Giorgio Agugiaro

2011-12-01

Full Text Available Constant improvements in the field of surveying, computing and distribution of digital-content are reshaping the way Cultural Heritage can be digitised and virtually accessed, even remotely via web. A traditional 2D approach for data access, exploration, retrieval and exploration may generally suffice, however more complex analyses concerning spatial and temporal features require 3D tools, which, in some cases, have not yet been implemented or are not yet generally commercially available. Efficient organisation and integration strategies applicable to the wide array of heterogeneous data in the field of Cultural Heritage represent a hot research topic nowadays. This article presents a visualisation and query tool (QueryArch3D conceived to deal with multi-resolution 3D models. Geometric data are organised in successive levels of detail (LoD, provided with geometric and semantic hierarchies and enriched with attributes coming from external data sources. The visualisation and query front-end enables the 3D navigation of the models in a virtual environment, as well as the interaction with the objects by means of queries based on attributes or on geometries. The tool can be used as a standalone application, or served through the web. The characteristics of the research work, along with some implementation issues and the developed QueryArch3D tool will be discussed and presented.
The SQL++ Query Language: Configurable, Unifying and Semi-structured

OpenAIRE

Ong, Kian Win; Papakonstantinou, Yannis; Vernoux, Romain

2014-01-01

NoSQL databases support semi-structured data, typically modeled as JSON. They also provide limited (but expanding) query languages. Their idiomatic, non-SQL language constructs, the many variations, and the lack of formal semantics inhibit deep understanding of the query languages, and also impede progress towards clean, powerful, declarative query languages. This paper specifies the syntax and semantics of SQL++, which is applicable to both JSON native stores and SQL databases. The SQL++ sem...
Customizable Electronic Laboratory Online (CELO): A Web-based Data Management System Builder for Biomedical Research Laboratories

Science.gov (United States)

Fong, Christine; Brinkley, James F.

2006-01-01

A common challenge among today’s biomedical research labs is managing growing amounts of research data. In order to reduce the time and resource costs of building data management tools, we designed the Customizable Electronic Laboratory Online (CELO) system. CELO automatically creates a generic database and web interface for laboratories that submit a simple web registration form. Laboratories can then use a collection of predefined XML templates to assist with the design of a database schema. Users can immediately utilize the web-based system to query data, manage multimedia files, and securely share data remotely over the internet. PMID:17238541
21 CFR 25.5 - Terminology.

Science.gov (United States)

2010-04-01

... 21 Food and Drugs 1 2010-04-01 2010-04-01 false Terminology. 25.5 Section 25.5 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES GENERAL ENVIRONMENTAL IMPACT...). (12) Legislation (40 CFR 1508.17). (13) Major Federal action (40 CFR 1508.18). (14) Mitigation (40 CFR...

Path Index Based Keywords to SPARQL Query Transformation for Semantic Data Federations

Directory of Open Access Journals (Sweden)

Thilini Cooray

2016-06-01

Full Text Available Semantic web is a highly emerging research domain. Enhancing the ability of keyword query processing on Semantic Web data provides a huge support for familiarizing the usefulness of Semantic Web to the general public. Most of the existing approaches focus on just user keyword matching to RDF graphs and output the connecting elements as results. Semantic Web consists of SPARQL query language which can process queries more accurately and efficiently than general keyword matching. There are only about a couple of approaches available for transforming keyword queries to SPARQL. They basically rely on real time graph traversals? for identifying subgraphs which can connect user keywords. Those approaches are either limited to query processing on a single data store or a set of interlinked data sets. They have not focused on query processing on a federation of independent data sets which belongs to the same domain. This research proposes a Path Index based approach eliminating real time graph traversal for transforming keyword queries to SPARQL. We have introduced an ontology alignment based approach for keyword query transforming on a federation of RDF data stored using multiple heterogeneous vocabularies. Evaluation shows that the proposed approach have the ability to generate SPARQL queries which can provide highly relevant results for user keyword queries. The Path Index based query transformation approach has also achieved high efficiency compared to the existing approach.
Lazy Toggle PRM: A single-query approach to motion planning

KAUST Repository

Denny, Jory

2013-05-01

Probabilistic RoadMaps (PRMs) are quite suc-cessful in solving complex and high-dimensional motion plan-ning problems. While particularly suited for multiple-query scenarios and expansive spaces, they lack efficiency in both solving single-query scenarios and mapping narrow spaces. Two PRM variants separately tackle these gaps. Lazy PRM reduces the computational cost of roadmap construction for single-query scenarios by delaying roadmap validation until query time. Toggle PRM is well suited for mapping narrow spaces by mapping both Cfree and Cobst, which gives certain theoretical benefits. However, fully validating the two resulting roadmaps can be costly. We present a strategy, Lazy Toggle PRM, for integrating these two approaches into a method which is both suited for narrow passages and efficient single-query calculations. This simultaneously addresses two challenges of PRMs. Like Lazy PRM, Lazy Toggle PRM delays validation of roadmaps until query time, but if no path is found, the algorithm augments the roadmap using the Toggle PRM methodology. We demonstrate the effectiveness of Lazy Toggle PRM in a wide range of scenarios, including those with narrow passages and high descriptive complexity (e.g., those described by many triangles), concluding that it is more effective than existing methods in solving difficult queries. © 2013 IEEE.
KNODWAT: a scientific framework application for testing knowledge discovery methods for the biomedical domain.

Science.gov (United States)

Holzinger, Andreas; Zupan, Mario

2013-06-13

Professionals in the biomedical domain are confronted with an increasing mass of data. Developing methods to assist professional end users in the field of Knowledge Discovery to identify, extract, visualize and understand useful information from these huge amounts of data is a huge challenge. However, there are so many diverse methods and methodologies available, that for biomedical researchers who are inexperienced in the use of even relatively popular knowledge discovery methods, it can be very difficult to select the most appropriate method for their particular research problem. A web application, called KNODWAT (KNOwledge Discovery With Advanced Techniques) has been developed, using Java on Spring framework 3.1. and following a user-centered approach. The software runs on Java 1.6 and above and requires a web server such as Apache Tomcat and a database server such as the MySQL Server. For frontend functionality and styling, Twitter Bootstrap was used as well as jQuery for interactive user interface operations. The framework presented is user-centric, highly extensible and flexible. Since it enables methods for testing using existing data to assess suitability and performance, it is especially suitable for inexperienced biomedical researchers, new to the field of knowledge discovery and data mining. For testing purposes two algorithms, CART and C4.5 were implemented using the WEKA data mining framework.
Learning Scientific and Medical Terminology with a Mnemonic Strategy Using an Illogical Association Technique

Science.gov (United States)

Brahler, C. Jayne; Walker, Diane

2008-01-01

For students pursuing careers in medical fields, knowledge of technical and medical terminology is prerequisite to being able to solve problems in their respective disciplines and professions. The Dean Vaughn Medical Terminology 350 Total Retention System, also known as Medical Terminology 350 (25), is a mnemonic instructional and learning…
Efficient external memory structures for range-aggregate queries

DEFF Research Database (Denmark)

Agarwal, P.K.; Yang, J.; Arge, L.

2013-01-01

We present external memory data structures for efficiently answering range-aggregate queries. The range-aggregate problem is defined as follows: Given a set of weighted points in Rd, compute the aggregate of the weights of the points that lie inside a d-dimensional orthogonal query rectangle. The...
Segmenting healthcare terminology users: a strategic approach to large scale evolutionary development.

Science.gov (United States)

Price, C; Briggs, K; Brown, P J

1999-01-01

Healthcare terminologies have become larger and more complex, aiming to support a diverse range of functions across the whole spectrum of healthcare activity. Prioritization of development, implementation and evaluation can be achieved by regarding the "terminology" as an integrated system of content-based and functional components. Matching these components to target segments within the healthcare community, supports a strategic approach to evolutionary development and provides essential product differentiation to enable terminology providers and systems suppliers to focus on end-user requirements.
Entropy Based Analysis of DNS Query Traffic in the Campus Network

Directory of Open Access Journals (Sweden)

Dennis Arturo Ludeña Romaña

2008-10-01

Full Text Available We carried out the entropy based study on the DNS query traffic from the campus network in a university through January 1st, 2006 to March 31st, 2007. The results are summarized, as follows: (1 The source IP addresses- and query keyword-based entropies change symmetrically in the DNS query traffic from the outside of the campus network when detecting the spam bot activity on the campus network. On the other hand (2, the source IP addresses- and query keywordbased entropies change similarly each other when detecting big DNS query traffic caused by prescanning or distributed denial of service (DDoS attack from the campus network. Therefore, we can detect the spam bot and/or DDoS attack bot by only watching DNS query access traffic.
Development of an information retrieval tool for biomedical patents.

Science.gov (United States)

Alves, Tiago; Rodrigues, Rúben; Costa, Hugo; Rocha, Miguel

2018-06-01

The volume of biomedical literature has been increasing in the last years. Patent documents have also followed this trend, being important sources of biomedical knowledge, technical details and curated data, which are put together along the granting process. The field of Biomedical text mining (BioTM) has been creating solutions for the problems posed by the unstructured nature of natural language, which makes the search of information a challenging task. Several BioTM techniques can be applied to patents. From those, Information Retrieval (IR) includes processes where relevant data are obtained from collections of documents. In this work, the main goal was to build a patent pipeline addressing IR tasks over patent repositories to make these documents amenable to BioTM tasks. The pipeline was developed within @Note2, an open-source computational framework for BioTM, adding a number of modules to the core libraries, including patent metadata and full text retrieval, PDF to text conversion and optical character recognition. Also, user interfaces were developed for the main operations materialized in a new @Note2 plug-in. The integration of these tools in @Note2 opens opportunities to run BioTM tools over patent texts, including tasks from Information Extraction, such as Named Entity Recognition or Relation Extraction. We demonstrated the pipeline's main functions with a case study, using an available benchmark dataset from BioCreative challenges. Also, we show the use of the plug-in with a user query related to the production of vanillin. This work makes available all the relevant content from patents to the scientific community, decreasing drastically the time required for this task, and provides graphical interfaces to ease the use of these tools. Copyright © 2018 Elsevier B.V. All rights reserved.
Common usage of cardiologic anatomical terminology: critical analysis and a trilingual discussion proposal.

Science.gov (United States)

Werneck, Alexandre Lins; Batigália, Fernando

2009-01-01

Terminology and Lexicography have been especially addressed to the Allied Health Sciences regarding discussion of case reports or concerning publication of scientific articles. The knowledge of Human Anatomy enables the understanding of medical terms and the refinement of Medical Terminology makes possible a better anatomicomedical communication in a highly technical level. Most of the scientific publications in both Anatomy and Medicine are found only in English and most of dictionaries or search resources available do not have specificity enough to explain anatomicomedical, terminological, or lexicographical occurrences. To design and produce a multilingual terminological dictionary (Latin-English-Portuguese-Spanish) containing a list of English anatomicomedical terms in common usage in cardiology subspecialties addressed to medical students and professionals, to other allied health sciences professionals, and to translators working in this specific field. Terms, semantical and grammatical components were selected to compose an anatomicocardiological corpus. The adequacy to the thematic terminological research requests and the translation reliability level will be settled from the terminology specificity in contrast to the semantics, as well as from a peer survey of the main terms used by national and international experts in specialized journals, Internet sites, and from text-books on Anatomy and Cardiology. The inclusion criteria will be the terms included in the English, Portuguese, and Spanish Terminologia Anatomica - the official terminology of the anatomical sciences; nonofficial technical commonly used terms which lead to terminology or translation misunderstanding often being a source of confusion. A table with a sample of the 508 most used anatomical cardiologic terms in English language peer-reviewed journals of cardiology and (pediatric and adult) thoracic surgery is shown. The working up of a multilingual terminological dictionary reduces the risk of
Top-k aggregation queries in large-scale distributed systems

OpenAIRE

Michel, Sebastian

2007-01-01

Distributed top-k query processing has recently become an essential functionality in a large number of emerging application classes like Internet traffic monitoring and Peer-to-Peer Web search. This work addresses efficient algorithms for distributed top-k queries in wide-area networks where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers. More precisely, in this thesis, we make the following distributions: We present the fa...
Knowledge based word-concept model estimation and refinement for biomedical text mining.

Science.gov (United States)

Jimeno Yepes, Antonio; Berlanga, Rafael

2015-02-01

Text mining of scientific literature has been essential for setting up large public biomedical databases, which are being widely used by the research community. In the biomedical domain, the existence of a large number of terminological resources and knowledge bases (KB) has enabled a myriad of machine learning methods for different text mining related tasks. Unfortunately, KBs have not been devised for text mining tasks but for human interpretation, thus performance of KB-based methods is usually lower when compared to supervised machine learning methods. The disadvantage of supervised methods though is they require labeled training data and therefore not useful for large scale biomedical text mining systems. KB-based methods do not have this limitation. In this paper, we describe a novel method to generate word-concept probabilities from a KB, which can serve as a basis for several text mining tasks. This method not only takes into account the underlying patterns within the descriptions contained in the KB but also those in texts available from large unlabeled corpora such as MEDLINE. The parameters of the model have been estimated without training data. Patterns from MEDLINE have been built using MetaMap for entity recognition and related using co-occurrences. The word-concept probabilities were evaluated on the task of word sense disambiguation (WSD). The results showed that our method obtained a higher degree of accuracy than other state-of-the-art approaches when evaluated on the MSH WSD data set. We also evaluated our method on the task of document ranking using MEDLINE citations. These results also showed an increase in performance over existing baseline retrieval approaches. Copyright © 2014 Elsevier Inc. All rights reserved.
Terminologi Rumah Adat Dalam Loka Sumbawa: Sebuah Tinjauan Antropolinguistik

Directory of Open Access Journals (Sweden)

Wawan Hermansyah

2016-10-01

Full Text Available In Loka custom home that is still standing firm in the middle of the town of Sumbawa Besar is a historical witness that shows the glory of the Sultanate of Sumbawa in its time. Terminology richness that included in Dalam Loka custom home providing space for the language and culture reviewers to understand more deeply what happened in the past time based on the symbols of things and suggests how the ancient life with profound meaning. Thus, to express forms of terminology and understand the values held in Dalam Loka custom home, it is necessary to be traced through the linguistic approach or the study of culture called antropholinguistic. Therefore, the theory used in this research is antropolinguistic theory and social semiotic theory. This study used a qualitative descriptive approach. The types and sources of data used are classified into two types: primary data and secondary data. The methods used in the method of data collection were listening and conversation. The data were analyzed by using intralingual equivalence and ekstralingual equivalence method. The results and discussion of this research found that forms of terminology in the Dalam Loka custom home in Sumbawa derived from some languages, which is derived from the Javanese, Makassar and Malay languages. Moreover, as for other forms of terminology found in Dalam Loka custom home in Sumbawa derived from foreign languages, such as Arabic and Sanskrit language. The cultural context shapes the terminology in the Dalam Loka custom home in Sumbawa indicate the existence of a civilization with a system of government and the imperial system in the form of aristocracy. System of government rests on the king (sultan is a system that includes customs, governance and law.
Fast Inbound Top-K Query for Random Walk with Restart.

Science.gov (United States)

Zhang, Chao; Jiang, Shan; Chen, Yucheng; Sun, Yidan; Han, Jiawei

2015-09-01

Random walk with restart (RWR) is widely recognized as one of the most important node proximity measures for graphs, as it captures the holistic graph structure and is robust to noise in the graph. In this paper, we study a novel query based on the RWR measure, called the inbound top-k (Ink) query. Given a query node q and a number k , the Ink query aims at retrieving k nodes in the graph that have the largest weighted RWR scores to q . Ink queries can be highly useful for various applications such as traffic scheduling, disease treatment, and targeted advertising. Nevertheless, none of the existing RWR computation techniques can accurately and efficiently process the Ink query in large graphs. We propose two algorithms, namely Squeeze and Ripple, both of which can accurately answer the Ink query in a fast and incremental manner. To identify the top- k nodes, Squeeze iteratively performs matrix-vector multiplication and estimates the lower and upper bounds for all the nodes in the graph. Ripple employs a more aggressive strategy by only estimating the RWR scores for the nodes falling in the vicinity of q , the nodes outside the vicinity do not need to be evaluated because their RWR scores are propagated from the boundary of the vicinity and thus upper bounded. Ripple incrementally expands the vicinity until the top- k result set can be obtained. Our extensive experiments on real-life graph data sets show that Ink queries can retrieve interesting results, and the proposed algorithms are orders of magnitude faster than state-of-the-art method.
Query-by-example surgical activity detection.

Science.gov (United States)

Gao, Yixin; Vedula, S Swaroop; Lee, Gyusung I; Lee, Mija R; Khudanpur, Sanjeev; Hager, Gregory D

2016-06-01

Easy acquisition of surgical data opens many opportunities to automate skill evaluation and teaching. Current technology to search tool motion data for surgical activity segments of interest is limited by the need for manual pre-processing, which can be prohibitive at scale. We developed a content-based information retrieval method, query-by-example (QBE), to automatically detect activity segments within surgical data recordings of long duration that match a query. The example segment of interest (query) and the surgical data recording (target trial) are time series of kinematics. Our approach includes an unsupervised feature learning module using a stacked denoising autoencoder (SDAE), two scoring modules based on asymmetric subsequence dynamic time warping (AS-DTW) and template matching, respectively, and a detection module. A distance matrix of the query against the trial is computed using the SDAE features, followed by AS-DTW combined with template scoring, to generate a ranked list of candidate subsequences (substrings). To evaluate the quality of the ranked list against the ground-truth, thresholding conventional DTW distances and bipartite matching are applied. We computed the recall, precision, F1-score, and a Jaccard index-based score on three experimental setups. We evaluated our QBE method using a suture throw maneuver as the query, on two tool motion datasets (JIGSAWS and MISTIC-SL) captured in a training laboratory. We observed a recall of 93, 90 and 87 % and a precision of 93, 91, and 88 % with same surgeon same trial (SSST), same surgeon different trial (SSDT) and different surgeon (DS) experiment setups on JIGSAWS, and a recall of 87, 81 and 75 % and a precision of 72, 61, and 53 % with SSST, SSDT and DS experiment setups on MISTIC-SL, respectively. We developed a novel, content-based information retrieval method to automatically detect multiple instances of an activity within long surgical recordings. Our method demonstrated adequate recall
Geometric Representations of Condition Queries on Three-Dimensional Vector Fields

Science.gov (United States)

Henze, Chris

1999-01-01

Condition queries on distributed data ask where particular conditions are satisfied. It is possible to represent condition queries as geometric objects by plotting field data in various spaces derived from the data, and by selecting loci within these derived spaces which signify the desired conditions. Rather simple geometric partitions of derived spaces can represent complex condition queries because much complexity can be encapsulated in the derived space mapping itself A geometric view of condition queries provides a useful conceptual unification, allowing one to intuitively understand many existing vector field feature detection algorithms -- and to design new ones -- as variations on a common theme. A geometric representation of condition queries also provides a simple and coherent basis for computer implementation, reducing a wide variety of existing and potential vector field feature detection techniques to a few simple geometric operations.
Biomedical engineering and nanotechnology

International Nuclear Information System (INIS)

Pawar, S.H.; Khyalappa, R.J.; Yakhmi, J.V.

2009-01-01

This book is predominantly a compilation of papers presented in the conference which is focused on the development in biomedical materials, biomedical devises and instrumentation, biomedical effects of electromagnetic radiation, electrotherapy, radiotherapy, biosensors, biotechnology, bioengineering, tissue engineering, clinical engineering and surgical planning, medical imaging, hospital system management, biomedical education, biomedical industry and society, bioinformatics, structured nanomaterial for biomedical application, nano-composites, nano-medicine, synthesis of nanomaterial, nano science and technology development. The papers presented herein contain the scientific substance to suffice the academic directivity of the researchers from the field of biomedicine, biomedical engineering, material science and nanotechnology. Papers relevant to INIS are indexed separately
Efficient processing of containment queries on nested sets

NARCIS (Netherlands)

Ibrahim, A.; Fletcher, G.H.L.

2013-01-01

We study the problem of computing containment queries on sets which can have both atomic and set-valued objects as elements, i.e., nested sets. Containment is a fundamental query pattern with many basic applications. Our study of nested set containment is motivated by the ubiquity of nested data in
Ontology Based Queries - Investigating a Natural Language Interface

NARCIS (Netherlands)

van der Sluis, Ielka; Hielkema, F.; Mellish, C.; Doherty, G.

2010-01-01

In this paper we look at what may be learned from a comparative study examining non-technical users with a background in social science browsing and querying metadata. Four query tasks were carried out with a natural language interface and with an interface that uses a web paradigm with hyperlinks.
On Logical Characterisation of Human Concept Learning based on Terminological Systems

DEFF Research Database (Denmark)

Badie, Farshad

2018-01-01

The central focus of this article is the epistemological assumption that knowledge could be generated based on human beings' experiences and over their conceptions of the world. Logical characterisation of human inductive learning over their produced conceptions within terminological systems and ...... and analysis of actual human inductive reasoning (and learning). This research connects with the topics 'logic & learning', 'cognitive modelling' and 'terminological knowledge representation'.......The central focus of this article is the epistemological assumption that knowledge could be generated based on human beings' experiences and over their conceptions of the world. Logical characterisation of human inductive learning over their produced conceptions within terminological systems...
DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data.

Science.gov (United States)

Putri, Fadhilah Kurnia; Song, Giltae; Kwon, Joonho; Rao, Praveen

2017-09-25

One of the crucial problems for taxi drivers is to efficiently locate passengers in order to increase profits. The rapid advancement and ubiquitous penetration of Internet of Things (IoT) technology into transportation industries enables us to provide taxi drivers with locations that have more potential passengers (more profitable areas) by analyzing and querying taxi trip data. In this paper, we propose a query processing system, called Distributed Profitable-Area Query ( DISPAQ ) which efficiently identifies profitable areas by exploiting the Apache Software Foundation's Spark framework and a MongoDB database. DISPAQ first maintains a profitable-area query index (PQ-index) by extracting area summaries and route summaries from raw taxi trip data. It then identifies candidate profitable areas by searching the PQ-index during query processing. Then, it exploits a Z-Skyline algorithm, which is an extension of skyline processing with a Z-order space filling curve, to quickly refine the candidate profitable areas. To improve the performance of distributed query processing, we also propose local Z-Skyline optimization, which reduces the number of dominant tests by distributing killer profitable areas to each cluster node. Through extensive evaluation with real datasets, we demonstrate that our DISPAQ system provides a scalable and efficient solution for processing profitable-area queries from huge amounts of big taxi trip data.

A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data.

Directory of Open Access Journals (Sweden)

Giovanni Delussu

Full Text Available This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR's formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called "Constant Load" and "Constant Number of Records", with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes.
A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data

Science.gov (United States)

Lianas, Luca; Frexia, Francesca; Zanetti, Gianluigi

2016-01-01

This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR’s formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called “Constant Load” and “Constant Number of Records”, with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes. PMID:27936191
A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data.

Science.gov (United States)

Delussu, Giovanni; Lianas, Luca; Frexia, Francesca; Zanetti, Gianluigi

2016-01-01

This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR's formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called "Constant Load" and "Constant Number of Records", with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes.
Biomedical photonics handbook biomedical diagnostics

CERN Document Server

Vo-Dinh, Tuan

2014-01-01

Shaped by Quantum Theory, Technology, and the Genomics RevolutionThe integration of photonics, electronics, biomaterials, and nanotechnology holds great promise for the future of medicine. This topic has recently experienced an explosive growth due to the noninvasive or minimally invasive nature and the cost-effectiveness of photonic modalities in medical diagnostics and therapy. The second edition of the Biomedical Photonics Handbook presents fundamental developments as well as important applications of biomedical photonics of interest to scientists, engineers, manufacturers, teachers, studen
Terminology of the public relations field: corpus — automatic term recognition — terminology database

Directory of Open Access Journals (Sweden)

Nataša Logar Berginc

2013-12-01

Full Text Available The article describes an analysis of automatic term recognition results performed for single- and multi-word terms with the LUIZ term extraction system. The target application of the results is a terminology database of Public Relations and the main resource the KoRP Public Relations Corpus. Our analysis is focused on two segments: (a single-word noun term candidates, which we compare with the frequency list of nouns from KoRP and evaluate termhood on the basis of the judgements of two domain experts, and (b multi-word term candidates with verb and noun as headword. In order to better assess the performance of the system and the soundness of our approach we also performed an analysis of recall. Our results show that the terminological relevance of extracted nouns is indeed higher than that of merely frequent nouns, and that verbal phrases only rarely count as proper terms. The most productive patterns of multi-word terms with noun as a headword have the following structure: [adjective + noun], [adjective + and + adjective + noun] and [adjective + adjective + noun]. The analysis of recall shows low inter-annotator agreement, but nevertheless very satisfactory recall levels.
Biomedical engineering principles

CERN Document Server

Ritter, Arthur B; Valdevit, Antonio; Ascione, Alfred N

2011-01-01

Introduction: Modeling of Physiological ProcessesCell Physiology and TransportPrinciples and Biomedical Applications of HemodynamicsA Systems Approach to PhysiologyThe Cardiovascular SystemBiomedical Signal ProcessingSignal Acquisition and ProcessingTechniques for Physiological Signal ProcessingExamples of Physiological Signal ProcessingPrinciples of BiomechanicsPractical Applications of BiomechanicsBiomaterialsPrinciples of Biomedical Capstone DesignUnmet Clinical NeedsEntrepreneurship: Reasons why Most Good Designs Never Get to MarketAn Engineering Solution in Search of a Biomedical Problem
Incentives for Delay-Constrained Data Query and Feedback in Mobile Opportunistic Crowdsensing

Directory of Open Access Journals (Sweden)

Yang Liu

2016-07-01

Full Text Available In this paper, we propose effective data collection schemes that stimulate cooperation between selfish users in mobile opportunistic crowdsensing. A query issuer generates a query and requests replies within a given delay budget. When a data provider receives the query for the first time from an intermediate user, the former replies to it and authorizes the latter as the owner of the reply. Different data providers can reply to the same query. When a user that owns a reply meets the query issuer that generates the query, it requests the query issuer to pay credits. The query issuer pays credits and provides feedback to the data provider, which gives the reply. When a user that carries a feedback meets the data provider, the data provider pays credits to the user in order to adjust its claimed expertise. Queries, replies and feedbacks can be traded between mobile users. We propose an effective mechanism to define rewards for queries, replies and feedbacks. We formulate the bargain process as a two-person cooperative game, whose solution is found by using the Nash theorem. To improve the credit circulation, we design an online auction process, in which the wealthy user can buy replies and feedbacks from the starving one using credits. We have carried out extensive simulations based on real-world traces to evaluate the proposed schemes.
Public health terminology: Hindrance to a Health in All Policies approach?

Science.gov (United States)

Synnevåg, Ellen S; Amdam, Roar; Fosse, Elisabeth

2018-02-01

National public health policies in Norway are based on a Health in All Policies (HiAP) approach. At the local level, this means that public health, as a cross-sectional responsibility, should be implemented in all municipal sectors by integrating public health policies in municipal planning and management systems. The paper investigates these local processes, focusing on the use of public health terminology and how this terminology is translated from national to local contexts. We ask whether the terms 'public health' and 'public health work' are suitable when implementing an HiAP approach. A qualitative case study based on analyses of interviews and planning documents was performed in three Norwegian municipalities. The results present dilemmas associated with using public health terminology when implementing an HiAP approach. On the one hand, the terms are experienced as wide, complex, advanced and unnecessary. On the other hand, the terms are experienced as important for a systematic approach towards understanding public health ideology and cross-sectional responsibility. One municipality used alternative terminology. This paper promotes debate about the appropriateness of using the terms 'public health' and 'public health work' at the local level. It suggests that adaptation is suitable and necessary, unless it compromises knowledge, responsibility and a systematic approach. This study concludes that the use of terminology is a central factor when implementing the Norwegian Public Health Act at the local level.
Schedule Sales Query Raw Data

Data.gov (United States)

General Services Administration — Schedule Sales Query presents sales volume figures as reported to GSA by contractors. The reports are generated as quarterly reports for the current year and the...
Doha agreement meeting on terminology and definitions in groin pain in athletes

DEFF Research Database (Denmark)

Weir, Adam; Brukner, Peter; Delahunt, Eamonn

2015-01-01

BACKGROUND: Heterogeneous taxonomy of groin injuries in athletes adds confusion to this complicated area. AIM: The 'Doha agreement meeting on terminology and definitions in groin pain in athletes' was convened to attempt to resolve this problem. Our aim was to agree on a standard terminology, along....... All members participated in a Delphi questionnaire prior to the meeting. RESULTS: Unanimous agreement was reached on the following terminology. The classification system has three major subheadings of groin pain in athletes: 1. Defined clinical entities for groin pain: Adductor-related, iliopsoas......-related, inguinal-related and pubic-related groin pain. 2. Hip-related groin pain. 3. Other causes of groin pain in athletes. The definitions are included in this paper. CONCLUSIONS: The Doha agreement meeting on terminology and definitions in groin pain in athletes reached a consensus on a clinically based...
Flexible Query Answering Systems

DEFF Research Database (Denmark)

This book constitutes the refereed proceedings of the 12th International Conference on Flexible Query Answering Systems, FQAS 2017, held in London, UK, in June 2017. The 21 full papers presented in this book together with 4 short papers were carefully reviewed and selected from 43 submissions...
Collaboration of the CMEA countries concerning the treatment of radiation protection terminology

International Nuclear Information System (INIS)

Arkhangel'skaya, G.V.; Rodnyanskaya, L.I.

1986-01-01

At present particular attention is directed to the terminology of radiation hygiene because of its intensive development and of the multitude of English terms integrated into it. The Leningrad Research Institute of Radiation Hygiene has elaborated a terminology list and characterized the terms of health physics. As to the cooperation in terminology by CMEA specialists it is proposed to elaborate a multilingual dictionary with previous drawing up of equivalent terms and coordination in defining them. Another proposal concerning cooperative publishing of compilations with terms to be recommended is made
Querying Sentiment Development over Time

DEFF Research Database (Denmark)

Andreasen, Troels; Christiansen, Henning; Have, Christian Theil

2013-01-01

A new language is introduced for describing hypotheses about fluctuations of measurable properties in streams of timestamped data, and as prime example, we consider trends of emotions in the constantly flowing stream of Twitter messages. The language, called EmoEpisodes, has a precise semantics...... that measures how well a hypothesis characterizes a given time interval; the semantics is parameterized so it can be adjusted to different views of the data. EmoEpisodes is extended to a query language with variables standing for unknown topics and emotions, and the query-answering mechanism will return...... instantiations for topics and emotions as well as time intervals that provide the largest deflections in this measurement. Experiments are performed on a selection of Twitter data to demonstrates the usefulness of the approach....
Evidence for the Existing American Nurses Association-Recognized Standardized Nursing Terminologies: A Systematic Review

Science.gov (United States)

Tastan, Sevinc; Linch, Graciele C. F.; Keenan, Gail M.; Stifter, Janet; McKinney, Dawn; Fahey, Linda; Dunn Lopez, Karen; Yao, Yingwei; Wilkie, Diana J.

2014-01-01

Objective To determine the state of the science for the five standardized nursing terminology sets in terms of level of evidence and study focus. Design Systematic Review. Data sources Keyword search of PubMed, CINAHL, and EMBASE databases from 1960s to March 19, 2012 revealed 1,257 publications. Review Methods From abstract review we removed duplicate articles, those not in English or with no identifiable standardized nursing terminology, and those with a low-level of evidence. From full text review of the remaining 312 articles, eight trained raters used a coding system to record standardized nursing terminology names, publication year, country, and study focus. Inter-rater reliability confirmed the level of evidence. We analyzed coded results. Results On average there were 4 studies per year between 1985 and 1995. The yearly number increased to 14 for the decade between 1996–2005, 21 between 2006–2010, and 25 in 2011. Investigators conducted the research in 27 countries. By evidence level for the 312 studies 72.4% were descriptive, 18.9% were observational, and 8.7% were intervention studies. Of the 312 reports, 72.1% focused on North American Nursing Diagnosis-International, Nursing Interventions Classification, Nursing Outcome Classification, or some combination of those three standardized nursing terminologies; 9.6% on Omaha System; 7.1% on International Classification for Nursing Practice; 1.6% on Clinical Care Classification/Home Health Care Classification; 1.6% on Perioperative Nursing Data Set; and 8.0% on two or more standardized nursing terminology sets. There were studies in all 10 foci categories including those focused on concept analysis/classification infrastructure (n = 43), the identification of the standardized nursing terminology concepts applicable to a health setting from registered nurses’ documentation (n = 54), mapping one terminology to another (n = 58), implementation of standardized nursing terminologies into electronic health
Harnessing Biomedical Natural Language Processing Tools to Identify Medicinal Plant Knowledge from Historical Texts.

Science.gov (United States)

Sharma, Vivekanand; Law, Wayne; Balick, Michael J; Sarkar, Indra Neil

2017-01-01

The growing amount of data describing historical medicinal uses of plants from digitization efforts provides the opportunity to develop systematic approaches for identifying potential plant-based therapies. However, the task of cataloguing plant use information from natural language text is a challenging task for ethnobotanists. To date, there have been only limited adoption of informatics approaches used for supporting the identification of ethnobotanical information associated with medicinal uses. This study explored the feasibility of using biomedical terminologies and natural language processing approaches for extracting relevant plant-associated therapeutic use information from historical biodiversity literature collection available from the Biodiversity Heritage Library. The results from this preliminary study suggest that there is potential utility of informatics methods to identify medicinal plant knowledge from digitized resources as well as highlight opportunities for improvement.
Multiple k Nearest Neighbor Query Processing in Spatial Network Databases

DEFF Research Database (Denmark)

Xuegang, Huang; Jensen, Christian Søndergaard; Saltenis, Simonas

2006-01-01

This paper concerns the efficient processing of multiple k nearest neighbor queries in a road-network setting. The assumed setting covers a range of scenarios such as the one where a large population of mobile service users that are constrained to a road network issue nearest-neighbor queries...... for points of interest that are accessible via the road network. Given multiple k nearest neighbor queries, the paper proposes progressive techniques that selectively cache query results in main memory and subsequently reuse these for query processing. The paper initially proposes techniques for the case...... where an upper bound on k is known a priori and then extends the techniques to the case where this is not so. Based on empirical studies with real-world data, the paper offers insight into the circumstances under which the different proposed techniques can be used with advantage for multiple k nearest...
Parallel main-memory indexing for moving-object query and update workloads

DEFF Research Database (Denmark)

Sidlauskas, Darius; Saltenis, Simonas; Jensen, Christian Søndergaard

2012-01-01

of supporting the location-related query and update workloads generated by very large populations of such moving objects. This paper presents a main-memory indexing technique that aims to support such workloads. The technique, called PGrid, uses a grid structure that is capable of exploiting the parallelism...... offered by modern processors. Unlike earlier proposals that maintain separate structures for updates and queries, PGrid allows both long-running queries and rapid updates to operate on a single data structure and thus offers up-to-date query results. Because PGrid does not rely on creating snapshots...... on the same current data-store state, PGrid outperforms snapshot-based techniques in terms of both query freshness and CPU cycle-wise efficiency....
VPipe: Virtual Pipelining for Scheduling of DAG Stream Query Plans

Science.gov (United States)

Wang, Song; Gupta, Chetan; Mehta, Abhay

There are data streams all around us that can be harnessed for tremendous business and personal advantage. For an enterprise-level stream processing system such as CHAOS [1] (Continuous, Heterogeneous Analytic Over Streams), handling of complex query plans with resource constraints is challenging. While several scheduling strategies exist for stream processing, efficient scheduling of complex DAG query plans is still largely unsolved. In this paper, we propose a novel execution scheme for scheduling complex directed acyclic graph (DAG) query plans with meta-data enriched stream tuples. Our solution, called Virtual Pipelined Chain (or VPipe Chain for short), effectively extends the "Chain" pipelining scheduling approach to complex DAG query plans.
A hierarchical recurrent encoder-decoder for generative context-aware query suggestion

DEFF Research Database (Denmark)

Sordoni, Alessandro; Bengio, Yoshua; Vahabi, Hossein

2015-01-01

Users may strive to formulate an adequate textual query for their information need. Search engines assist the users by presenting query suggestions. To preserve the original search intent, suggestions should be context-aware and account for the previous queries issued by the user. Achieving context...
Bat-Inspired Algorithm Based Query Expansion for Medical Web Information Retrieval.

Science.gov (United States)

Khennak, Ilyes; Drias, Habiba

2017-02-01

With the increasing amount of medical data available on the Web, looking for health information has become one of the most widely searched topics on the Internet. Patients and people of several backgrounds are now using Web search engines to acquire medical information, including information about a specific disease, medical treatment or professional advice. Nonetheless, due to a lack of medical knowledge, many laypeople have difficulties in forming appropriate queries to articulate their inquiries, which deem their search queries to be imprecise due the use of unclear keywords. The use of these ambiguous and vague queries to describe the patients' needs has resulted in a failure of Web search engines to retrieve accurate and relevant information. One of the most natural and promising method to overcome this drawback is Query Expansion. In this paper, an original approach based on Bat Algorithm is proposed to improve the retrieval effectiveness of query expansion in medical field. In contrast to the existing literature, the proposed approach uses Bat Algorithm to find the best expanded query among a set of expanded query candidates, while maintaining low computational complexity. Moreover, this new approach allows the determination of the length of the expanded query empirically. Numerical results on MEDLINE, the on-line medical information database, show that the proposed approach is more effective and efficient compared to the baseline.

Using a Java Dynamic Tree to manage the terminology in a suite of medical applications.

Science.gov (United States)

Yang, K; Evens, M W; Trace, D A

2008-01-01

Now that the National Library of Medicine has made SNOMED-CT widely available, we are trying to manage the terminology of a whole suite of medical applications and map our terminology into that in SNOMED. This paper describes the design and implementation of the Java Dynamic Tree that provides structure to our medical terminology and explains how it functions as the core of our system. The tree was designed to reflect the stages in a patient interview, so it contains components for identifying the patient and the provider, a large set of chief complaints, review of systems, physical examination, several history modules, medications, laboratory tests, imaging, and special procedures. The tree is mirrored in a commercial DBMS, which also stores multi-encounter patient data, disorder patterns for our Bayesian diagnostic system, and the data and rules for other expert systems. The DBMS facilitates the import and export of large terminology files. Our Java Dynamic Tree allows the health care provider to view the entire terminology along with the structure that supports it, as well as the mechanism for the generation of progress notes and other documents, in terms of a single hierarchical structure. Changes in terminology can be propagated through the system under the control of the expert. The import/ export facility has been a major help by replacing our original terminology by the terminology in SNOMED-CT.
Active Learning by Querying Informative and Representative Examples.

Science.gov (United States)

Huang, Sheng-Jun; Jin, Rong; Zhou, Zhi-Hua

2014-10-01

Active learning reduces the labeling cost by iteratively selecting the most valuable data to query their labels. It has attracted a lot of interests given the abundance of unlabeled data and the high cost of labeling. Most active learning approaches select either informative or representative unlabeled instances to query their labels, which could significantly limit their performance. Although several active learning algorithms were proposed to combine the two query selection criteria, they are usually ad hoc in finding unlabeled instances that are both informative and representative. We address this limitation by developing a principled approach, termed QUIRE, based on the min-max view of active learning. The proposed approach provides a systematic way for measuring and combining the informativeness and representativeness of an unlabeled instance. Further, by incorporating the correlation among labels, we extend the QUIRE approach to multi-label learning by actively querying instance-label pairs. Extensive experimental results show that the proposed QUIRE approach outperforms several state-of-the-art active learning approaches in both single-label and multi-label learning.
Cognitive approach to the construction of three-language terminological thesaurus

Directory of Open Access Journals (Sweden)

Grigorij Chetverikov

2015-11-01

Full Text Available Cognitive approach to the construction of three-language terminological thesaurus The paper is devoted to developing a lexicographic database of a three-language terminological dictionary. The detailed analysis of a relations scheme between tables and structures of tables with the help of three-layer decomposition predicate method is carried out, which has allowed to define ways of solving direct and reversible three-language electronic dictionaries creation problem.
SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases.

Science.gov (United States)

Schweiger, Dominik; Trajanoski, Zlatko; Pabinger, Stephan

2014-08-15

Semantic Web has established itself as a framework for using and sharing data across applications and database boundaries. Here, we present a web-based platform for querying biological Semantic Web databases in a graphical way. SPARQLGraph offers an intuitive drag & drop query builder, which converts the visual graph into a query and executes it on a public endpoint. The tool integrates several publicly available Semantic Web databases, including the databases of the just recently released EBI RDF platform. Furthermore, it provides several predefined template queries for answering biological questions. Users can easily create and save new query graphs, which can also be shared with other researchers. This new graphical way of creating queries for biological Semantic Web databases considerably facilitates usability as it removes the requirement of knowing specific query languages and database structures. The system is freely available at http://sparqlgraph.i-med.ac.at.
Features of standardized nursing terminology sets in Japan.

Science.gov (United States)

Sagara, Kaoru; Abe, Akinori; Ozaku, Hiromi Itoh; Kuwahara, Noriaki; Kogure, Kiyoshi

2006-01-01

This paper reports the features and relationships between standardizes nursing terminology sets used in Japan. First, we analyzed the common parts in five standardized nursing terminology sets: the Japan Nursing Practice Standard Master (JNPSM) that includes the names of nursing activities and is built by the Medical Information Center Development Center (MEDIS-DC); the labels of the Japan Classification of Nursing Practice (JCNP), built by the term advisory committee in the Japan Academy of Nursing Science; the labels of the International Classification for Nursing Practice (ICNP) translated to Japanese; the labels, domain names, and class names of the North American Nursing Diagnosis Association (NANDA) Nursing Diagnoses 2003-2004 translated to Japanese; and the terms included in the labels of Nursing Interventions Classification (NIC) translated to Japanese. Then we compared them with terms in a thesaurus dictionary, the Bunrui Goihyo, that contains general Japanese words and is built by the National Institute for Japanese Language. 1) the level of interchangeability between four standardized nursing terminology sets is quite low; 2) abbreviations and katakana words are frequently used to express nursing activities; 3) general Japanese words are usually used to express the status or situation of patients.
Biomedical engineering fundamentals

CERN Document Server

Bronzino, Joseph D

2014-01-01

Known as the bible of biomedical engineering, The Biomedical Engineering Handbook, Fourth Edition, sets the standard against which all other references of this nature are measured. As such, it has served as a major resource for both skilled professionals and novices to biomedical engineering.Biomedical Engineering Fundamentals, the first volume of the handbook, presents material from respected scientists with diverse backgrounds in physiological systems, biomechanics, biomaterials, bioelectric phenomena, and neuroengineering. More than three dozen specific topics are examined, including cardia
Graphical modeling and query language for hospitals.

Science.gov (United States)

Barzdins, Janis; Barzdins, Juris; Rencis, Edgars; Sostaks, Agris

2013-01-01

So far there has been little evidence that implementation of the health information technologies (HIT) is leading to health care cost savings. One of the reasons for this lack of impact by the HIT likely lies in the complexity of the business process ownership in the hospitals. The goal of our research is to develop a business model-based method for hospital use which would allow doctors to retrieve directly the ad-hoc information from various hospital databases. We have developed a special domain-specific process modelling language called the MedMod. Formally, we define the MedMod language as a profile on UML Class diagrams, but we also demonstrate it on examples, where we explain the semantics of all its elements informally. Moreover, we have developed the Process Query Language (PQL) that is based on MedMod process definition language. The purpose of PQL is to allow a doctor querying (filtering) runtime data of hospital's processes described using MedMod. The MedMod language tries to overcome deficiencies in existing process modeling languages, allowing to specify the loosely-defined sequence of the steps to be performed in the clinical process. The main advantages of PQL are in two main areas - usability and efficiency. They are: 1) the view on data through "glasses" of familiar process, 2) the simple and easy-to-perceive means of setting filtering conditions require no more expertise than using spreadsheet applications, 3) the dynamic response to each step in construction of the complete query that shortens the learning curve greatly and reduces the error rate, and 4) the selected means of filtering and data retrieving allows to execute queries in O(n) time regarding the size of the dataset. We are about to continue developing this project with three further steps. First, we are planning to develop user-friendly graphical editors for the MedMod process modeling and query languages. The second step is to do evaluation of usability the proposed language and tool
On (dynamic) range minimum queries in external memory

DEFF Research Database (Denmark)

Arge, L.; Fischer, Johannes; Sanders, Peter

2013-01-01

We study the one-dimensional range minimum query (RMQ) problem in the external memory model. We provide the first space-optimal solution to the batched static version of the problem. On an instance with N elements and Q queries, our solution takes Θ(sort(N + Q)) = Θ( N+QB log M /B N+QB ) I...
Incidence Rate of Canonical vs. Derived Medical Terminology in Natural Language.

Science.gov (United States)

Topac, Vasile; Jurcau, Daniel-Alexandru; Stoicu-Tivadar, Vasile

2015-01-01

Medical terminology appears in the natural language in multiple forms: canonical, derived or inflected form. This research presents an analysis of the form in which medical terminology appears in Romanian and English language. The sources of medical language used for the study are web pages presenting medical information for patients and other lay users. The results show that, in English, medical terminology tends to appear more in canonical form while, in the case of Romanian, it is the opposite. This paper also presents the service that was created to perform this analysis. This tool is available for the general public, and it is designed to be easily extensible, allowing the addition of other languages.
The Use of Non-linguistic Data in a Terminology and Knowledge Bank

DEFF Research Database (Denmark)

Madsen, Bodil Nistrup

2016-01-01

is carried out at Copenhagen Business School, will be introduced. In order to illustrate the need for a taxonomy for terminological data, some examples from the Data Category Registry of ISO TC 37 (ISOcat) will be given, and the taxonomy which has been developed for the DanTermBank project will be compared...... to the structure of ISOcat, the first printed standard comprising data categories for terminology management, ISO 12620:1999, and other standards from ISO TC 37. Finally some examples of linguistic and non-linguistic representations of concepts which we plan to introduce into the DanTermBank will be presented.......This paper will discuss definitions and give examples of linguistic and non -linguistic representation of concepts in a terminology and knowledge bank, and it will be argued that there is a need for a taxonomy of terminological data categories. As a background the DanTermBank project, which...
Menangkal Serangan SQL Injection Dengan Parameterized Query

Directory of Open Access Journals (Sweden)

Yulianingsih Yulianingsih

2016-06-01

Full Text Available Semakin meningkat pertumbuhan layanan informasi maka semakin tinggi pula tingkat kerentanan keamanan dari suatu sumber informasi. Melalui tulisan ini disajikan penelitian yang dilakukan secara eksperimen yang membahas tentang kejahatan penyerangan database secara SQL Injection. Penyerangan dilakukan melalui halaman autentikasi dikarenakan halaman ini merupakan pintu pertama akses yang seharusnya memiliki pertahanan yang cukup. Kemudian dilakukan eksperimen terhadap metode Parameterized Query untuk mendapatkan solusi terhadap permasalahan tersebut. Kata kunci— Layanan Informasi, Serangan, eksperimen, SQL Injection, Parameterized Query.
A Revisit of Query Expansion with Different Semantic Levels

DEFF Research Database (Denmark)

Zhang, Ce; Cui, Bin; Cong, Gao

2009-01-01

Query expansion has received extensive attention in information retrieval community. Although semantic based query expansion appears to be promising in improving retrieval performance, previous research has shown that it cannot consistently improve retrieval performance. It is a tricky problem to...
Revisiting the Global Software Engineering Terminology

DEFF Research Database (Denmark)

Tell, Paolo; Giuffrida, Rosalba; Shah, Hina

2013-01-01

Even though Global Software Engineering (GSE) has been a research topic of interest for many years, some of its ground terminology is still lacking a unified, coherent, and shared definition and/or classification. The purpose of this report is to collect, outline, and relate several fundamental...
9 CFR 101.2 - Administrative terminology.

Science.gov (United States)

2010-01-01

... toxic to microorganisms, e.g., antibiotics), or analogous products at any stage of production, shipment... 9 Animals and Animal Products 1 2010-01-01 2010-01-01 false Administrative terminology. 101.2 Section 101.2 Animals and Animal Products ANIMAL AND PLANT HEALTH INSPECTION SERVICE, DEPARTMENT OF...
Node Query Preservation for Deterministic Linear Top-Down Tree Transducers

Directory of Open Access Journals (Sweden)

Kazuki Miyahara

2013-11-01

Full Text Available This paper discusses the decidability of node query preservation problems for XML document transformations. We assume a transformation given by a deterministic linear top-down data tree transducer (abbreviated as DLT^V and an n-ary query based on runs of a tree automaton. We say that a DLT^V Tr strongly preserves a query Q if there is a query Q' such that for every document t, the answer set of Q' for Tr(t is equal to the answer set of Q for t. Also we say that Tr weakly preserves Q if there is a query Q' such that for every t_d in the range of Tr, the answer set of Q' for t_d is equal to the union of the answer set of Q for t such that t_d = Tr(t. We show that the weak preservation problem is coNP-complete and the strong preservation problem is in 2-EXPTIME.
Predicting Drug Recalls From Internet Search Engine Queries.

Science.gov (United States)

Yom-Tov, Elad

2017-01-01

Batches of pharmaceuticals are sometimes recalled from the market when a safety issue or a defect is detected in specific production runs of a drug. Such problems are usually detected when patients or healthcare providers report abnormalities to medical authorities. Here, we test the hypothesis that defective production lots can be detected earlier by monitoring queries to Internet search engines. We extracted queries from the USA to the Bing search engine, which mentioned one of the 5195 pharmaceutical drugs during 2015 and all recall notifications issued by the Food and Drug Administration (FDA) during that year. By using attributes that quantify the change in query volume at the state level, we attempted to predict if a recall of a specific drug will be ordered by FDA in a time horizon ranging from 1 to 40 days in future. Our results show that future drug recalls can indeed be identified with an AUC of 0.791 and a lift at 5% of approximately 6 when predicting a recall occurring one day ahead. This performance degrades as prediction is made for longer periods ahead. The most indicative attributes for prediction are sudden spikes in query volume about a specific medicine in each state. Recalls of prescription drugs and those estimated to be of medium-risk are more likely to be identified using search query data. These findings suggest that aggregated Internet search engine data can be used to facilitate in early warning of faulty batches of medicines.
External Data Structures for Shortest Path Queries on Planar Digraphs

DEFF Research Database (Denmark)

Arge, Lars; Toma, Laura

2005-01-01

In this paper we present space-query trade-offs for external memory data structures that answer shortest path queries on planar directed graphs. For any S = Ω(N 1 + ε) and S = O(N2/B), our main result is a family of structures that use S space and answer queries in O(N2/ S B) I/Os, thus obtaining...... optimal space-query product O(N2/B). An S space structure can be constructed in O(√S · sort(N)) I/Os, where sort(N) is the number of I/Os needed to sort N elements, B is the disk block size, and N is the size of the graph....
Harmonizing intelligence terminologies in business: Literature review

Directory of Open Access Journals (Sweden)

Sivave Mashingaidze

2014-11-01

Full Text Available The principal objective of this article is to do a literature review of different intelligence terminology with the aim of establishing the common attributes and differences, and to propose a universal and comprehensive definition of intelligence for common understanding amongst users. The findings showed that Competitive Intelligence has the broadest scope of intelligence activities covering the whole external operating environment of the company and targeting all levels of decision-making for instance; strategic intelligence, tactical intelligence and operative intelligence. Another terminology was found called Cyber IntelligenceTM which encompasses competitor intelligence, strategic intelligence, market intelligence and counterintelligence. In conclusion although CI has the broadest scope of intelligence and umbrella to many intelligence concepts, still Business Intelligence, and Corporate Intelligence are often used interchangeably as CI
Minimizing I/O Costs of Multi-Dimensional Queries with BitmapIndices

Energy Technology Data Exchange (ETDEWEB)

Rotem, Doron; Stockinger, Kurt; Wu, Kesheng

2006-03-30

Bitmap indices have been widely used in scientific applications and commercial systems for processing complex,multi-dimensional queries where traditional tree-based indices would not work efficiently. A common approach for reducing the size of a bitmap index for high cardinality attributes is to group ranges of values of an attribute into bins and then build a bitmap for each bin rather than a bitmap for each value of the attribute. Binning reduces storage costs,however, results of queries based on bins often require additional filtering for discarding it false positives, i.e., records in the result that do not satisfy the query constraints. This additional filtering,also known as ''candidate checking,'' requires access to the base data on disk and involves significant I/O costs. This paper studies strategies for minimizing the I/O costs for ''candidate checking'' for multi-dimensional queries. This is done by determining the number of bins allocated for each dimension and then placing bin boundaries in optimal locations. Our algorithms use knowledge of data distribution and query workload. We derive several analytical results concerning optimal bin allocation for a probabilistic query model. Our experimental evaluation with real life data shows an average I/O cost improvement of at least a factor of 10 for multi-dimensional queries on datasets from two different applications. Our experiments also indicate that the speedup increases with the number of query dimensions.
Video Stream Retrieval of Unseen Queries using Semantic Memory

NARCIS (Netherlands)

Cappallo, S.; Mensink, T.; Snoek, C.G.M.; Wilson, R.C.; Hancock, E.R.; Smith, W.A.P.

2016-01-01

Retrieval of live, user-broadcast video streams is an under-addressed and increasingly relevant challenge. The on-line nature of the problem requires temporal evaluation and the unforeseeable scope of potential queries motivates an approach which can accommodate arbitrary search queries. To account

Some links on this page may take you to non-federal websites. Their policies may differ from this site.