WorldWideScience

Sample records for semantic annotation challenges

  1. Semantic annotation in biomedicine: the current landscape.

    Science.gov (United States)

    Jovanović, Jelena; Bagheri, Ebrahim

    2017-09-22

    The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.

  2. The surplus value of semantic annotations

    NARCIS (Netherlands)

    Marx, M.

    2010-01-01

    We compare the costs of semantic annotation of textual documents to its benefits for information processing tasks. Semantic annotation can improve the performance of retrieval tasks and facilitates an improved search experience through faceted search, focused retrieval, better document summaries,

  3. Semantic annotation of consumer health questions.

    Science.gov (United States)

    Kilicoglu, Halil; Ben Abacha, Asma; Mrabet, Yassine; Shooshan, Sonya E; Rodriguez, Laritza; Masterton, Kate; Demner-Fushman, Dina

    2018-02-06

    Consumers increasingly use online resources for their health information needs. While current search engines can address these needs to some extent, they generally do not take into account that most health information needs are complex and can only fully be expressed in natural language. Consumer health question answering (QA) systems aim to fill this gap. A major challenge in developing consumer health QA systems is extracting relevant semantic content from the natural language questions (question understanding). To develop effective question understanding tools, question corpora semantically annotated for relevant question elements are needed. In this paper, we present a two-part consumer health question corpus annotated with several semantic categories: named entities, question triggers/types, question frames, and question topic. The first part (CHQA-email) consists of relatively long email requests received by the U.S. National Library of Medicine (NLM) customer service, while the second part (CHQA-web) consists of shorter questions posed to MedlinePlus search engine as queries. Each question has been annotated by two annotators. The annotation methodology is largely the same between the two parts of the corpus; however, we also explain and justify the differences between them. Additionally, we provide information about corpus characteristics, inter-annotator agreement, and our attempts to measure annotation confidence in the absence of adjudication of annotations. The resulting corpus consists of 2614 questions (CHQA-email: 1740, CHQA-web: 874). Problems are the most frequent named entities, while treatment and general information questions are the most common question types. Inter-annotator agreement was generally modest: question types and topics yielded highest agreement, while the agreement for more complex frame annotations was lower. Agreement in CHQA-web was consistently higher than that in CHQA-email. Pairwise inter-annotator agreement proved most

  4. Semantator: annotating clinical narratives with semantic web ontologies.

    Science.gov (United States)

    Song, Dezhao; Chute, Christopher G; Tao, Cui

    2012-01-01

    To facilitate clinical research, clinical data needs to be stored in a machine processable and understandable way. Manual annotating clinical data is time consuming. Automatic approaches (e.g., Natural Language Processing systems) have been adopted to convert such data into structured formats; however, the quality of such automatically extracted data may not always be satisfying. In this paper, we propose Semantator, a semi-automatic tool for document annotation with Semantic Web ontologies. With a loaded free text document and an ontology, Semantator supports the creation/deletion of ontology instances for any document fragment, linking/disconnecting instances with the properties in the ontology, and also enables automatic annotation by connecting to the NCBO annotator and cTAKES. By representing annotations in Semantic Web standards, Semantator supports reasoning based upon the underlying semantics of the owl:disjointWith and owl:equivalentClass predicates. We present discussions based on user experiences of using Semantator.

  5. A Novel Approach to Semantic and Coreference Annotation at LLNL

    Energy Technology Data Exchange (ETDEWEB)

    Firpo, M

    2005-02-04

    A case is made for the importance of high quality semantic and coreference annotation. The challenges of providing such annotation are described. Asperger's Syndrome is introduced, and the connections are drawn between the needs of text annotation and the abilities of persons with Asperger's Syndrome to meet those needs. Finally, a pilot program is recommended wherein semantic annotation is performed by people with Asperger's Syndrome. The primary points embodied in this paper are as follows: (1) Document annotation is essential to the Natural Language Processing (NLP) projects at Lawrence Livermore National Laboratory (LLNL); (2) LLNL does not currently have a system in place to meet its need for text annotation; (3) Text annotation is challenging for a variety of reasons, many related to its very rote nature; (4) Persons with Asperger's Syndrome are particularly skilled at rote verbal tasks, and behavioral experts agree that they would excel at text annotation; and (6) A pilot study is recommend in which two to three people with Asperger's Syndrome annotate documents and then the quality and throughput of their work is evaluated relative to that of their neuro-typical peers.

  6. Ontology modularization to improve semantic medical image annotation.

    Science.gov (United States)

    Wennerberg, Pinar; Schulz, Klaus; Buitelaar, Paul

    2011-02-01

    Searching for medical images and patient reports is a significant challenge in a clinical setting. The contents of such documents are often not described in sufficient detail thus making it difficult to utilize the inherent wealth of information contained within them. Semantic image annotation addresses this problem by describing the contents of images and reports using medical ontologies. Medical images and patient reports are then linked to each other through common annotations. Subsequently, search algorithms can more effectively find related sets of documents on the basis of these semantic descriptions. A prerequisite to realizing such a semantic search engine is that the data contained within should have been previously annotated with concepts from medical ontologies. One major challenge in this regard is the size and complexity of medical ontologies as annotation sources. Manual annotation is particularly time consuming labor intensive in a clinical environment. In this article we propose an approach to reducing the size of clinical ontologies for more efficient manual image and text annotation. More precisely, our goal is to identify smaller fragments of a large anatomy ontology that are relevant for annotating medical images from patients suffering from lymphoma. Our work is in the area of ontology modularization, which is a recent and active field of research. We describe our approach, methods and data set in detail and we discuss our results. Copyright © 2010 Elsevier Inc. All rights reserved.

  7. A Machine Learning Based Analytical Framework for Semantic Annotation Requirements

    OpenAIRE

    Hamed Hassanzadeh; MohammadReza Keyvanpour

    2011-01-01

    The Semantic Web is an extension of the current web in which information is given well-defined meaning. The perspective of Semantic Web is to promote the quality and intelligence of the current web by changing its contents into machine understandable form. Therefore, semantic level information is one of the cornerstones of the Semantic Web. The process of adding semantic metadata to web resources is called Semantic Annotation. There are many obstacles against the Semantic Annotation, such as ...

  8. Semantic Annotation of Unstructured Documents Using Concepts Similarity

    Directory of Open Access Journals (Sweden)

    Fernando Pech

    2017-01-01

    Full Text Available There is a large amount of information in the form of unstructured documents which pose challenges in the information storage, search, and retrieval. This situation has given rise to several information search approaches. Some proposals take into account the contextual meaning of the terms specified in the query. Semantic annotation technique can help to retrieve and extract information in unstructured documents. We propose a semantic annotation strategy for unstructured documents as part of a semantic search engine. In this proposal, ontologies are used to determine the context of the entities specified in the query. Our strategy for extracting the context is focused on concepts similarity. Each relevant term of the document is associated with an instance in the ontology. The similarity between each of the explicit relationships is measured through the combination of two types of associations: the association between each pair of concepts and the calculation of the weight of the relationships.

  9. Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator

    Science.gov (United States)

    Seyed, P.; Chastain, K.; McGuinness, D. L.

    2013-12-01

    Use of Semantic Web technologies for data management in the Earth sciences (and beyond) has great potential but is still in its early stages, since the challenges of translating data into a more explicit or semantic form for immediate use within applications has not been fully addressed. In this abstract we help address this challenge by introducing the SemantEco Annotator, which enables anyone, regardless of expertise, to semantically annotate tabular Earth Science data and translate it into linked data format, while applying the logic inherent in community-standard vocabularies to guide the process. The Annotator was conceived under a desire to unify dataset content from a variety of sources under common vocabularies, for use in semantically-enabled web applications. Our current use case employs linked data generated by the Annotator for use in the SemantEco environment, which utilizes semantics to help users explore, search, and visualize water or air quality measurement and species occurrence data through a map-based interface. The generated data can also be used immediately to facilitate discovery and search capabilities within 'big data' environments. The Annotator provides a method for taking information about a dataset, that may only be known to its maintainers, and making it explicit, in a uniform and machine-readable fashion, such that a person or information system can more easily interpret the underlying structure and meaning. Its primary mechanism is to enable a user to formally describe how columns of a tabular dataset relate and/or describe entities. For example, if a user identifies columns for latitude and longitude coordinates, we can infer the data refers to a point that can be plotted on a map. Further, it can be made explicit that measurements of 'nitrate' and 'NO3-' are of the same entity through vocabulary assignments, thus more easily utilizing data sets that use different nomenclatures. The Annotator provides an extensive and searchable

  10. Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements

    Directory of Open Access Journals (Sweden)

    Danuta Roszko

    2015-06-01

    Full Text Available Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements In the article the authors present the experimental Polish-Lithuanian corpus (ECorpPL-LT formed for the idea of Polish-Lithuanian theoretical contrastive studies, a Polish-Lithuanian electronic dictionary, and as help for a sworn translator. The semantic annotation being brought into ECorpPL-LT is extremely useful in Polish-Lithuanian contrastive studies, and also proves helpful in translation work.

  11. Construction of coffee transcriptome networks based on gene annotation semantics

    Directory of Open Access Journals (Sweden)

    Castillo Luis F.

    2012-12-01

    Full Text Available Gene annotation is a process that encompasses multiple approaches on the analysis of nucleic acids or protein sequences in order to assign structural and functional characteristics to gene models. When thousands of gene models are being described in an organism genome, construction and visualization of gene networks impose novel challenges in the understanding of complex expression patterns and the generation of new knowledge in genomics research. In order to take advantage of accumulated text data after conventional gene sequence analysis, this work applied semantics in combination with visualization tools to build transcriptome networks from a set of coffee gene annotations. A set of selected coffee transcriptome sequences, chosen by the quality of the sequence comparison reported by Basic Local Alignment Search Tool (BLAST and Interproscan, were filtered out by coverage, identity, length of the query, and e-values. Meanwhile, term descriptors for molecular biology and biochemistry were obtained along the Wordnet dictionary in order to construct a Resource Description Framework (RDF using Ruby scripts and Methontology to find associations between concepts. Relationships between sequence annotations and semantic concepts were graphically represented through a total of 6845 oriented vectors, which were reduced to 745 non-redundant associations. A large gene network connecting transcripts by way of relational concepts was created where detailed connections remain to be validated for biological significance based on current biochemical and genetics frameworks. Besides reusing text information in the generation of gene connections and for data mining purposes, this tool development opens the possibility to visualize complex and abundant transcriptome data, and triggers the formulation of new hypotheses in metabolic pathways analysis.

  12. Fuzzy Emotional Semantic Analysis and Automated Annotation of Scene Images

    Directory of Open Access Journals (Sweden)

    Jianfang Cao

    2015-01-01

    Full Text Available With the advances in electronic and imaging techniques, the production of digital images has rapidly increased, and the extraction and automated annotation of emotional semantics implied by images have become issues that must be urgently addressed. To better simulate human subjectivity and ambiguity for understanding scene images, the current study proposes an emotional semantic annotation method for scene images based on fuzzy set theory. A fuzzy membership degree was calculated to describe the emotional degree of a scene image and was implemented using the Adaboost algorithm and a back-propagation (BP neural network. The automated annotation method was trained and tested using scene images from the SUN Database. The annotation results were then compared with those based on artificial annotation. Our method showed an annotation accuracy rate of 91.2% for basic emotional values and 82.4% after extended emotional values were added, which correspond to increases of 5.5% and 8.9%, respectively, compared with the results from using a single BP neural network algorithm. Furthermore, the retrieval accuracy rate based on our method reached approximately 89%. This study attempts to lay a solid foundation for the automated emotional semantic annotation of more types of images and therefore is of practical significance.

  13. SAS- Semantic Annotation Service for Geoscience resources on the web

    Science.gov (United States)

    Elag, M.; Kumar, P.; Marini, L.; Li, R.; Jiang, P.

    2015-12-01

    There is a growing need for increased integration across the data and model resources that are disseminated on the web to advance their reuse across different earth science applications. Meaningful reuse of resources requires semantic metadata to realize the semantic web vision for allowing pragmatic linkage and integration among resources. Semantic metadata associates standard metadata with resources to turn them into semantically-enabled resources on the web. However, the lack of a common standardized metadata framework as well as the uncoordinated use of metadata fields across different geo-information systems, has led to a situation in which standards and related Standard Names abound. To address this need, we have designed SAS to provide a bridge between the core ontologies required to annotate resources and information systems in order to enable queries and analysis over annotation from a single environment (web). SAS is one of the services that are provided by the Geosematnic framework, which is a decentralized semantic framework to support the integration between models and data and allow semantically heterogeneous to interact with minimum human intervention. Here we present the design of SAS and demonstrate its application for annotating data and models. First we describe how predicates and their attributes are extracted from standards and ingested in the knowledge-base of the Geosemantic framework. Then we illustrate the application of SAS in annotating data managed by SEAD and annotating simulation models that have web interface. SAS is a step in a broader approach to raise the quality of geoscience data and models that are published on the web and allow users to better search, access, and use of the existing resources based on standard vocabularies that are encoded and published using semantic technologies.

  14. Semi-Semantic Annotation: A guideline for the URDU.KON-TB treebank POS annotation

    Directory of Open Access Journals (Sweden)

    Qaiser ABBAS

    2016-12-01

    Full Text Available This work elaborates the semi-semantic part of speech annotation guidelines for the URDU.KON-TB treebank: an annotated corpus. A hierarchical annotation scheme was designed to label the part of speech and then applied on the corpus. This raw corpus was collected from the Urdu Wikipedia and the Jang newspaper and then annotated with the proposed semi-semantic part of speech labels. The corpus contains text of local & international news, social stories, sports, culture, finance, religion, traveling, etc. This exercise finally contributed a part of speech annotation to the URDU.KON-TB treebank. Twenty-two main part of speech categories are divided into subcategories, which conclude the morphological, and semantical information encoded in it. This article reports the annotation guidelines in major; however, it also briefs the development of the URDU.KON-TB treebank, which includes the raw corpus collection, designing & employment of annotation scheme and finally, its statistical evaluation and results. The guidelines presented as follows, will be useful for linguistic community to annotate the sentences not only for the national language Urdu but for the other indigenous languages like Punjab, Sindhi, Pashto, etc., as well.

  15. Arabic web pages clustering and annotation using semantic class features

    Directory of Open Access Journals (Sweden)

    Hanan M. Alghamdi

    2014-12-01

    Full Text Available To effectively manage the great amount of data on Arabic web pages and to enable the classification of relevant information are very important research problems. Studies on sentiment text mining have been very limited in the Arabic language because they need to involve deep semantic processing. Therefore, in this paper, we aim to retrieve machine-understandable data with the help of a Web content mining technique to detect covert knowledge within these data. We propose an approach to achieve clustering with semantic similarities. This approach comprises integrating k-means document clustering with semantic feature extraction and document vectorization to group Arabic web pages according to semantic similarities and then show the semantic annotation. The document vectorization helps to transform text documents into a semantic class probability distribution or semantic class density. To reach semantic similarities, the approach extracts the semantic class features and integrates them into the similarity weighting schema. The quality of the clustering result has evaluated the use of the purity and the mean intra-cluster distance (MICD evaluation measures. We have evaluated the proposed approach on a set of common Arabic news web pages. We have acquired favorable clustering results that are effective in minimizing the MICD, expanding the purity and lowering the runtime.

  16. iPad: Semantic annotation and markup of radiological images.

    Science.gov (United States)

    Rubin, Daniel L; Rodriguez, Cesar; Shah, Priyanka; Beaulieu, Chris

    2008-11-06

    Radiological images contain a wealth of information,such as anatomy and pathology, which is often not explicit and computationally accessible. Information schemes are being developed to describe the semantic content of images, but such schemes can be unwieldy to operationalize because there are few tools to enable users to capture structured information easily as part of the routine research workflow. We have created iPad, an open source tool enabling researchers and clinicians to create semantic annotations on radiological images. iPad hides the complexity of the underlying image annotation information model from users, permitting them to describe images and image regions using a graphical interface that maps their descriptions to structured ontologies semi-automatically. Image annotations are saved in a variety of formats,enabling interoperability among medical records systems, image archives in hospitals, and the Semantic Web. Tools such as iPad can help reduce the burden of collecting structured information from images, and it could ultimately enable researchers and physicians to exploit images on a very large scale and glean the biological and physiological significance of image content.

  17. Annotation of rule-based models with formal semantics to enable creation, analysis, reuse and visualization

    Science.gov (United States)

    Misirli, Goksel; Cavaliere, Matteo; Waites, William; Pocock, Matthew; Madsen, Curtis; Gilfellon, Owen; Honorato-Zimmer, Ricardo; Zuliani, Paolo; Danos, Vincent; Wipat, Anil

    2016-01-01

    Motivation: Biological systems are complex and challenging to model and therefore model reuse is highly desirable. To promote model reuse, models should include both information about the specifics of simulations and the underlying biology in the form of metadata. The availability of computationally tractable metadata is especially important for the effective automated interpretation and processing of models. Metadata are typically represented as machine-readable annotations which enhance programmatic access to information about models. Rule-based languages have emerged as a modelling framework to represent the complexity of biological systems. Annotation approaches have been widely used for reaction-based formalisms such as SBML. However, rule-based languages still lack a rich annotation framework to add semantic information, such as machine-readable descriptions, to the components of a model. Results: We present an annotation framework and guidelines for annotating rule-based models, encoded in the commonly used Kappa and BioNetGen languages. We adapt widely adopted annotation approaches to rule-based models. We initially propose a syntax to store machine-readable annotations and describe a mapping between rule-based modelling entities, such as agents and rules, and their annotations. We then describe an ontology to both annotate these models and capture the information contained therein, and demonstrate annotating these models using examples. Finally, we present a proof of concept tool for extracting annotations from a model that can be queried and analyzed in a uniform way. The uniform representation of the annotations can be used to facilitate the creation, analysis, reuse and visualization of rule-based models. Although examples are given, using specific implementations the proposed techniques can be applied to rule-based models in general. Availability and implementation: The annotation ontology for rule-based models can be found at http

  18. Annotation and retrieval system of CAD models based on functional semantics

    Science.gov (United States)

    Wang, Zhansong; Tian, Ling; Duan, Wenrui

    2014-11-01

    CAD model retrieval based on functional semantics is more significant than content-based 3D model retrieval during the mechanical conceptual design phase. However, relevant research is still not fully discussed. Therefore, a functional semantic-based CAD model annotation and retrieval method is proposed to support mechanical conceptual design and design reuse, inspire designer creativity through existing CAD models, shorten design cycle, and reduce costs. Firstly, the CAD model functional semantic ontology is constructed to formally represent the functional semantics of CAD models and describe the mechanical conceptual design space comprehensively and consistently. Secondly, an approach to represent CAD models as attributed adjacency graphs(AAG) is proposed. In this method, the geometry and topology data are extracted from STEP models. On the basis of AAG, the functional semantics of CAD models are annotated semi-automatically by matching CAD models that contain the partial features of which functional semantics have been annotated manually, thereby constructing CAD Model Repository that supports model retrieval based on functional semantics. Thirdly, a CAD model retrieval algorithm that supports multi-function extended retrieval is proposed to explore more potential creative design knowledge in the semantic level. Finally, a prototype system, called Functional Semantic-based CAD Model Annotation and Retrieval System(FSMARS), is implemented. A case demonstrates that FSMARS can successfully botain multiple potential CAD models that conform to the desired function. The proposed research addresses actual needs and presents a new way to acquire CAD models in the mechanical conceptual design phase.

  19. Evaluating the effect of annotation size on measures of semantic similarity

    KAUST Repository

    Kulmanov, Maxat

    2017-02-13

    Background: Ontologies are widely used as metadata in biological and biomedical datasets. Measures of semantic similarity utilize ontologies to determine how similar two entities annotated with classes from ontologies are, and semantic similarity is increasingly applied in applications ranging from diagnosis of disease to investigation in gene networks and functions of gene products.

  20. Semantic Web Services Challenge, Results from the First Year. Series: Semantic Web And Beyond, Volume 8.

    Science.gov (United States)

    Petrie, C.; Margaria, T.; Lausen, H.; Zaremba, M.

    Explores trade-offs among existing approaches. Reveals strengths and weaknesses of proposed approaches, as well as which aspects of the problem are not yet covered. Introduces software engineering approach to evaluating semantic web services. Service-Oriented Computing is one of the most promising software engineering trends because of the potential to reduce the programming effort for future distributed industrial systems. However, only a small part of this potential rests on the standardization of tools offered by the web services stack. The larger part of this potential rests upon the development of sufficient semantics to automate service orchestration. Currently there are many different approaches to semantic web service descriptions and many frameworks built around them. A common understanding, evaluation scheme, and test bed to compare and classify these frameworks in terms of their capabilities and shortcomings, is necessary to make progress in developing the full potential of Service-Oriented Computing. The Semantic Web Services Challenge is an open source initiative that provides a public evaluation and certification of multiple frameworks on common industrially-relevant problem sets. This edited volume reports on the first results in developing common understanding of the various technologies intended to facilitate the automation of mediation, choreography and discovery for Web Services using semantic annotations. Semantic Web Services Challenge: Results from the First Year is designed for a professional audience composed of practitioners and researchers in industry. Professionals can use this book to evaluate SWS technology for their potential practical use. The book is also suitable for advanced-level students in computer science.

  1. The role of automated speech and audio analysis in semantic multimedia annotation

    NARCIS (Netherlands)

    de Jong, Franciska M.G.; Ordelman, Roeland J.F.; van Hessen, Adrianus J.

    This paper overviews the various ways in which automatic speech and audio analysis can be deployed to enhance the semantic annotation of multimedia content, and as a consequence to improve the effectiveness of conceptual access tools. A number of techniques will be presented, including the alignment

  2. Needle Custom Search: Recall-oriented search on the Web using semantic annotations

    NARCIS (Netherlands)

    Kaptein, Rianne; Koot, Gijs; Huis in 't Veld, Mirjam A.A.; van den Broek, Egon; de Rijke, Maarten; Kenter, Tom; de Vries, A.P.; Zhai, Chen Xiang; de Jong, Franciska M.G.; Radinsky, Kira; Hofmann, Katja

    Web search engines are optimized for early precision, which makes it difficult to perform recall-oriented tasks using these search engines. In this article, we present our tool Needle Custom Search. This tool exploits semantic annotations of Web search results and, thereby, increase the efficiency

  3. Needle Custom Search : Recall-oriented search on the web using semantic annotations

    NARCIS (Netherlands)

    Kaptein, Rianne; Koot, Gijs; Huis in 't Veld, Mirjam A.A.; van den Broek, Egon L.

    2014-01-01

    Web search engines are optimized for early precision, which makes it difficult to perform recall-oriented tasks using these search engines. In this article, we present our tool Needle Custom Search. This tool exploits semantic annotations of Web search results and, thereby, increase the efficiency

  4. Genre-adaptive Semantic Computing and Audio-based Modelling for Music Mood Annotation

    DEFF Research Database (Denmark)

    Saari, Pasi; Fazekas, György; Eerola, Tuomas

    2016-01-01

    This study investigates whether taking genre into account is beneficial for automatic music mood annotation in terms of core affects valence, arousal, and tension, as well as several other mood scales. Novel techniques employing genre-adaptive semantic computing and audio-based modelling are prop......This study investigates whether taking genre into account is beneficial for automatic music mood annotation in terms of core affects valence, arousal, and tension, as well as several other mood scales. Novel techniques employing genre-adaptive semantic computing and audio-based modelling...... related to a set of 600 popular music tracks spanning multiple genres. The results show that ACTwg outperforms a semantic computing technique that does not exploit genre information, and ACTwg-SLPwg outperforms conventional techniques and other genre-adaptive alternatives. In particular, improvements......-based genre representation for genre-adaptive music mood analysis....

  5. FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation.

    Science.gov (United States)

    Bolleman, Jerven T; Mungall, Christopher J; Strozzi, Francesco; Baran, Joachim; Dumontier, Michel; Bonnal, Raoul J P; Buels, Robert; Hoehndorf, Robert; Fujisawa, Takatomo; Katayama, Toshiaki; Cock, Peter J A

    2016-06-13

    Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. We have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned "omics" areas. Using the same data format to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe - and potentially merge - sequence annotations from multiple sources. Data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.

  6. Extending eScience Provenance with User-Submitted Semantic Annotations

    Science.gov (United States)

    Michaelis, J.; Zednik, S.; West, P.; Fox, P. A.; McGuinness, D. L.

    2010-12-01

    eScience based systems generate provenance of their data products, related to such things as: data processing, data collection conditions, expert evaluation, and data product quality. Recent advances in web-based technology offer users the possibility of making annotations to both data products and steps in accompanying provenance traces, thereby expanding the utility of such provenance for others. These contributing users may have varying backgrounds, ranging from system experts to outside domain experts to citizen scientists. Furthermore, such users may wish to make varying types of annotations - ranging from documenting the purpose of a provenance step to raising concerns about the quality of data dependencies. Semantic Web technologies allow for such kinds of rich annotations to be made to provenance through the use of ontology vocabularies for (i) organizing provenance, and (ii) organizing user/annotation classifications. Furthermore, through Linked Data practices, Semantic linkages may be made from provenance steps to external data of interest. A desire for Semantically-annotated provenance has been motivated by data management issues in the Mauna Loa Solar Observatory’s (MLSO) Advanced Coronal Observing System (ACOS). In ACOS, photomoeter-based readings are taken of solar activity and subsequently processed into final data products consumable by end users. At intermediate stages of ACOS processing, factors such as evaluations by human experts and weather conditions are logged, which could impact data product quality. If such factors are linked via user-submitted annotations to provenance, it could be significantly beneficial for other users. Likewise, the background of a user could impact the credibility of their annotations. For example, an annotation made by a citizen scientist describing the purpose of a provenance step may not be as reliable as a similar annotation made by an ACOS project member. For this work, we have developed a software package that

  7. Fast and accurate semantic annotation of bioassays exploiting a hybrid of machine learning and user confirmation.

    Science.gov (United States)

    Clark, Alex M; Bunin, Barry A; Litterman, Nadia K; Schürer, Stephan C; Visser, Ubbo

    2014-01-01

    Bioinformatics and computer aided drug design rely on the curation of a large number of protocols for biological assays that measure the ability of potential drugs to achieve a therapeutic effect. These assay protocols are generally published by scientists in the form of plain text, which needs to be more precisely annotated in order to be useful to software methods. We have developed a pragmatic approach to describing assays according to the semantic definitions of the BioAssay Ontology (BAO) project, using a hybrid of machine learning based on natural language processing, and a simplified user interface designed to help scientists curate their data with minimum effort. We have carried out this work based on the premise that pure machine learning is insufficiently accurate, and that expecting scientists to find the time to annotate their protocols manually is unrealistic. By combining these approaches, we have created an effective prototype for which annotation of bioassay text within the domain of the training set can be accomplished very quickly. Well-trained annotations require single-click user approval, while annotations from outside the training set domain can be identified using the search feature of a well-designed user interface, and subsequently used to improve the underlying models. By drastically reducing the time required for scientists to annotate their assays, we can realistically advocate for semantic annotation to become a standard part of the publication process. Once even a small proportion of the public body of bioassay data is marked up, bioinformatics researchers can begin to construct sophisticated and useful searching and analysis algorithms that will provide a diverse and powerful set of tools for drug discovery researchers.

  8. Fast and accurate semantic annotation of bioassays exploiting a hybrid of machine learning and user confirmation

    Directory of Open Access Journals (Sweden)

    Alex M. Clark

    2014-08-01

    Full Text Available Bioinformatics and computer aided drug design rely on the curation of a large number of protocols for biological assays that measure the ability of potential drugs to achieve a therapeutic effect. These assay protocols are generally published by scientists in the form of plain text, which needs to be more precisely annotated in order to be useful to software methods. We have developed a pragmatic approach to describing assays according to the semantic definitions of the BioAssay Ontology (BAO project, using a hybrid of machine learning based on natural language processing, and a simplified user interface designed to help scientists curate their data with minimum effort. We have carried out this work based on the premise that pure machine learning is insufficiently accurate, and that expecting scientists to find the time to annotate their protocols manually is unrealistic. By combining these approaches, we have created an effective prototype for which annotation of bioassay text within the domain of the training set can be accomplished very quickly. Well-trained annotations require single-click user approval, while annotations from outside the training set domain can be identified using the search feature of a well-designed user interface, and subsequently used to improve the underlying models. By drastically reducing the time required for scientists to annotate their assays, we can realistically advocate for semantic annotation to become a standard part of the publication process. Once even a small proportion of the public body of bioassay data is marked up, bioinformatics researchers can begin to construct sophisticated and useful searching and analysis algorithms that will provide a diverse and powerful set of tools for drug discovery researchers.

  9. Prototype semantic infrastructure for automated small molecule classification and annotation in lipidomics.

    Science.gov (United States)

    Chepelev, Leonid L; Riazanov, Alexandre; Kouznetsov, Alexandre; Low, Hong Sang; Dumontier, Michel; Baker, Christopher J O

    2011-07-26

    The development of high-throughput experimentation has led to astronomical growth in biologically relevant lipids and lipid derivatives identified, screened, and deposited in numerous online databases. Unfortunately, efforts to annotate, classify, and analyze these chemical entities have largely remained in the hands of human curators using manual or semi-automated protocols, leaving many novel entities unclassified. Since chemical function is often closely linked to structure, accurate structure-based classification and annotation of chemical entities is imperative to understanding their functionality. As part of an exploratory study, we have investigated the utility of semantic web technologies in automated chemical classification and annotation of lipids. Our prototype framework consists of two components: an ontology and a set of federated web services that operate upon it. The formal lipid ontology we use here extends a part of the LiPrO ontology and draws on the lipid hierarchy in the LIPID MAPS database, as well as literature-derived knowledge. The federated semantic web services that operate upon this ontology are deployed within the Semantic Annotation, Discovery, and Integration (SADI) framework. Structure-based lipid classification is enacted by two core services. Firstly, a structural annotation service detects and enumerates relevant functional groups for a specified chemical structure. A second service reasons over lipid ontology class descriptions using the attributes obtained from the annotation service and identifies the appropriate lipid classification. We extend the utility of these core services by combining them with additional SADI services that retrieve associations between lipids and proteins and identify publications related to specified lipid types. We analyze the performance of SADI-enabled eicosanoid classification relative to the LIPID MAPS classification and reflect on the contribution of our integrative methodology in the context of

  10. Prototype semantic infrastructure for automated small molecule classification and annotation in lipidomics

    Directory of Open Access Journals (Sweden)

    Dumontier Michel

    2011-07-01

    Full Text Available Abstract Background The development of high-throughput experimentation has led to astronomical growth in biologically relevant lipids and lipid derivatives identified, screened, and deposited in numerous online databases. Unfortunately, efforts to annotate, classify, and analyze these chemical entities have largely remained in the hands of human curators using manual or semi-automated protocols, leaving many novel entities unclassified. Since chemical function is often closely linked to structure, accurate structure-based classification and annotation of chemical entities is imperative to understanding their functionality. Results As part of an exploratory study, we have investigated the utility of semantic web technologies in automated chemical classification and annotation of lipids. Our prototype framework consists of two components: an ontology and a set of federated web services that operate upon it. The formal lipid ontology we use here extends a part of the LiPrO ontology and draws on the lipid hierarchy in the LIPID MAPS database, as well as literature-derived knowledge. The federated semantic web services that operate upon this ontology are deployed within the Semantic Annotation, Discovery, and Integration (SADI framework. Structure-based lipid classification is enacted by two core services. Firstly, a structural annotation service detects and enumerates relevant functional groups for a specified chemical structure. A second service reasons over lipid ontology class descriptions using the attributes obtained from the annotation service and identifies the appropriate lipid classification. We extend the utility of these core services by combining them with additional SADI services that retrieve associations between lipids and proteins and identify publications related to specified lipid types. We analyze the performance of SADI-enabled eicosanoid classification relative to the LIPID MAPS classification and reflect on the contribution of

  11. Semi-supervised learning based probabilistic latent semantic analysis for automatic image annotation

    Institute of Scientific and Technical Information of China (English)

    Tian Dongping

    2017-01-01

    In recent years, multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas, especially for automatic image annotation, whose purpose is to provide an efficient and effective searching environment for users to query their images more easily.In this paper, a semi-supervised learning based probabilistic latent semantic analysis ( PL-SA) model for automatic image annotation is presenred.Since it' s often hard to obtain or create la-beled images in large quantities while unlabeled ones are easier to collect, a transductive support vector machine ( TSVM) is exploited to enhance the quality of the training image data.Then, differ-ent image features with different magnitudes will result in different performance for automatic image annotation.To this end, a Gaussian normalization method is utilized to normalize different features extracted from effective image regions segmented by the normalized cuts algorithm so as to reserve the intrinsic content of images as complete as possible.Finally, a PLSA model with asymmetric mo-dalities is constructed based on the expectation maximization( EM) algorithm to predict a candidate set of annotations with confidence scores.Extensive experiments on the general-purpose Corel5k dataset demonstrate that the proposed model can significantly improve performance of traditional PL-SA for the task of automatic image annotation.

  12. Improving integrative searching of systems chemical biology data using semantic annotation.

    Science.gov (United States)

    Chen, Bin; Ding, Ying; Wild, David J

    2012-03-08

    Systems chemical biology and chemogenomics are considered critical, integrative disciplines in modern biomedical research, but require data mining of large, integrated, heterogeneous datasets from chemistry and biology. We previously developed an RDF-based resource called Chem2Bio2RDF that enabled querying of such data using the SPARQL query language. Whilst this work has proved useful in its own right as one of the first major resources in these disciplines, its utility could be greatly improved by the application of an ontology for annotation of the nodes and edges in the RDF graph, enabling a much richer range of semantic queries to be issued. We developed a generalized chemogenomics and systems chemical biology OWL ontology called Chem2Bio2OWL that describes the semantics of chemical compounds, drugs, protein targets, pathways, genes, diseases and side-effects, and the relationships between them. The ontology also includes data provenance. We used it to annotate our Chem2Bio2RDF dataset, making it a rich semantic resource. Through a series of scientific case studies we demonstrate how this (i) simplifies the process of building SPARQL queries, (ii) enables useful new kinds of queries on the data and (iii) makes possible intelligent reasoning and semantic graph mining in chemogenomics and systems chemical biology. Chem2Bio2OWL is available at http://chem2bio2rdf.org/owl. The document is available at http://chem2bio2owl.wikispaces.com.

  13. Improving integrative searching of systems chemical biology data using semantic annotation

    Directory of Open Access Journals (Sweden)

    Chen Bin

    2012-03-01

    Full Text Available Abstract Background Systems chemical biology and chemogenomics are considered critical, integrative disciplines in modern biomedical research, but require data mining of large, integrated, heterogeneous datasets from chemistry and biology. We previously developed an RDF-based resource called Chem2Bio2RDF that enabled querying of such data using the SPARQL query language. Whilst this work has proved useful in its own right as one of the first major resources in these disciplines, its utility could be greatly improved by the application of an ontology for annotation of the nodes and edges in the RDF graph, enabling a much richer range of semantic queries to be issued. Results We developed a generalized chemogenomics and systems chemical biology OWL ontology called Chem2Bio2OWL that describes the semantics of chemical compounds, drugs, protein targets, pathways, genes, diseases and side-effects, and the relationships between them. The ontology also includes data provenance. We used it to annotate our Chem2Bio2RDF dataset, making it a rich semantic resource. Through a series of scientific case studies we demonstrate how this (i simplifies the process of building SPARQL queries, (ii enables useful new kinds of queries on the data and (iii makes possible intelligent reasoning and semantic graph mining in chemogenomics and systems chemical biology. Availability Chem2Bio2OWL is available at http://chem2bio2rdf.org/owl. The document is available at http://chem2bio2owl.wikispaces.com.

  14. Desiderata for ontologies to be used in semantic annotation of biomedical documents.

    Science.gov (United States)

    Bada, Michael; Hunter, Lawrence

    2011-02-01

    A wealth of knowledge valuable to the translational research scientist is contained within the vast biomedical literature, but this knowledge is typically in the form of natural language. Sophisticated natural-language-processing systems are needed to translate text into unambiguous formal representations grounded in high-quality consensus ontologies, and these systems in turn rely on gold-standard corpora of annotated documents for training and testing. To this end, we are constructing the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-text biomedical journal articles that are being manually annotated with the entire sets of terms from select vocabularies, predominantly from the Open Biomedical Ontologies (OBO) library. Our efforts in building this corpus has illuminated infelicities of these ontologies with respect to the semantic annotation of biomedical documents, and we propose desiderata whose implementation could substantially improve their utility in this task; these include the integration of overlapping terms across OBOs, the resolution of OBO-specific ambiguities, the integration of the BFO with the OBOs and the use of mid-level ontologies, the inclusion of noncanonical instances, and the expansion of relations and realizable entities. Copyright © 2010 Elsevier Inc. All rights reserved.

  15. IntelliGO: a new vector-based semantic similarity measure including annotation origin

    Directory of Open Access Journals (Sweden)

    Devignes Marie-Dominique

    2010-12-01

    Full Text Available Abstract Background The Gene Ontology (GO is a well known controlled vocabulary describing the biological process, molecular function and cellular component aspects of gene annotation. It has become a widely used knowledge source in bioinformatics for annotating genes and measuring their semantic similarity. These measures generally involve the GO graph structure, the information content of GO aspects, or a combination of both. However, only a few of the semantic similarity measures described so far can handle GO annotations differently according to their origin (i.e. their evidence codes. Results We present here a new semantic similarity measure called IntelliGO which integrates several complementary properties in a novel vector space model. The coefficients associated with each GO term that annotates a given gene or protein include its information content as well as a customized value for each type of GO evidence code. The generalized cosine similarity measure, used for calculating the dot product between two vectors, has been rigorously adapted to the context of the GO graph. The IntelliGO similarity measure is tested on two benchmark datasets consisting of KEGG pathways and Pfam domains grouped as clans, considering the GO biological process and molecular function terms, respectively, for a total of 683 yeast and human genes and involving more than 67,900 pair-wise comparisons. The ability of the IntelliGO similarity measure to express the biological cohesion of sets of genes compares favourably to four existing similarity measures. For inter-set comparison, it consistently discriminates between distinct sets of genes. Furthermore, the IntelliGO similarity measure allows the influence of weights assigned to evidence codes to be checked. Finally, the results obtained with a complementary reference technique give intermediate but correct correlation values with the sequence similarity, Pfam, and Enzyme classifications when compared to

  16. The use of semantic similarity measures for optimally integrating heterogeneous Gene Ontology data from large scale annotation pipelines

    Directory of Open Access Journals (Sweden)

    Gaston K Mazandu

    2014-08-01

    Full Text Available With the advancement of new high throughput sequencing technologies, there has been an increase in the number of genome sequencing projects worldwide, which has yielded complete genome sequences of human, animals and plants. Subsequently, several labs have focused on genome annotation, consisting of assigning functions to gene products, mostly using Gene Ontology (GO terms. As a consequence, there is an increased heterogeneity in annotations across genomes due to different approaches used by different pipelines to infer these annotations and also due to the nature of the GO structure itself. This makes a curator's task difficult, even if they adhere to the established guidelines for assessing these protein annotations. Here we develop a genome-scale approach for integrating GO annotations from different pipelines using semantic similarity measures. We used this approach to identify inconsistencies and similarities in functional annotations between orthologs of human and Drosophila melanogaster, to assess the quality of GO annotations derived from InterPro2GO mappings compared to manually annotated GO annotations for the Drosophila melanogaster proteome from a FlyBase dataset and human, and to filter GO annotation data for these proteomes. Results obtained indicate that an efficient integration of GO annotations eliminates redundancy up to 27.08 and 22.32% in the Drosophila melanogaster and human GO annotation datasets, respectively. Furthermore, we identified lack of and missing annotations for some orthologs, and annotation mismatches between InterPro2GO and manual pipelines in these two proteomes, thus requiring further curation. This simplifies and facilitates tasks of curators in assessing protein annotations, reduces redundancy and eliminates inconsistencies in large annotation datasets for ease of comparative functional genomics.

  17. Annotation and Curation of Uncharacterized proteins- Challenges

    Directory of Open Access Journals (Sweden)

    Johny eIjaq

    2015-03-01

    Full Text Available Hypothetical Proteins are the proteins that are predicted to be expressed from an open reading frame (ORF, constituting a substantial fraction of proteomes in both prokaryotes and eukaryotes. Genome projects have led to the identification of many therapeutic targets, the putative function of the protein and their interactions. In this review we have enlisted various methods. Annotation linked to structural and functional prediction of hypothetical proteins assist in the discovery of new structures and functions serving as markers and pharmacological targets for drug designing, discovery and screening. Mass spectrometry is an analytical technique for validating protein characterisation. Matrix-assisted laser desorption ionization–mass spectrometry (MALDI-MS is an efficient analytical method. Microarrays and Protein expression profiles help understanding the biological systems through a systems-wide study of proteins and their interactions with other proteins and non-proteinaceous molecules to control complex processes in cells and tissues and even whole organism. Next generation sequencing technology accelerates multiple areas of genomics research.

  18. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.

  19. SEMANTIC WEB MINING: ISSUES AND CHALLENGES

    OpenAIRE

    Karan Singh*, Anil kumar, Arun Kumar Yadav

    2016-01-01

    The combination of the two fast evolving scientific research areas “Semantic Web” and “Web Mining” are well-known as “Semantic Web Mining” in computer science. These two areas cover way for the mining of related and meaningful information from the web, by this means giving growth to the term “Semantic Web Mining”. The “Semantic Web” makes mining easy and “Web Mining” can construct new structure of Web. Web Mining applies Data Mining technique on web content, Structure and Usage. This paper gi...

  20. Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases.

    Science.gov (United States)

    Wollbrett, Julien; Larmande, Pierre; de Lamotte, Frédéric; Ruiz, Manuel

    2013-04-15

    In recent years, a large amount of "-omics" data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic.

  1. Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases

    Science.gov (United States)

    2013-01-01

    Background In recent years, a large amount of “-omics” data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. Results We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. Conclusions BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic. PMID:23586394

  2. MO-DE-207B-08: Radiomic CT Features Complement Semantic Annotations to Predict EGFR Mutations in Lung Adenocarcinomas

    Energy Technology Data Exchange (ETDEWEB)

    Rios Velazquez, E; Parmar, C; Narayan, V; Aerts, H [Dana-Farber Cancer Institute, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts (United States); Liu, Y; Gillies, R [H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL (United States)

    2016-06-15

    Purpose: To compare the complementary value of quantitative radiomic features to that of radiologist-annotated semantic features in predicting EGFR mutations in lung adenocarcinomas. Methods: Pre-operative CT images of 258 lung adenocarcinoma patients were available. Tumors were segmented using the sing-click ensemble segmentation algorithm. A set of radiomic features was extracted using 3D-Slicer. Test-retest reproducibility and unsupervised dimensionality reduction were applied to select a subset of reproducible and independent radiomic features. Twenty semantic annotations were scored by an expert radiologist, describing the tumor, surrounding tissue and associated findings. Minimum-redundancy-maximum-relevance (MRMR) was used to identify the most informative radiomic and semantic features in 172 patients (training-set, temporal split). Radiomic, semantic and combined radiomic-semantic logistic regression models to predict EGFR mutations were evaluated in and independent validation dataset of 86 patients using the area under the receiver operating curve (AUC). Results: EGFR mutations were found in 77/172 (45%) and 39/86 (45%) of the training and validation sets, respectively. Univariate AUCs showed a similar range for both feature types: radiomics median AUC = 0.57 (range: 0.50 – 0.62); semantic median AUC = 0.53 (range: 0.50 – 0.64, Wilcoxon p = 0.55). After MRMR feature selection, the best-performing radiomic, semantic, and radiomic-semantic logistic regression models, for EGFR mutations, showed a validation AUC of 0.56 (p = 0.29), 0.63 (p = 0.063) and 0.67 (p = 0.004), respectively. Conclusion: Quantitative volumetric and textural Radiomic features complement the qualitative and semi-quantitative radiologist annotations. The prognostic value of informative qualitative semantic features such as cavitation and lobulation is increased with the addition of quantitative textural features from the tumor region.

  3. MO-DE-207B-08: Radiomic CT Features Complement Semantic Annotations to Predict EGFR Mutations in Lung Adenocarcinomas

    International Nuclear Information System (INIS)

    Rios Velazquez, E; Parmar, C; Narayan, V; Aerts, H; Liu, Y; Gillies, R

    2016-01-01

    Purpose: To compare the complementary value of quantitative radiomic features to that of radiologist-annotated semantic features in predicting EGFR mutations in lung adenocarcinomas. Methods: Pre-operative CT images of 258 lung adenocarcinoma patients were available. Tumors were segmented using the sing-click ensemble segmentation algorithm. A set of radiomic features was extracted using 3D-Slicer. Test-retest reproducibility and unsupervised dimensionality reduction were applied to select a subset of reproducible and independent radiomic features. Twenty semantic annotations were scored by an expert radiologist, describing the tumor, surrounding tissue and associated findings. Minimum-redundancy-maximum-relevance (MRMR) was used to identify the most informative radiomic and semantic features in 172 patients (training-set, temporal split). Radiomic, semantic and combined radiomic-semantic logistic regression models to predict EGFR mutations were evaluated in and independent validation dataset of 86 patients using the area under the receiver operating curve (AUC). Results: EGFR mutations were found in 77/172 (45%) and 39/86 (45%) of the training and validation sets, respectively. Univariate AUCs showed a similar range for both feature types: radiomics median AUC = 0.57 (range: 0.50 – 0.62); semantic median AUC = 0.53 (range: 0.50 – 0.64, Wilcoxon p = 0.55). After MRMR feature selection, the best-performing radiomic, semantic, and radiomic-semantic logistic regression models, for EGFR mutations, showed a validation AUC of 0.56 (p = 0.29), 0.63 (p = 0.063) and 0.67 (p = 0.004), respectively. Conclusion: Quantitative volumetric and textural Radiomic features complement the qualitative and semi-quantitative radiologist annotations. The prognostic value of informative qualitative semantic features such as cavitation and lobulation is increased with the addition of quantitative textural features from the tumor region.

  4. Challenges as enablers for high quality Linked Data: insights from the Semantic Publishing Challenge

    Directory of Open Access Journals (Sweden)

    Anastasia Dimou

    2017-01-01

    Full Text Available While most challenges organized so far in the Semantic Web domain are focused on comparing tools with respect to different criteria such as their features and competencies, or exploiting semantically enriched data, the Semantic Web Evaluation Challenges series, co-located with the ESWC Semantic Web Conference, aims to compare them based on their output, namely the produced dataset. The Semantic Publishing Challenge is one of these challenges. Its goal is to involve participants in extracting data from heterogeneous sources on scholarly publications, and producing Linked Data that can be exploited by the community itself. This paper reviews lessons learned from both (i the overall organization of the Semantic Publishing Challenge, regarding the definition of the tasks, building the input dataset and forming the evaluation, and (ii the results produced by the participants, regarding the proposed approaches, the used tools, the preferred vocabularies and the results produced in the three editions of 2014, 2015 and 2016. We compared these lessons to other Semantic Web Evaluation Challenges. In this paper, we (i distill best practices for organizing such challenges that could be applied to similar events, and (ii report observations on Linked Data publishing derived from the submitted solutions. We conclude that higher quality may be achieved when Linked Data is produced as a result of a challenge, because the competition becomes an incentive, while solutions become better with respect to Linked Data publishing best practices when they are evaluated against the rules of the  challenge.

  5. Data mart construction based on semantic annotation of scientific articles: A case study for the prioritization of drug targets.

    Science.gov (United States)

    Teixeira, Marlon Amaro Coelho; Belloze, Kele Teixeira; Cavalcanti, Maria Cláudia; Silva-Junior, Floriano P

    2018-04-01

    Semantic text annotation enables the association of semantic information (ontology concepts) to text expressions (terms), which are readable by software agents. In the scientific scenario, this is particularly useful because it reveals a lot of scientific discoveries that are hidden within academic articles. The Biomedical area has more than 300 ontologies, most of them composed of over 500 concepts. These ontologies can be used to annotate scientific papers and thus, facilitate data extraction. However, in the context of a scientific research, a simple keyword-based query using the interface of a digital scientific texts library can return more than a thousand hits. The analysis of such a large set of texts, annotated with such numerous and large ontologies, is not an easy task. Therefore, the main objective of this work is to provide a method that could facilitate this task. This work describes a method called Text and Ontology ETL (TOETL), to build an analytical view over such texts. First, a corpus of selected papers is semantically annotated using distinct ontologies. Then, the annotation data is extracted, organized and aggregated into the dimensional schema of a data mart. Besides the TOETL method, this work illustrates its application through the development of the TaP DM (Target Prioritization data mart). This data mart has focus on the research of gene essentiality, a key concept to be considered when searching for genes showing potential as anti-infective drug targets. This work reveals that the proposed approach is a relevant tool to support decision making in the prioritization of new drug targets, being more efficient than the keyword-based traditional tools. Copyright © 2018 Elsevier B.V. All rights reserved.

  6. WebQuest y anotaciones semánticas WebQuest and semantic annotations

    Directory of Open Access Journals (Sweden)

    Santiago Blanco Suárez

    2007-03-01

    Full Text Available En este artículo se presenta un sistema de búsqueda y recuperación de metadatos de actividades educativas que siguen el modelo WebQuest. Se trata de una base de datos relacional, accesible a través del web, que se complementa con un módulo que permite realizar anotaciones semánticas y cuyo objetivo es capturar y enriquecer el conocimiento acerca del uso de dichos ejercicios por parte de la comunidad de docentes que experimentan con ellos, así como documentar los recursos o sitios web de interés didáctico buscando construir un repositorio de enlaces educativos de calidad. This paper presents a system of searching and recovering educational activities that follow the Web-Quest model through the web, complemented with a module to make semantic annotations aimed at getting and enriching the knowledge on the use of these exercises by the teaching community. It also tries to document the resources or websites with didactic interest in order to build a qualified account of educational links.

  7. A way toward analyzing high-content bioimage data by means of semantic annotation and visual data mining

    Science.gov (United States)

    Herold, Julia; Abouna, Sylvie; Zhou, Luxian; Pelengaris, Stella; Epstein, David B. A.; Khan, Michael; Nattkemper, Tim W.

    2009-02-01

    In the last years, bioimaging has turned from qualitative measurements towards a high-throughput and highcontent modality, providing multiple variables for each biological sample analyzed. We present a system which combines machine learning based semantic image annotation and visual data mining to analyze such new multivariate bioimage data. Machine learning is employed for automatic semantic annotation of regions of interest. The annotation is the prerequisite for a biological object-oriented exploration of the feature space derived from the image variables. With the aid of visual data mining, the obtained data can be explored simultaneously in the image as well as in the feature domain. Especially when little is known of the underlying data, for example in the case of exploring the effects of a drug treatment, visual data mining can greatly aid the process of data evaluation. We demonstrate how our system is used for image evaluation to obtain information relevant to diabetes study and screening of new anti-diabetes treatments. Cells of the Islet of Langerhans and whole pancreas in pancreas tissue samples are annotated and object specific molecular features are extracted from aligned multichannel fluorescence images. These are interactively evaluated for cell type classification in order to determine the cell number and mass. Only few parameters need to be specified which makes it usable also for non computer experts and allows for high-throughput analysis.

  8. Knowledge representation and management: benefits and challenges of the semantic web for the fields of KRM and NLP.

    Science.gov (United States)

    Rassinoux, A-M

    2011-01-01

    To summarize excellent current research in the field of knowledge representation and management (KRM). A synopsis of the articles selected for the IMIA Yearbook 2011 is provided and an attempt to highlight the current trends in the field is sketched. This last decade, with the extension of the text-based web towards a semantic-structured web, NLP techniques have experienced a renewed interest in knowledge extraction. This trend is corroborated through the five papers selected for the KRM section of the Yearbook 2011. They all depict outstanding studies that exploit NLP technologies whenever possible in order to accurately extract meaningful information from various biomedical textual sources. Bringing semantic structure to the meaningful content of textual web pages affords the user with cooperative sharing and intelligent finding of electronic data. As exemplified by the best paper selection, more and more advanced biomedical applications aim at exploiting the meaningful richness of free-text documents in order to generate semantic metadata and recently to learn and populate domain ontologies. These later are becoming a key piece as they allow portraying the semantics of the Semantic Web content. Maintaining their consistency with documents and semantic annotations that refer to them is a crucial challenge of the Semantic Web for the coming years.

  9. Towards a Semantic Web of Things: A Hybrid Semantic Annotation, Extraction, and Reasoning Framework for Cyber-Physical System

    OpenAIRE

    Wu, Zhenyu; Xu, Yuan; Yang, Yunong; Zhang, Chunhong; Zhu, Xinning; Ji, Yang

    2017-01-01

    Web of Things (WoT) facilitates the discovery and interoperability of Internet of Things (IoT) devices in a cyber-physical system (CPS). Moreover, a uniform knowledge representation of physical resources is quite necessary for further composition, collaboration, and decision-making process in CPS. Though several efforts have integrated semantics with WoT, such as knowledge engineering methods based on semantic sensor networks (SSN), it still could not represent the complex relationships betwe...

  10. A Collaborative Semantic Annotation System in Health: Towards a SOA Design for Knowledge Sharing in Ambient Intelligence

    Directory of Open Access Journals (Sweden)

    Gabriel Guerrero-Contreras

    2017-01-01

    Full Text Available People nowadays spend more and more time performing collaborative tasks at anywhere and anytime. Specifically, professionals want to collaborate with each other by using advanced technologies for sharing knowledge in order to improve/automatize business processes. Semantic web technologies offer multiple benefits such as data integration across sources and automation enablers. The conversion of the widespread Content Management Systems into its semantic equivalent is a relevant step, as this enables the benefits of the semantic web to be extended. The FLERSA annotation tool makes it possible. In particular, it converts the Joomla! CMS into its semantic equivalent. However, this tool is highly coupled with that specific Joomla! platform. Furthermore, ambient intelligent (AmI environments can be seen as a natural way to address complex interactions between users and their environment, which could be transparently supported through distributed information systems. However, to build distributed information systems for AmI environments it is necessary to make important design decisions and apply techniques at system/software architecture level. In this paper, a SOA-based design solution consisting of two services and an underlying middleware is combined with the FLERSA tool. It allows end-users to collaborate independently of technical details and specific context conditions and in a distributed, decentralized way.

  11. Investigating Correlation between Protein Sequence Similarity and Semantic Similarity Using Gene Ontology Annotations.

    Science.gov (United States)

    Ikram, Najmul; Qadir, Muhammad Abdul; Afzal, Muhammad Tanvir

    2018-01-01

    Sequence similarity is a commonly used measure to compare proteins. With the increasing use of ontologies, semantic (function) similarity is getting importance. The correlation between these measures has been applied in the evaluation of new semantic similarity methods, and in protein function prediction. In this research, we investigate the relationship between the two similarity methods. The results suggest absence of a strong correlation between sequence and semantic similarities. There is a large number of proteins with low sequence similarity and high semantic similarity. We observe that Pearson's correlation coefficient is not sufficient to explain the nature of this relationship. Interestingly, the term semantic similarity values above 0 and below 1 do not seem to play a role in improving the correlation. That is, the correlation coefficient depends only on the number of common GO terms in proteins under comparison, and the semantic similarity measurement method does not influence it. Semantic similarity and sequence similarity have a distinct behavior. These findings are of significant effect for future works on protein comparison, and will help understand the semantic similarity between proteins in a better way.

  12. Towards a Semantic Web of Things: A Hybrid Semantic Annotation, Extraction, and Reasoning Framework for Cyber-Physical System.

    Science.gov (United States)

    Wu, Zhenyu; Xu, Yuan; Yang, Yunong; Zhang, Chunhong; Zhu, Xinning; Ji, Yang

    2017-02-20

    Web of Things (WoT) facilitates the discovery and interoperability of Internet of Things (IoT) devices in a cyber-physical system (CPS). Moreover, a uniform knowledge representation of physical resources is quite necessary for further composition, collaboration, and decision-making process in CPS. Though several efforts have integrated semantics with WoT, such as knowledge engineering methods based on semantic sensor networks (SSN), it still could not represent the complex relationships between devices when dynamic composition and collaboration occur, and it totally depends on manual construction of a knowledge base with low scalability. In this paper, to addresses these limitations, we propose the semantic Web of Things (SWoT) framework for CPS (SWoT4CPS). SWoT4CPS provides a hybrid solution with both ontological engineering methods by extending SSN and machine learning methods based on an entity linking (EL) model. To testify to the feasibility and performance, we demonstrate the framework by implementing a temperature anomaly diagnosis and automatic control use case in a building automation system. Evaluation results on the EL method show that linking domain knowledge to DBpedia has a relative high accuracy and the time complexity is at a tolerant level. Advantages and disadvantages of SWoT4CPS with future work are also discussed.

  13. Towards a Semantic Web of Things: A Hybrid Semantic Annotation, Extraction, and Reasoning Framework for Cyber-Physical System

    Directory of Open Access Journals (Sweden)

    Zhenyu Wu

    2017-02-01

    Full Text Available Web of Things (WoT facilitates the discovery and interoperability of Internet of Things (IoT devices in a cyber-physical system (CPS. Moreover, a uniform knowledge representation of physical resources is quite necessary for further composition, collaboration, and decision-making process in CPS. Though several efforts have integrated semantics with WoT, such as knowledge engineering methods based on semantic sensor networks (SSN, it still could not represent the complex relationships between devices when dynamic composition and collaboration occur, and it totally depends on manual construction of a knowledge base with low scalability. In this paper, to addresses these limitations, we propose the semantic Web of Things (SWoT framework for CPS (SWoT4CPS. SWoT4CPS provides a hybrid solution with both ontological engineering methods by extending SSN and machine learning methods based on an entity linking (EL model. To testify to the feasibility and performance, we demonstrate the framework by implementing a temperature anomaly diagnosis and automatic control use case in a building automation system. Evaluation results on the EL method show that linking domain knowledge to DBpedia has a relative high accuracy and the time complexity is at a tolerant level. Advantages and disadvantages of SWoT4CPS with future work are also discussed.

  14. In the pursuit of a semantic similarity metric based on UMLS annotations for articles in PubMed Central Open Access.

    Science.gov (United States)

    Garcia Castro, Leyla Jael; Berlanga, Rafael; Garcia, Alexander

    2015-10-01

    Although full-text articles are provided by the publishers in electronic formats, it remains a challenge to find related work beyond the title and abstract context. Identifying related articles based on their abstract is indeed a good starting point; this process is straightforward and does not consume as many resources as full-text based similarity would require. However, further analyses may require in-depth understanding of the full content. Two articles with highly related abstracts can be substantially different regarding the full content. How similarity differs when considering title-and-abstract versus full-text and which semantic similarity metric provides better results when dealing with full-text articles are the main issues addressed in this manuscript. We have benchmarked three similarity metrics - BM25, PMRA, and Cosine, in order to determine which one performs best when using concept-based annotations on full-text documents. We also evaluated variations in similarity values based on title-and-abstract against those relying on full-text. Our test dataset comprises the Genomics track article collection from the 2005 Text Retrieval Conference. Initially, we used an entity recognition software to semantically annotate titles and abstracts as well as full-text with concepts defined in the Unified Medical Language System (UMLS®). For each article, we created a document profile, i.e., a set of identified concepts, term frequency, and inverse document frequency; we then applied various similarity metrics to those document profiles. We considered correlation, precision, recall, and F1 in order to determine which similarity metric performs best with concept-based annotations. For those full-text articles available in PubMed Central Open Access (PMC-OA), we also performed dispersion analyses in order to understand how similarity varies when considering full-text articles. We have found that the PubMed Related Articles similarity metric is the most suitable for

  15. Combining rules, background knowledge and change patterns to maintain semantic annotations.

    Science.gov (United States)

    Cardoso, Silvio Domingos; Chantal, Reynaud-Delaître; Da Silveira, Marcos; Pruski, Cédric

    2017-01-01

    Knowledge Organization Systems (KOS) play a key role in enriching biomedical information in order to make it machine-understandable and shareable. This is done by annotating medical documents, or more specifically, associating concept labels from KOS with pieces of digital information, e.g., images or texts. However, the dynamic nature of KOS may impact the annotations, thus creating a mismatch between the evolved concept and the associated information. To solve this problem, methods to maintain the quality of the annotations are required. In this paper, we define a framework based on rules, background knowledge and change patterns to drive the annotation adaption process. We evaluate experimentally the proposed approach in realistic cases-studies and demonstrate the overall performance of our approach in different KOS considering the precision, recall, F1-score and AUC value of the system.

  16. The Semantic Web: opportunities and challenges for next-generation Web applications

    Directory of Open Access Journals (Sweden)

    2002-01-01

    Full Text Available Recently there has been a growing interest in the investigation and development of the next generation web - the Semantic Web. While most of the current forms of web content are designed to be presented to humans, but are barely understandable by computers, the content of the Semantic Web is structured in a semantic way so that it is meaningful to computers as well as to humans. In this paper, we report a survey of recent research on the Semantic Web. In particular, we present the opportunities that this revolution will bring to us: web-services, agent-based distributed computing, semantics-based web search engines, and semantics-based digital libraries. We also discuss the technical and cultural challenges of realizing the Semantic Web: the development of ontologies, formal semantics of Semantic Web languages, and trust and proof models. We hope that this will shed some light on the direction of future work on this field.

  17. A Lexical Framework for Semantic Annotation of Positive and Negative Regulation Relations in Biomedical Pathways

    DEFF Research Database (Denmark)

    Zambach, Sine; Lassen, Tine

    presented here, we analyze 6 frequently used verbs denoting the regulation relations regulates, positively regulates and negatively regulates through corpus analysis, and propose a formal representation of the acquired knowledge as domain speci¯c semantic frames. The acquired knowledge patterns can thus...

  18. A Survey on Challenges of Semantics Application in the Internet of Things Domain

    Directory of Open Access Journals (Sweden)

    Harlamova Marina

    2017-05-01

    Full Text Available The Internet of Things (IoT, a global Internet-based system of computing devices and machines, is one of the most significant trends in the information technology area. An accepted unified communication approach would be a prerequisite for its mass adoption. Semantic technologies (Semantic Web have been advocated as enablers of unified communication. However, while there are particular advancements in research on application of Semantic Web in the IoT domain, the dynamic and complex nature of the IoT often requires case specific solutions hard to be applied widely. In the present survey, the semantic technology challenges in the IoT domain are amalgamated to provide background for further studies in the use of semantic technologies in the IoT.

  19. Interoperable Multimedia Annotation and Retrieval for the Tourism Sector

    NARCIS (Netherlands)

    Chatzitoulousis, Antonios; Efraimidis, Pavlos S.; Athanasiadis, I.N.

    2015-01-01

    The Atlas Metadata System (AMS) employs semantic web annotation techniques in order to create an interoperable information annotation and retrieval platform for the tourism sector. AMS adopts state-of-the-art metadata vocabularies, annotation techniques and semantic web technologies.

  20. Ontology-based semantic information technology for safeguards: opportunities and challenges

    International Nuclear Information System (INIS)

    McDaniel, Michael

    2014-01-01

    The challenge of efficiently handling large volumes of heterogeneous information is a barrier to more effective safeguards implementation. With the emergence of new technologies for generating and collecting information this is an issue common to many industries and problem domains. Several diverse information‑intensive fields are developing and adopting ontology‑based semantic information technology solutions to address issues of information integration, federation and interoperability. Ontology, in this context, refers to the formal specification of the content, structure, and logic of knowledge within a domain of interest. Ontology‑based semantic information technologies have the potential to impact nearly every level of safeguards implementation, from information collection and integration, to personnel training and knowledge retention, to planning and analysis. However, substantial challenges remain before the full benefits of semantic technology can be realized. Perhaps the most significant challenge is the development of a nuclear fuel cycle ontology. For safeguards, existing knowledge resources such as the IAEA’s Physical Model and established upper level ontologies can be used as starting points for ontology development, but a concerted effort must be taken by the safeguards community for such an activity to be successful. This paper provides a brief background of ontologies and semantic information technology, demonstrates how these technologies are used in other areas, offers examples of how ontologies can be applied to safeguards, and discusses the challenges of developing and implementing this technology as well as a possible path forward.

  1. Exploring and linking biomedical resources through multidimensional semantic spaces.

    Science.gov (United States)

    Berlanga, Rafael; Jiménez-Ruiz, Ernesto; Nebot, Victoria

    2012-01-25

    The semantic integration of biomedical resources is still a challenging issue which is required for effective information processing and data analysis. The availability of comprehensive knowledge resources such as biomedical ontologies and integrated thesauri greatly facilitates this integration effort by means of semantic annotation, which allows disparate data formats and contents to be expressed under a common semantic space. In this paper, we propose a multidimensional representation for such a semantic space, where dimensions regard the different perspectives in biomedical research (e.g., population, disease, anatomy and protein/genes). This paper presents a novel method for building multidimensional semantic spaces from semantically annotated biomedical data collections. This method consists of two main processes: knowledge and data normalization. The former one arranges the concepts provided by a reference knowledge resource (e.g., biomedical ontologies and thesauri) into a set of hierarchical dimensions for analysis purposes. The latter one reduces the annotation set associated to each collection item into a set of points of the multidimensional space. Additionally, we have developed a visual tool, called 3D-Browser, which implements OLAP-like operators over the generated multidimensional space. The method and the tool have been tested and evaluated in the context of the Health-e-Child (HeC) project. Automatic semantic annotation was applied to tag three collections of abstracts taken from PubMed, one for each target disease of the project, the Uniprot database, and the HeC patient record database. We adopted the UMLS Meta-thesaurus 2010AA as the reference knowledge resource. Current knowledge resources and semantic-aware technology make possible the integration of biomedical resources. Such an integration is performed through semantic annotation of the intended biomedical data resources. This paper shows how these annotations can be exploited for

  2. Annotating Logical Forms for EHR Questions.

    Science.gov (United States)

    Roberts, Kirk; Demner-Fushman, Dina

    2016-05-01

    This paper discusses the creation of a semantically annotated corpus of questions about patient data in electronic health records (EHRs). The goal is to provide the training data necessary for semantic parsers to automatically convert EHR questions into a structured query. A layered annotation strategy is used which mirrors a typical natural language processing (NLP) pipeline. First, questions are syntactically analyzed to identify multi-part questions. Second, medical concepts are recognized and normalized to a clinical ontology. Finally, logical forms are created using a lambda calculus representation. We use a corpus of 446 questions asking for patient-specific information. From these, 468 specific questions are found containing 259 unique medical concepts and requiring 53 unique predicates to represent the logical forms. We further present detailed characteristics of the corpus, including inter-annotator agreement results, and describe the challenges automatic NLP systems will face on this task.

  3. ODMSummary: A Tool for Automatic Structured Comparison of Multiple Medical Forms Based on Semantic Annotation with the Unified Medical Language System.

    Science.gov (United States)

    Storck, Michael; Krumm, Rainer; Dugas, Martin

    2016-01-01

    Medical documentation is applied in various settings including patient care and clinical research. Since procedures of medical documentation are heterogeneous and developed further, secondary use of medical data is complicated. Development of medical forms, merging of data from different sources and meta-analyses of different data sets are currently a predominantly manual process and therefore difficult and cumbersome. Available applications to automate these processes are limited. In particular, tools to compare multiple documentation forms are missing. The objective of this work is to design, implement and evaluate the new system ODMSummary for comparison of multiple forms with a high number of semantically annotated data elements and a high level of usability. System requirements are the capability to summarize and compare a set of forms, enable to estimate the documentation effort, track changes in different versions of forms and find comparable items in different forms. Forms are provided in Operational Data Model format with semantic annotations from the Unified Medical Language System. 12 medical experts were invited to participate in a 3-phase evaluation of the tool regarding usability. ODMSummary (available at https://odmtoolbox.uni-muenster.de/summary/summary.html) provides a structured overview of multiple forms and their documentation fields. This comparison enables medical experts to assess multiple forms or whole datasets for secondary use. System usability was optimized based on expert feedback. The evaluation demonstrates that feedback from domain experts is needed to identify usability issues. In conclusion, this work shows that automatic comparison of multiple forms is feasible and the results are usable for medical experts.

  4. Empowering Personalized Medicine with Big Data and Semantic Web Technology: Promises, Challenges, and Use Cases.

    Science.gov (United States)

    Panahiazar, Maryam; Taslimitehrani, Vahid; Jadhav, Ashutosh; Pathak, Jyotishman

    2014-10-01

    In healthcare, big data tools and technologies have the potential to create significant value by improving outcomes while lowering costs for each individual patient. Diagnostic images, genetic test results and biometric information are increasingly generated and stored in electronic health records presenting us with challenges in data that is by nature high volume, variety and velocity, thereby necessitating novel ways to store, manage and process big data. This presents an urgent need to develop new, scalable and expandable big data infrastructure and analytical methods that can enable healthcare providers access knowledge for the individual patient, yielding better decisions and outcomes. In this paper, we briefly discuss the nature of big data and the role of semantic web and data analysis for generating "smart data" which offer actionable information that supports better decision for personalized medicine. In our view, the biggest challenge is to create a system that makes big data robust and smart for healthcare providers and patients that can lead to more effective clinical decision-making, improved health outcomes, and ultimately, managing the healthcare costs. We highlight some of the challenges in using big data and propose the need for a semantic data-driven environment to address them. We illustrate our vision with practical use cases, and discuss a path for empowering personalized medicine using big data and semantic web technology.

  5. Semantically Interoperable XML Data.

    Science.gov (United States)

    Vergara-Niedermayr, Cristobal; Wang, Fusheng; Pan, Tony; Kurc, Tahsin; Saltz, Joel

    2013-09-01

    XML is ubiquitously used as an information exchange platform for web-based applications in healthcare, life sciences, and many other domains. Proliferating XML data are now managed through latest native XML database technologies. XML data sources conforming to common XML schemas could be shared and integrated with syntactic interoperability. Semantic interoperability can be achieved through semantic annotations of data models using common data elements linked to concepts from ontologies. In this paper, we present a framework and software system to support the development of semantic interoperable XML based data sources that can be shared through a Grid infrastructure. We also present our work on supporting semantic validated XML data through semantic annotations for XML Schema, semantic validation and semantic authoring of XML data. We demonstrate the use of the system for a biomedical database of medical image annotations and markups.

  6. Semantically Interoperable XML Data

    Science.gov (United States)

    Vergara-Niedermayr, Cristobal; Wang, Fusheng; Pan, Tony; Kurc, Tahsin; Saltz, Joel

    2013-01-01

    XML is ubiquitously used as an information exchange platform for web-based applications in healthcare, life sciences, and many other domains. Proliferating XML data are now managed through latest native XML database technologies. XML data sources conforming to common XML schemas could be shared and integrated with syntactic interoperability. Semantic interoperability can be achieved through semantic annotations of data models using common data elements linked to concepts from ontologies. In this paper, we present a framework and software system to support the development of semantic interoperable XML based data sources that can be shared through a Grid infrastructure. We also present our work on supporting semantic validated XML data through semantic annotations for XML Schema, semantic validation and semantic authoring of XML data. We demonstrate the use of the system for a biomedical database of medical image annotations and markups. PMID:25298789

  7. Contribution of Clinical Archetypes, and the Challenges, towards Achieving Semantic Interoperability for EHRs.

    Science.gov (United States)

    Tapuria, Archana; Kalra, Dipak; Kobayashi, Shinji

    2013-12-01

    The objective is to introduce 'clinical archetype' which is a formal and agreed way of representing clinical information to ensure interoperability across and within Electronic Health Records (EHRs). The paper also aims at presenting the challenges building quality labeled clinical archetypes and the challenges towards achieving semantic interoperability between EHRs. Twenty years of international research, various European healthcare informatics projects and the pioneering work of the openEHR Foundation have led to the following results. The requirements for EHR information architectures have been consolidated within ISO 18308 and adopted within the ISO 13606 EHR interoperability standard. However, a generic EHR architecture cannot ensure that the clinical meaning of information from heterogeneous sources can be reliably interpreted by receiving systems and services. Therefore, clinical models called 'clinical archetypes' are required to formalize the representation of clinical information within the EHR. Part 2 of ISO 13606 defines how archetypes should be formally represented. The current challenge is to grow clinical communities to build a library of clinical archetypes and to identify how evidence of best practice and multi-professional clinical consensus should best be combined to define archetypes at the optimal level of granularity and specificity and quality label them for wide adoption. Standardizing clinical terms within EHRs using clinical terminology like Systematized Nomenclature of Medicine Clinical Terms is also a challenge. Clinical archetypes would play an important role in achieving semantic interoperability within EHRs. Attempts are being made in exploring the design and adoption challenges for clinical archetypes.

  8. The 'PEARL' Data Warehouse: Initial Challenges Faced with Semantic and Syntactic Interoperability.

    Science.gov (United States)

    Mahmoud, Samhar; Boyd, Andy; Curcin, Vasa; Bache, Richard; Ali, Asad; Miles, Simon; Taweel, Adel; Delaney, Brendan; Macleod, John

    2017-01-01

    Data about patients are available from diverse sources, including those routinely collected as individuals interact with service providers, and those provided directly by individuals through surveys. Linking these data can lead to a more complete picture about the individual, to inform either care decision making or research investigations. However, post-linkage, differences in data recording systems and formats present barriers to achieving these aims. This paper describes an approach to combine linked GP records with study observations, and reports initial challenges related to semantic and syntactic interoperability issues.

  9. Reasoning with Annotations of Texts

    OpenAIRE

    Ma , Yue; Lévy , François; Ghimire , Sudeep

    2011-01-01

    International audience; Linguistic and semantic annotations are important features for text-based applications. However, achieving and maintaining a good quality of a set of annotations is known to be a complex task. Many ad hoc approaches have been developed to produce various types of annotations, while comparing those annotations to improve their quality is still rare. In this paper, we propose a framework in which both linguistic and domain information can cooperate to reason with annotat...

  10. Propagating semantic information in biochemical network models

    Directory of Open Access Journals (Sweden)

    Schulz Marvin

    2012-01-01

    Full Text Available Abstract Background To enable automatic searches, alignments, and model combination, the elements of systems biology models need to be compared and matched across models. Elements can be identified by machine-readable biological annotations, but assigning such annotations and matching non-annotated elements is tedious work and calls for automation. Results A new method called "semantic propagation" allows the comparison of model elements based not only on their own annotations, but also on annotations of surrounding elements in the network. One may either propagate feature vectors, describing the annotations of individual elements, or quantitative similarities between elements from different models. Based on semantic propagation, we align partially annotated models and find annotations for non-annotated model elements. Conclusions Semantic propagation and model alignment are included in the open-source library semanticSBML, available on sourceforge. Online services for model alignment and for annotation prediction can be used at http://www.semanticsbml.org.

  11. Semantic Web research anno 2006 : Main streams, popular fallacies, current status and future challenges

    NARCIS (Netherlands)

    Van Harmelen, Frank

    2006-01-01

    In this topical paper we try to give an analysis and overview of the current state of Semantic Web research. We point to different interpretations of the Semantic Web as the reason underlying many controversies, we list (and debunk) four false objections which are often raised against the Semantic

  12. Semantic Enrichment of GPS Trajectories

    NARCIS (Netherlands)

    de Graaff, V.; van Keulen, Maurice; de By, R.A.

    2012-01-01

    Semantic annotation of GPS trajectories helps us to recognize the interests of the creator of the GPS trajectories. Automating this trajectory annotation circumvents the requirement of additional user input. To annotate the GPS traces automatically, two types of automated input are required: 1) a

  13. Semantic Search of Web Services

    Science.gov (United States)

    Hao, Ke

    2013-01-01

    This dissertation addresses semantic search of Web services using natural language processing. We first survey various existing approaches, focusing on the fact that the expensive costs of current semantic annotation frameworks result in limited use of semantic search for large scale applications. We then propose a vector space model based service…

  14. Semantic Role Labeling

    CERN Document Server

    Palmer, Martha; Xue, Nianwen

    2011-01-01

    This book is aimed at providing an overview of several aspects of semantic role labeling. Chapter 1 begins with linguistic background on the definition of semantic roles and the controversies surrounding them. Chapter 2 describes how the theories have led to structured lexicons such as FrameNet, VerbNet and the PropBank Frame Files that in turn provide the basis for large scale semantic annotation of corpora. This data has facilitated the development of automatic semantic role labeling systems based on supervised machine learning techniques. Chapter 3 presents the general principles of applyin

  15. Geospatial semantic web

    CERN Document Server

    Zhang, Chuanrong; Li, Weidong

    2015-01-01

    This book covers key issues related to Geospatial Semantic Web, including geospatial web services for spatial data interoperability; geospatial ontology for semantic interoperability; ontology creation, sharing, and integration; querying knowledge and information from heterogeneous data source; interfaces for Geospatial Semantic Web, VGI (Volunteered Geographic Information) and Geospatial Semantic Web; challenges of Geospatial Semantic Web; and development of Geospatial Semantic Web applications. This book also describes state-of-the-art technologies that attempt to solve these problems such as WFS, WMS, RDF, OWL, and GeoSPARQL, and demonstrates how to use the Geospatial Semantic Web technologies to solve practical real-world problems such as spatial data interoperability.

  16. Ubiquitous Annotation Systems

    DEFF Research Database (Denmark)

    Hansen, Frank Allan

    2006-01-01

    Ubiquitous annotation systems allow users to annotate physical places, objects, and persons with digital information. Especially in the field of location based information systems much work has been done to implement adaptive and context-aware systems, but few efforts have focused on the general...... requirements for linking information to objects in both physical and digital space. This paper surveys annotation techniques from open hypermedia systems, Web based annotation systems, and mobile and augmented reality systems to illustrate different approaches to four central challenges ubiquitous annotation...... systems have to deal with: anchoring, structuring, presentation, and authoring. Through a number of examples each challenge is discussed and HyCon, a context-aware hypermedia framework developed at the University of Aarhus, Denmark, is used to illustrate an integrated approach to ubiquitous annotations...

  17. CASL The Common Algebraic Specification Language Semantics

    DEFF Research Database (Denmark)

    Haxthausen, Anne

    1998-01-01

    This is version 1.0 of the CASL Language Summary, annotated by the CoFI Semantics Task Group with the semantics of constructs. This is the first complete but possibly imperfect version of the semantics. It was compiled prior to the CoFI workshop at Cachan in November 1998.......This is version 1.0 of the CASL Language Summary, annotated by the CoFI Semantics Task Group with the semantics of constructs. This is the first complete but possibly imperfect version of the semantics. It was compiled prior to the CoFI workshop at Cachan in November 1998....

  18. Results from levels 2/3 fusion implementations: issues, challenges, retrospectives, and perspectives for the future an annotated perspective

    Science.gov (United States)

    Kadar, Ivan; Bosse, Eloi; Salerno, John; Lambert, Dale A.; Das, Subrata; Ruspini, Enrique H.; Rhodes, Bradley J.; Biermann, Joachim

    2008-04-01

    Even though the definition of the Joint Director of Laboratories (JDL) "fusion levels" were established in 1987, published 1991, revised in 1999 and 2004, the meaning, effects, control and optimization of interactions among the fusion levels have not as yet been fully explored and understood. Specifically, this is apparent from the abstract JDL definitions of "Levels 2/3 Fusion" - situation and threat assessment (SA/TA), which involve deriving relations among entities, e.g., the aggregation of object states (i.e., classification and location) in SA, while TA uses SA products to estimate/predict the impact of actions/interactions effects on situations taken by the participant entities involved. Given all the existing knowledge in the information fusion and human factors literature, (both prior to and after the introduction of "fusion levels" in 1987) there are still open questions remaining in regard to implementation of knowledge representation and reasoning methods under uncertainty to afford SA/TA. Therefore, to promote exchange of ideas and to illuminate the historical, current and future issues associated with Levels 2/3 implementations, leading experts were invited to present their respective views on various facets of this complex problem. This paper is a retrospective annotated view of the invited panel discussion organized by Ivan Kadar (first author), supported by John Salerno, in order to provide both a historical perspective of the evolution of the state-of-the-art (SOA) in higher-level "Levels 2/3" information fusion implementations by looking back over the past ten or more years (before JDL), and based upon the lessons learned to forecast where focus should be placed to further enhance and advance the SOA by addressing key issues and challenges. In order to convey the panel discussion to audiences not present at the panel, annotated position papers summarizing the panel presentation are included.

  19. Semantic web data warehousing for caGrid.

    Science.gov (United States)

    McCusker, James P; Phillips, Joshua A; González Beltrán, Alejandra; Finkelstein, Anthony; Krauthammer, Michael

    2009-10-01

    The National Cancer Institute (NCI) is developing caGrid as a means for sharing cancer-related data and services. As more data sets become available on caGrid, we need effective ways of accessing and integrating this information. Although the data models exposed on caGrid are semantically well annotated, it is currently up to the caGrid client to infer relationships between the different models and their classes. In this paper, we present a Semantic Web-based data warehouse (Corvus) for creating relationships among caGrid models. This is accomplished through the transformation of semantically-annotated caBIG Unified Modeling Language (UML) information models into Web Ontology Language (OWL) ontologies that preserve those semantics. We demonstrate the validity of the approach by Semantic Extraction, Transformation and Loading (SETL) of data from two caGrid data sources, caTissue and caArray, as well as alignment and query of those sources in Corvus. We argue that semantic integration is necessary for integration of data from distributed web services and that Corvus is a useful way of accomplishing this. Our approach is generalizable and of broad utility to researchers facing similar integration challenges.

  20. The new challenge for e-learning : the educational semantic web

    NARCIS (Netherlands)

    Aroyo, L.M.; Dicheva, D.

    2004-01-01

    The big question for many researchers in the area of educational systems now is what is the next step in the evolution of e-learning? Are we finally moving from a scattered intelligence to a coherent space of collaborative intelligence? How close we are to the vision of the Educational Semantic Web

  1. Clustering semantics for hypermedia presentation

    NARCIS (Netherlands)

    Alberink, M.J.; Rutledge, L.W.; Hardman, H.L.; Veenstra, M.J.A.

    2004-01-01

    Semantic annotations of media repositories make relationships among the stored media and relevant concepts explicit. However, these relationships and the media they join are not directly presentable as hypermedia. Previous work shows how clustering over the annotations in the repositories can

  2. Algorithms for limited-view computed tomography: an annotated bibliography and a challenge

    International Nuclear Information System (INIS)

    Rangayyan, R.; Dhawan, A.P.; Gordon, R.

    1985-01-01

    In many applications of computed tomography, it may not be possible to acquire projection data at all angles, as required by the most commonly used algorithm of convolution backprojection. In such a limited-data situation, we face an ill-posed problem in attempting to reconstruct an image from an incomplete set of projections. Many techniques have been proposed to tackle this situation, employing diverse theories such as signal recovery, image restoration, constrained deconvolution, and constrained optimization, as well as novel schemes such as iterative object-dependent algorithms incorporating a priori knowledge and use of multispectral radiation. The authors present an overview of such techniques and offer a challenge to all readers to reconstruct images from a set of limited-view data provided here

  3. Hierarchy-associated semantic-rule inference framework for classifying indoor scenes

    Science.gov (United States)

    Yu, Dan; Liu, Peng; Ye, Zhipeng; Tang, Xianglong; Zhao, Wei

    2016-03-01

    Typically, the initial task of classifying indoor scenes is challenging, because the spatial layout and decoration of a scene can vary considerably. Recent efforts at classifying object relationships commonly depend on the results of scene annotation and predefined rules, making classification inflexible. Furthermore, annotation results are easily affected by external factors. Inspired by human cognition, a scene-classification framework was proposed using the empirically based annotation (EBA) and a match-over rule-based (MRB) inference system. The semantic hierarchy of images is exploited by EBA to construct rules empirically for MRB classification. The problem of scene classification is divided into low-level annotation and high-level inference from a macro perspective. Low-level annotation involves detecting the semantic hierarchy and annotating the scene with a deformable-parts model and a bag-of-visual-words model. In high-level inference, hierarchical rules are extracted to train the decision tree for classification. The categories of testing samples are generated from the parts to the whole. Compared with traditional classification strategies, the proposed semantic hierarchy and corresponding rules reduce the effect of a variable background and improve the classification performance. The proposed framework was evaluated on a popular indoor scene dataset, and the experimental results demonstrate its effectiveness.

  4. Diverse Image Annotation

    KAUST Repository

    Wu, Baoyuan

    2017-11-09

    In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.

  5. Diverse Image Annotation

    KAUST Repository

    Wu, Baoyuan; Jia, Fan; Liu, Wei; Ghanem, Bernard

    2017-01-01

    In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.

  6. Multi-Label Classification Based on Low Rank Representation for Image Annotation

    Directory of Open Access Journals (Sweden)

    Qiaoyu Tan

    2017-01-01

    Full Text Available Annotating remote sensing images is a challenging task for its labor demanding annotation process and requirement of expert knowledge, especially when images can be annotated with multiple semantic concepts (or labels. To automatically annotate these multi-label images, we introduce an approach called Multi-Label Classification based on Low Rank Representation (MLC-LRR. MLC-LRR firstly utilizes low rank representation in the feature space of images to compute the low rank constrained coefficient matrix, then it adapts the coefficient matrix to define a feature-based graph and to capture the global relationships between images. Next, it utilizes low rank representation in the label space of labeled images to construct a semantic graph. Finally, these two graphs are exploited to train a graph-based multi-label classifier. To validate the performance of MLC-LRR against other related graph-based multi-label methods in annotating images, we conduct experiments on a public available multi-label remote sensing images (Land Cover. We perform additional experiments on five real-world multi-label image datasets to further investigate the performance of MLC-LRR. Empirical study demonstrates that MLC-LRR achieves better performance on annotating images than these comparing methods across various evaluation criteria; it also can effectively exploit global structure and label correlations of multi-label images.

  7. Synesthesia, sensory-motor contingency, and semantic emulation: how swimming style-color synesthesia challenges the traditional view of synesthesia.

    Science.gov (United States)

    Mroczko-Wąsowicz, Aleksandra; Werning, Markus

    2012-01-01

    Synesthesia is traditionally regarded as a phenomenon in which an additional non-standard phenomenal experience occurs consistently in response to ordinary stimulation applied to the same or another modality. Recent studies suggest an important role of semantic representations in the induction of synesthesia. In the present proposal we try to link the empirically grounded theory of sensory-motor contingency and mirror system based embodied simulation/emulation to newly discovered cases of swimming style-color synesthesia. In the latter color experiences are evoked only by showing the synesthetes a picture of a swimming person or asking them to think about a given swimming style. Neural mechanisms of mirror systems seem to be involved here. It has been shown that for mirror-sensory synesthesia, such as mirror-touch or mirror-pain synesthesia (when visually presented tactile or noxious stimulation of others results in the projection of the tactile or pain experience onto oneself), concurrent experiences are caused by overactivity in the mirror neuron system responding to the specific observation. The comparison of different forms of synesthesia has the potential of challenging conventional thinking on this phenomenon and providing a more general, sensory-motor account of synesthesia encompassing cases driven by semantic or emulational rather than pure sensory or motor representations. Such an interpretation could include top-down associations, questioning the explanation in terms of hard-wired structural connectivity. In the paper the hypothesis is developed that the wide-ranging phenomenon of synesthesia might result from a process of hyperbinding between "too many" semantic attribute domains. This hypothesis is supplemented by some suggestions for an underlying neural mechanism.

  8. Synesthesia, sensory-motor contingency and semantic emulation: How swimming style-color synesthesia challenges the traditional view of synesthesia

    Directory of Open Access Journals (Sweden)

    Aleksandra eMroczko-Wąsowicz

    2012-08-01

    Full Text Available Synesthesia is a phenomenon in which an additional nonstandard perceptual experience occurs consistently in response to ordinary stimulation applied to the same or another modality. Recent studies suggest an important role of semantic representations in the induction of synesthesia. In the present proposal we try to link the empirically grounded theory of sensory-motor contingency and mirror system based embodied simulation to newly discovered cases of swimming-style color synesthesia. In the latter color experiences are evoked only by showing the synesthetes a picture of a swimming person or asking them to think about a given swimming style. Neural mechanisms of mirror systems seem to be involved here. It has been shown that for mirror-sensory synesthesia, such as mirror-touch or mirror-pain synesthesia, concurrent experiences are caused by the overactivity in the mirror neuron system responding to the specific observation. The comparison of different forms of synesthesia has the potential of challenging conventional thinking on this phenomenon and providing a more general, sensory-motor account of synesthesia encompassing cases driven by semantic or emulational rather than pure sensory or motor representations.

  9. Causality in the semantics of Esterel : revisited

    NARCIS (Netherlands)

    Mousavi, M.R.; Klin, B.; Sobocinski, P.

    2010-01-01

    We re-examine the challenges concerning causality in the semantics of Esterel and show that they pertain to the known issues in the semantics of Structured Operational Semantics with negative premises. We show that the solutions offered for the semantics of SOS also provide answers to the semantic

  10. CASL - The CoFI Algebraic Specification Language - Semantics

    DEFF Research Database (Denmark)

    Haxthausen, Anne

    1999-01-01

    This is version 1.0 of the CASL Language Summary, annotated by the CoFI Semantics Task Group with the semantics of constructs. This is the second complete but possibly imperfect version of the semantics. It was compiled prior to the CoFI workshop in Amsterdam in March 1999.......This is version 1.0 of the CASL Language Summary, annotated by the CoFI Semantics Task Group with the semantics of constructs. This is the second complete but possibly imperfect version of the semantics. It was compiled prior to the CoFI workshop in Amsterdam in March 1999....

  11. Annotation of Regular Polysemy

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector

    Regular polysemy has received a lot of attention from the theory of lexical semantics and from computational linguistics. However, there is no consensus on how to represent the sense of underspecified examples at the token level, namely when annotating or disambiguating senses of metonymic words...... and metonymic. We have conducted an analysis in English, Danish and Spanish. Later on, we have tried to replicate the human judgments by means of unsupervised and semi-supervised sense prediction. The automatic sense-prediction systems have been unable to find empiric evidence for the underspecified sense, even...

  12. The Semantic Web: From Representation to Realization

    Science.gov (United States)

    Thórisson, Kristinn R.; Spivack, Nova; Wissner, James M.

    A semantically-linked web of electronic information - the Semantic Web - promises numerous benefits including increased precision in automated information sorting, searching, organizing and summarizing. Realizing this requires significantly more reliable meta-information than is readily available today. It also requires a better way to represent information that supports unified management of diverse data and diverse Manipulation methods: from basic keywords to various types of artificial intelligence, to the highest level of intelligent manipulation - the human mind. How this is best done is far from obvious. Relying solely on hand-crafted annotation and ontologies, or solely on artificial intelligence techniques, seems less likely for success than a combination of the two. In this paper describe an integrated, complete solution to these challenges that has already been implemented and tested with hundreds of thousands of users. It is based on an ontological representational level we call SemCards that combines ontological rigour with flexible user interface constructs. SemCards are machine- and human-readable digital entities that allow non-experts to create and use semantic content, while empowering machines to better assist and participate in the process. SemCards enable users to easily create semantically-grounded data that in turn acts as examples for automation processes, creating a positive iterative feedback loop of metadata creation and refinement between user and machine. They provide a holistic solution to the Semantic Web, supporting powerful management of the full lifecycle of data, including its creation, retrieval, classification, sorting and sharing. We have implemented the SemCard technology on the semantic Web site Twine.com, showing that the technology is indeed versatile and scalable. Here we present the key ideas behind SemCards and describe the initial implementation of the technology.

  13. Investigating and Annotating the Role of Citation in Biomedical Full-Text Articles.

    Science.gov (United States)

    Yu, Hong; Agarwal, Shashank; Frid, Nadya

    2009-11-01

    Citations are ubiquitous in scientific articles and play important roles for representing the semantic content of a full-text biomedical article. In this work, we manually examined full-text biomedical articles to analyze the semantic content of citations in full-text biomedical articles. After developing a citation relation schema and annotation guideline, our pilot annotation results show an overall agreement of 0.71, and here we report on the research challenges and the lessons we've learned while trying to overcome them. Our work is a first step toward automatic citation classification in full-text biomedical articles, which may contribute to many text mining tasks, including information retrieval, extraction, summarization, and question answering.

  14. A reasonable Semantic Web

    NARCIS (Netherlands)

    Hitzler, Pascal; Van Harmelen, Frank

    2010-01-01

    The realization of Semantic Web reasoning is central to substantiating the Semantic Web vision. However, current mainstream research on this topic faces serious challenges, which forces us to question established lines of research and to rethink the underlying approaches. We argue that reasoning for

  15. Semantics, contrastive linguistics and parallel corpora

    Directory of Open Access Journals (Sweden)

    Violetta Koseska

    2014-09-01

    Full Text Available Semantics, contrastive linguistics and parallel corpora In view of the ambiguity of the term “semantics”, the author shows the differences between the traditional lexical semantics and the contemporary semantics in the light of various semantic schools. She examines semantics differently in connection with contrastive studies where the description must necessary go from the meaning towards the linguistic form, whereas in traditional contrastive studies the description proceeded from the form towards the meaning. This requirement regarding theoretical contrastive studies necessitates construction of a semantic interlanguage, rather than only singling out universal semantic categories expressed with various language means. Such studies can be strongly supported by parallel corpora. However, in order to make them useful for linguists in manual and computer translations, as well as in the development of dictionaries, including online ones, we need not only formal, often automatic, annotation of texts, but also semantic annotation - which is unfortunately manual. In the article we focus on semantic annotation concerning time, aspect and quantification of names and predicates in the whole semantic structure of the sentence on the example of the “Polish-Bulgarian-Russian parallel corpus”.

  16. Challenges Facing the Semantic Web and Social Software as Communication Technology Agents in E-Learning Environments

    Science.gov (United States)

    Olaniran, Bolanle A.

    2010-01-01

    The semantic web describes the process whereby information content is made available for machine consumption. With increased reliance on information communication technologies, the semantic web promises effective and efficient information acquisition and dissemination of products and services in the global economy, in particular, e-learning.…

  17. Semantic Advertising

    OpenAIRE

    Zamanzadeh, Ben; Ashish, Naveen; Ramakrishnan, Cartic; Zimmerman, John

    2013-01-01

    We present the concept of Semantic Advertising which we see as the future of online advertising. Semantic Advertising is online advertising powered by semantic technology which essentially enables us to represent and reason with concepts and the meaning of things. This paper aims to 1) Define semantic advertising, 2) Place it in the context of broader and more widely used concepts such as the Semantic Web and Semantic Search, 3) Provide a survey of work in related areas such as context matchi...

  18. Annotation of toponyms in TEI digital literary editions and linking to the web of data

    Directory of Open Access Journals (Sweden)

    Frontini, Francesca

    2016-07-01

    Full Text Available This paper aims to discuss the challenges and benefits of the annotation of place names in literary texts and literary criticism. We shall first highlight the problems of encoding spatial information in digital editions using the TEI format by means of two manual annotation experiments and the discussion of various cases. This will lead to the question of how to use existing semantic web resources to complement and enrich toponym mark-up, in particular to provide mentions with precise georeferencing. Finally the automatic annotation of a large corpus will show the potential of visualizing places from texts, by illustrating an analysis of the evolution of literary life from the spatial and geographical point of view.

  19. A Semantic Web-based System for Managing Clinical Archetypes.

    Science.gov (United States)

    Fernandez-Breis, Jesualdo Tomas; Menarguez-Tortosa, Marcos; Martinez-Costa, Catalina; Fernandez-Breis, Eneko; Herrero-Sempere, Jose; Moner, David; Sanchez, Jesus; Valencia-Garcia, Rafael; Robles, Montserrat

    2008-01-01

    Archetypes facilitate the sharing of clinical knowledge and therefore are a basic tool for achieving interoperability between healthcare information systems. In this paper, a Semantic Web System for Managing Archetypes is presented. This system allows for the semantic annotation of archetypes, as well for performing semantic searches. The current system is capable of working with both ISO13606 and OpenEHR archetypes.

  20. Semantic attributes for people's appearance description: an appearance modality for video surveillance applications

    Science.gov (United States)

    Frikha, Mayssa; Fendri, Emna; Hammami, Mohamed

    2017-09-01

    Using semantic attributes such as gender, clothes, and accessories to describe people's appearance is an appealing modeling method for video surveillance applications. We proposed a midlevel appearance signature based on extracting a list of nameable semantic attributes describing the body in uncontrolled acquisition conditions. Conventional approaches extract the same set of low-level features to learn the semantic classifiers uniformly. Their critical limitation is the inability to capture the dominant visual characteristics for each trait separately. The proposed approach consists of extracting low-level features in an attribute-adaptive way by automatically selecting the most relevant features for each attribute separately. Furthermore, relying on a small training-dataset would easily lead to poor performance due to the large intraclass and interclass variations. We annotated large scale people images collected from different person reidentification benchmarks covering a large attribute sample and reflecting the challenges of uncontrolled acquisition conditions. These annotations were gathered into an appearance semantic attribute dataset that contains 3590 images annotated with 14 attributes. Various experiments prove that carefully designed features for learning the visual characteristics for an attribute provide an improvement of the correct classification accuracy and a reduction of both spatial and temporal complexities against state-of-the-art approaches.

  1. Semantic role labeling for protein transport predicates

    Directory of Open Access Journals (Sweden)

    Martin James H

    2008-06-01

    Full Text Available Abstract Background Automatic semantic role labeling (SRL is a natural language processing (NLP technique that maps sentences to semantic representations. This technique has been widely studied in the recent years, but mostly with data in newswire domains. Here, we report on a SRL model for identifying the semantic roles of biomedical predicates describing protein transport in GeneRIFs – manually curated sentences focusing on gene functions. To avoid the computational cost of syntactic parsing, and because the boundaries of our protein transport roles often did not match up with syntactic phrase boundaries, we approached this problem with a word-chunking paradigm and trained support vector machine classifiers to classify words as being at the beginning, inside or outside of a protein transport role. Results We collected a set of 837 GeneRIFs describing movements of proteins between cellular components, whose predicates were annotated for the semantic roles AGENT, PATIENT, ORIGIN and DESTINATION. We trained these models with the features of previous word-chunking models, features adapted from phrase-chunking models, and features derived from an analysis of our data. Our models were able to label protein transport semantic roles with 87.6% precision and 79.0% recall when using manually annotated protein boundaries, and 87.0% precision and 74.5% recall when using automatically identified ones. Conclusion We successfully adapted the word-chunking classification paradigm to semantic role labeling, applying it to a new domain with predicates completely absent from any previous studies. By combining the traditional word and phrasal role labeling features with biomedical features like protein boundaries and MEDPOST part of speech tags, we were able to address the challenges posed by the new domain data and subsequently build robust models that achieved F-measures as high as 83.1. This system for extracting protein transport information from Gene

  2. Reflect: a practical approach to web semantics

    DEFF Research Database (Denmark)

    O'Donoghue, S.I.; Horn, Heiko; Pafilisa, E.

    2010-01-01

    To date, adding semantic capabilities to web content usually requires considerable server-side re-engineering, thus only a tiny fraction of all web content currently has semantic annotations. Recently, we announced Reflect (http://reflect.ws), a free service that takes a more practical approach......: Reflect uses augmented browsing to allow end-users to add systematic semantic annotations to any web-page in real-time, typically within seconds. In this paper we describe the tagging process in detail and show how further entity types can be added to Reflect; we also describe how publishers and content...... web technologies....

  3. Advancing translational research with the Semantic Web.

    Science.gov (United States)

    Ruttenberg, Alan; Clark, Tim; Bug, William; Samwald, Matthias; Bodenreider, Olivier; Chen, Helen; Doherty, Donald; Forsberg, Kerstin; Gao, Yong; Kashyap, Vipul; Kinoshita, June; Luciano, Joanne; Marshall, M Scott; Ogbuji, Chimezie; Rees, Jonathan; Stephens, Susie; Wong, Gwendolyn T; Wu, Elizabeth; Zaccagnini, Davide; Hongsermeier, Tonya; Neumann, Eric; Herman, Ivan; Cheung, Kei-Hoi

    2007-05-09

    A fundamental goal of the U.S. National Institute of Health (NIH) "Roadmap" is to strengthen Translational Research, defined as the movement of discoveries in basic research to application at the clinical level. A significant barrier to translational research is the lack of uniformly structured data across related biomedical domains. The Semantic Web is an extension of the current Web that enables navigation and meaningful use of digital resources by automatic processes. It is based on common formats that support aggregation and integration of data drawn from diverse sources. A variety of technologies have been built on this foundation that, together, support identifying, representing, and reasoning across a wide range of biomedical data. The Semantic Web Health Care and Life Sciences Interest Group (HCLSIG), set up within the framework of the World Wide Web Consortium, was launched to explore the application of these technologies in a variety of areas. Subgroups focus on making biomedical data available in RDF, working with biomedical ontologies, prototyping clinical decision support systems, working on drug safety and efficacy communication, and supporting disease researchers navigating and annotating the large amount of potentially relevant literature. We present a scenario that shows the value of the information environment the Semantic Web can support for aiding neuroscience researchers. We then report on several projects by members of the HCLSIG, in the process illustrating the range of Semantic Web technologies that have applications in areas of biomedicine. Semantic Web technologies present both promise and challenges. Current tools and standards are already adequate to implement components of the bench-to-bedside vision. On the other hand, these technologies are young. Gaps in standards and implementations still exist and adoption is limited by typical problems with early technology, such as the need for a critical mass of practitioners and installed base

  4. Biomedical semantics in the Semantic Web.

    Science.gov (United States)

    Splendiani, Andrea; Burger, Albert; Paschke, Adrian; Romano, Paolo; Marshall, M Scott

    2011-03-07

    The Semantic Web offers an ideal platform for representing and linking biomedical information, which is a prerequisite for the development and application of analytical tools to address problems in data-intensive areas such as systems biology and translational medicine. As for any new paradigm, the adoption of the Semantic Web offers opportunities and poses questions and challenges to the life sciences scientific community: which technologies in the Semantic Web stack will be more beneficial for the life sciences? Is biomedical information too complex to benefit from simple interlinked representations? What are the implications of adopting a new paradigm for knowledge representation? What are the incentives for the adoption of the Semantic Web, and who are the facilitators? Is there going to be a Semantic Web revolution in the life sciences?We report here a few reflections on these questions, following discussions at the SWAT4LS (Semantic Web Applications and Tools for Life Sciences) workshop series, of which this Journal of Biomedical Semantics special issue presents selected papers from the 2009 edition, held in Amsterdam on November 20th.

  5. Discovering gene annotations in biomedical text databases

    Directory of Open Access Journals (Sweden)

    Ozsoyoglu Gultekin

    2008-03-01

    Full Text Available Abstract Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i automating the annotation of genomic entities with Gene Ontology concepts, and (ii providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate

  6. Objective-guided image annotation.

    Science.gov (United States)

    Mao, Qi; Tsang, Ivor Wai-Hung; Gao, Shenghua

    2013-04-01

    Automatic image annotation, which is usually formulated as a multi-label classification problem, is one of the major tools used to enhance the semantic understanding of web images. Many multimedia applications (e.g., tag-based image retrieval) can greatly benefit from image annotation. However, the insufficient performance of image annotation methods prevents these applications from being practical. On the other hand, specific measures are usually designed to evaluate how well one annotation method performs for a specific objective or application, but most image annotation methods do not consider optimization of these measures, so that they are inevitably trapped into suboptimal performance of these objective-specific measures. To address this issue, we first summarize a variety of objective-guided performance measures under a unified representation. Our analysis reveals that macro-averaging measures are very sensitive to infrequent keywords, and hamming measure is easily affected by skewed distributions. We then propose a unified multi-label learning framework, which directly optimizes a variety of objective-specific measures of multi-label learning tasks. Specifically, we first present a multilayer hierarchical structure of learning hypotheses for multi-label problems based on which a variety of loss functions with respect to objective-guided measures are defined. And then, we formulate these loss functions as relaxed surrogate functions and optimize them by structural SVMs. According to the analysis of various measures and the high time complexity of optimizing micro-averaging measures, in this paper, we focus on example-based measures that are tailor-made for image annotation tasks but are seldom explored in the literature. Experiments show consistency with the formal analysis on two widely used multi-label datasets, and demonstrate the superior performance of our proposed method over state-of-the-art baseline methods in terms of example-based measures on four

  7. Design and Applications of a GeoSemantic Framework for Integration of Data and Model Resources in Hydrologic Systems

    Science.gov (United States)

    Elag, M.; Kumar, P.

    2016-12-01

    Hydrologists today have to integrate resources such as data and models, which originate and reside in multiple autonomous and heterogeneous repositories over the Web. Several resource management systems have emerged within geoscience communities for sharing long-tail data, which are collected by individual or small research groups, and long-tail models, which are developed by scientists or small modeling communities. While these systems have increased the availability of resources within geoscience domains, deficiencies remain due to the heterogeneity in the methods, which are used to describe, encode, and publish information about resources over the Web. This heterogeneity limits our ability to access the right information in the right context so that it can be efficiently retrieved and understood without the Hydrologist's mediation. A primary challenge of the Web today is the lack of the semantic interoperability among the massive number of resources, which already exist and are continually being generated at rapid rates. To address this challenge, we have developed a decentralized GeoSemantic (GS) framework, which provides three sets of micro-web services to support (i) semantic annotation of resources, (ii) semantic alignment between the metadata of two resources, and (iii) semantic mediation among Standard Names. Here we present the design of the framework and demonstrate its application for semantic integration between data and models used in the IML-CZO. First we show how the IML-CZO data are annotated using the Semantic Annotation Services. Then we illustrate how the Resource Alignment Services and Knowledge Integration Services are used to create a semantic workflow among TopoFlow model, which is a spatially-distributed hydrologic model and the annotated data. Results of this work are (i) a demonstration of how the GS framework advances the integration of heterogeneous data and models of water-related disciplines by seamless handling of their semantic

  8. Semantic Multimedia

    NARCIS (Netherlands)

    S. Staab; A. Scherp; R. Arndt; R. Troncy (Raphael); M. Grzegorzek; C. Saathoff; S. Schenk; L. Hardman (Lynda)

    2008-01-01

    htmlabstractMultimedia constitutes an interesting field of application for Semantic Web and Semantic Web reasoning, as the access and management of multimedia content and context depends strongly on the semantic descriptions of both. At the same time, multimedia resources constitute complex objects,

  9. Generative Semantics.

    Science.gov (United States)

    King, Margaret

    The first section of this paper deals with the attempts within the framework of transformational grammar to make semantics a systematic part of linguistic description, and outlines the characteristics of the generative semantics position. The second section takes a critical look at generative semantics in its later manifestations, and makes a case…

  10. Tagging like Humans: Diverse and Distinct Image Annotation

    KAUST Repository

    Wu, Baoyuan

    2018-03-31

    In this work we propose a new automatic image annotation model, dubbed {\\\\bf diverse and distinct image annotation} (D2IA). The generative model D2IA is inspired by the ensemble of human annotations, which create semantically relevant, yet distinct and diverse tags. In D2IA, we generate a relevant and distinct tag subset, in which the tags are relevant to the image contents and semantically distinct to each other, using sequential sampling from a determinantal point process (DPP) model. Multiple such tag subsets that cover diverse semantic aspects or diverse semantic levels of the image contents are generated by randomly perturbing the DPP sampling process. We leverage a generative adversarial network (GAN) model to train D2IA. Extensive experiments including quantitative and qualitative comparisons, as well as human subject studies, on two benchmark datasets demonstrate that the proposed model can produce more diverse and distinct tags than the state-of-the-arts.

  11. Semantic integration of medication data into the EHOP Clinical Data Warehouse.

    Science.gov (United States)

    Delamarre, Denis; Bouzille, Guillaume; Dalleau, Kevin; Courtel, Denis; Cuggia, Marc

    2015-01-01

    Reusing medication data is crucial for many medical research domains. Semantic integration of such data in clinical data warehouse (CDW) is quite challenging. Our objective was to develop a reliable and scalable method for integrating prescription data into EHOP (a French CDW). PN13/PHAST was used as the semantic interoperability standard during the ETL process, and to store the prescriptions as documents in the CDW. Theriaque was used as a drug knowledge database (DKDB), to annotate the prescription dataset with the finest granularity, and to provide semantic capabilities to the EHOP query workbench. the system was evaluated on a clinical data set. Depending on the use case, the precision ranged from 52% to 100%, Recall was always 100%. interoperability standards and DKDB, document approach, and the finest granularity approach are the key factors for successful drug data integration in CDW.

  12. Determining Semantically Related Significant Genes.

    Science.gov (United States)

    Taha, Kamal

    2014-01-01

    GO relation embodies some aspects of existence dependency. If GO term xis existence-dependent on GO term y, the presence of y implies the presence of x. Therefore, the genes annotated with the function of the GO term y are usually functionally and semantically related to the genes annotated with the function of the GO term x. A large number of gene set enrichment analysis methods have been developed in recent years for analyzing gene sets enrichment. However, most of these methods overlook the structural dependencies between GO terms in GO graph by not considering the concept of existence dependency. We propose in this paper a biological search engine called RSGSearch that identifies enriched sets of genes annotated with different functions using the concept of existence dependency. We observe that GO term xcannot be existence-dependent on GO term y, if x- and y- have the same specificity (biological characteristics). After encoding into a numeric format the contributions of GO terms annotating target genes to the semantics of their lowest common ancestors (LCAs), RSGSearch uses microarray experiment to identify the most significant LCA that annotates the result genes. We evaluated RSGSearch experimentally and compared it with five gene set enrichment systems. Results showed marked improvement.

  13. Annotated bibliography

    International Nuclear Information System (INIS)

    1997-08-01

    Under a cooperative agreement with the U.S. Department of Energy's Office of Science and Technology, Waste Policy Institute (WPI) is conducting a five-year research project to develop a research-based approach for integrating communication products in stakeholder involvement related to innovative technology. As part of the research, WPI developed this annotated bibliography which contains almost 100 citations of articles/books/resources involving topics related to communication and public involvement aspects of deploying innovative cleanup technology. To compile the bibliography, WPI performed on-line literature searches (e.g., Dialog, International Association of Business Communicators Public Relations Society of America, Chemical Manufacturers Association, etc.), consulted past years proceedings of major environmental waste cleanup conferences (e.g., Waste Management), networked with professional colleagues and DOE sites to gather reports or case studies, and received input during the August 1996 Research Design Team meeting held to discuss the project's research methodology. Articles were selected for annotation based upon their perceived usefulness to the broad range of public involvement and communication practitioners

  14. BioAssay templates for the semantic web

    Directory of Open Access Journals (Sweden)

    Alex M. Clark

    2016-05-01

    Full Text Available Annotation of bioassay protocols using semantic web vocabulary is a way to make experiment descriptions machine-readable. Protocols are communicated using concise scientific English, which precludes most kinds of analysis by software algorithms. Given the availability of a sufficiently expressive ontology, some or all of the pertinent information can be captured by asserting a series of facts, expressed as semantic web triples (subject, predicate, object. With appropriate annotation, assays can be searched, clustered, tagged and evaluated in a multitude of ways, analogous to other segments of drug discovery informatics. The BioAssay Ontology (BAO has been previously designed for this express purpose, and provides a layered hierarchy of meaningful terms which can be linked to. Currently the biggest challenge is the issue of content creation: scientists cannot be expected to use the BAO effectively without having access to software tools that make it straightforward to use the vocabulary in a canonical way. We have sought to remove this barrier by: (1 defining a BioAssay Template (BAT data model; (2 creating a software tool for experts to create or modify templates to suit their needs; and (3 designing a common assay template (CAT to leverage the most value from the BAO terms. The CAT was carefully assembled by biologists in order to find a balance between the maximum amount of information captured vs. low degrees of freedom in order to keep the user experience as simple as possible. The data format that we use for describing templates and corresponding annotations is the native format of the semantic web (RDF triples, and we demonstrate some of the ways that generated content can be meaningfully queried using the SPARQL language. We have made all of these materials available as open source (http://github.com/cdd/bioassay-template, in order to encourage community input and use within diverse projects, including but not limited to our own

  15. Semantic integration to identify overlapping functional modules in protein interaction networks

    Directory of Open Access Journals (Sweden)

    Ramanathan Murali

    2007-07-01

    Full Text Available Abstract Background The systematic analysis of protein-protein interactions can enable a better understanding of cellular organization, processes and functions. Functional modules can be identified from the protein interaction networks derived from experimental data sets. However, these analyses are challenging because of the presence of unreliable interactions and the complex connectivity of the network. The integration of protein-protein interactions with the data from other sources can be leveraged for improving the effectiveness of functional module detection algorithms. Results We have developed novel metrics, called semantic similarity and semantic interactivity, which use Gene Ontology (GO annotations to measure the reliability of protein-protein interactions. The protein interaction networks can be converted into a weighted graph representation by assigning the reliability values to each interaction as a weight. We presented a flow-based modularization algorithm to efficiently identify overlapping modules in the weighted interaction networks. The experimental results show that the semantic similarity and semantic interactivity of interacting pairs were positively correlated with functional co-occurrence. The effectiveness of the algorithm for identifying modules was evaluated using functional categories from the MIPS database. We demonstrated that our algorithm had higher accuracy compared to other competing approaches. Conclusion The integration of protein interaction networks with GO annotation data and the capability of detecting overlapping modules substantially improve the accuracy of module identification.

  16. Semantic metrics

    OpenAIRE

    Hu, Bo; Kalfoglou, Yannis; Dupplaw, David; Alani, Harith; Lewis, Paul; Shadbolt, Nigel

    2006-01-01

    In the context of the Semantic Web, many ontology-related operations, e.g. ontology ranking, segmentation, alignment, articulation, reuse, evaluation, can be boiled down to one fundamental operation: computing the similarity and/or dissimilarity among ontological entities, and in some cases among ontologies themselves. In this paper, we review standard metrics for computing distance measures and we propose a series of semantic metrics. We give a formal account of semantic metrics drawn from a...

  17. Learning Semantic Segmentation with Diverse Supervision

    OpenAIRE

    Ye, Linwei; Liu, Zhi; Wang, Yang

    2018-01-01

    Models based on deep convolutional neural networks (CNN) have significantly improved the performance of semantic segmentation. However, learning these models requires a large amount of training images with pixel-level labels, which are very costly and time-consuming to collect. In this paper, we propose a method for learning CNN-based semantic segmentation models from images with several types of annotations that are available for various computer vision tasks, including image-level labels fo...

  18. Concept annotation in the CRAFT corpus.

    Science.gov (United States)

    Bada, Michael; Eckert, Miriam; Evans, Donald; Garcia, Kristin; Shipley, Krista; Sitnikov, Dmitry; Baumgartner, William A; Cohen, K Bretonnel; Verspoor, Karin; Blake, Judith A; Hunter, Lawrence E

    2012-07-09

    Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.

  19. Hierarchical Semantic Model of Geovideo

    Directory of Open Access Journals (Sweden)

    XIE Xiao

    2015-05-01

    Full Text Available The public security incidents were getting increasingly challenging with regard to their new features, including multi-scale mobility, multistage dynamic evolution, as well as spatiotemporal concurrency and uncertainty in the complex urban environment. However, the existing video models, which were used/designed for independent archive or local analysis of surveillance video, have seriously inhibited emergency response to the urgent requirements.Aiming at the explicit representation of change mechanism in video, the paper proposed a novel hierarchical geovideo semantic model using UML. This model was characterized by the hierarchical representation of both data structure and semantics based on the change-oriented three domains (feature domain, process domain and event domain instead of overall semantic description of video streaming; combining both geographical semantics and video content semantics, in support of global semantic association between multiple geovideo data. The public security incidents by video surveillance are inspected as an example to illustrate the validity of this model.

  20. Semantic stability is more pleasurable in unstable episodic contexts. On the relevance of perceptual challenge in art appreciation

    Directory of Open Access Journals (Sweden)

    Claudia eMuth

    2016-02-01

    Full Text Available Research in the field of psychological aesthetics points to the appeal of stimuli which defy easy recognition by being semantically unstable but which still allow for creating meaning—in the ongoing process of elaborative perception or as an end product of the entire process. Such effects were reported for hidden images (Muth & Carbon, 2013 as well as Cubist artworks concealing detectable—although fragmented—objects (Muth, Pepperell, & Carbon, 2013. To test the stability of the relationship between semantic determinacy and appreciation across different episodic contexts, 30 volunteers evaluated an artistic movie continuously on visual determinacy or liking via the Continuous Evaluation Procedure (CEP, Muth, Raab, & Carbon, 2015. The movie consisted of five episodes with emerging Gestalts. In the first between-participants condition, the hidden Gestalts in the movie episodes were of increasing determinacy, in the second condition, the episodes showed decreasing determinacies of hidden Gestalts. In the increasing-determinacy group, visual determinacy was rated higher and showed better predictive quality for liking than in the decreasing-determinacy group. Furthermore, when the movie started with low visual determinacy of hidden Gestalts, unexpectedly strong increases in visual determinacy had a bigger effect on liking than in the condition which allowed for weaker Gestalt recognition after having started with highly determinate Gestalts. The resulting pattern calls for consideration of the episodic context when examining art appreciation.

  1. Montague semantics

    NARCIS (Netherlands)

    Janssen, T.M.V.

    2012-01-01

    Montague semantics is a theory of natural language semantics and of its relation with syntax. It was originally developed by the logician Richard Montague (1930-1971) and subsequently modified and extended by linguists, philosophers, and logicians. The most important features of the theory are its

  2. LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD

    Energy Technology Data Exchange (ETDEWEB)

    VERSPOOR, KARIN [Los Alamos National Laboratory; LIN, SHOU-DE [Los Alamos National Laboratory

    2007-01-29

    An N-gram language model aims at capturing statistical syntactic word order information from corpora. Although the concept of language models has been applied extensively to handle a variety of NLP problems with reasonable success, the standard model does not incorporate semantic information, and consequently limits its applicability to semantic problems such as word sense disambiguation. We propose a framework that integrates semantic information into the language model schema, allowing a system to exploit both syntactic and semantic information to address NLP problems. Furthermore, acknowledging the limited availability of semantically annotated data, we discuss how the proposed model can be learned without annotated training examples. Finally, we report on a case study showing how the semantics-enhanced language model can be applied to unsupervised word sense disambiguation with promising results.

  3. A Semantic Approach for Recommendations generation: some Cultural Heritage applications

    Directory of Open Access Journals (Sweden)

    Maurizio De Tommasi

    2011-12-01

    Full Text Available EnThe growing availability of data in the information systems has raised the challenging problem of distinguishing between the resources that belong to the same information context. Starting from the hypothesis that the information system is based on Semantic Web technologies, is it possible to use these technologies to make an information system more adaptive to user requirements in order to enable personalization and differentiation mechanisms in the information delivery process?This paper proposes an approach to building recommendations by using Semantic Web technologies, in order to give the users a different access to the information. The outcome is a semantic recommender engine, capable of retrieving and ranking semantically annotated resources, by using a set of domain ontologies and a semantic matching algorithm. We are showing some applications of this model in the Cultural Heritage domain in which the presented approach seems to be particularly effective, due to the richness of semantic structures and models existing for such domain.ItLa crescente quantità di dati disponibili da parte dei sistemi informativi ha sollevato il complesso problema della distinzione tra risorse appartenenti allo stesso contesto informativo. Partendo dall'ipotesi che il sistema informativo si basi sulle tecnologie proprie del Web Semantico, è possibile utilizzare tali tecnologie per rendere il sistema adattivo ai requisiti dell'utente, abilitando, in questo modo, meccanismi di personalizzazione e differenziazione?Questo articolo propone un approccio per la generazione di recommendation,  utilizzando le tecnologie del Web Semantico, al fine di fornire, ai singoli utenti, accessi differenziati alle informazioni. Il risultato è un motore di generazione di recommendation semantiche, in grado di recuperare e classificare risorse annotate semanticamente, avvalendosi di un set di ontologie di dominio e di un algoritmo di matching semantico. Saranno

  4. Semantic Support for Complex Ecosystem Research Environments

    Science.gov (United States)

    Klawonn, M.; McGuinness, D. L.; Pinheiro, P.; Santos, H. O.; Chastain, K.

    2015-12-01

    As ecosystems come under increasing stresses from diverse sources, there is growing interest in research efforts aimed at monitoring, modeling, and improving understanding of ecosystems and protection options. We aimed to provide a semantic infrastructure capable of representing data initially related to one large aquatic ecosystem research effort - the Jefferson project at Lake George. This effort includes significant historical observational data, extensive sensor-based monitoring data, experimental data, as well as model and simulation data covering topics including lake circulation, watershed runoff, lake biome food webs, etc. The initial measurement representation has been centered on monitoring data and related provenance. We developed a human-aware sensor network ontology (HASNetO) that leverages existing ontologies (PROV-O, OBOE, VSTO*) in support of measurement annotations. We explicitly support the human-aware aspects of human sensor deployment and collection activity to help capture key provenance that often is lacking. Our foundational ontology has since been generalized into a family of ontologies and used to create our human-aware data collection infrastructure that now supports the integration of measurement data along with simulation data. Interestingly, we have also utilized the same infrastructure to work with partners who have some more specific needs for specifying the environmental conditions where measurements occur, for example, knowing that an air temperature is not an external air temperature, but of the air temperature when windows are shut and curtains are open. We have also leveraged the same infrastructure to work with partners more interested in modeling smart cities with data feeds more related to people, mobility, environment, and living. We will introduce our human-aware data collection infrastructure, and demonstrate how it uses HASNetO and its supporting SOLR-based search platform to support data integration and semantic browsing

  5. Advancing translational research with the Semantic Web

    Science.gov (United States)

    Ruttenberg, Alan; Clark, Tim; Bug, William; Samwald, Matthias; Bodenreider, Olivier; Chen, Helen; Doherty, Donald; Forsberg, Kerstin; Gao, Yong; Kashyap, Vipul; Kinoshita, June; Luciano, Joanne; Marshall, M Scott; Ogbuji, Chimezie; Rees, Jonathan; Stephens, Susie; Wong, Gwendolyn T; Wu, Elizabeth; Zaccagnini, Davide; Hongsermeier, Tonya; Neumann, Eric; Herman, Ivan; Cheung, Kei-Hoi

    2007-01-01

    Background A fundamental goal of the U.S. National Institute of Health (NIH) "Roadmap" is to strengthen Translational Research, defined as the movement of discoveries in basic research to application at the clinical level. A significant barrier to translational research is the lack of uniformly structured data across related biomedical domains. The Semantic Web is an extension of the current Web that enables navigation and meaningful use of digital resources by automatic processes. It is based on common formats that support aggregation and integration of data drawn from diverse sources. A variety of technologies have been built on this foundation that, together, support identifying, representing, and reasoning across a wide range of biomedical data. The Semantic Web Health Care and Life Sciences Interest Group (HCLSIG), set up within the framework of the World Wide Web Consortium, was launched to explore the application of these technologies in a variety of areas. Subgroups focus on making biomedical data available in RDF, working with biomedical ontologies, prototyping clinical decision support systems, working on drug safety and efficacy communication, and supporting disease researchers navigating and annotating the large amount of potentially relevant literature. Results We present a scenario that shows the value of the information environment the Semantic Web can support for aiding neuroscience researchers. We then report on several projects by members of the HCLSIG, in the process illustrating the range of Semantic Web technologies that have applications in areas of biomedicine. Conclusion Semantic Web technologies present both promise and challenges. Current tools and standards are already adequate to implement components of the bench-to-bedside vision. On the other hand, these technologies are young. Gaps in standards and implementations still exist and adoption is limited by typical problems with early technology, such as the need for a critical mass of

  6. Advancing translational research with the Semantic Web

    Directory of Open Access Journals (Sweden)

    Marshall M Scott

    2007-05-01

    Full Text Available Abstract Background A fundamental goal of the U.S. National Institute of Health (NIH "Roadmap" is to strengthen Translational Research, defined as the movement of discoveries in basic research to application at the clinical level. A significant barrier to translational research is the lack of uniformly structured data across related biomedical domains. The Semantic Web is an extension of the current Web that enables navigation and meaningful use of digital resources by automatic processes. It is based on common formats that support aggregation and integration of data drawn from diverse sources. A variety of technologies have been built on this foundation that, together, support identifying, representing, and reasoning across a wide range of biomedical data. The Semantic Web Health Care and Life Sciences Interest Group (HCLSIG, set up within the framework of the World Wide Web Consortium, was launched to explore the application of these technologies in a variety of areas. Subgroups focus on making biomedical data available in RDF, working with biomedical ontologies, prototyping clinical decision support systems, working on drug safety and efficacy communication, and supporting disease researchers navigating and annotating the large amount of potentially relevant literature. Results We present a scenario that shows the value of the information environment the Semantic Web can support for aiding neuroscience researchers. We then report on several projects by members of the HCLSIG, in the process illustrating the range of Semantic Web technologies that have applications in areas of biomedicine. Conclusion Semantic Web technologies present both promise and challenges. Current tools and standards are already adequate to implement components of the bench-to-bedside vision. On the other hand, these technologies are young. Gaps in standards and implementations still exist and adoption is limited by typical problems with early technology, such as the need

  7. A Compositional Semantics for Stochastic Reo Connectors

    NARCIS (Netherlands)

    Y.-J. Moon (Young-Joo); A.M. Silva (Alexandra); , C. (born Köhler, , C.) Krause (Christian); F. Arbab (Farhad)

    2010-01-01

    htmlabstractIn this paper we present a compositional semantics for the channel-based coordination language Reo which enables the analysis of quality of service (QoS) properties of service compositions. For this purpose, we annotate Reo channels with stochastic delay rates and explicitly model

  8. SemVisM: semantic visualizer for medical image

    Science.gov (United States)

    Landaeta, Luis; La Cruz, Alexandra; Baranya, Alexander; Vidal, María.-Esther

    2015-01-01

    SemVisM is a toolbox that combines medical informatics and computer graphics tools for reducing the semantic gap between low-level features and high-level semantic concepts/terms in the images. This paper presents a novel strategy for visualizing medical data annotated semantically, combining rendering techniques, and segmentation algorithms. SemVisM comprises two main components: i) AMORE (A Modest vOlume REgister) to handle input data (RAW, DAT or DICOM) and to initially annotate the images using terms defined on medical ontologies (e.g., MesH, FMA or RadLex), and ii) VOLPROB (VOlume PRObability Builder) for generating the annotated volumetric data containing the classified voxels that belong to a particular tissue. SemVisM is built on top of the semantic visualizer ANISE.1

  9. Semantic Desktop

    Science.gov (United States)

    Sauermann, Leo; Kiesel, Malte; Schumacher, Kinga; Bernardi, Ansgar

    In diesem Beitrag wird gezeigt, wie der Arbeitsplatz der Zukunft aussehen könnte und wo das Semantic Web neue Möglichkeiten eröffnet. Dazu werden Ansätze aus dem Bereich Semantic Web, Knowledge Representation, Desktop-Anwendungen und Visualisierung vorgestellt, die es uns ermöglichen, die bestehenden Daten eines Benutzers neu zu interpretieren und zu verwenden. Dabei bringt die Kombination von Semantic Web und Desktop Computern besondere Vorteile - ein Paradigma, das unter dem Titel Semantic Desktop bekannt ist. Die beschriebenen Möglichkeiten der Applikationsintegration sind aber nicht auf den Desktop beschränkt, sondern können genauso in Web-Anwendungen Verwendung finden.

  10. Semantic Interoperability in Heterogeneous IoT Infrastructure for Healthcare

    Directory of Open Access Journals (Sweden)

    Sohail Jabbar

    2017-01-01

    Full Text Available Interoperability remains a significant burden to the developers of Internet of Things’ Systems. This is due to the fact that the IoT devices are highly heterogeneous in terms of underlying communication protocols, data formats, and technologies. Secondly due to lack of worldwide acceptable standards, interoperability tools remain limited. In this paper, we proposed an IoT based Semantic Interoperability Model (IoT-SIM to provide Semantic Interoperability among heterogeneous IoT devices in healthcare domain. Physicians communicate their patients with heterogeneous IoT devices to monitor their current health status. Information between physician and patient is semantically annotated and communicated in a meaningful way. A lightweight model for semantic annotation of data using heterogeneous devices in IoT is proposed to provide annotations for data. Resource Description Framework (RDF is a semantic web framework that is used to relate things using triples to make it semantically meaningful. RDF annotated patients’ data has made it semantically interoperable. SPARQL query is used to extract records from RDF graph. For simulation of system, we used Tableau, Gruff-6.2.0, and Mysql tools.

  11. Coordinating Communities and Building Governance in the Development of Schematic and Semantic Standards: the Key to Solving Global Earth and Space Science Challenges in the 21st Century.

    Science.gov (United States)

    Wyborn, L. A.

    2007-12-01

    The Information Age in Science is being driven partly by the data deluge as exponentially growing volumes of data are being generated by research. Such large volumes of data cannot be effectively processed by humans and efficient and timely processing by computers requires development of specific machine readable formats. Further, as key challenges in earth and space sciences, such as climate change, hazard prediction and sustainable development resources require a cross disciplinary approach, data from various domains will need to be integrated from globally distributed sources also via machine to machine formats. However, it is becoming increasingly apparent that the existing standards can be very domain specific and most existing data transfer formats require human intervention. Where groups from different communities do try combine data across the domain/discipline boundaries much time is spent reformatting and reorganizing the data and it is conservatively estimated that this can take 80% of a project's time and resources. Four different types of standards are required for machine to machine interaction: systems, syntactic, schematic and semantic. Standards at the systems (WMS, WFS, etc) and at the syntactic level (GML, Observation and Measurement, SensorML) are being developed through international standards bodies such as ISO, OGC, W3C, IEEE etc. In contrast standards at the schematic level (e.g., GeoSciML, LandslidesML, WaterML, QuakeML) and at the semantic level (ie ontologies and vocabularies) are currently developing rapidly, in a very uncoordinated way and with little governance. As the size of the community that can machine read each others data depends on the size of the community that has developed the schematic or semantic standards, it is essential that to achieve global integration of earth and space science data, the required standards need to be developed through international collaboration using accepted standard proceedures. Once developed the

  12. Predicting Protein Function via Semantic Integration of Multiple Networks.

    Science.gov (United States)

    Yu, Guoxian; Fu, Guangyuan; Wang, Jun; Zhu, Hailong

    2016-01-01

    Determining the biological functions of proteins is one of the key challenges in the post-genomic era. The rapidly accumulated large volumes of proteomic and genomic data drives to develop computational models for automatically predicting protein function in large scale. Recent approaches focus on integrating multiple heterogeneous data sources and they often get better results than methods that use single data source alone. In this paper, we investigate how to integrate multiple biological data sources with the biological knowledge, i.e., Gene Ontology (GO), for protein function prediction. We propose a method, called SimNet, to Semantically integrate multiple functional association Networks derived from heterogenous data sources. SimNet firstly utilizes GO annotations of proteins to capture the semantic similarity between proteins and introduces a semantic kernel based on the similarity. Next, SimNet constructs a composite network, obtained as a weighted summation of individual networks, and aligns the network with the kernel to get the weights assigned to individual networks. Then, it applies a network-based classifier on the composite network to predict protein function. Experiment results on heterogenous proteomic data sources of Yeast, Human, Mouse, and Fly show that, SimNet not only achieves better (or comparable) results than other related competitive approaches, but also takes much less time. The Matlab codes of SimNet are available at https://sites.google.com/site/guoxian85/simnet.

  13. A topic modeling approach for web service annotation

    Directory of Open Access Journals (Sweden)

    Leandro Ordóñez-Ante

    2014-06-01

    Full Text Available The actual implementation of semantic-based mechanisms for service retrieval has been restricted, given the resource-intensive procedure involved in the formal specification of services, which generally comprises associating semantic annotations to their documentation sources. Typically, developer performs such a procedure by hand, requiring specialized knowledge on models for semantic description of services (e.g. OWL-S, WSMO, SAWSDL, as well as formal specifications of knowledge. Thus, this semantic-based service description procedure turns out to be a cumbersome and error-prone task. This paper introduces a proposal for service annotation, based on processing web service documentation for extracting information regarding its offered capabilities. By uncovering the hidden semantic structure of such information through statistical analysis techniques, we are able to associate meaningful annotations to the services operations/resources, while grouping those operations into non-exclusive semantic related categories. This research paper belongs to the TelComp 2.0 project, which Colciencas and University of Cauca founded in cooperation.

  14. Active learning reduces annotation time for clinical concept extraction.

    Science.gov (United States)

    Kholghi, Mahnoosh; Sitbon, Laurianne; Zuccon, Guido; Nguyen, Anthony

    2017-10-01

    To investigate: (1) the annotation time savings by various active learning query strategies compared to supervised learning and a random sampling baseline, and (2) the benefits of active learning-assisted pre-annotations in accelerating the manual annotation process compared to de novo annotation. There are 73 and 120 discharge summary reports provided by Beth Israel institute in the train and test sets of the concept extraction task in the i2b2/VA 2010 challenge, respectively. The 73 reports were used in user study experiments for manual annotation. First, all sequences within the 73 reports were manually annotated from scratch. Next, active learning models were built to generate pre-annotations for the sequences selected by a query strategy. The annotation/reviewing time per sequence was recorded. The 120 test reports were used to measure the effectiveness of the active learning models. When annotating from scratch, active learning reduced the annotation time up to 35% and 28% compared to a fully supervised approach and a random sampling baseline, respectively. Reviewing active learning-assisted pre-annotations resulted in 20% further reduction of the annotation time when compared to de novo annotation. The number of concepts that require manual annotation is a good indicator of the annotation time for various active learning approaches as demonstrated by high correlation between time rate and concept annotation rate. Active learning has a key role in reducing the time required to manually annotate domain concepts from clinical free text, either when annotating from scratch or reviewing active learning-assisted pre-annotations. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Supporting Keyword Search for Image Retrieval with Integration of Probabilistic Annotation

    Directory of Open Access Journals (Sweden)

    Tie Hua Zhou

    2015-05-01

    Full Text Available The ever-increasing quantities of digital photo resources are annotated with enriching vocabularies to form semantic annotations. Photo-sharing social networks have boosted the need for efficient and intuitive querying to respond to user requirements in large-scale image collections. In order to help users formulate efficient and effective image retrieval, we present a novel integration of a probabilistic model based on keyword query architecture that models the probability distribution of image annotations: allowing users to obtain satisfactory results from image retrieval via the integration of multiple annotations. We focus on the annotation integration step in order to specify the meaning of each image annotation, thus leading to the most representative annotations of the intent of a keyword search. For this demonstration, we show how a probabilistic model has been integrated to semantic annotations to allow users to intuitively define explicit and precise keyword queries in order to retrieve satisfactory image results distributed in heterogeneous large data sources. Our experiments on SBU (collected by Stony Brook University database show that (i our integrated annotation contains higher quality representatives and semantic matches; and (ii the results indicating annotation integration can indeed improve image search result quality.

  16. Semantic Web repositories for genomics data using the eXframe platform.

    Science.gov (United States)

    Merrill, Emily; Corlosquet, Stéphane; Ciccarese, Paolo; Clark, Tim; Das, Sudeshna

    2014-01-01

    With the advent of inexpensive assay technologies, there has been an unprecedented growth in genomics data as well as the number of databases in which it is stored. In these databases, sample annotation using ontologies and controlled vocabularies is becoming more common. However, the annotation is rarely available as Linked Data, in a machine-readable format, or for standardized queries using SPARQL. This makes large-scale reuse, or integration with other knowledge bases very difficult. To address this challenge, we have developed the second generation of our eXframe platform, a reusable framework for creating online repositories of genomics experiments. This second generation model now publishes Semantic Web data. To accomplish this, we created an experiment model that covers provenance, citations, external links, assays, biomaterials used in the experiment, and the data collected during the process. The elements of our model are mapped to classes and properties from various established biomedical ontologies. Resource Description Framework (RDF) data is automatically produced using these mappings and indexed in an RDF store with a built-in Sparql Protocol and RDF Query Language (SPARQL) endpoint. Using the open-source eXframe software, institutions and laboratories can create Semantic Web repositories of their experiments, integrate it with heterogeneous resources and make it interoperable with the vast Semantic Web of biomedical knowledge.

  17. Action representation: crosstalk between semantics and pragmatics.

    Science.gov (United States)

    Prinz, Wolfgang

    2014-03-01

    Marc Jeannerod pioneered a representational approach to movement and action. In his approach, motor representations provide both, declarative knowledge about action and procedural knowledge for action (action semantics and action pragmatics, respectively). Recent evidence from language comprehension and action simulation supports the claim that action pragmatics and action semantics draw on common representational resources, thus challenging the traditional divide between declarative and procedural action knowledge. To account for these observations, three kinds of theoretical frameworks are discussed: (i) semantics is grounded in pragmatics, (ii) pragmatics is anchored in semantics, and (iii) pragmatics is part and parcel of semantics. © 2013 Elsevier Ltd. All rights reserved.

  18. A framework for automatic annotation of web pages using the Google Rich Snippets vocabulary

    NARCIS (Netherlands)

    Meer, van der J.; Boon, F.; Hogenboom, F.P.; Frasincar, F.; Kaymak, U.

    2011-01-01

    One of the latest developments for the Semantic Web is Google Rich Snippets, a service that uses Web page annotations for displaying search results in a visually appealing manner. In this paper we propose the Automatic Review Recognition and annOtation of Web pages (ARROW) framework, which is able

  19. A Linked Data-Based Collaborative Annotation System for Increasing Learning Achievements

    Science.gov (United States)

    Zarzour, Hafed; Sellami, Mokhtar

    2017-01-01

    With the emergence of the Web 2.0, collaborative annotation practices have become more mature in the field of learning. In this context, several recent studies have shown the powerful effects of the integration of annotation mechanism in learning process. However, most of these studies provide poor support for semantically structured resources,…

  20. A semantic medical multimedia retrieval approach using ontology information hiding.

    Science.gov (United States)

    Guo, Kehua; Zhang, Shigeng

    2013-01-01

    Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches.

  1. A manually annotated Actinidia chinensis var. chinensis (kiwifruit) genome highlights the challenges associated with draft genomes and gene prediction in plants.

    Science.gov (United States)

    Pilkington, Sarah M; Crowhurst, Ross; Hilario, Elena; Nardozza, Simona; Fraser, Lena; Peng, Yongyan; Gunaseelan, Kularajathevan; Simpson, Robert; Tahir, Jibran; Deroles, Simon C; Templeton, Kerry; Luo, Zhiwei; Davy, Marcus; Cheng, Canhong; McNeilage, Mark; Scaglione, Davide; Liu, Yifei; Zhang, Qiong; Datson, Paul; De Silva, Nihal; Gardiner, Susan E; Bassett, Heather; Chagné, David; McCallum, John; Dzierzon, Helge; Deng, Cecilia; Wang, Yen-Yi; Barron, Lorna; Manako, Kelvina; Bowen, Judith; Foster, Toshi M; Erridge, Zoe A; Tiffin, Heather; Waite, Chethi N; Davies, Kevin M; Grierson, Ella P; Laing, William A; Kirk, Rebecca; Chen, Xiuyin; Wood, Marion; Montefiori, Mirco; Brummell, David A; Schwinn, Kathy E; Catanach, Andrew; Fullerton, Christina; Li, Dawei; Meiyalaghan, Sathiyamoorthy; Nieuwenhuizen, Niels; Read, Nicola; Prakash, Roneel; Hunter, Don; Zhang, Huaibi; McKenzie, Marian; Knäbel, Mareike; Harris, Alastair; Allan, Andrew C; Gleave, Andrew; Chen, Angela; Janssen, Bart J; Plunkett, Blue; Ampomah-Dwamena, Charles; Voogd, Charlotte; Leif, Davin; Lafferty, Declan; Souleyre, Edwige J F; Varkonyi-Gasic, Erika; Gambi, Francesco; Hanley, Jenny; Yao, Jia-Long; Cheung, Joey; David, Karine M; Warren, Ben; Marsh, Ken; Snowden, Kimberley C; Lin-Wang, Kui; Brian, Lara; Martinez-Sanchez, Marcela; Wang, Mindy; Ileperuma, Nadeesha; Macnee, Nikolai; Campin, Robert; McAtee, Peter; Drummond, Revel S M; Espley, Richard V; Ireland, Hilary S; Wu, Rongmei; Atkinson, Ross G; Karunairetnam, Sakuntala; Bulley, Sean; Chunkath, Shayhan; Hanley, Zac; Storey, Roy; Thrimawithana, Amali H; Thomson, Susan; David, Charles; Testolin, Raffaele; Huang, Hongwen; Hellens, Roger P; Schaffer, Robert J

    2018-04-16

    Most published genome sequences are drafts, and most are dominated by computational gene prediction. Draft genomes typically incorporate considerable sequence data that are not assigned to chromosomes, and predicted genes without quality confidence measures. The current Actinidia chinensis (kiwifruit) 'Hongyang' draft genome has 164 Mb of sequences unassigned to pseudo-chromosomes, and omissions have been identified in the gene models. A second genome of an A. chinensis (genotype Red5) was fully sequenced. This new sequence resulted in a 554.0 Mb assembly with all but 6 Mb assigned to pseudo-chromosomes. Pseudo-chromosomal comparisons showed a considerable number of translocation events have occurred following a whole genome duplication (WGD) event some consistent with centromeric Robertsonian-like translocations. RNA sequencing data from 12 tissues and ab initio analysis informed a genome-wide manual annotation, using the WebApollo tool. In total, 33,044 gene loci represented by 33,123 isoforms were identified, named and tagged for quality of evidential support. Of these 3114 (9.4%) were identical to a protein within 'Hongyang' The Kiwifruit Information Resource (KIR v2). Some proportion of the differences will be varietal polymorphisms. However, as most computationally predicted Red5 models required manual re-annotation this proportion is expected to be small. The quality of the new gene models was tested by fully sequencing 550 cloned 'Hort16A' cDNAs and comparing with the predicted protein models for Red5 and both the original 'Hongyang' assembly and the revised annotation from KIR v2. Only 48.9% and 63.5% of the cDNAs had a match with 90% identity or better to the original and revised 'Hongyang' annotation, respectively, compared with 90.9% to the Red5 models. Our study highlights the need to take a cautious approach to draft genomes and computationally predicted genes. Our use of the manual annotation tool WebApollo facilitated manual checking and

  2. Semantic Activity Recognition

    OpenAIRE

    Thonnat , Monique

    2008-01-01

    International audience; Extracting automatically the semantics from visual data is a real challenge. We describe in this paper how recent work in cognitive vision leads to significative results in activity recognition for visualsurveillance and video monitoring. In particular we present work performed in the domain of video understanding in our PULSAR team at INRIA in Sophia Antipolis. Our main objective is to analyse in real-time video streams captured by static video cameras and to recogniz...

  3. Generative Semantics

    Science.gov (United States)

    Bagha, Karim Nazari

    2011-01-01

    Generative semantics is (or perhaps was) a research program within linguistics, initiated by the work of George Lakoff, John R. Ross, Paul Postal and later McCawley. The approach developed out of transformational generative grammar in the mid 1960s, but stood largely in opposition to work by Noam Chomsky and his students. The nature and genesis of…

  4. Inferentializing Semantics

    Czech Academy of Sciences Publication Activity Database

    Peregrin, Jaroslav

    2010-01-01

    Roč. 39, č. 3 (2010), s. 255-274 ISSN 0022-3611 R&D Projects: GA ČR(CZ) GA401/07/0904 Institutional research plan: CEZ:AV0Z90090514 Keywords : inference * proof theory * model theory * inferentialism * semantics Subject RIV: AA - Philosophy ; Religion

  5. Trust estimation of the semantic web using semantic web clustering

    Science.gov (United States)

    Shirgahi, Hossein; Mohsenzadeh, Mehran; Haj Seyyed Javadi, Hamid

    2017-05-01

    Development of semantic web and social network is undeniable in the Internet world these days. Widespread nature of semantic web has been very challenging to assess the trust in this field. In recent years, extensive researches have been done to estimate the trust of semantic web. Since trust of semantic web is a multidimensional problem, in this paper, we used parameters of social network authority, the value of pages links authority and semantic authority to assess the trust. Due to the large space of semantic network, we considered the problem scope to the clusters of semantic subnetworks and obtained the trust of each cluster elements as local and calculated the trust of outside resources according to their local trusts and trust of clusters to each other. According to the experimental result, the proposed method shows more than 79% Fscore that is about 11.9% in average more than Eigen, Tidal and centralised trust methods. Mean of error in this proposed method is 12.936, that is 9.75% in average less than Eigen and Tidal trust methods.

  6. The semantic similarity ensemble

    Directory of Open Access Journals (Sweden)

    Andrea Ballatore

    2013-12-01

    Full Text Available Computational measures of semantic similarity between geographic terms provide valuable support across geographic information retrieval, data mining, and information integration. To date, a wide variety of approaches to geo-semantic similarity have been devised. A judgment of similarity is not intrinsically right or wrong, but obtains a certain degree of cognitive plausibility, depending on how closely it mimics human behavior. Thus selecting the most appropriate measure for a specific task is a significant challenge. To address this issue, we make an analogy between computational similarity measures and soliciting domain expert opinions, which incorporate a subjective set of beliefs, perceptions, hypotheses, and epistemic biases. Following this analogy, we define the semantic similarity ensemble (SSE as a composition of different similarity measures, acting as a panel of experts having to reach a decision on the semantic similarity of a set of geographic terms. The approach is evaluated in comparison to human judgments, and results indicate that an SSE performs better than the average of its parts. Although the best member tends to outperform the ensemble, all ensembles outperform the average performance of each ensemble's member. Hence, in contexts where the best measure is unknown, the ensemble provides a more cognitively plausible approach.

  7. High Performance Descriptive Semantic Analysis of Semantic Graph Databases

    Energy Technology Data Exchange (ETDEWEB)

    Joslyn, Cliff A.; Adolf, Robert D.; al-Saffar, Sinan; Feo, John T.; Haglin, David J.; Mackey, Greg E.; Mizell, David W.

    2011-06-02

    As semantic graph database technology grows to address components ranging from extant large triple stores to SPARQL endpoints over SQL-structured relational databases, it will become increasingly important to be able to understand their inherent semantic structure, whether codified in explicit ontologies or not. Our group is researching novel methods for what we call descriptive semantic analysis of RDF triplestores, to serve purposes of analysis, interpretation, visualization, and optimization. But data size and computational complexity makes it increasingly necessary to bring high performance computational resources to bear on this task. Our research group built a novel high performance hybrid system comprising computational capability for semantic graph database processing utilizing the large multi-threaded architecture of the Cray XMT platform, conventional servers, and large data stores. In this paper we describe that architecture and our methods, and present the results of our analyses of basic properties, connected components, namespace interaction, and typed paths such for the Billion Triple Challenge 2010 dataset.

  8. Semantic Blogging : Spreading the Semantic Web Meme

    OpenAIRE

    Cayzer, Steve

    2004-01-01

    This paper is about semantic blogging, an application of the semantic web to blogging. The semantic web promises to make the web more useful by endowing metadata with machine processable semantics. Blogging is a lightweight web publishing paradigm which provides a very low barrier to entry, useful syndication and aggregation behaviour, a simple to understand structure and decentralized construction of a rich information network. Semantic blogging builds upon the success and clear network valu...

  9. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.

  10. Community annotation and bioinformatics workforce development in concert—Little Skate Genome Annotation Workshops and Jamborees

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N.; King, Benjamin L.; Polson, Shawn W.; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F.; Page, Shallee T.; Farnum Rendino, Marc; Thomas, William Kelley; Udwary, Daniel W.; Wu, Cathy H.

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome. PMID:22434832

  11. The MGED Ontology: a resource for semantics-based description of microarray experiments.

    Science.gov (United States)

    Whetzel, Patricia L; Parkinson, Helen; Causton, Helen C; Fan, Liju; Fostel, Jennifer; Fragoso, Gilberto; Game, Laurence; Heiskanen, Mervi; Morrison, Norman; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Taylor, Chris; White, Joseph; Stoeckert, Christian J

    2006-04-01

    The generation of large amounts of microarray data and the need to share these data bring challenges for both data management and annotation and highlights the need for standards. MIAME specifies the minimum information needed to describe a microarray experiment and the Microarray Gene Expression Object Model (MAGE-OM) and resulting MAGE-ML provide a mechanism to standardize data representation for data exchange, however a common terminology for data annotation is needed to support these standards. Here we describe the MGED Ontology (MO) developed by the Ontology Working Group of the Microarray Gene Expression Data (MGED) Society. The MO provides terms for annotating all aspects of a microarray experiment from the design of the experiment and array layout, through to the preparation of the biological sample and the protocols used to hybridize the RNA and analyze the data. The MO was developed to provide terms for annotating experiments in line with the MIAME guidelines, i.e. to provide the semantics to describe a microarray experiment according to the concepts specified in MIAME. The MO does not attempt to incorporate terms from existing ontologies, e.g. those that deal with anatomical parts or developmental stages terms, but provides a framework to reference terms in other ontologies and therefore facilitates the use of ontologies in microarray data annotation. The MGED Ontology version.1.2.0 is available as a file in both DAML and OWL formats at http://mged.sourceforge.net/ontologies/index.php. Release notes and annotation examples are provided. The MO is also provided via the NCICB's Enterprise Vocabulary System (http://nciterms.nci.nih.gov/NCIBrowser/Dictionary.do). Stoeckrt@pcbi.upenn.edu Supplementary data are available at Bioinformatics online.

  12. BOWiki: an ontology-based wiki for annotation of data and integration of knowledge in biology

    Directory of Open Access Journals (Sweden)

    Gregorio Sergio E

    2009-05-01

    Full Text Available Abstract Motivation Ontology development and the annotation of biological data using ontologies are time-consuming exercises that currently require input from expert curators. Open, collaborative platforms for biological data annotation enable the wider scientific community to become involved in developing and maintaining such resources. However, this openness raises concerns regarding the quality and correctness of the information added to these knowledge bases. The combination of a collaborative web-based platform with logic-based approaches and Semantic Web technology can be used to address some of these challenges and concerns. Results We have developed the BOWiki, a web-based system that includes a biological core ontology. The core ontology provides background knowledge about biological types and relations. Against this background, an automated reasoner assesses the consistency of new information added to the knowledge base. The system provides a platform for research communities to integrate information and annotate data collaboratively. Availability The BOWiki and supplementary material is available at http://www.bowiki.net/. The source code is available under the GNU GPL from http://onto.eva.mpg.de/trac/BoWiki.

  13. Teaching and Learning Communities through Online Annotation

    Science.gov (United States)

    van der Pluijm, B.

    2016-12-01

    What do colleagues do with your assigned textbook? What they say or think about the material? Want students to be more engaged in their learning experience? If so, online materials that complement standard lecture format provide new opportunity through managed, online group annotation that leverages the ubiquity of internet access, while personalizing learning. The concept is illustrated with the new online textbook "Processes in Structural Geology and Tectonics", by Ben van der Pluijm and Stephen Marshak, which offers a platform for sharing of experiences, supplementary materials and approaches, including readings, mathematical applications, exercises, challenge questions, quizzes, alternative explanations, and more. The annotation framework used is Hypothes.is, which offers a free, open platform markup environment for annotation of websites and PDF postings. The annotations can be public, grouped or individualized, as desired, including export access and download of annotations. A teacher group, hosted by a moderator/owner, limits access to members of a user group of teachers, so that its members can use, copy or transcribe annotations for their own lesson material. Likewise, an instructor can host a student group that encourages sharing of observations, questions and answers among students and instructor. Also, the instructor can create one or more closed groups that offers study help and hints to students. Options galore, all of which aim to engage students and to promote greater responsibility for their learning experience. Beyond new capacity, the ability to analyze student annotation supports individual learners and their needs. For example, student notes can be analyzed for key phrases and concepts, and identify misunderstandings, omissions and problems. Also, example annotations can be shared to enhance notetaking skills and to help with studying. Lastly, online annotation allows active application to lecture posted slides, supporting real-time notetaking

  14. Facilitating functional annotation of chicken microarray data

    Directory of Open Access Journals (Sweden)

    Gresham Cathy R

    2009-10-01

    Full Text Available Abstract Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO. However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and

  15. NoGOA: predicting noisy GO annotations using evidences and sparse representation.

    Science.gov (United States)

    Yu, Guoxian; Lu, Chang; Wang, Jun

    2017-07-21

    Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA .

  16. Semantic mashups intelligent reuse of web resources

    CERN Document Server

    Endres-Niggemeyer, Brigitte

    2013-01-01

    Mashups are mostly lightweight Web applications that offer new functionalities by combining, aggregating and transforming resources and services available on the Web. Popular examples include a map in their main offer, for instance for real estate, hotel recommendations, or navigation tools.  Mashups may contain and mix client-side and server-side activity. Obviously, understanding the incoming resources (services, statistical figures, text, videos, etc.) is a precondition for optimally combining them, so that there is always some undercover semantics being used.  By using semantic annotations

  17. Annotating images by mining image search results.

    Science.gov (United States)

    Wang, Xin-Jing; Zhang, Lei; Li, Xirong; Ma, Wei-Ying

    2008-11-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search results. Some 2.4 million images with their surrounding text are collected from a few photo forums to support this approach. The entire process is formulated in a divide-and-conquer framework where a query keyword is provided along with the uncaptioned image to improve both the effectiveness and efficiency. This is helpful when the collected data set is not dense everywhere. In this sense, our approach contains three steps: 1) the search process to discover visually and semantically similar search results, 2) the mining process to identify salient terms from textual descriptions of the search results, and 3) the annotation rejection process to filter out noisy terms yielded by Step 2. To ensure real-time annotation, two key techniques are leveraged-one is to map the high-dimensional image visual features into hash codes, the other is to implement it as a distributed system, of which the search and mining processes are provided as Web services. As a typical result, the entire process finishes in less than 1 second. Since no training data set is required, our approach enables annotating with unlimited vocabulary and is highly scalable and robust to outliers. Experimental results on both real Web images and a benchmark image data set show the effectiveness and efficiency of the proposed algorithm. It is also worth noting that, although the entire approach is illustrated within the divide-and conquer framework, a query keyword is not crucial to our current implementation. We provide experimental results to prove this.

  18. Jigsaw Semantics

    Directory of Open Access Journals (Sweden)

    Paul J. E. Dekker

    2010-12-01

    Full Text Available In the last decade the enterprise of formal semantics has been under attack from several philosophical and linguistic perspectives, and it has certainly suffered from its own scattered state, which hosts quite a variety of paradigms which may seem to be incompatible. It will not do to try and answer the arguments of the critics, because the arguments are often well-taken. The negative conclusions, however, I believe are not. The only adequate reply seems to be a constructive one, which puts several pieces of formal semantics, in particular dynamic semantics, together again. In this paper I will try and sketch an overview of tasks, techniques, and results, which serves to at least suggest that it is possible to develop a coherent overall picture of undeniably important and structural phenomena in the interpretation of natural language. The idea is that the concept of meanings as truth conditions after all provides an excellent start for an integrated study of the meaning and use of natural language, and that an extended notion of goal directed pragmatics naturally complements this picture. None of the results reported here are really new, but we think it is important to re-collect them.ReferencesAsher, Nicholas & Lascarides, Alex. 1998. ‘Questions in Dialogue’. Linguistics and Philosophy 23: 237–309.http://dx.doi.org/10.1023/A:1005364332007Borg, Emma. 2007. ‘Minimalism versus contextualism in semantics’. In Gerhard Preyer & Georg Peter (eds. ‘Context-Sensitivity and Semantic Minimalism’, pp. 339–359. Oxford: Oxford University Press.Cappelen, Herman & Lepore, Ernest. 1997. ‘On an Alleged Connection between Indirect Quotation and Semantic Theory’. Mind and Language 12: pp. 278–296.Cappelen, Herman & Lepore, Ernie. 2005. Insensitive Semantics. Oxford: Blackwell.http://dx.doi.org/10.1002/9780470755792Dekker, Paul. 2002. ‘Meaning and Use of Indefinite Expressions’. Journal of Logic, Language and Information 11: pp. 141–194

  19. Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob; Hohimer, Ryan E.; White, Amanda M.

    2006-06-06

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  20. Automating Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob L.; Hohimer, Ryan E.; White, Amanda M.

    2006-01-22

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  1. Academic Program Administration via Semantic Web – A Case Study

    OpenAIRE

    Qurban A Memon; Shakeel A. Khoja

    2009-01-01

    Generally, administrative systems in an academic environment are disjoint and support independent queries. The objective in this work is to semantically connect these independent systems to provide support to queries run on the integrated platform. The proposed framework, by enriching educational material in the legacy systems, provides a value-added semantics layer where activities such as annotation, query and reasoning can be carried out to support management requirements. We discuss the d...

  2. Semantic framework for mapping object-oriented model to semantic web languages.

    Science.gov (United States)

    Ježek, Petr; Mouček, Roman

    2015-01-01

    The article deals with and discusses two main approaches in building semantic structures for electrophysiological metadata. It is the use of conventional data structures, repositories, and programming languages on one hand and the use of formal representations of ontologies, known from knowledge representation, such as description logics or semantic web languages on the other hand. Although knowledge engineering offers languages supporting richer semantic means of expression and technological advanced approaches, conventional data structures and repositories are still popular among developers, administrators and users because of their simplicity, overall intelligibility, and lower demands on technical equipment. The choice of conventional data resources and repositories, however, raises the question of how and where to add semantics that cannot be naturally expressed using them. As one of the possible solutions, this semantics can be added into the structures of the programming language that accesses and processes the underlying data. To support this idea we introduced a software prototype that enables its users to add semantically richer expressions into a Java object-oriented code. This approach does not burden users with additional demands on programming environment since reflective Java annotations were used as an entry for these expressions. Moreover, additional semantics need not to be written by the programmer directly to the code, but it can be collected from non-programmers using a graphic user interface. The mapping that allows the transformation of the semantically enriched Java code into the Semantic Web language OWL was proposed and implemented in a library named the Semantic Framework. This approach was validated by the integration of the Semantic Framework in the EEG/ERP Portal and by the subsequent registration of the EEG/ERP Portal in the Neuroscience Information Framework.

  3. Towards a Reactive Semantic Execution Environment

    Science.gov (United States)

    Komazec, Srdjan; Facca, Federico Michele

    Managing complex and distributed software systems built on top of the service-oriented paradigm has never been more challenging. While Semantic Web Service technologies offer a promising set of languages and tools as a foundation to resolve the heterogeneity and scalability issues, they are still failing to provide an autonomic execution environment. In this paper we present an approach based on Semantic Web Services to enable the monitoring and self-management of a Semantic Execution Environment (SEE), a brokerage system for Semantic Web Services. Our approach is founded on the event-triggered reactivity paradigm in order to facilitate environment control, thus contributing to its autonomicity, robustness and flexibility.

  4. An analysis on the entity annotations in biological corpora [v1; ref status: indexed, http://f1000r.es/2o0

    Directory of Open Access Journals (Sweden)

    Mariana Neves

    2014-04-01

    Full Text Available Collection of documents annotated with semantic entities and relationships are crucial resources to support development and evaluation of text mining solutions for the biomedical domain. Here I present an overview of 36 corpora and show an analysis on the semantic annotations they contain. Annotations for entity types were classified into six semantic groups and an overview on the semantic entities which can be found in each corpus is shown. Results show that while some semantic entities, such as genes, proteins and chemicals are consistently annotated in many collections, corpora available for diseases, variations and mutations are still few, in spite of their importance in the biological domain.

  5. Open semantic analysis: The case of word level semantics in Danish

    DEFF Research Database (Denmark)

    Nielsen, Finn Årup; Hansen, Lars Kai

    2017-01-01

    The present research is motivated by the need for accessible and efficient tools for automated semantic analysis in Danish. We are interested in tools that are completely open, so they can be used by a critical public, in public administration, non-governmental organizations and businesses. We...... describe data-driven models for Danish semantic relatedness, word intrusion and sentiment prediction. Open Danish corpora were assembled and unsupervised learning implemented for explicit semantic analysis and with Gensim’s Word2vec model. We evaluate the performance of the two models on three different...... annotated word datasets. We test the semantic representations’ alignment with single word sentiment using supervised learning. We find that logistic regression and large random forests perform well with Word2vec features....

  6. USI: a fast and accurate approach for conceptual document annotation.

    Science.gov (United States)

    Fiorini, Nicolas; Ranwez, Sylvie; Montmain, Jacky; Ranwez, Vincent

    2015-03-14

    Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document. In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity. By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion - instead of one score per concept.

  7. annot8r: GO, EC and KEGG annotation of EST datasets

    Directory of Open Access Journals (Sweden)

    Schmid Ralf

    2008-04-01

    Full Text Available Abstract Background The expressed sequence tag (EST methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways. Results annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO, Enzyme Commission (EC and Kyoto Encyclopaedia of Genes and Genomes (KEGG annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools. Conclusion annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non

  8. Semantic Web Technologies for the Adaptive Web

    DEFF Research Database (Denmark)

    Dolog, Peter; Nejdl, Wolfgang

    2007-01-01

    Ontologies and reasoning are the key terms brought into focus by the semantic web community. Formal representation of ontologies in a common data model on the web can be taken as a foundation for adaptive web technologies as well. This chapter describes how ontologies shared on the semantic web...... provide conceptualization for the links which are a main vehicle to access information on the web. The subject domain ontologies serve as constraints for generating only those links which are relevant for the domain a user is currently interested in. Furthermore, user model ontologies provide additional...... means for deciding which links to show, annotate, hide, generate, and reorder. The semantic web technologies provide means to formalize the domain ontologies and metadata created from them. The formalization enables reasoning for personalization decisions. This chapter describes which components...

  9. A Comparison of Fuzzy and Annotated Logic Programming

    Czech Academy of Sciences Publication Activity Database

    Krajči, S.; Lencses, R.; Vojtáš, Peter

    2004-01-01

    Roč. 144, - (2004), s. 173-192 ISSN 0165-0114 R&D Projects: GA ČR GA201/00/1489 Grant - others:VEGA(SK) 1/7557/20; VEGA(SK) 1/7555/20; VEGA(SK) 1/0385/03 Institutional research plan: CEZ:AV0Z1030915 Keywords : fuzzy logic programming * generalized annotated programs * declarative and procedural semantics * continuous semantics and computable fixpoint * soundness and completeness Subject RIV: BA - General Mathematics Impact factor: 0.734, year: 2004

  10. Building a comprehensive syntactic and semantic corpus of Chinese clinical texts.

    Science.gov (United States)

    He, Bin; Dong, Bin; Guan, Yi; Yang, Jinfeng; Jiang, Zhipeng; Yu, Qiubin; Cheng, Jianyi; Qu, Chunyan

    2017-05-01

    To build a comprehensive corpus covering syntactic and semantic annotations of Chinese clinical texts with corresponding annotation guidelines and methods as well as to develop tools trained on the annotated corpus, which supplies baselines for research on Chinese texts in the clinical domain. An iterative annotation method was proposed to train annotators and to develop annotation guidelines. Then, by using annotation quality assurance measures, a comprehensive corpus was built, containing annotations of part-of-speech (POS) tags, syntactic tags, entities, assertions, and relations. Inter-annotator agreement (IAA) was calculated to evaluate the annotation quality and a Chinese clinical text processing and information extraction system (CCTPIES) was developed based on our annotated corpus. The syntactic corpus consists of 138 Chinese clinical documents with 47,426 tokens and 2612 full parsing trees, while the semantic corpus includes 992 documents that annotated 39,511 entities with their assertions and 7693 relations. IAA evaluation shows that this comprehensive corpus is of good quality, and the system modules are effective. The annotated corpus makes a considerable contribution to natural language processing (NLP) research into Chinese texts in the clinical domain. However, this corpus has a number of limitations. Some additional types of clinical text should be introduced to improve corpus coverage and active learning methods should be utilized to promote annotation efficiency. In this study, several annotation guidelines and an annotation method for Chinese clinical texts were proposed, and a comprehensive corpus with its NLP modules were constructed, providing a foundation for further study of applying NLP techniques to Chinese texts in the clinical domain. Copyright © 2017. Published by Elsevier Inc.

  11. Preserved musical semantic memory in semantic dementia.

    Science.gov (United States)

    Weinstein, Jessica; Koenig, Phyllis; Gunawardena, Delani; McMillan, Corey; Bonner, Michael; Grossman, Murray

    2011-02-01

    To understand the scope of semantic impairment in semantic dementia. Case study. Academic medical center. A man with semantic dementia, as demonstrated by clinical, neuropsychological, and imaging studies. Music performance and magnetic resonance imaging results. Despite profoundly impaired semantic memory for words and objects due to left temporal lobe atrophy, this semiprofessional musician was creative and expressive in demonstrating preserved musical knowledge. Long-term representations of words and objects in semantic memory may be dissociated from meaningful knowledge in other domains, such as music.

  12. Inquisitive semantics and pragmatics

    NARCIS (Netherlands)

    Groenendijk, J.; Roelofsen, F.; Larrazabal, J.M.; Zubeldia, L.

    2009-01-01

    This paper starts with an informal introduction to inquisitive semantics. After that, we present a formal definition of the semantics, and introduce the basic semantic notions of inquisitiveness and informativeness, in terms of wich we define the semantic categories of questions, assertions, and

  13. Semi-Supervised Learning to Identify UMLS Semantic Relations.

    Science.gov (United States)

    Luo, Yuan; Uzuner, Ozlem

    2014-01-01

    The UMLS Semantic Network is constructed by experts and requires periodic expert review to update. We propose and implement a semi-supervised approach for automatically identifying UMLS semantic relations from narrative text in PubMed. Our method analyzes biomedical narrative text to collect semantic entity pairs, and extracts multiple semantic, syntactic and orthographic features for the collected pairs. We experiment with seeded k-means clustering with various distance metrics. We create and annotate a ground truth corpus according to the top two levels of the UMLS semantic relation hierarchy. We evaluate our system on this corpus and characterize the learning curves of different clustering configuration. Using KL divergence consistently performs the best on the held-out test data. With full seeding, we obtain macro-averaged F-measures above 70% for clustering the top level UMLS relations (2-way), and above 50% for clustering the second level relations (7-way).

  14. Semantic Web

    Directory of Open Access Journals (Sweden)

    Anna Lamandini

    2011-06-01

    Full Text Available The semantic Web is a technology at the service of knowledge which is aimed at accessibility and the sharing of content; facilitating interoperability between different systems and as such is one of the nine key technological pillars of TIC (technologies for information and communication within the third theme, programme specific cooperation of the seventh programme framework for research and development (7°PQRS, 2007-2013. As a system it seeks to overcome overload or excess of irrelevant information in Internet, in order to facilitate specific or pertinent research. It is an extension of the existing Web in which the aim is for cooperation between and the computer and people (the dream of Sir Tim Berners –Lee where machines can give more support to people when integrating and elaborating data in order to obtain inferences and a global sharing of data. It is a technology that is able to favour the development of a “data web” in other words the creation of a space in both sets of interconnected and shared data (Linked Data which allows users to link different types of data coming from different sources. It is a technology that will have great effect on everyday life since it will permit the planning of “intelligent applications” in various sectors such as education and training, research, the business world, public information, tourism, health, and e-government. It is an innovative technology that activates a social transformation (socio-semantic Web on a world level since it redefines the cognitive universe of users and enables the sharing not only of information but of significance (collective and connected intelligence.

  15. Automated evaluation of annotators for museum collections using subjective login

    NARCIS (Netherlands)

    Ceolin, D.; Nottamkandath, A.; Fokkink, W.J.; Dimitrakos, Th.; Moona, R.; Patel, Dh.; Harrison McKnight, D.

    2012-01-01

    Museums are rapidly digitizing their collections, and face a huge challenge to annotate every digitized artifact in store. Therefore they are opening up their archives for receiving annotations from experts world-wide. This paper presents an architecture for choosing the most eligible set of

  16. Annotation-based feature extraction from sets of SBML models.

    Science.gov (United States)

    Alm, Rebekka; Waltemath, Dagmar; Wolfien, Markus; Wolkenhauer, Olaf; Henkel, Ron

    2015-01-01

    Model repositories such as BioModels Database provide computational models of biological systems for the scientific community. These models contain rich semantic annotations that link model entities to concepts in well-established bio-ontologies such as Gene Ontology. Consequently, thematically similar models are likely to share similar annotations. Based on this assumption, we argue that semantic annotations are a suitable tool to characterize sets of models. These characteristics improve model classification, allow to identify additional features for model retrieval tasks, and enable the comparison of sets of models. In this paper we discuss four methods for annotation-based feature extraction from model sets. We tested all methods on sets of models in SBML format which were composed from BioModels Database. To characterize each of these sets, we analyzed and extracted concepts from three frequently used ontologies, namely Gene Ontology, ChEBI and SBO. We find that three out of the methods are suitable to determine characteristic features for arbitrary sets of models: The selected features vary depending on the underlying model set, and they are also specific to the chosen model set. We show that the identified features map on concepts that are higher up in the hierarchy of the ontologies than the concepts used for model annotations. Our analysis also reveals that the information content of concepts in ontologies and their usage for model annotation do not correlate. Annotation-based feature extraction enables the comparison of model sets, as opposed to existing methods for model-to-keyword comparison, or model-to-model comparison.

  17. Automated Speech and Audio Analysis for Semantic Access to Multimedia

    NARCIS (Netherlands)

    Jong, F.M.G. de; Ordelman, R.; Huijbregts, M.

    2006-01-01

    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to

  18. Automated speech and audio analysis for semantic access to multimedia

    NARCIS (Netherlands)

    de Jong, Franciska M.G.; Ordelman, Roeland J.F.; Huijbregts, M.A.H.; Avrithis, Y.; Kompatsiaris, Y.; Staab, S.; O' Connor, N.E.

    2006-01-01

    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to

  19. Semantic guidance of eye movements in real-world scenes.

    Science.gov (United States)

    Hwang, Alex D; Wang, Hsueh-Cheng; Pomplun, Marc

    2011-05-25

    The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying latent semantic analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects' gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects' eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. Copyright © 2011 Elsevier Ltd. All rights reserved.

  20. Explaining Semantic Short-Term Memory Deficits: Evidence for the Critical Role of Semantic Control

    Science.gov (United States)

    Hoffman, Paul; Jefferies, Elizabeth; Lambon Ralph, Matthew A.

    2011-01-01

    Patients with apparently selective short-term memory (STM) deficits for semantic information have played an important role in developing multi-store theories of STM and challenge the idea that verbal STM is supported by maintaining activation in the language system. We propose that semantic STM deficits are not as selective as previously thought…

  1. A Semantic Web-based System for Mining Genetic Mutations in Cancer Clinical Trials.

    Science.gov (United States)

    Priya, Sambhawa; Jiang, Guoqian; Dasari, Surendra; Zimmermann, Michael T; Wang, Chen; Heflin, Jeff; Chute, Christopher G

    2015-01-01

    Textual eligibility criteria in clinical trial protocols contain important information about potential clinically relevant pharmacogenomic events. Manual curation for harvesting this evidence is intractable as it is error prone and time consuming. In this paper, we develop and evaluate a Semantic Web-based system that captures and manages mutation evidences and related contextual information from cancer clinical trials. The system has 2 main components: an NLP-based annotator and a Semantic Web ontology-based annotation manager. We evaluated the performance of the annotator in terms of precision and recall. We demonstrated the usefulness of the system by conducting case studies in retrieving relevant clinical trials using a collection of mutations identified from TCGA Leukemia patients and Atlas of Genetics and Cytogenetics in Oncology and Haematology. In conclusion, our system using Semantic Web technologies provides an effective framework for extraction, annotation, standardization and management of genetic mutations in cancer clinical trials.

  2. Integrating Semantic Information in Metadata Descriptions for a Geoscience-wide Resource Inventory.

    Science.gov (United States)

    Zaslavsky, I.; Richard, S. M.; Gupta, A.; Valentine, D.; Whitenack, T.; Ozyurt, I. B.; Grethe, J. S.; Schachne, A.

    2016-12-01

    Integrating semantic information into legacy metadata catalogs is a challenging issue and so far has been mostly done on a limited scale. We present experience of CINERGI (Community Inventory of Earthcube Resources for Geoscience Interoperability), an NSF Earthcube Building Block project, in creating a large cross-disciplinary catalog of geoscience information resources to enable cross-domain discovery. The project developed a pipeline for automatically augmenting resource metadata, in particular generating keywords that describe metadata documents harvested from multiple geoscience information repositories or contributed by geoscientists through various channels including surveys and domain resource inventories. The pipeline examines available metadata descriptions using text parsing, vocabulary management and semantic annotation and graph navigation services of GeoSciGraph. GeoSciGraph, in turn, relies on a large cross-domain ontology of geoscience terms, which bridges several independently developed ontologies or taxonomies including SWEET, ENVO, YAGO, GeoSciML, GCMD, SWO, and CHEBI. The ontology content enables automatic extraction of keywords reflecting science domains, equipment used, geospatial features, measured properties, methods, processes, etc. We specifically focus on issues of cross-domain geoscience ontology creation, resolving several types of semantic conflicts among component ontologies or vocabularies, and constructing and managing facets for improved data discovery and navigation. The ontology and keyword generation rules are iteratively improved as pipeline results are presented to data managers for selective manual curation via a CINERGI Annotator user interface. We present lessons learned from applying CINERGI metadata augmentation pipeline to a number of federal agency and academic data registries, in the context of several use cases that require data discovery and integration across multiple earth science data catalogs of varying quality

  3. Current and future trends in marine image annotation software

    Science.gov (United States)

    Gomes-Pereira, Jose Nuno; Auger, Vincent; Beisiegel, Kolja; Benjamin, Robert; Bergmann, Melanie; Bowden, David; Buhl-Mortensen, Pal; De Leo, Fabio C.; Dionísio, Gisela; Durden, Jennifer M.; Edwards, Luke; Friedman, Ariell; Greinert, Jens; Jacobsen-Stout, Nancy; Lerner, Steve; Leslie, Murray; Nattkemper, Tim W.; Sameoto, Jessica A.; Schoening, Timm; Schouten, Ronald; Seager, James; Singh, Hanumant; Soubigou, Olivier; Tojeira, Inês; van den Beld, Inge; Dias, Frederico; Tempera, Fernando; Santos, Ricardo S.

    2016-12-01

    Given the need to describe, analyze and index large quantities of marine imagery data for exploration and monitoring activities, a range of specialized image annotation tools have been developed worldwide. Image annotation - the process of transposing objects or events represented in a video or still image to the semantic level, may involve human interactions and computer-assisted solutions. Marine image annotation software (MIAS) have enabled over 500 publications to date. We review the functioning, application trends and developments, by comparing general and advanced features of 23 different tools utilized in underwater image analysis. MIAS requiring human input are basically a graphical user interface, with a video player or image browser that recognizes a specific time code or image code, allowing to log events in a time-stamped (and/or geo-referenced) manner. MIAS differ from similar software by the capability of integrating data associated to video collection, the most simple being the position coordinates of the video recording platform. MIAS have three main characteristics: annotating events in real time, posteriorly to annotation and interact with a database. These range from simple annotation interfaces, to full onboard data management systems, with a variety of toolboxes. Advanced packages allow to input and display data from multiple sensors or multiple annotators via intranet or internet. Posterior human-mediated annotation often include tools for data display and image analysis, e.g. length, area, image segmentation, point count; and in a few cases the possibility of browsing and editing previous dive logs or to analyze the annotations. The interaction with a database allows the automatic integration of annotations from different surveys, repeated annotation and collaborative annotation of shared datasets, browsing and querying of data. Progress in the field of automated annotation is mostly in post processing, for stable platforms or still images

  4. Programming the semantic web

    CERN Document Server

    Segaran, Toby; Taylor, Jamie

    2009-01-01

    With this book, the promise of the Semantic Web -- in which machines can find, share, and combine data on the Web -- is not just a technical possibility, but a practical reality Programming the Semantic Web demonstrates several ways to implement semantic web applications, using current and emerging standards and technologies. You'll learn how to incorporate existing data sources into semantically aware applications and publish rich semantic data. Each chapter walks you through a single piece of semantic technology and explains how you can use it to solve real problems. Whether you're writing

  5. Vox Populi : generating video documentaries from semantically annotated media repositories

    NARCIS (Netherlands)

    Bocconi, S.

    2006-01-01

    The context of this research is one or more online video repositories containing several hours of documentary footage and users possibly interested only in particular topics of that material. In such a setting it is not possible to craft a single version containing all possible topics the user might

  6. Annotating individual human genomes.

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A; Topol, Eric J; Schork, Nicholas J

    2011-10-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. Copyright © 2011 Elsevier Inc. All rights reserved.

  7. ANNOTATING INDIVIDUAL HUMAN GENOMES*

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A.; Topol, Eric J.; Schork, Nicholas J.

    2014-01-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. PMID:21839162

  8. Semantic Web Technologies and Big Data Infrastructures: SPARQL Federated Querying of Heterogeneous Big Data Stores

    OpenAIRE

    Konstantopoulos, Stasinos; Charalambidis, Angelos; Mouchakis, Giannis; Troumpoukis, Antonis; Jakobitsch, Jürgen; Karkaletsis, Vangelis

    2016-01-01

    The ability to cross-link large scale data with each other and with structured Semantic Web data, and the ability to uniformly process Semantic Web and other data adds value to both the Semantic Web and to the Big Data community. This paper presents work in progress towards integrating Big Data infrastructures with Semantic Web technologies, allowing for the cross-linking and uniform retrieval of data stored in both Big Data infrastructures and Semantic Web data. The technical challenges invo...

  9. GSV Annotated Bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)

    2010-09-14

    The following annotated bibliography was developed as part of the geospatial algorithm verification and validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Verification and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following five topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models. Many other papers were studied during the course of the investigation including. The annotations for these articles can be found in the paper "On the verification and validation of geospatial image analysis algorithms".

  10. Semantic Document Model to Enhance Data and Knowledge Interoperability

    Science.gov (United States)

    Nešić, Saša

    To enable document data and knowledge to be efficiently shared and reused across application, enterprise, and community boundaries, desktop documents should be completely open and queryable resources, whose data and knowledge are represented in a form understandable to both humans and machines. At the same time, these are the requirements that desktop documents need to satisfy in order to contribute to the visions of the Semantic Web. With the aim of achieving this goal, we have developed the Semantic Document Model (SDM), which turns desktop documents into Semantic Documents as uniquely identified and semantically annotated composite resources, that can be instantiated into human-readable (HR) and machine-processable (MP) forms. In this paper, we present the SDM along with an RDF and ontology-based solution for the MP document instance. Moreover, on top of the proposed model, we have built the Semantic Document Management System (SDMS), which provides a set of services that exploit the model. As an application example that takes advantage of SDMS services, we have extended MS Office with a set of tools that enables users to transform MS Office documents (e.g., MS Word and MS PowerPoint) into Semantic Documents, and to search local and distant semantic document repositories for document content units (CUs) over Semantic Web protocols.

  11. Preserved cumulative semantic interference despite amnesia

    Directory of Open Access Journals (Sweden)

    Gary Michael Oppenheim

    2015-05-01

    As predicted by Oppenheim et al’s (2010 implicit incremental learning account, WRP’s BCN RTs demonstrated strong (and significant repetition priming and semantic blocking effects (Figure 1. Similar to typical results from neurally intact undergraduates, WRP took longer to name pictures presented in semantically homogeneous blocks than in heterogeneous blocks, an effect that increased with each cycle. This result challenges accounts that ascribe cumulative semantic interference in this task to explicit memory mechanisms, instead suggesting that the effect has the sort of implicit learning bases that are typically spared in hippocampal amnesia.

  12. Snapshots for Semantic Maps

    National Research Council Canada - National Science Library

    Nielsen, Curtis W; Ricks, Bob; Goodrich, Michael A; Bruemmer, David; Few, Doug; Walton, Miles

    2004-01-01

    .... Semantic maps are a relatively new approach to information presentation. Semantic maps provide more detail about an environment than typical maps because they are augmented by icons or symbols that provide meaning for places or objects of interest...

  13. Propagating annotations of molecular networks using in silico fragmentation.

    Science.gov (United States)

    da Silva, Ricardo R; Wang, Mingxun; Nothias, Louis-Félix; van der Hooft, Justin J J; Caraballo-Rodríguez, Andrés Mauricio; Fox, Evan; Balunas, Marcy J; Klassen, Jonathan L; Lopes, Norberto Peporine; Dorrestein, Pieter C

    2018-04-18

    The annotation of small molecules is one of the most challenging and important steps in untargeted mass spectrometry analysis, as most of our biological interpretations rely on structural annotations. Molecular networking has emerged as a structured way to organize and mine data from untargeted tandem mass spectrometry (MS/MS) experiments and has been widely applied to propagate annotations. However, propagation is done through manual inspection of MS/MS spectra connected in the spectral networks and is only possible when a reference library spectrum is available. One of the alternative approaches used to annotate an unknown fragmentation mass spectrum is through the use of in silico predictions. One of the challenges of in silico annotation is the uncertainty around the correct structure among the predicted candidate lists. Here we show how molecular networking can be used to improve the accuracy of in silico predictions through propagation of structural annotations, even when there is no match to a MS/MS spectrum in spectral libraries. This is accomplished through creating a network consensus of re-ranked structural candidates using the molecular network topology and structural similarity to improve in silico annotations. The Network Annotation Propagation (NAP) tool is accessible through the GNPS web-platform https://gnps.ucsd.edu/ProteoSAFe/static/gnps-theoretical.jsp.

  14. Reliability in content analysis: The case of semantic feature norms classification.

    Science.gov (United States)

    Bolognesi, Marianna; Pilgram, Roosmaryn; van den Heerik, Romy

    2017-12-01

    Semantic feature norms (e.g., STIMULUS: car → RESPONSE: ) are commonly used in cognitive psychology to look into salient aspects of given concepts. Semantic features are typically collected in experimental settings and then manually annotated by the researchers into feature types (e.g., perceptual features, taxonomic features, etc.) by means of content analyses-that is, by using taxonomies of feature types and having independent coders perform the annotation task. However, the ways in which such content analyses are typically performed and reported are not consistent across the literature. This constitutes a serious methodological problem that might undermine the theoretical claims based on such annotations. In this study, we first offer a review of some of the released datasets of annotated semantic feature norms and the related taxonomies used for content analysis. We then provide theoretical and methodological insights in relation to the content analysis methodology. Finally, we apply content analysis to a new dataset of semantic features and show how the method should be applied in order to deliver reliable annotations and replicable coding schemes. We tackle the following issues: (1) taxonomy structure, (2) the description of categories, (3) coder training, and (4) sustainability of the coding scheme-that is, comparison of the annotations provided by trained versus novice coders. The outcomes of the project are threefold: We provide methodological guidelines for semantic feature classification; we provide a revised and adapted taxonomy that can (arguably) be applied to both concrete and abstract concepts; and we provide a dataset of annotated semantic feature norms.

  15. Retrieval from semantic memory.

    NARCIS (Netherlands)

    Noordman-Vonk, Wietske

    1977-01-01

    The present study has been concerned with the retrieval of semantic information. Retrieving semantic information is a fundamental process in almost any kind of cognitive behavior. The introduction presented the main experimental paradigms and results found in the literature on semantic memory as

  16. Towards Universal Semantic Tagging

    NARCIS (Netherlands)

    Abzianidze, Lasha; Bos, Johan

    2017-01-01

    The paper proposes the task of universal semantic tagging---tagging word tokens with language-neutral, semantically informative tags. We argue that the task, with its independent nature, contributes to better semantic analysis for wide-coverage multilingual text. We present the initial version of

  17. Annotation: The Savant Syndrome

    Science.gov (United States)

    Heaton, Pamela; Wallace, Gregory L.

    2004-01-01

    Background: Whilst interest has focused on the origin and nature of the savant syndrome for over a century, it is only within the past two decades that empirical group studies have been carried out. Methods: The following annotation briefly reviews relevant research and also attempts to address outstanding issues in this research area.…

  18. Annotating Emotions in Meetings

    NARCIS (Netherlands)

    Reidsma, Dennis; Heylen, Dirk K.J.; Ordelman, Roeland J.F.

    We present the results of two trials testing procedures for the annotation of emotion and mental state of the AMI corpus. The first procedure is an adaptation of the FeelTrace method, focusing on a continuous labelling of emotion dimensions. The second method is centered around more discrete

  19. A semantically rich and standardised approach enhancing discovery of sensor data and metadata

    Science.gov (United States)

    Kokkinaki, Alexandra; Buck, Justin; Darroch, Louise

    2016-04-01

    The marine environment plays an essential role in the earth's climate. To enhance the ability to monitor the health of this important system, innovative sensors are being produced and combined with state of the art sensor technology. As the number of sensors deployed is continually increasing,, it is a challenge for data users to find the data that meet their specific needs. Furthermore, users need to integrate diverse ocean datasets originating from the same or even different systems. Standards provide a solution to the above mentioned challenges. The Open Geospatial Consortium (OGC) has created Sensor Web Enablement (SWE) standards that enable different sensor networks to establish syntactic interoperability. When combined with widely accepted controlled vocabularies, they become semantically rich and semantic interoperability is achievable. In addition, Linked Data is the recommended best practice for exposing, sharing and connecting information on the Semantic Web using Uniform Resource Identifiers (URIs), Resource Description Framework (RDF) and RDF Query Language (SPARQL). As part of the EU-funded SenseOCEAN project, the British Oceanographic Data Centre (BODC) is working on the standardisation of sensor metadata enabling 'plug and play' sensor integration. Our approach combines standards, controlled vocabularies and persistent URIs to publish sensor descriptions, their data and associated metadata as 5 star Linked Data and OGC SWE (SensorML, Observations & Measurements) standard. Thus sensors become readily discoverable, accessible and useable via the web. Content and context based searching is also enabled since sensors descriptions are understood by machines. Additionally, sensor data can be combined with other sensor or Linked Data datasets to form knowledge. This presentation will describe the work done in BODC to achieve syntactic and semantic interoperability in the sensor domain. It will illustrate the reuse and extension of the Semantic Sensor

  20. Enhancing acronym/abbreviation knowledge bases with semantic information.

    Science.gov (United States)

    Torii, Manabu; Liu, Hongfang

    2007-10-11

    In the biomedical domain, a terminology knowledge base that associates acronyms/abbreviations (denoted as SFs) with the definitions (denoted as LFs) is highly needed. For the construction such terminology knowledge base, we investigate the feasibility to build a system automatically assigning semantic categories to LFs extracted from text. Given a collection of pairs (SF,LF) derived from text, we i) assess the coverage of LFs and pairs (SF,LF) in the UMLS and justify the need of a semantic category assignment system; and ii) automatically derive name phrases annotated with semantic category and construct a system using machine learning. Utilizing ADAM, an existing collection of (SF,LF) pairs extracted from MEDLINE, our system achieved an f-measure of 87% when assigning eight UMLS-based semantic groups to LFs. The system has been incorporated into a web interface which integrates SF knowledge from multiple SF knowledge bases. Web site: http://gauss.dbb.georgetown.edu/liblab/SFThesurus.

  1. Opinion Mining in Latvian Text Using Semantic Polarity Analysis and Machine Learning Approach

    Directory of Open Access Journals (Sweden)

    Gatis Špats

    2016-07-01

    Full Text Available In this paper we demonstrate approaches for opinion mining in Latvian text. Authors have applied, combined and extended results of several previous studies and public resources to perform opinion mining in Latvian text using two approaches, namely, semantic polarity analysis and machine learning. One of the most significant constraints that make application of opinion mining for written content classification in Latvian text challenging is the limited publicly available text corpora for classifier training. We have joined several sources and created a publically available extended lexicon. Our results are comparable to or outperform current achievements in opinion mining in Latvian. Experiments show that lexicon-based methods provide more accurate opinion mining than the application of Naive Bayes machine learning classifier on Latvian tweets. Methods used during this study could be further extended using human annotators, unsupervised machine learning and bootstrapping to create larger corpora of classified text.

  2. Semantic markup of sensor capabilities: how simple it too simple?

    Science.gov (United States)

    Rueda-Velasquez, C. A.; Janowicz, K.; Fredericks, J.

    2016-12-01

    Semantics plays a key role for the publication, retrieval, integration, and reuse of observational data across the geosciences. In most cases, one can safely assume that the providers of such data, e.g., individual scientists, understand the observation context in which their data are collected,e.g., the used observation procedure, the sampling strategy, the feature of interest being studied, and so forth. However, can we expect that the same is true for the technical details of the used sensors and especially the nuanced changes that can impact observations in often unpredictable ways? Should the burden of annotating the sensor capabilities, firmware, operation ranges, and so forth be really part of a scientist's responsibility? Ideally, semantic annotations should be provided by the parties that understand these details and have a vested interest in maintaining these data. With manufactures providing semantically-enabled metadata for their sensors and instruments, observations could more easily be annotated and thereby enriched using this information. Unfortunately, today's sensor ontologies and tool chains developed for the Semantic Web community require expertise beyond the knowledge and interest of most manufacturers. Consequently, knowledge engineers need to better understand the sweet spot between simple ontologies/vocabularies and sufficient expressivity as well as the tools required to enable manufacturers to share data about their sensors. Here, we report on the current results of EarthCube's X-Domes project that aims to address the questions outlined above.

  3. Automatic medical image annotation and keyword-based image retrieval using relevance feedback.

    Science.gov (United States)

    Ko, Byoung Chul; Lee, JiHyeon; Nam, Jae-Yeal

    2012-08-01

    This paper presents novel multiple keywords annotation for medical images, keyword-based medical image retrieval, and relevance feedback method for image retrieval for enhancing image retrieval performance. For semantic keyword annotation, this study proposes a novel medical image classification method combining local wavelet-based center symmetric-local binary patterns with random forests. For keyword-based image retrieval, our retrieval system use the confidence score that is assigned to each annotated keyword by combining probabilities of random forests with predefined body relation graph. To overcome the limitation of keyword-based image retrieval, we combine our image retrieval system with relevance feedback mechanism based on visual feature and pattern classifier. Compared with other annotation and relevance feedback algorithms, the proposed method shows both improved annotation performance and accurate retrieval results.

  4. Model and Interoperability using Meta Data Annotations

    Science.gov (United States)

    David, O.

    2011-12-01

    Software frameworks and architectures are in need for meta data to efficiently support model integration. Modelers have to know the context of a model, often stepping into modeling semantics and auxiliary information usually not provided in a concise structure and universal format, consumable by a range of (modeling) tools. XML often seems the obvious solution for capturing meta data, but its wide adoption to facilitate model interoperability is limited by XML schema fragmentation, complexity, and verbosity outside of a data-automation process. Ontologies seem to overcome those shortcomings, however the practical significance of their use remains to be demonstrated. OMS version 3 took a different approach for meta data representation. The fundamental building block of a modular model in OMS is a software component representing a single physical process, calibration method, or data access approach. Here, programing language features known as Annotations or Attributes were adopted. Within other (non-modeling) frameworks it has been observed that annotations lead to cleaner and leaner application code. Framework-supported model integration, traditionally accomplished using Application Programming Interfaces (API) calls is now achieved using descriptive code annotations. Fully annotated components for various hydrological and Ag-system models now provide information directly for (i) model assembly and building, (ii) data flow analysis for implicit multi-threading or visualization, (iii) automated and comprehensive model documentation of component dependencies, physical data properties, (iv) automated model and component testing, calibration, and optimization, and (v) automated audit-traceability to account for all model resources leading to a particular simulation result. Such a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework but a strong reference to its originating code. Since models and

  5. Semantic-Based RFID Data Management

    Science.gov (United States)

    de Virgilio, Roberto; di Sciascio, Eugenio; Ruta, Michele; Scioscia, Floriano; Torlone, Riccardo

    Traditional Radio-Frequency IDentification (RFID) applications have been focused on replacing bar codes in supply chain management. Leveraging a ubiquitous computing architecture, the chapter presents a framework allowing both quick decentralized on-line item discovery and centralized off-line massive business logic analysis, according to needs and requirements of supply chain actors. A semantic-based environment, where tagged objects become resources exposing to an RFID reader not a trivial identification code but a semantic annotation, enables tagged objects to describe themselves on the fly without depending on a centralized infrastructure. On the other hand, facing on data management issues, a proposal is formulated for an effective off-line multidimensional analysis of huge amounts of RFID data generated and stored along the supply chain.

  6. A Compositional Semantics for Stochastic Reo Connectors

    Directory of Open Access Journals (Sweden)

    Young-Joo Moon

    2010-07-01

    Full Text Available In this paper we present a compositional semantics for the channel-based coordination language Reo which enables the analysis of quality of service (QoS properties of service compositions. For this purpose, we annotate Reo channels with stochastic delay rates and explicitly model data-arrival rates at the boundary of a connector, to capture its interaction with the services that comprise its environment. We propose Stochastic Reo automata as an extension of Reo automata, in order to compositionally derive a QoS-aware semantics for Reo. We further present a translation of Stochastic Reo automata to Continuous-Time Markov Chains (CTMCs. This translation enables us to use third-party CTMC verification tools to do an end-to-end performance analysis of service compositions.

  7. Handbook of metadata, semantics and ontologies

    CERN Document Server

    Sicilia, Miguel-Angel

    2013-01-01

    Metadata research has emerged as a discipline cross-cutting many domains, focused on the provision of distributed descriptions (often called annotations) to Web resources or applications. Such associated descriptions are supposed to serve as a foundation for advanced services in many application areas, including search and location, personalization, federation of repositories and automated delivery of information. Indeed, the Semantic Web is in itself a concrete technological framework for ontology-based metadata. For example, Web-based social networking requires metadata describing people and

  8. GSV Annotated Bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)

    2011-06-14

    The following annotated bibliography was developed as part of the Geospatial Algorithm Veri cation and Validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Veri cation and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following ve topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models.

  9. Geospatial Semantics and the Semantic Web

    CERN Document Server

    Ashish, Naveen

    2011-01-01

    The availability of geographic and geospatial information and services, especially on the open Web has become abundant in the last several years with the proliferation of online maps, geo-coding services, geospatial Web services and geospatially enabled applications. The need for geospatial reasoning has significantly increased in many everyday applications including personal digital assistants, Web search applications, local aware mobile services, specialized systems for emergency response, medical triaging, intelligence analysis and more. Geospatial Semantics and the Semantic Web: Foundation

  10. Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles.

    Science.gov (United States)

    Cohen, K Bretonnel; Lanfranchi, Arrick; Choi, Miji Joo-Young; Bada, Michael; Baumgartner, William A; Panteleyeva, Natalya; Verspoor, Karin; Palmer, Martha; Hunter, Lawrence E

    2017-08-17

    Coreference resolution is the task of finding strings in text that have the same referent as other strings. Failures of coreference resolution are a common cause of false negatives in information extraction from the scientific literature. In order to better understand the nature of the phenomenon of coreference in biomedical publications and to increase performance on the task, we annotated the Colorado Richly Annotated Full Text (CRAFT) corpus with coreference relations. The corpus was manually annotated with coreference relations, including identity and appositives for all coreferring base noun phrases. The OntoNotes annotation guidelines, with minor adaptations, were used. Interannotator agreement ranges from 0.480 (entity-based CEAF) to 0.858 (Class-B3), depending on the metric that is used to assess it. The resulting corpus adds nearly 30,000 annotations to the previous release of the CRAFT corpus. Differences from related projects include a much broader definition of markables, connection to extensive annotation of several domain-relevant semantic classes, and connection to complete syntactic annotation. Tool performance was benchmarked on the data. A publicly available out-of-the-box, general-domain coreference resolution system achieved an F-measure of 0.14 (B3), while a simple domain-adapted rule-based system achieved an F-measure of 0.42. An ensemble of the two reached F of 0.46. Following the IDENTITY chains in the data would add 106,263 additional named entities in the full 97-paper corpus, for an increase of 76% percent in the semantic classes of the eight ontologies that have been annotated in earlier versions of the CRAFT corpus. The project produced a large data set for further investigation of coreference and coreference resolution in the scientific literature. The work raised issues in the phenomenon of reference in this domain and genre, and the paper proposes that many mentions that would be considered generic in the general domain are not

  11. Meinongian Semantics and Artificial Intelligence

    Directory of Open Access Journals (Sweden)

    William J. Rapaport

    2013-12-01

    Full Text Available This essay describes computational semantic networks for a philosophical audience and surveys several approaches to semantic-network semantics. In particular, propositional semantic networks (exemplified by SNePS are discussed; it is argued that only a fully intensional, Meinongian semantics is appropriate for them; and several Meinongian systems are presented.

  12. ADEpedia: a scalable and standardized knowledge base of Adverse Drug Events using semantic web technology.

    Science.gov (United States)

    Jiang, Guoqian; Solbrig, Harold R; Chute, Christopher G

    2011-01-01

    A source of semantically coded Adverse Drug Event (ADE) data can be useful for identifying common phenotypes related to ADEs. We proposed a comprehensive framework for building a standardized ADE knowledge base (called ADEpedia) through combining ontology-based approach with semantic web technology. The framework comprises four primary modules: 1) an XML2RDF transformation module; 2) a data normalization module based on NCBO Open Biomedical Annotator; 3) a RDF store based persistence module; and 4) a front-end module based on a Semantic Wiki for the review and curation. A prototype is successfully implemented to demonstrate the capability of the system to integrate multiple drug data and ontology resources and open web services for the ADE data standardization. A preliminary evaluation is performed to demonstrate the usefulness of the system, including the performance of the NCBO annotator. In conclusion, the semantic web technology provides a highly scalable framework for ADE data source integration and standard query service.

  13. SemFunSim: a new method for measuring disease similarity by integrating semantic and gene functional association.

    Directory of Open Access Journals (Sweden)

    Liang Cheng

    Full Text Available Measuring similarity between diseases plays an important role in disease-related molecular function research. Functional associations between disease-related genes and semantic associations between diseases are often used to identify pairs of similar diseases from different perspectives. Currently, it is still a challenge to exploit both of them to calculate disease similarity. Therefore, a new method (SemFunSim that integrates semantic and functional association is proposed to address the issue.SemFunSim is designed as follows. First of all, FunSim (Functional similarity is proposed to calculate disease similarity using disease-related gene sets in a weighted network of human gene function. Next, SemSim (Semantic Similarity is devised to calculate disease similarity using the relationship between two diseases from Disease Ontology. Finally, FunSim and SemSim are integrated to measure disease similarity.The high average AUC (area under the receiver operating characteristic curve (96.37% shows that SemFunSim achieves a high true positive rate and a low false positive rate. 79 of the top 100 pairs of similar diseases identified by SemFunSim are annotated in the Comparative Toxicogenomics Database (CTD as being targeted by the same therapeutic compounds, while other methods we compared could identify 35 or less such pairs among the top 100. Moreover, when using our method on diseases without annotated compounds in CTD, we could confirm many of our predicted candidate compounds from literature. This indicates that SemFunSim is an effective method for drug repositioning.

  14. The Semantic Web Revisited

    OpenAIRE

    Shadbolt, Nigel; Berners-Lee, Tim; Hall, Wendy

    2006-01-01

    The original Scientific American article on the Semantic Web appeared in 2001. It described the evolution of a Web that consisted largely of documents for humans to read to one that included data and information for computers to manipulate. The Semantic Web is a Web of actionable information--information derived from data through a semantic theory for interpreting the symbols.This simple idea, however, remains largely unrealized. Shopbots and auction bots abound on the Web, but these are esse...

  15. Practical solutions to implementing "Born Semantic" data systems

    Science.gov (United States)

    Leadbetter, A.; Buck, J. J. H.; Stacey, P.

    2015-12-01

    The concept of data being "Born Semantic" has been proposed in recent years as a Semantic Web analogue to the idea of data being "born digital"[1], [2]. Within the "Born Semantic" concept, data are captured digitally and at a point close to the time of creation are annotated with markup terms from semantic web resources (controlled vocabularies, thesauri or ontologies). This allows heterogeneous data to be more easily ingested and amalgamated in near real-time due to the standards compliant annotation of the data. In taking the "Born Semantic" proposal from concept to operation, a number of difficulties have been encountered. For example, although there are recognised methods such as Header, Dictionary, Triples [3] for the compression, publication and dissemination of large volumes of triples these systems are not practical to deploy in the field on low-powered (both electrically and computationally) devices. Similarly, it is not practical for instruments to output fully formed semantically annotated data files if they are designed to be plugged into a modular system and the data to be centrally logged in the field as is the case on Argo floats and oceanographic gliders where internal bandwidth becomes an issue [2]. In light of these issues, this presentation will concentrate on pragmatic solutions being developed to the problem of generating Linked Data in near real-time systems. Specific examples from the European Commission SenseOCEAN project where Linked Data systems are being developed for autonomous underwater platforms, and from work being undertaken in the streaming of data from the Irish Galway Bay Cable Observatory initiative will be highlighted. Further, developments of a set of tools for the LogStash-ElasticSearch software ecosystem to allow the storing and retrieval of Linked Data will be introduced. References[1] A. Leadbetter & J. Fredericks, We have "born digital" - now what about "born semantic"?, European Geophysical Union General Assembly, 2014

  16. Applied Semantic Web Technologies

    CERN Document Server

    Sugumaran, Vijayan

    2011-01-01

    The rapid advancement of semantic web technologies, along with the fact that they are at various levels of maturity, has left many practitioners confused about the current state of these technologies. Focusing on the most mature technologies, Applied Semantic Web Technologies integrates theory with case studies to illustrate the history, current state, and future direction of the semantic web. It maintains an emphasis on real-world applications and examines the technical and practical issues related to the use of semantic technologies in intelligent information management. The book starts with

  17. Semantic web for dummies

    CERN Document Server

    Pollock, Jeffrey T

    2009-01-01

    Semantic Web technology is already changing how we interact with data on the Web. By connecting random information on the Internet in new ways, Web 3.0, as it is sometimes called, represents an exciting online evolution. Whether you're a consumer doing research online, a business owner who wants to offer your customers the most useful Web site, or an IT manager eager to understand Semantic Web solutions, Semantic Web For Dummies is the place to start! It will help you:Know how the typical Internet user will recognize the effects of the Semantic WebExplore all the benefits the data Web offers t

  18. A "Semantic" View of Scientific Models for Science Education

    Science.gov (United States)

    Adúriz-Bravo, Agustín

    2013-01-01

    In this paper I inspect a "semantic" view of scientific models taken from contemporary philosophy of science-I draw upon the so-called "semanticist family", which frontally challenges the received, syntactic conception of scientific theories. I argue that a semantic view may be of use both for science education in the…

  19. Semantic Analysis of Verbal Collocations with Lexical Functions

    CERN Document Server

    Gelbukh, Alexander

    2013-01-01

    This book is written for both linguists and computer scientists working in the field of artificial intelligence as well as to anyone interested in intelligent text processing. Lexical function is a concept that formalizes semantic and syntactic relations between lexical units. Collocational relation is a type of institutionalized lexical relations which holds between the base and its partner in a collocation. Knowledge of collocation is important for natural language processing because collocation comprises the restrictions on how words can be used together. The book shows how collocations can be annotated with lexical functions in a computer readable dictionary - allowing their precise semantic analysis in texts and their effective use in natural language applications including parsers, high quality machine translation, periphrasis system and computer-aided learning of lexica. The books shows how to extract collocations from corpora and annotate them with lexical functions automatically. To train algorithms,...

  20. Towards the multilingual semantic web principles, methods and applications

    CERN Document Server

    Buitelaar, Paul

    2014-01-01

    To date, the relation between multilingualism and the Semantic Web has not yet received enough attention in the research community. One major challenge for the Semantic Web community is to develop architectures, frameworks and systems that can help in overcoming national and language barriers, facilitating equal access to information produced in different cultures and languages. As such, this volume aims at documenting the state-of-the-art with regard to the vision of a Multilingual Semantic Web, in which semantic information will be accessible in and across multiple languages. The Multiling

  1. Learning With Social Semantic Technologies - Exploiting Latest Tools

    Directory of Open Access Journals (Sweden)

    Gisela Granitzer

    2008-12-01

    Full Text Available Even though it was only about three years ago that Social Software became a trend, it has become a common practice to utilize Social Software in learning institutions. It brought about a lot of advantages, but also challenges. Amounts of distributed and often unstructured user generated content make it difficult to meaningfully process and find relevant information. According to the estimate of the authors, the solution lies in underpinning Social Software with structure resulting in Social Semantic Software. In this contribution we introduce the central concepts Social Software, Semantic Web and Social Semantic Web and show how Social Semantic Technologies might be utilized in the higher education context.

  2. Impingement: an annotated bibliography

    International Nuclear Information System (INIS)

    Uziel, M.S.; Hannon, E.H.

    1979-04-01

    This bibliography of 655 annotated references on impingement of aquatic organisms at intake structures of thermal-power-plant cooling systems was compiled from the published and unpublished literature. The bibliography includes references from 1928 to 1978 on impingement monitoring programs; impingement impact assessment; applicable law; location and design of intake structures, screens, louvers, and other barriers; fish behavior and swim speed as related to impingement susceptibility; and the effects of light, sound, bubbles, currents, and temperature on fish behavior. References are arranged alphabetically by author or corporate author. Indexes are provided for author, keywords, subject category, geographic location, taxon, and title

  3. Semantic querying of data guided by Formal Concept Analysis

    OpenAIRE

    Codocedo , Victor; Lykourentzou , Ioanna; Napoli , Amedeo

    2012-01-01

    International audience; In this paper we present a novel approach to handle querying over a concept lattice of documents and annotations. We focus on the problem of "non-matching documents", which are those that, despite being semantically relevant to the user query, do not contain the query's elements and hence cannot be retrieved by typical string matching approaches. In order to find these documents, we modify the initial user query using the concept lattice as a guide. We achieve this by ...

  4. Predicting word sense annotation agreement

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector; Johannsen, Anders Trærup; Lopez de Lacalle, Oier

    2015-01-01

    High agreement is a common objective when annotating data for word senses. However, a number of factors make perfect agreement impossible, e.g. the limitations of the sense inventories, the difficulty of the examples or the interpretation preferences of the annotations. Estimating potential...... agreement is thus a relevant task to supplement the evaluation of sense annotations. In this article we propose two methods to predict agreement on word-annotation instances. We experiment with a continuous representation and a three-way discretization of observed agreement. In spite of the difficulty...

  5. Morphological Cues for Lexical Semantics

    National Research Council Canada - National Science Library

    Light, Marc

    1996-01-01

    Most natural language processing tasks require lexical semantic information such as verbal argument structure and selectional restrictions, corresponding nominal semantic class, verbal aspectual class...

  6. Phylogenetic molecular function annotation

    International Nuclear Information System (INIS)

    Engelhardt, Barbara E; Jordan, Michael I; Repo, Susanna T; Brenner, Steven E

    2009-01-01

    It is now easier to discover thousands of protein sequences in a new microbial genome than it is to biochemically characterize the specific activity of a single protein of unknown function. The molecular functions of protein sequences have typically been predicted using homology-based computational methods, which rely on the principle that homologous proteins share a similar function. However, some protein families include groups of proteins with different molecular functions. A phylogenetic approach for predicting molecular function (sometimes called 'phylogenomics') is an effective means to predict protein molecular function. These methods incorporate functional evidence from all members of a family that have functional characterizations using the evolutionary history of the protein family to make robust predictions for the uncharacterized proteins. However, they are often difficult to apply on a genome-wide scale because of the time-consuming step of reconstructing the phylogenies of each protein to be annotated. Our automated approach for function annotation using phylogeny, the SIFTER (Statistical Inference of Function Through Evolutionary Relationships) methodology, uses a statistical graphical model to compute the probabilities of molecular functions for unannotated proteins. Our benchmark tests showed that SIFTER provides accurate functional predictions on various protein families, outperforming other available methods.

  7. Exploiting semantic linkages among multiple sources for semantic information retrieval

    Science.gov (United States)

    Li, JianQiang; Yang, Ji-Jiang; Liu, Chunchen; Zhao, Yu; Liu, Bo; Shi, Yuliang

    2014-07-01

    The vision of the Semantic Web is to build a global Web of machine-readable data to be consumed by intelligent applications. As the first step to make this vision come true, the initiative of linked open data has fostered many novel applications aimed at improving data accessibility in the public Web. Comparably, the enterprise environment is so different from the public Web that most potentially usable business information originates in an unstructured form (typically in free text), which poses a challenge for the adoption of semantic technologies in the enterprise environment. Considering that the business information in a company is highly specific and centred around a set of commonly used concepts, this paper describes a pilot study to migrate the concept of linked data into the development of a domain-specific application, i.e. the vehicle repair support system. The set of commonly used concepts, including the part name of a car and the phenomenon term on the car repairing, are employed to build the linkage between data and documents distributed among different sources, leading to the fusion of documents and data across source boundaries. Then, we describe the approaches of semantic information retrieval to consume these linkages for value creation for companies. The experiments on two real-world data sets show that the proposed approaches outperform the best baseline 6.3-10.8% and 6.4-11.1% in terms of top five and top 10 precisions, respectively. We believe that our pilot study can serve as an important reference for the development of similar semantic applications in an enterprise environment.

  8. Image annotation by deep neural networks with attention shaping

    Science.gov (United States)

    Zheng, Kexin; Lv, Shaohe; Ma, Fang; Chen, Fei; Jin, Chi; Dou, Yong

    2017-07-01

    Image annotation is a task of assigning semantic labels to an image. Recently, deep neural networks with visual attention have been utilized successfully in many computer vision tasks. In this paper, we show that conventional attention mechanism is easily misled by the salient class, i.e., the attended region always contains part of the image area describing the content of salient class at different attention iterations. To this end, we propose a novel attention shaping mechanism, which aims to maximize the non-overlapping area between consecutive attention processes by taking into account the history of previous attention vectors. Several weighting polices are studied to utilize the history information in different manners. In two benchmark datasets, i.e., PASCAL VOC2012 and MIRFlickr-25k, the average precision is improved by up to 10% in comparison with the state-of-the-art annotation methods.

  9. Chemical Entity Semantic Specification: Knowledge representation for efficient semantic cheminformatics and facile data integration

    Science.gov (United States)

    2011-01-01

    preservation of data correspondence and provenance. Our representation builds on existing cheminformatics technologies and, by the virtue of RDF specification, remains flexible and amenable to application- and domain-specific annotations without compromising chemical data integration. We conclude that the adoption of a consistent and semantically-enabled chemical specification is imperative for surviving the coming chemical data deluge and supporting systems science research. PMID:21595881

  10. Quality of semantic standards

    NARCIS (Netherlands)

    Folmer, Erwin Johan Albert

    2012-01-01

    Little scientific literature addresses the issue of quality of semantic standards, albeit a problem with high economic and social impact. Our problem survey, including 34 semantic Standard Setting Organizations (SSOs), gives evidence that quality of standards can be improved, but for improvement a

  11. Semantic Business Process Modeling

    OpenAIRE

    Markovic, Ivan

    2010-01-01

    This book presents a process-oriented business modeling framework based on semantic technologies. The framework consists of modeling languages, methods, and tools that allow for semantic modeling of business motivation, business policies and rules, and business processes. Quality of the proposed modeling framework is evaluated based on the modeling content of SAP Solution Composer and several real-world business scenarios.

  12. Semantic Web Primer

    NARCIS (Netherlands)

    Antoniou, Grigoris; Harmelen, Frank van

    2004-01-01

    The development of the Semantic Web, with machine-readable content, has the potential to revolutionize the World Wide Web and its use. A Semantic Web Primer provides an introduction and guide to this still emerging field, describing its key ideas, languages, and technologies. Suitable for use as a

  13. Pragmatics for formal semantics

    DEFF Research Database (Denmark)

    Danvy, Olivier

    2011-01-01

    This tech talk describes how to write and how to inter-derive formal semantics for sequential programming languages. The progress reported here is (1) concrete guidelines to write each formal semantics to alleviate their proof obligations, and (2) simple calculational tools to obtain a formal...

  14. Semantic Web status model

    CSIR Research Space (South Africa)

    Gerber, AJ

    2006-06-01

    Full Text Available Semantic Web application areas are experiencing intensified interest due to the rapid growth in the use of the Web, together with the innovation and renovation of information content technologies. The Semantic Web is regarded as an integrator across...

  15. Semantic Modeling for Exposomics with Exploratory Evaluation in Clinical Context

    Directory of Open Access Journals (Sweden)

    Jung-wei Fan

    2017-01-01

    Full Text Available Exposome is a critical dimension in the precision medicine paradigm. Effective representation of exposomics knowledge is instrumental to melding nongenetic factors into data analytics for clinical research. There is still limited work in (1 modeling exposome entities and relations with proper integration to mainstream ontologies and (2 systematically studying their presence in clinical context. Through selected ontological relations, we developed a template-driven approach to identifying exposome concepts from the Unified Medical Language System (UMLS. The derived concepts were evaluated in terms of literature coverage and the ability to assist in annotating clinical text. The generated semantic model represents rich domain knowledge about exposure events (454 pairs of relations between exposure and outcome. Additionally, a list of 5667 disorder concepts with microbial etiology was created for inferred pathogen exposures. The model consistently covered about 90% of PubMed literature on exposure-induced iatrogenic diseases over 10 years (2001–2010. The model contributed to the efficiency of exposome annotation in clinical text by filtering out 78% of irrelevant machine annotations. Analysis into 50 annotated discharge summaries helped advance our understanding of the exposome information in clinical text. This pilot study demonstrated feasibility of semiautomatically developing a useful semantic resource for exposomics.

  16. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    Science.gov (United States)

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.

  17. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    Directory of Open Access Journals (Sweden)

    Gustavo Arango-Argoty

    Full Text Available Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/, which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.

  18. MetaStorm: A Public Resource for Customizable Metagenomics Annotation

    Science.gov (United States)

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S.; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution. PMID:27632579

  19. Mesotext. Framing and exploring annotations

    NARCIS (Netherlands)

    Boot, P.; Boot, P.; Stronks, E.

    2007-01-01

    From the introduction: Annotation is an important item on the wish list for digital scholarly tools. It is one of John Unsworth’s primitives of scholarship (Unsworth 2000). Especially in linguistics,a number of tools have been developed that facilitate the creation of annotations to source material

  20. THE DIMENSIONS OF COMPOSITION ANNOTATION.

    Science.gov (United States)

    MCCOLLY, WILLIAM

    ENGLISH TEACHER ANNOTATIONS WERE STUDIED TO DETERMINE THE DIMENSIONS AND PROPERTIES OF THE ENTIRE SYSTEM FOR WRITING CORRECTIONS AND CRITICISMS ON COMPOSITIONS. FOUR SETS OF COMPOSITIONS WERE WRITTEN BY STUDENTS IN GRADES 9 THROUGH 13. TYPESCRIPTS OF THE COMPOSITIONS WERE ANNOTATED BY CLASSROOM ENGLISH TEACHERS. THEN, 32 ENGLISH TEACHERS JUDGED…

  1. Introducing glycomics data into the Semantic Web.

    Science.gov (United States)

    Aoki-Kinoshita, Kiyoko F; Bolleman, Jerven; Campbell, Matthew P; Kawano, Shin; Kim, Jin-Dong; Lütteke, Thomas; Matsubara, Masaaki; Okuda, Shujiro; Ranzinger, Rene; Sawaki, Hiromichi; Shikanai, Toshihide; Shinmachi, Daisuke; Suzuki, Yoshinori; Toukach, Philip; Yamada, Issaku; Packer, Nicolle H; Narimatsu, Hisashi

    2013-11-26

    Glycoscience is a research field focusing on complex carbohydrates (otherwise known as glycans)a, which can, for example, serve as "switches" that toggle between different functions of a glycoprotein or glycolipid. Due to the advancement of glycomics technologies that are used to characterize glycan structures, many glycomics databases are now publicly available and provide useful information for glycoscience research. However, these databases have almost no link to other life science databases. In order to implement support for the Semantic Web most efficiently for glycomics research, the developers of major glycomics databases agreed on a minimal standard for representing glycan structure and annotation information using RDF (Resource Description Framework). Moreover, all of the participants implemented this standard prototype and generated preliminary RDF versions of their data. To test the utility of the converted data, all of the data sets were uploaded into a Virtuoso triple store, and several SPARQL queries were tested as "proofs-of-concept" to illustrate the utility of the Semantic Web in querying across databases which were originally difficult to implement. We were able to successfully retrieve information by linking UniCarbKB, GlycomeDB and JCGGDB in a single SPARQL query to obtain our target information. We also tested queries linking UniProt with GlycoEpitope as well as lectin data with GlycomeDB through PDB. As a result, we have been able to link proteomics data with glycomics data through the implementation of Semantic Web technologies, allowing for more flexible queries across these domains.

  2. Accelerating cancer systems biology research through Semantic Web technology.

    Science.gov (United States)

    Wang, Zhihui; Sagotsky, Jonathan; Taylor, Thomas; Shironoshita, Patrick; Deisboeck, Thomas S

    2013-01-01

    Cancer systems biology is an interdisciplinary, rapidly expanding research field in which collaborations are a critical means to advance the field. Yet the prevalent database technologies often isolate data rather than making it easily accessible. The Semantic Web has the potential to help facilitate web-based collaborative cancer research by presenting data in a manner that is self-descriptive, human and machine readable, and easily sharable. We have created a semantically linked online Digital Model Repository (DMR) for storing, managing, executing, annotating, and sharing computational cancer models. Within the DMR, distributed, multidisciplinary, and inter-organizational teams can collaborate on projects, without forfeiting intellectual property. This is achieved by the introduction of a new stakeholder to the collaboration workflow, the institutional licensing officer, part of the Technology Transfer Office. Furthermore, the DMR has achieved silver level compatibility with the National Cancer Institute's caBIG, so users can interact with the DMR not only through a web browser but also through a semantically annotated and secure web service. We also discuss the technology behind the DMR leveraging the Semantic Web, ontologies, and grid computing to provide secure inter-institutional collaboration on cancer modeling projects, online grid-based execution of shared models, and the collaboration workflow protecting researchers' intellectual property. Copyright © 2012 Wiley Periodicals, Inc.

  3. Recommendation of standardized health learning contents using archetypes and semantic web technologies.

    Science.gov (United States)

    Legaz-García, María del Carmen; Martínez-Costa, Catalina; Menárguez-Tortosa, Marcos; Fernández-Breis, Jesualdo Tomás

    2012-01-01

    Linking Electronic Healthcare Records (EHR) content to educational materials has been considered a key international recommendation to enable clinical engagement and to promote patient safety. This would suggest citizens to access reliable information available on the web and to guide them properly. In this paper, we describe an approach in that direction, based on the use of dual model EHR standards and standardized educational contents. The recommendation method will be based on the semantic coverage of the learning content repository for a particular archetype, which will be calculated by applying semantic web technologies like ontologies and semantic annotations.

  4. Managing and Querying Image Annotation and Markup in XML

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid. PMID:21218167

  5. Managing and Querying Image Annotation and Markup in XML.

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid.

  6. Students as Designers of Semantic Web Applications

    Science.gov (United States)

    Tracy, Fran; Jordan, Katy

    2012-01-01

    This paper draws upon the experience of an interdisciplinary research group in engaging undergraduate university students in the design and development of semantic web technologies. A flexible approach to participatory design challenged conventional distinctions between "designer" and "user" and allowed students to play a role…

  7. Cases, Simulacra, and Semantic Web Technologies

    Science.gov (United States)

    Carmichael, P.; Tscholl, M.

    2013-01-01

    "Ensemble" is an interdisciplinary research and development project exploring the potential role of emerging Semantic Web technologies in case-based learning across learning environments in higher education. Empirical findings have challenged the claim that cases "bring reality into the classroom" and that this, in turn, might…

  8. Validity Semantics in Educational and Psychological Assessment

    Science.gov (United States)

    Hathcoat, John D.

    2013-01-01

    The semantics, or meaning, of validity is a fluid concept in educational and psychological testing. Contemporary controversies surrounding this concept appear to stem from the proper location of validity. Under one view, validity is a property of score-based inferences and entailed uses of test scores. This view is challenged by the…

  9. VuWiki: An Ontology-Based Semantic Wiki for Vulnerability Assessments

    Science.gov (United States)

    Khazai, Bijan; Kunz-Plapp, Tina; Büscher, Christian; Wegner, Antje

    2014-05-01

    The concept of vulnerability, as well as its implementation in vulnerability assessments, is used in various disciplines and contexts ranging from disaster management and reduction to ecology, public health or climate change and adaptation, and a corresponding multitude of ideas about how to conceptualize and measure vulnerability exists. Three decades of research in vulnerability have generated a complex and growing body of knowledge that challenges newcomers, practitioners and even experienced researchers. To provide a structured representation of the knowledge field "vulnerability assessment", we have set up an ontology-based semantic wiki for reviewing and representing vulnerability assessments: VuWiki, www.vuwiki.org. Based on a survey of 55 vulnerability assessment studies, we first developed an ontology as an explicit reference system for describing vulnerability assessments. We developed the ontology in a theoretically controlled manner based on general systems theory and guided by principles for ontology development in the field of earth and environment (Raskin and Pan 2005). Four key questions form the first level "branches" or categories of the developed ontology: (1) Vulnerability of what? (2) Vulnerability to what? (3) What reference framework was used in the vulnerability assessment?, and (4) What methodological approach was used in the vulnerability assessment? These questions correspond to the basic, abstract structure of the knowledge domain of vulnerability assessments and have been deduced from theories and concepts of various disciplines. The ontology was then implemented in a semantic wiki which allows for the classification and annotation of vulnerability assessments. As a semantic wiki, VuWiki does not aim at "synthesizing" a holistic and overarching model of vulnerability. Instead, it provides both scientists and practitioners with a uniform ontology as a reference system and offers easy and structured access to the knowledge field of

  10. Bio-jETI: a framework for semantics-based service composition

    Directory of Open Access Journals (Sweden)

    Margaria Tiziana

    2009-10-01

    Full Text Available Abstract Background The development of bioinformatics databases, algorithms, and tools throughout the last years has lead to a highly distributed world of bioinformatics services. Without adequate management and development support, in silico researchers are hardly able to exploit the potential of building complex, specialized analysis processes from these services. The Semantic Web aims at thoroughly equipping individual data and services with machine-processable meta-information, while workflow systems support the construction of service compositions. However, even in this combination, in silico researchers currently would have to deal manually with the service interfaces, the adequacy of the semantic annotations, type incompatibilities, and the consistency of service compositions. Results In this paper, we demonstrate by means of two examples how Semantic Web technology together with an adequate domain modelling frees in silico researchers from dealing with interfaces, types, and inconsistencies. In Bio-jETI, bioinformatics services can be graphically combined to complex services without worrying about details of their interfaces or about type mismatches of the composition. These issues are taken care of at the semantic level by Bio-jETI's model checking and synthesis features. Whenever possible, they automatically resolve type mismatches in the considered service setting. Otherwise, they graphically indicate impossible/incorrect service combinations. In the latter case, the workflow developer may either modify his service composition using semantically similar services, or ask for help in developing the missing mediator that correctly bridges the detected type gap. Newly developed mediators should then be adequately annotated semantically, and added to the service library for later reuse in similar situations. Conclusion We show the power of semantic annotations in an adequately modelled and semantically enabled domain setting. Using model

  11. A Defense of Semantic Minimalism

    Science.gov (United States)

    Kim, Su

    2012-01-01

    Semantic Minimalism is a position about the semantic content of declarative sentences, i.e., the content that is determined entirely by syntax. It is defined by the following two points: "Point 1": The semantic content is a complete/truth-conditional proposition. "Point 2": The semantic content is useful to a theory of…

  12. Basic semantics of product sounds

    NARCIS (Netherlands)

    Özcan Vieira, E.; Van Egmond, R.

    2012-01-01

    Product experience is a result of sensory and semantic experiences with product properties. In this paper, we focus on the semantic attributes of product sounds and explore the basic components for product sound related semantics using a semantic differential paradigmand factor analysis. With two

  13. Benchmarking semantic web technology

    CERN Document Server

    García-Castro, R

    2009-01-01

    This book addresses the problem of benchmarking Semantic Web Technologies; first, from a methodological point of view, proposing a general methodology to follow in benchmarking activities over Semantic Web Technologies and, second, from a practical point of view, presenting two international benchmarking activities that involved benchmarking the interoperability of Semantic Web technologies using RDF(S) as the interchange language in one activity and OWL in the other.The book presents in detail how the different resources needed for these interoperability benchmarking activities were defined:

  14. Reactive Kripke semantics

    CERN Document Server

    Gabbay, Dov M

    2013-01-01

    This text offers an extension to the traditional Kripke semantics for non-classical logics by adding the notion of reactivity. Reactive Kripke models change their accessibility relation as we progress in the evaluation process of formulas in the model. This feature makes the reactive Kripke semantics strictly stronger and more applicable than the traditional one. Here we investigate the properties and axiomatisations of this new and most effective semantics, and we offer a wide landscape of applications of the idea of reactivity. Applied topics include reactive automata, reactive grammars, rea

  15. UML 2 Semantics and Applications

    CERN Document Server

    Lano, Kevin

    2009-01-01

    A coherent and integrated account of the leading UML 2 semantics work and the practical applications of UML semantics development With contributions from leading experts in the field, the book begins with an introduction to UML and goes on to offer in-depth and up-to-date coverage of: The role of semantics Considerations and rationale for a UML system model Definition of the UML system model UML descriptive semantics Axiomatic semantics of UML class diagrams The object constraint language Axiomatic semantics of state machines A coalgebraic semantic framework for reasoning about interaction des

  16. Annotation of phenotypic diversity: decoupling data curation and ontology curation using Phenex.

    Science.gov (United States)

    Balhoff, James P; Dahdul, Wasila M; Dececchi, T Alexander; Lapp, Hilmar; Mabee, Paula M; Vision, Todd J

    2014-01-01

    Phenex (http://phenex.phenoscape.org/) is a desktop application for semantically annotating the phenotypic character matrix datasets common in evolutionary biology. Since its initial publication, we have added new features that address several major bottlenecks in the efficiency of the phenotype curation process: allowing curators during the data curation phase to provisionally request terms that are not yet available from a relevant ontology; supporting quality control against annotation guidelines to reduce later manual review and revision; and enabling the sharing of files for collaboration among curators. We decoupled data annotation from ontology development by creating an Ontology Request Broker (ORB) within Phenex. Curators can use the ORB to request a provisional term for use in data annotation; the provisional term can be automatically replaced with a permanent identifier once the term is added to an ontology. We added a set of annotation consistency checks to prevent common curation errors, reducing the need for later correction. We facilitated collaborative editing by improving the reliability of Phenex when used with online folder sharing services, via file change monitoring and continual autosave. With the addition of these new features, and in particular the Ontology Request Broker, Phenex users have been able to focus more effectively on data annotation. Phenoscape curators using Phenex have reported a smoother annotation workflow, with much reduced interruptions from ontology maintenance and file management issues.

  17. ONEMercury: Towards Automatic Annotation of Earth Science Metadata

    Science.gov (United States)

    Tuarob, S.; Pouchard, L. C.; Noy, N.; Horsburgh, J. S.; Palanisamy, G.

    2012-12-01

    Earth sciences have become more data-intensive, requiring access to heterogeneous data collected from multiple places, times, and thematic scales. For example, research on climate change may involve exploring and analyzing observational data such as the migration of animals and temperature shifts across the earth, as well as various model-observation inter-comparison studies. Recently, DataONE, a federated data network built to facilitate access to and preservation of environmental and ecological data, has come to exist. ONEMercury has recently been implemented as part of the DataONE project to serve as a portal for discovering and accessing environmental and observational data across the globe. ONEMercury harvests metadata from the data hosted by multiple data repositories and makes it searchable via a common search interface built upon cutting edge search engine technology, allowing users to interact with the system, intelligently filter the search results on the fly, and fetch the data from distributed data sources. Linking data from heterogeneous sources always has a cost. A problem that ONEMercury faces is the different levels of annotation in the harvested metadata records. Poorly annotated records tend to be missed during the search process as they lack meaningful keywords. Furthermore, such records would not be compatible with the advanced search functionality offered by ONEMercury as the interface requires a metadata record be semantically annotated. The explosion of the number of metadata records harvested from an increasing number of data repositories makes it impossible to annotate the harvested records manually, urging the need for a tool capable of automatically annotating poorly curated metadata records. In this paper, we propose a topic-model (TM) based approach for automatic metadata annotation. Our approach mines topics in the set of well annotated records and suggests keywords for poorly annotated records based on topic similarity. We utilize the

  18. Neural Semantic Parsing by Character-based Translation: Experiments with Abstract Meaning Representations

    NARCIS (Netherlands)

    van Noord, Rik; Bos, Johannes

    2017-01-01

    We evaluate the character-level translation method for neural semantic parsing on a large corpus of sentences annotated with Abstract Meaning Representations (AMRs). Using a sequence-to-sequence model, and some trivial preprocessing and postprocessing of AMRs, we obtain a baseline accuracy of 53.1

  19. Semantic Metadata for Heterogeneous Spatial Planning Documents

    Science.gov (United States)

    Iwaniak, A.; Kaczmarek, I.; Łukowicz, J.; Strzelecki, M.; Coetzee, S.; Paluszyński, W.

    2016-09-01

    Spatial planning documents contain information about the principles and rights of land use in different zones of a local authority. They are the basis for administrative decision making in support of sustainable development. In Poland these documents are published on the Web according to a prescribed non-extendable XML schema, designed for optimum presentation to humans in HTML web pages. There is no document standard, and limited functionality exists for adding references to external resources. The text in these documents is discoverable and searchable by general-purpose web search engines, but the semantics of the content cannot be discovered or queried. The spatial information in these documents is geographically referenced but not machine-readable. Major manual efforts are required to integrate such heterogeneous spatial planning documents from various local authorities for analysis, scenario planning and decision support. This article presents results of an implementation using machine-readable semantic metadata to identify relationships among regulations in the text, spatial objects in the drawings and links to external resources. A spatial planning ontology was used to annotate different sections of spatial planning documents with semantic metadata in the Resource Description Framework in Attributes (RDFa). The semantic interpretation of the content, links between document elements and links to external resources were embedded in XHTML pages. An example and use case from the spatial planning domain in Poland is presented to evaluate its efficiency and applicability. The solution enables the automated integration of spatial planning documents from multiple local authorities to assist decision makers with understanding and interpreting spatial planning information. The approach is equally applicable to legal documents from other countries and domains, such as cultural heritage and environmental management.

  20. SEMANTIC METADATA FOR HETEROGENEOUS SPATIAL PLANNING DOCUMENTS

    Directory of Open Access Journals (Sweden)

    A. Iwaniak

    2016-09-01

    Full Text Available Spatial planning documents contain information about the principles and rights of land use in different zones of a local authority. They are the basis for administrative decision making in support of sustainable development. In Poland these documents are published on the Web according to a prescribed non-extendable XML schema, designed for optimum presentation to humans in HTML web pages. There is no document standard, and limited functionality exists for adding references to external resources. The text in these documents is discoverable and searchable by general-purpose web search engines, but the semantics of the content cannot be discovered or queried. The spatial information in these documents is geographically referenced but not machine-readable. Major manual efforts are required to integrate such heterogeneous spatial planning documents from various local authorities for analysis, scenario planning and decision support. This article presents results of an implementation using machine-readable semantic metadata to identify relationships among regulations in the text, spatial objects in the drawings and links to external resources. A spatial planning ontology was used to annotate different sections of spatial planning documents with semantic metadata in the Resource Description Framework in Attributes (RDFa. The semantic interpretation of the content, links between document elements and links to external resources were embedded in XHTML pages. An example and use case from the spatial planning domain in Poland is presented to evaluate its efficiency and applicability. The solution enables the automated integration of spatial planning documents from multiple local authorities to assist decision makers with understanding and interpreting spatial planning information. The approach is equally applicable to legal documents from other countries and domains, such as cultural heritage and environmental management.

  1. Chado controller: advanced annotation management with a community annotation system.

    Science.gov (United States)

    Guignon, Valentin; Droc, Gaëtan; Alaux, Michael; Baurens, Franc-Christophe; Garsmeur, Olivier; Poiron, Claire; Carver, Tim; Rouard, Mathieu; Bocs, Stéphanie

    2012-04-01

    We developed a controller that is compliant with the Chado database schema, GBrowse and genome annotation-editing tools such as Artemis and Apollo. It enables the management of public and private data, monitors manual annotation (with controlled vocabularies, structural and functional annotation controls) and stores versions of annotation for all modified features. The Chado controller uses PostgreSQL and Perl. The Chado Controller package is available for download at http://www.gnpannot.org/content/chado-controller and runs on any Unix-like operating system, and documentation is available at http://www.gnpannot.org/content/chado-controller-doc The system can be tested using the GNPAnnot Sandbox at http://www.gnpannot.org/content/gnpannot-sandbox-form valentin.guignon@cirad.fr; stephanie.sidibe-bocs@cirad.fr Supplementary data are available at Bioinformatics online.

  2. Displaying Annotations for Digitised Globes

    Science.gov (United States)

    Gede, Mátyás; Farbinger, Anna

    2018-05-01

    Thanks to the efforts of the various globe digitising projects, nowadays there are plenty of old globes that can be examined as 3D models on the computer screen. These globes usually contain a lot of interesting details that an average observer would not entirely discover for the first time. The authors developed a website that can display annotations for such digitised globes. These annotations help observers of the globe to discover all the important, interesting details. Annotations consist of a plain text title, a HTML formatted descriptive text and a corresponding polygon and are stored in KML format. The website is powered by the Cesium virtual globe engine.

  3. Algebraic Semantics for Narrative

    Science.gov (United States)

    Kahn, E.

    1974-01-01

    This paper uses discussion of Edmund Spenser's "The Faerie Queene" to present a theoretical framework for explaining the semantics of narrative discourse. The algebraic theory of finite automata is used. (CK)

  4. Semantic Web Development

    National Research Council Canada - National Science Library

    Berners-Lee, Tim; Swick, Ralph

    2006-01-01

    ...) project between 2002 and 2005 provided key steps in the research in the Semantic Web technology, and also played an essential role in delivering the technology to industry and government in the form...

  5. The Meaningful Use of Big Data: Four Perspectives - Four Challenges

    NARCIS (Netherlands)

    C. Bizer; P.A. Boncz (Peter); M Brodie; O. Erling (Orri)

    2011-01-01

    textabstractTwenty-five Semantic Web and Database researchers met at the 2011 STI Semantic Summit in Riga, Latvia July 6-8, 2011 to discuss the opportunities and challenges posed by Big Data for the Semantic Web, Semantic Technologies, and Database communities. The unanimous conclusion was that the

  6. A framework for annotating human genome in disease context.

    Science.gov (United States)

    Xu, Wei; Wang, Huisong; Cheng, Wenqing; Fu, Dong; Xia, Tian; Kibbe, Warren A; Lin, Simon M

    2012-01-01

    Identification of gene-disease association is crucial to understanding disease mechanism. A rapid increase in biomedical literatures, led by advances of genome-scale technologies, poses challenge for manually-curated-based annotation databases to characterize gene-disease associations effectively and timely. We propose an automatic method-The Disease Ontology Annotation Framework (DOAF) to provide a comprehensive annotation of the human genome using the computable Disease Ontology (DO), the NCBO Annotator service and NCBI Gene Reference Into Function (GeneRIF). DOAF can keep the resulting knowledgebase current by periodically executing automatic pipeline to re-annotate the human genome using the latest DO and GeneRIF releases at any frequency such as daily or monthly. Further, DOAF provides a computable and programmable environment which enables large-scale and integrative analysis by working with external analytic software or online service platforms. A user-friendly web interface (doa.nubic.northwestern.edu) is implemented to allow users to efficiently query, download, and view disease annotations and the underlying evidences.

  7. A semi-automatic annotation tool for cooking video

    Science.gov (United States)

    Bianco, Simone; Ciocca, Gianluigi; Napoletano, Paolo; Schettini, Raimondo; Margherita, Roberto; Marini, Gianluca; Gianforme, Giorgio; Pantaleo, Giuseppe

    2013-03-01

    In order to create a cooking assistant application to guide the users in the preparation of the dishes relevant to their profile diets and food preferences, it is necessary to accurately annotate the video recipes, identifying and tracking the foods of the cook. These videos present particular annotation challenges such as frequent occlusions, food appearance changes, etc. Manually annotate the videos is a time-consuming, tedious and error-prone task. Fully automatic tools that integrate computer vision algorithms to extract and identify the elements of interest are not error free, and false positive and false negative detections need to be corrected in a post-processing stage. We present an interactive, semi-automatic tool for the annotation of cooking videos that integrates computer vision techniques under the supervision of the user. The annotation accuracy is increased with respect to completely automatic tools and the human effort is reduced with respect to completely manual ones. The performance and usability of the proposed tool are evaluated on the basis of the time and effort required to annotate the same video sequences.

  8. Virtual patients on the semantic Web: a proof-of-application study.

    Science.gov (United States)

    Dafli, Eleni; Antoniou, Panagiotis; Ioannidis, Lazaros; Dombros, Nicholas; Topps, David; Bamidis, Panagiotis D

    2015-01-22

    Virtual patients are interactive computer simulations that are increasingly used as learning activities in modern health care education, especially in teaching clinical decision making. A key challenge is how to retrieve and repurpose virtual patients as unique types of educational resources between different platforms because of the lack of standardized content-retrieving and repurposing mechanisms. Semantic Web technologies provide the capability, through structured information, for easy retrieval, reuse, repurposing, and exchange of virtual patients between different systems. An attempt to address this challenge has been made through the mEducator Best Practice Network, which provisioned frameworks for the discovery, retrieval, sharing, and reuse of medical educational resources. We have extended the OpenLabyrinth virtual patient authoring and deployment platform to facilitate the repurposing and retrieval of existing virtual patient material. A standalone Web distribution and Web interface, which contains an extension for the OpenLabyrinth virtual patient authoring system, was implemented. This extension was designed to semantically annotate virtual patients to facilitate intelligent searches, complex queries, and easy exchange between institutions. The OpenLabyrinth extension enables OpenLabyrinth authors to integrate and share virtual patient case metadata within the mEducator3.0 network. Evaluation included 3 successive steps: (1) expert reviews; (2) evaluation of the ability of health care professionals and medical students to create, share, and exchange virtual patients through specific scenarios in extended OpenLabyrinth (OLabX); and (3) evaluation of the repurposed learning objects that emerged from the procedure. We evaluated 30 repurposed virtual patient cases. The evaluation, with a total of 98 participants, demonstrated the system's main strength: the core repurposing capacity. The extensive metadata schema presentation facilitated user exploration

  9. PSG: Peer-to-Peer semantic grid framework architecture

    Directory of Open Access Journals (Sweden)

    Amira Soliman

    2011-07-01

    Full Text Available The grid vision, of sharing diverse resources in a flexible, coordinated and secure manner, strongly depends on metadata. Currently, grid metadata is generated and used in an ad-hoc fashion, much of it buried in the grid middleware code libraries and database schemas. This ad-hoc expression and use of metadata causes chronic dependency on human intervention during the operation of grid machinery. Therefore, the Semantic Grid is emerged as an extension of the grid in which rich resource metadata is exposed and handled explicitly, and shared and managed via grid protocols. The layering of an explicit semantic infrastructure over the grid infrastructure potentially leads to increase interoperability and flexibility. In this paper, we present PSG framework architecture that offers semantic-based grid services. PSG architecture allows the explicit use of semantics and defining the associated grid services. PSG architecture is originated from the integration of Peer-to-Peer (P2P computing with semantics and agents. Ontologies are used in annotating each grid component, developing users/nodes profiles and organizing framework agents. While, P2P is responsible for organizing and coordinating the grid nodes and resources.

  10. Foundations of semantic web technologies

    CERN Document Server

    Hitzler, Pascal; Rudolph, Sebastian

    2009-01-01

    The Quest for Semantics Building Models Calculating with Knowledge Exchanging Information Semanic Web Technologies RESOURCE DESCRIPTION LANGUAGE (RDF)Simple Ontologies in RDF and RDF SchemaIntroduction to RDF Syntax for RDF Advanced Features Simple Ontologies in RDF Schema Encoding of Special Data Structures An ExampleRDF Formal Semantics Why Semantics? Model-Theoretic Semantics for RDF(S) Syntactic Reasoning with Deduction Rules The Semantic Limits of RDF(S)WEB ONTOLOGY LANGUAGE (OWL) Ontologies in OWL OWL Syntax and Intuitive Semantics OWL Species The Forthcoming OWL 2 StandardOWL Formal Sem

  11. Comparing Refinements for Failure and Bisimulation Semantics

    NARCIS (Netherlands)

    Eshuis, H.; Fokkinga, M.M.

    2002-01-01

    Refinement in bisimulation semantics is defined differently from refinement in failure semantics: in bisimulation semantics refinement is based on simulations between labelled transition systems, whereas in failure semantics refinement is based on inclusions between failure systems. There exist

  12. Integrating semantic dimension into openEHR archetypes for the management of cerebral palsy electronic medical records.

    Science.gov (United States)

    Ellouze, Afef Samet; Bouaziz, Rafik; Ghorbel, Hanen

    2016-10-01

    Integrating semantic dimension into clinical archetypes is necessary once modeling medical records. First, it enables semantic interoperability and, it offers applying semantic activities on clinical data and provides a higher design quality of Electronic Medical Record (EMR) systems. However, to obtain these advantages, designers need to use archetypes that cover semantic features of clinical concepts involved in their specific applications. In fact, most of archetypes filed within open repositories are expressed in the Archetype Definition Language (ALD) which allows defining only the syntactic structure of clinical concepts weakening semantic activities on the EMR content in the semantic web environment. This paper focuses on the modeling of an EMR prototype for infants affected by Cerebral Palsy (CP), using the dual model approach and integrating semantic web technologies. Such a modeling provides a better delivery of quality of care and ensures semantic interoperability between all involved therapies' information systems. First, data to be documented are identified and collected from the involved therapies. Subsequently, data are analyzed and arranged into archetypes expressed in accordance of ADL. During this step, open archetype repositories are explored, in order to find the suitable archetypes. Then, ADL archetypes are transformed into archetypes expressed in OWL-DL (Ontology Web Language - Description Language). Finally, we construct an ontological source related to these archetypes enabling hence their annotation to facilitate data extraction and providing possibility to exercise semantic activities on such archetypes. Semantic dimension integration into EMR modeled in accordance to the archetype approach. The feasibility of our solution is shown through the development of a prototype, baptized "CP-SMS", which ensures semantic exploitation of CP EMR. This prototype provides the following features: (i) creation of CP EMR instances and their checking by

  13. Modeling Views for Semantic Web Using eXtensible Semantic (XSemantic) Nets

    NARCIS (Netherlands)

    Rajugan, R.; Chang, E.; Feng, L.; Dillon, T.; meersman, R; Tari, Z; herrero, p; Méndez, G.; Cavedon, L.; Martin, D.; Hinze, A.; Buchanan, G.

    2005-01-01

    The emergence of Semantic Web (SW) and the related technologies promise to make the web a meaningful experience. Yet, high level modeling, design and querying techniques proves to be a challenging task for organizations that are hoping utilize the SW paradigm for their industrial applications, which

  14. MATCHING ALTERNATIVE ADDRESSES: A SEMANTIC WEB APPROACH

    Directory of Open Access Journals (Sweden)

    S. Ariannamazi

    2015-12-01

    Full Text Available Rapid development of crowd-sourcing or volunteered geographic information (VGI provides opportunities for authoritatives that deal with geospatial information. Heterogeneity of multiple data sources and inconsistency of data types is a key characteristics of VGI datasets. The expansion of cities resulted in the growing number of POIs in the OpenStreetMap, a well-known VGI source, which causes the datasets to outdate in short periods of time. These changes made to spatial and aspatial attributes of features such as names and addresses might cause confusion or ambiguity in the processes that require feature’s literal information like addressing and geocoding. VGI sources neither will conform specific vocabularies nor will remain in a specific schema for a long period of time. As a result, the integration of VGI sources is crucial and inevitable in order to avoid duplication and the waste of resources. Information integration can be used to match features and qualify different annotation alternatives for disambiguation. This study enhances the search capabilities of geospatial tools with applications able to understand user terminology to pursuit an efficient way for finding desired results. Semantic web is a capable tool for developing technologies that deal with lexical and numerical calculations and estimations. There are a vast amount of literal-spatial data representing the capability of linguistic information in knowledge modeling, but these resources need to be harmonized based on Semantic Web standards. The process of making addresses homogenous generates a helpful tool based on spatial data integration and lexical annotation matching and disambiguating.

  15. Annotating images by harnessing worldwide user-tagged photos

    NARCIS (Netherlands)

    Li, X.; Snoek, C.G.M.; Worring, M.

    2009-01-01

    Automatic image tagging is important yet challenging due to the semantic gap and the lack of learning examples to model a tag's visual diversity. Meanwhile, social user tagging is creating rich multimedia content on the Web. In this paper, we propose to combine the two tagging approaches in a

  16. Gazetteer Brokering through Semantic Mediation

    Science.gov (United States)

    Hobona, G.; Bermudez, L. E.; Brackin, R.

    2013-12-01

    A gazetteer is a geographical directory containing some information regarding places. It provides names, location and other attributes for places which may include points of interest (e.g. buildings, oilfields and boreholes), and other features. These features can be published via web services conforming to the Gazetteer Application Profile of the Web Feature Service (WFS) standard of the Open Geospatial Consortium (OGC). Against the backdrop of advances in geophysical surveys, there has been a significant increase in the amount of data referenced to locations. Gazetteers services have played a significant role in facilitating access to such data, including through provision of specialized queries such as text, spatial and fuzzy search. Recent developments in the OGC have led to advances in gazetteers such as support for multilingualism, diacritics, and querying via advanced spatial constraints (e.g. search by radial search and nearest neighbor). A challenge remaining however, is that gazetteers produced by different organizations have typically been modeled differently. Inconsistencies from gazetteers produced by different organizations may include naming the same feature in a different way, naming the attributes differently, locating the feature in a different location, and providing fewer or more attributes than the other services. The Gazetteer application profile of the WFS is a starting point to address such inconsistencies by providing a standardized interface based on rules specified in ISO 19112, the international standard for spatial referencing by geographic identifiers. The profile, however, does not provide rules to deal with semantic inconsistencies. The USGS and NGA commissioned research into the potential for a Single Point of Entry Global Gazetteer (SPEGG). The research was conducted by the Cross Community Interoperability thread of the OGC testbed, referenced OWS-9. The testbed prototyped approaches for brokering gazetteers through use of semantic

  17. Semantic Learning Service Personalized

    Directory of Open Access Journals (Sweden)

    Yibo Chen

    2012-02-01

    Full Text Available To provide users with more suitable and personalized service, personalization is widely used in various fields. Current e-Learning systems search for learning resources using information search technology, based on the keywords that selected or inputted by the user. Due to lack of semantic analysis for keywords and exploring the user contexts, the system cannot provide a good learning experiment. In this paper, we defined the concept and characteristic of the personalized learning service, and proposed a semantic learning service personalized framework. Moreover, we made full use of semantic technology, using ontologies to represent the learning contents and user profile, mining and utilizing the friendship and membership of the social relationship to construct the user social relationship profile, and improved the collaboration filtering algorithm to recommend personalized learning resources for users. The results of the empirical evaluation show that the approach is effectiveness in augmenting recommendation.

  18. Semantic Observation Integration

    Directory of Open Access Journals (Sweden)

    Werner Kuhn

    2012-09-01

    Full Text Available Although the integration of sensor-based information into analysis and decision making has been a research topic for many years, semantic interoperability has not yet been reached. The advent of user-generated content for the geospatial domain, Volunteered Geographic Information (VGI, makes it even more difficult to establish semantic integration. This paper proposes a novel approach to integrating conventional sensor information and VGI, which is exploited in the context of detecting forest fires. In contrast to common logic-based semantic descriptions, we present a formal system using algebraic specifications to unambiguously describe the processing steps from natural phenomena to value-added information. A generic ontology of observations is extended and profiled for forest fire detection in order to illustrate how the sensing process, and transformations between heterogeneous sensing systems, can be represented as mathematical functions and grouped into abstract data types. We discuss the required ontological commitments and a possible generalization.

  19. The Semantics of "Violence"

    DEFF Research Database (Denmark)

    Levisen, Carsten

    This paper presents a semantic analysis of “violence” – a word around which Anglo-internationaldiscourses revolve. Many ethnolinguistic communities around the world are currently adapting thisEnglish lexical concept into their linguistic systems, and, presumably also, the view of the worldembodied...... by the “violence” concept.Based on semantic fieldwork in Port Vila, the creolophone capital of Vanuatu in the SouthPacific, the paper investigates the discursive introduction of “violence” into a community which,until recently, lived by other concepts. I compare and contrast the traditional Bislama concepts...... kilimand faetem with the newly imported English word vaeolens (violence). My study provides newevidence for how cognitive and semantic change co-occur in the context of postcolonial linguisticcommunities, and my paper addresses an important, ongoing controversy related to the notion of“Anglocentric bias...

  20. Semantic Keys and Reading

    Directory of Open Access Journals (Sweden)

    Zev bar-Lev

    2016-12-01

    Full Text Available Semantic Keys are elements (word-parts of written language that give an iconic, general representation of the whole word’s meaning. In written Sino-Japanese the “radical” or semantic components play this role. For example, the character meaning ‘woman, female’ is the Semantic Key of the character for Ma ‘Mama’ (alongside the phonetic component Ma, which means ‘horse’ as a separate character. The theory of semantic Keys in both graphic and phonemic aspects is called qTheory or nanosemantics. The most innovative aspect of the present article is the hypothesis that, in languages using alphabetic writing systems, the role of Semantic Key is played by consonants, more specifically the first consonant. Thus, L meaning ‘LIFT’ is the Semantic Key of English Lift, Ladle, Lofty, aLps, eLevator, oLympus; Spanish Leva, Lecantarse, aLto, Lengua; Arabic aLLah, and Hebrew① ªeL-ºaL ‘upto-above’ (the Israeli airline, Polish Lot ‘flight’ (the Polish airline; Hebrew ªeL, ªeLohim ‘God’, and haLLeluyah ‘praise-ye God’ (using Parallels, ‘Lift up God’. Evidence for the universality of the theory is shown by many examples drawn from various languages, including Indo-European Semitic, Chinese and Japanese. The theory reveals hundreds of relationships within and between languages, related and unrelated, that have been “Hiding in Plain Sight”, to mention just one example: the Parallel between Spanish Pan ‘bread’ and Mandarin Fan ‘rice’.

  1. Image annotation under X Windows

    Science.gov (United States)

    Pothier, Steven

    1991-08-01

    A mechanism for attaching graphic and overlay annotation to multiple bits/pixel imagery while providing levels of performance approaching that of native mode graphics systems is presented. This mechanism isolates programming complexity from the application programmer through software encapsulation under the X Window System. It ensures display accuracy throughout operations on the imagery and annotation including zooms, pans, and modifications of the annotation. Trade-offs that affect speed of display, consumption of memory, and system functionality are explored. The use of resource files to tune the display system is discussed. The mechanism makes use of an abstraction consisting of four parts; a graphics overlay, a dithered overlay, an image overly, and a physical display window. Data structures are maintained that retain the distinction between the four parts so that they can be modified independently, providing system flexibility. A unique technique for associating user color preferences with annotation is introduced. An interface that allows interactive modification of the mapping between image value and color is discussed. A procedure that provides for the colorization of imagery on 8-bit display systems using pixel dithering is explained. Finally, the application of annotation mechanisms to various applications is discussed.

  2. Temporal Representation in Semantic Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Levandoski, J J; Abdulla, G M

    2007-08-07

    A wide range of knowledge discovery and analysis applications, ranging from business to biological, make use of semantic graphs when modeling relationships and concepts. Most of the semantic graphs used in these applications are assumed to be static pieces of information, meaning temporal evolution of concepts and relationships are not taken into account. Guided by the need for more advanced semantic graph queries involving temporal concepts, this paper surveys the existing work involving temporal representations in semantic graphs.

  3. Semantic Versus Syntactic Cutting Planes

    OpenAIRE

    Filmus, Yuval; Hrubeš, Pavel; Lauria, Massimo

    2016-01-01

    In this paper, we compare the strength of the semantic and syntactic version of the cutting planes proof system. First, we show that the lower bound technique of Pudlák applies also to semantic cutting planes: the proof system has feasible interpolation via monotone real circuits, which gives an exponential lower bound on lengths of semantic cutting planes refutations. Second, we show that semantic refutations are stronger than syntactic ones. In particular, we give a formula for whic...

  4. Flow Logics and Operational Semantics

    DEFF Research Database (Denmark)

    Nielson, Flemming; Nielson, Hanne Riis

    1998-01-01

    Flow logic is a “fast prototyping” approach to program analysis that shows great promise of being able to deal with a wide variety of languages and calculi for computation. However, seemingly innocent choices in the flow logic as well as in the operational semantics may inhibit proving the analys...... correct. Our main conclusion is that environment based semantics is more flexible than either substitution based semantics or semantics making use of structural congruences (like alpha-renaming)....

  5. Evolution of semantic systems

    CERN Document Server

    Küppers, Bernd-Olaf; Artmann, Stefan

    2013-01-01

    Complex systems in nature and society make use of information for the development of their internal organization and the control of their functional mechanisms. Alongside technical aspects of storing, transmitting and processing information, the various semantic aspects of information, such as meaning, sense, reference and function, play a decisive part in the analysis of such systems.With the aim of fostering a better understanding of semantic systems from an evolutionary and multidisciplinary perspective, this volume collects contributions by philosophers and natural scientists, linguists, i

  6. Semantic Labeling of Nonspeech Audio Clips

    Directory of Open Access Journals (Sweden)

    Xiaojuan Ma

    2010-01-01

    Full Text Available Human communication about entities and events is primarily linguistic in nature. While visual representations of information are shown to be highly effective as well, relatively little is known about the communicative power of auditory nonlinguistic representations. We created a collection of short nonlinguistic auditory clips encoding familiar human activities, objects, animals, natural phenomena, machinery, and social scenes. We presented these sounds to a broad spectrum of anonymous human workers using Amazon Mechanical Turk and collected verbal sound labels. We analyzed the human labels in terms of their lexical and semantic properties to ascertain that the audio clips do evoke the information suggested by their pre-defined captions. We then measured the agreement with the semantically compatible labels for each sound clip. Finally, we examined which kinds of entities and events, when captured by nonlinguistic acoustic clips, appear to be well-suited to elicit information for communication, and which ones are less discriminable. Our work is set against the broader goal of creating resources that facilitate communication for people with some types of language loss. Furthermore, our data should prove useful for future research in machine analysis/synthesis of audio, such as computational auditory scene analysis, and annotating/querying large collections of sound effects.

  7. Ontology-based Semantic Search Engine for Healthcare Services

    OpenAIRE

    Jotsna Molly Rajan; M. Deepa Lakshmi

    2012-01-01

    With the development of Web Services, the retrieval of relevant services has become a challenge. The keyword-based discovery mechanism using UDDI and WSDL is insufficient due to the retrievalof a large amount of irrelevant information. Also, keywords are insufficient in expressing semantic concepts since a single concept can be referred using syntactically different terms. Hence, service capabilities need to be manually analyzed, which lead to the development of the Semantic Web for automatic...

  8. A Semantic-Based Indexing for Indoor Moving Objects

    OpenAIRE

    Tingting Ben; Xiaolin Qin; Ning Wang

    2014-01-01

    The increasing availability of indoor positioning, driven by techniques like RFID, Bluetooth, and smart phones, enables a variety of indoor location-based services (LBSs). Efficient queries based on semantic-constraint in indoor spaces play an important role in supporting and boosting LBSs. However, the existing indoor index techniques cannot support these semantic constraints-based queries. To solve this problem, this paper addresses the challenge of indexing moving objects in indoor spaces,...

  9. Semantic memory in object use.

    Science.gov (United States)

    Silveri, Maria Caterina; Ciccarelli, Nicoletta

    2009-10-01

    We studied five patients with semantic memory disorders, four with semantic dementia and one with herpes simplex virus encephalitis, to investigate the involvement of semantic conceptual knowledge in object use. Comparisons between patients who had semantic deficits of different severity, as well as the follow-up, showed that the ability to use objects was largely preserved when the deficit was mild but progressively decayed as the deficit became more severe. Naming was generally more impaired than object use. Production tasks (pantomime execution and actual object use) and comprehension tasks (pantomime recognition and action recognition) as well as functional knowledge about objects were impaired when the semantic deficit was severe. Semantic and unrelated errors were produced during object use, but actions were always fluent and patients performed normally on a novel tools task in which the semantic demand was minimal. Patients with severe semantic deficits scored borderline on ideational apraxia tasks. Our data indicate that functional semantic knowledge is crucial for using objects in a conventional way and suggest that non-semantic factors, mainly non-declarative components of memory, might compensate to some extent for semantic disorders and guarantee some residual ability to use very common objects independently of semantic knowledge.

  10. Introducing glycomics data into the Semantic Web

    Science.gov (United States)

    2013-01-01

    Background Glycoscience is a research field focusing on complex carbohydrates (otherwise known as glycans)a, which can, for example, serve as “switches” that toggle between different functions of a glycoprotein or glycolipid. Due to the advancement of glycomics technologies that are used to characterize glycan structures, many glycomics databases are now publicly available and provide useful information for glycoscience research. However, these databases have almost no link to other life science databases. Results In order to implement support for the Semantic Web most efficiently for glycomics research, the developers of major glycomics databases agreed on a minimal standard for representing glycan structure and annotation information using RDF (Resource Description Framework). Moreover, all of the participants implemented this standard prototype and generated preliminary RDF versions of their data. To test the utility of the converted data, all of the data sets were uploaded into a Virtuoso triple store, and several SPARQL queries were tested as “proofs-of-concept” to illustrate the utility of the Semantic Web in querying across databases which were originally difficult to implement. Conclusions We were able to successfully retrieve information by linking UniCarbKB, GlycomeDB and JCGGDB in a single SPARQL query to obtain our target information. We also tested queries linking UniProt with GlycoEpitope as well as lectin data with GlycomeDB through PDB. As a result, we have been able to link proteomics data with glycomics data through the implementation of Semantic Web technologies, allowing for more flexible queries across these domains. PMID:24280648

  11. Streamlining geospatial metadata in the Semantic Web

    Science.gov (United States)

    Fugazza, Cristiano; Pepe, Monica; Oggioni, Alessandro; Tagliolato, Paolo; Carrara, Paola

    2016-04-01

    In the geospatial realm, data annotation and discovery rely on a number of ad-hoc formats and protocols. These have been created to enable domain-specific use cases generalized search is not feasible for. Metadata are at the heart of the discovery process and nevertheless they are often neglected or encoded in formats that either are not aimed at efficient retrieval of resources or are plainly outdated. Particularly, the quantum leap represented by the Linked Open Data (LOD) movement did not induce so far a consistent, interlinked baseline in the geospatial domain. In a nutshell, datasets, scientific literature related to them, and ultimately the researchers behind these products are only loosely connected; the corresponding metadata intelligible only to humans, duplicated on different systems, seldom consistently. Instead, our workflow for metadata management envisages i) editing via customizable web- based forms, ii) encoding of records in any XML application profile, iii) translation into RDF (involving the semantic lift of metadata records), and finally iv) storage of the metadata as RDF and back-translation into the original XML format with added semantics-aware features. Phase iii) hinges on relating resource metadata to RDF data structures that represent keywords from code lists and controlled vocabularies, toponyms, researchers, institutes, and virtually any description one can retrieve (or directly publish) in the LOD Cloud. In the context of a distributed Spatial Data Infrastructure (SDI) built on free and open-source software, we detail phases iii) and iv) of our workflow for the semantics-aware management of geospatial metadata.

  12. Semantic Indexing of Medical Learning Objects: Medical Students' Usage of a Semantic Network.

    Science.gov (United States)

    Tix, Nadine; Gießler, Paul; Ohnesorge-Radtke, Ursula; Spreckelsen, Cord

    2015-11-11

    The Semantically Annotated Media (SAM) project aims to provide a flexible platform for searching, browsing, and indexing medical learning objects (MLOs) based on a semantic network derived from established classification systems. Primarily, SAM supports the Aachen emedia skills lab, but SAM is ready for indexing distributed content and the Simple Knowledge Organizing System standard provides a means for easily upgrading or even exchanging SAM's semantic network. There is a lack of research addressing the usability of MLO indexes or search portals like SAM and the user behavior with such platforms. The purpose of this study was to assess the usability of SAM by investigating characteristic user behavior of medical students accessing MLOs via SAM. In this study, we chose a mixed-methods approach. Lean usability testing was combined with usability inspection by having the participants complete four typical usage scenarios before filling out a questionnaire. The questionnaire was based on the IsoMetrics usability inventory. Direct user interaction with SAM (mouse clicks and pages accessed) was logged. The study analyzed the typical usage patterns and habits of students using a semantic network for accessing MLOs. Four scenarios capturing characteristics of typical tasks to be solved by using SAM yielded high ratings of usability items and showed good results concerning the consistency of indexing by different users. Long-tail phenomena emerge as they are typical for a collaborative Web 2.0 platform. Suitable but nonetheless rarely used keywords were assigned to MLOs by some users. It is possible to develop a Web-based tool with high usability and acceptance for indexing and retrieval of MLOs. SAM can be applied to indexing multicentered repositories of MLOs collaboratively.

  13. Incorporating Semantic Knowledge into Dynamic Data Processing for Smart Power Grids

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, Qunzhi; Simmhan, Yogesh; Prasanna, Viktor

    2012-11-15

    Semantic Web allows us to model and query time-invariant or slowly evolving knowledge using ontologies. Emerging applications in Cyber Physical Systems such as Smart Power Grids that require continuous information monitoring and integration present novel opportunities and challenges for Semantic Web technologies. Semantic Web is promising to model diverse Smart Grid domain knowledge for enhanced situation awareness and response by multi-disciplinary participants. However, current technology does pose a performance overhead for dynamic analysis of sensor measurements. In this paper, we combine semantic web and complex event processing for stream based semantic querying. We illustrate its adoption in the USC Campus Micro-Grid for detecting and enacting dynamic response strategies to peak power situations by diverse user roles. We also describe the semantic ontology and event query model that supports this. Further, we introduce and evaluate caching techniques to improve the response time for semantic event queries to meet our application needs and enable sustainable energy management.

  14. Specifying semantic information on functional requirements

    OpenAIRE

    YAO, WUPING

    2012-01-01

    Requirements engineering is a challenging process in software development projects. Requirements, in general, are documented in natural language. They often have issues related to ambiguity, completeness and consistency. How to improve the quality of requirements documentation remains a classic research topic. This research aims at improving the way of editing and documenting functional requirements. We propose a meta-model to specify the semantic information of functional requirements, and d...

  15. Graph-based sequence annotation using a data integration approach

    Directory of Open Access Journals (Sweden)

    Pesch Robert

    2008-06-01

    Full Text Available The automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara- Cyc which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation.

  16. Graph-based sequence annotation using a data integration approach.

    Science.gov (United States)

    Pesch, Robert; Lysenko, Artem; Hindle, Matthew; Hassani-Pak, Keywan; Thiele, Ralf; Rawlings, Christopher; Köhler, Jacob; Taubert, Jan

    2008-08-25

    The automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara-Cyc) which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation. The methods and algorithms presented in this publication are an integral part of the ONDEX system which is freely available from http://ondex.sf.net/.

  17. Elucidating high-dimensional cancer hallmark annotation via enriched ontology.

    Science.gov (United States)

    Yan, Shankai; Wong, Ka-Chun

    2017-09-01

    Cancer hallmark annotation is a promising technique that could discover novel knowledge about cancer from the biomedical literature. The automated annotation of cancer hallmarks could reveal relevant cancer transformation processes in the literature or extract the articles that correspond to the cancer hallmark of interest. It acts as a complementary approach that can retrieve knowledge from massive text information, advancing numerous focused studies in cancer research. Nonetheless, the high-dimensional nature of cancer hallmark annotation imposes a unique challenge. To address the curse of dimensionality, we compared multiple cancer hallmark annotation methods on 1580 PubMed abstracts. Based on the insights, a novel approach, UDT-RF, which makes use of ontological features is proposed. It expands the feature space via the Medical Subject Headings (MeSH) ontology graph and utilizes novel feature selections for elucidating the high-dimensional cancer hallmark annotation space. To demonstrate its effectiveness, state-of-the-art methods are compared and evaluated by a multitude of performance metrics, revealing the full performance spectrum on the full set of cancer hallmarks. Several case studies are conducted, demonstrating how the proposed approach could reveal novel insights into cancers. https://github.com/cskyan/chmannot. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Automatic annotation of lecture videos for multimedia driven pedagogical platforms

    Directory of Open Access Journals (Sweden)

    Ali Shariq Imran

    2016-12-01

    Full Text Available Today’s eLearning websites are heavily loaded with multimedia contents, which are often unstructured, unedited, unsynchronized, and lack inter-links among different multimedia components. Hyperlinking different media modality may provide a solution for quick navigation and easy retrieval of pedagogical content in media driven eLearning websites. In addition, finding meta-data information to describe and annotate media content in eLearning platforms is challenging, laborious, prone to errors, and time-consuming task. Thus annotations for multimedia especially of lecture videos became an important part of video learning objects. To address this issue, this paper proposes three major contributions namely, automated video annotation, the 3-Dimensional (3D tag clouds, and the hyper interactive presenter (HIP eLearning platform. Combining existing state-of-the-art SIFT together with tag cloud, a novel approach for automatic lecture video annotation for the HIP is proposed. New video annotations are implemented automatically providing the needed random access in lecture videos within the platform, and a 3D tag cloud is proposed as a new way of user interaction mechanism. A preliminary study of the usefulness of the system has been carried out, and the initial results suggest that 70% of the students opted for using HIP as their preferred eLearning platform at Gjøvik University College (GUC.

  19. Alignment-Annotator web server: rendering and annotating sequence alignments.

    Science.gov (United States)

    Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas

    2014-07-01

    Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Semantic Barbs and Biorthogonality

    OpenAIRE

    Rathke, J.; Sassone, V.; Sobocinski, P.

    2007-01-01

    We use the framework of biorthogonality to introduce a novel semantic definition of the concept of barb (basic observable) for process calculi. We develop a uniform basic theory of barbs and demonstrate its robustness by showing that it gives rise to the correct observables in specific process calculi which model synchronous, asynchronous and broadcast communication regimes.

  1. Semantic data bank

    International Nuclear Information System (INIS)

    Anoreewsky, Evelyne; Nicolas, P.; Grillo, J.P.

    1977-01-01

    A system is proposed for determining semantic relations between lexical items. To do this, a descriptor is associated with each lexical item; two types of algorithms are used to calculate the relationships between descriptors ('similarity' or 'predicativity' relations). This system makes it possible to simulate linguistic experiences. Some results have been predicted and verified experimentally. [fr

  2. Learning semantic query suggestions

    NARCIS (Netherlands)

    Meij, E.; Bron, M.; Hollink, L.; Huurnink, B.; de Rijke, M.

    2009-01-01

    An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide

  3. Bridging the semantic gap in sports

    Science.gov (United States)

    Li, Baoxin; Errico, James; Pan, Hao; Sezan, M. Ibrahim

    2003-01-01

    One of the major challenges facing current media management systems and the related applications is the so-called "semantic gap" between the rich meaning that a user desires and the shallowness of the content descriptions that are automatically extracted from the media. In this paper, we address the problem of bridging this gap in the sports domain. We propose a general framework for indexing and summarizing sports broadcast programs. The framework is based on a high-level model of sports broadcast video using the concept of an event, defined according to domain-specific knowledge for different types of sports. Within this general framework, we develop automatic event detection algorithms that are based on automatic analysis of the visual and aural signals in the media. We have successfully applied the event detection algorithms to different types of sports including American football, baseball, Japanese sumo wrestling, and soccer. Event modeling and detection contribute to the reduction of the semantic gap by providing rudimentary semantic information obtained through media analysis. We further propose a novel approach, which makes use of independently generated rich textual metadata, to fill the gap completely through synchronization of the information-laden textual data with the basic event segments. An MPEG-7 compliant prototype browsing system has been implemented to demonstrate semantic retrieval and summarization of sports video.

  4. GIF Video Sentiment Detection Using Semantic Sequence

    Directory of Open Access Journals (Sweden)

    Dazhen Lin

    2017-01-01

    Full Text Available With the development of social media, an increasing number of people use short videos in social media applications to express their opinions and sentiments. However, sentiment detection of short videos is a very challenging task because of the semantic gap problem and sequence based sentiment understanding problem. In this context, we propose a SentiPair Sequence based GIF video sentiment detection approach with two contributions. First, we propose a Synset Forest method to extract sentiment related semantic concepts from WordNet to build a robust SentiPair label set. This approach considers the semantic gap between label words and selects a robust label subset which is related to sentiment. Secondly, we propose a SentiPair Sequence based GIF video sentiment detection approach that learns the semantic sequence to understand the sentiment from GIF videos. Our experiment results on GSO-2016 (GIF Sentiment Ontology data show that our approach not only outperforms four state-of-the-art classification methods but also shows better performance than the state-of-the-art middle level sentiment ontology features, Adjective Noun Pairs (ANPs.

  5. Expressed Peptide Tags: An additional layer of data for genome annotation

    Energy Technology Data Exchange (ETDEWEB)

    Savidor, Alon [ORNL; Donahoo, Ryan S [ORNL; Hurtado-Gonzales, Oscar [University of Tennessee, Knoxville (UTK); Verberkmoes, Nathan C [ORNL; Shah, Manesh B [ORNL; Lamour, Kurt H [ORNL; McDonald, W Hayes [ORNL

    2006-01-01

    While genome sequencing is becoming ever more routine, genome annotation remains a challenging process. Identification of the coding sequences within the genomic milieu presents a tremendous challenge, especially for eukaryotes with their complex gene architectures. Here we present a method to assist the annotation process through the use of proteomic data and bioinformatics. Mass spectra of digested protein preparations of the organism of interest were acquired and searched against a protein database created by a six frame translation of the genome. The identified peptides were mapped back to the genome, compared to the current annotation, and then categorized as supporting or extending the current genome annotation. We named the classified peptides Expressed Peptide Tags (EPTs). The well annotated bacterium Rhodopseudomonas palustris was used as a control for the method and showed high degree of correlation between EPT mapping and the current annotation, with 86% of the EPTs confirming existing gene calls and less than 1% of the EPTs expanding on the current annotation. The eukaryotic plant pathogens Phytophthora ramorum and Phytophthora sojae, whose genomes have been recently sequenced and are much less well annotated, were also subjected to this method. A series of algorithmic steps were taken to increase the confidence of EPT identification for these organisms, including generation of smaller sub-databases to be searched against, and definition of EPT criteria that accommodates the more complex eukaryotic gene architecture. As expected, the analysis of the Phytophthora species showed less correlation between EPT mapping and their current annotation. While ~77% of Phytophthora EPTs supported the current annotation, a portion of them (7.2% and 12.6% for P. ramorum and P. sojae, respectively) suggested modification to current gene calls or identified novel genes that were missed by the current genome annotation of these organisms.

  6. Clinical phenotype-based gene prioritization: an initial study using semantic similarity and the human phenotype ontology.

    Science.gov (United States)

    Masino, Aaron J; Dechene, Elizabeth T; Dulik, Matthew C; Wilkens, Alisha; Spinner, Nancy B; Krantz, Ian D; Pennington, Jeffrey W; Robinson, Peter N; White, Peter S

    2014-07-21

    Exome sequencing is a promising method for diagnosing patients with a complex phenotype. However, variant interpretation relative to patient phenotype can be challenging in some scenarios, particularly clinical assessment of rare complex phenotypes. Each patient's sequence reveals many possibly damaging variants that must be individually assessed to establish clear association with patient phenotype. To assist interpretation, we implemented an algorithm that ranks a given set of genes relative to patient phenotype. The algorithm orders genes by the semantic similarity computed between phenotypic descriptors associated with each gene and those describing the patient. Phenotypic descriptor terms are taken from the Human Phenotype Ontology (HPO) and semantic similarity is derived from each term's information content. Model validation was performed via simulation and with clinical data. We simulated 33 Mendelian diseases with 100 patients per disease. We modeled clinical conditions by adding noise and imprecision, i.e. phenotypic terms unrelated to the disease and terms less specific than the actual disease terms. We ranked the causative gene against all 2488 HPO annotated genes. The median causative gene rank was 1 for the optimal and noise cases, 12 for the imprecision case, and 60 for the imprecision with noise case. Additionally, we examined a clinical cohort of subjects with hearing impairment. The disease gene median rank was 22. However, when also considering the patient's exome data and filtering non-exomic and common variants, the median rank improved to 3. Semantic similarity can rank a causative gene highly within a gene list relative to patient phenotype characteristics, provided that imprecision is mitigated. The clinical case results suggest that phenotype rank combined with variant analysis provides significant improvement over the individual approaches. We expect that this combined prioritization approach may increase accuracy and decrease effort for

  7. A semantic-based method for extracting concept definitions from scientific publications: evaluation in the autism phenotype domain.

    Science.gov (United States)

    Hassanpour, Saeed; O'Connor, Martin J; Das, Amar K

    2013-08-12

    A variety of informatics approaches have been developed that use information retrieval, NLP and text-mining techniques to identify biomedical concepts and relations within scientific publications or their sentences. These approaches have not typically addressed the challenge of extracting more complex knowledge such as biomedical definitions. In our efforts to facilitate knowledge acquisition of rule-based definitions of autism phenotypes, we have developed a novel semantic-based text-mining approach that can automatically identify such definitions within text. Using an existing knowledge base of 156 autism phenotype definitions and an annotated corpus of 26 source articles containing such definitions, we evaluated and compared the average rank of correctly identified rule definition or corresponding rule template using both our semantic-based approach and a standard term-based approach. We examined three separate scenarios: (1) the snippet of text contained a definition already in the knowledge base; (2) the snippet contained an alternative definition for a concept in the knowledge base; and (3) the snippet contained a definition not in the knowledge base. Our semantic-based approach had a higher average rank than the term-based approach for each of the three scenarios (scenario 1: 3.8 vs. 5.0; scenario 2: 2.8 vs. 4.9; and scenario 3: 4.5 vs. 6.2), with each comparison significant at the p-value of 0.05 using the Wilcoxon signed-rank test. Our work shows that leveraging existing domain knowledge in the information extraction of biomedical definitions significantly improves the correct identification of such knowledge within sentences. Our method can thus help researchers rapidly acquire knowledge about biomedical definitions that are specified and evolving within an ever-growing corpus of scientific publications.

  8. Data management and query processing in semantic web databases

    CERN Document Server

    Groppe, Sven

    2011-01-01

    The Semantic Web, which is intended to establish a machine-understandable Web, is currently changing from being an emerging trend to a technology used in complex real-world applications. A number of standards and techniques have been developed by the World Wide Web Consortium (W3C), e.g., the Resource Description Framework (RDF), which provides a general method for conceptual descriptions for Web resources, and SPARQL, an RDF querying language. Recent examples of large RDF data with billions of facts include the UniProt comprehensive catalog of protein sequence, function and annotation data, t

  9. Public Relations: Selected, Annotated Bibliography.

    Science.gov (United States)

    Demo, Penny

    Designed for students and practitioners of public relations (PR), this annotated bibliography focuses on recent journal articles and ERIC documents. The 34 citations include the following: (1) surveys of public relations professionals on career-related education; (2) literature reviews of research on measurement and evaluation of PR and…

  10. Persuasion: A Selected, Annotated Bibliography.

    Science.gov (United States)

    McDermott, Steven T.

    Designed to reflect the diversity of approaches to persuasion, this annotated bibliography cites materials selected for their contribution to that diversity as well as for being relatively current and/or especially significant representatives of particular approaches. The bibliography starts with a list of 17 general textbooks on approaches to…

  11. [Prescription annotations in Welfare Pharmacy].

    Science.gov (United States)

    Han, Yi

    2018-03-01

    Welfare Pharmacy contains medical formulas documented by the government and official prescriptions used by the official pharmacy in the pharmaceutical process. In the last years of Southern Song Dynasty, anonyms gave a lot of prescription annotations, made textual researches for the name, source, composition and origin of the prescriptions, and supplemented important historical data of medical cases and researched historical facts. The annotations of Welfare Pharmacy gathered the essence of medical theory, and can be used as precious materials to correctly understand the syndrome differentiation, compatibility regularity and clinical application of prescriptions. This article deeply investigated the style and form of the prescription annotations in Welfare Pharmacy, the name of prescriptions and the evolution of terminology, the major functions of the prescriptions, processing methods, instructions for taking medicine and taboos of prescriptions, the medical cases and clinical efficacy of prescriptions, the backgrounds, sources, composition and cultural meanings of prescriptions, proposed that the prescription annotations played an active role in the textual dissemination, patent medicine production and clinical diagnosis and treatment of Welfare Pharmacy. This not only helps understand the changes in the names and terms of traditional Chinese medicines in Welfare Pharmacy, but also provides the basis for understanding the knowledge sources, compatibility regularity, important drug innovations and clinical medications of prescriptions in Welfare Pharmacy. Copyright© by the Chinese Pharmaceutical Association.

  12. Systems Theory and Communication. Annotated Bibliography.

    Science.gov (United States)

    Covington, William G., Jr.

    This annotated bibliography presents annotations of 31 books and journal articles dealing with systems theory and its relation to organizational communication, marketing, information theory, and cybernetics. Materials were published between 1963 and 1992 and are listed alphabetically by author. (RS)

  13. Semantic similarity measure in biomedical domain leverage web search engine.

    Science.gov (United States)

    Chen, Chi-Huang; Hsieh, Sheau-Ling; Weng, Yung-Ching; Chang, Wen-Yung; Lai, Feipei

    2010-01-01

    Semantic similarity measure plays an essential role in Information Retrieval and Natural Language Processing. In this paper we propose a page-count-based semantic similarity measure and apply it in biomedical domains. Previous researches in semantic web related applications have deployed various semantic similarity measures. Despite the usefulness of the measurements in those applications, measuring semantic similarity between two terms remains a challenge task. The proposed method exploits page counts returned by the Web Search Engine. We define various similarity scores for two given terms P and Q, using the page counts for querying P, Q and P AND Q. Moreover, we propose a novel approach to compute semantic similarity using lexico-syntactic patterns with page counts. These different similarity scores are integrated adapting support vector machines, to leverage the robustness of semantic similarity measures. Experimental results on two datasets achieve correlation coefficients of 0.798 on the dataset provided by A. Hliaoutakis, 0.705 on the dataset provide by T. Pedersen with physician scores and 0.496 on the dataset provided by T. Pedersen et al. with expert scores.

  14. Semantic web for the working ontologist effective modeling in RDFS and OWL

    CERN Document Server

    Allemang, Dean

    2011-01-01

    Semantic Web models and technologies provide information in machine-readable languages that enable computers to access the Web more intelligently and perform tasks automatically without the direction of users. These technologies are relatively recent and advancing rapidly, creating a set of unique challenges for those developing applications. Semantic Web for the Working Ontologist is the essential, comprehensive resource on semantic modeling, for practitioners in health care, artificial intelligence, finance, engineering, military intelligence, enterprise architecture, and more. Focused on

  15. Annotating images by mining image search results

    NARCIS (Netherlands)

    Wang, X.J.; Zhang, L.; Li, X.; Ma, W.Y.

    2008-01-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search

  16. From Data to Semantic Information

    Directory of Open Access Journals (Sweden)

    Luciano Floridi

    2003-06-01

    Full Text Available Abstract: There is no consensus yet on the definition of semantic information. This paper contributes to the current debate by criticising and revising the Standard Definition of semantic Information (SDI as meaningful data, in favour of the Dretske-Grice approach: meaningful and well-formed data constitute semantic information only if they also qualify as contingently truthful. After a brief introduction, SDI is criticised for providing necessary but insufficient conditions for the definition of semantic information. SDI is incorrect because truth-values do not supervene on semantic information, and misinformation (that is, false semantic information is not a type of semantic information, but pseudo-information, that is not semantic information at all. This is shown by arguing that none of the reasons for interpreting misinformation as a type of semantic information is convincing, whilst there are compelling reasons to treat it as pseudo-information. As a consequence, SDI is revised to include a necessary truth-condition. The last section summarises the main results of the paper and indicates the important implications of the revised definition for the analysis of the deflationary theories of truth, the standard definition of knowledge and the classic, quantitative theory of semantic information.

  17. A Survey of Data Semantization in Internet of Things

    Directory of Open Access Journals (Sweden)

    Feifei Shi

    2018-01-01

    Full Text Available With the development of Internet of Things (IoT, more and more sensors, actuators and mobile devices have been deployed into our daily lives. The result is that tremendous data are produced and it is urgent to dig out hidden information behind these volumous data. However, IoT data generated by multi-modal sensors or devices show great differences in formats, domains and types, which poses challenges for machines to process and understand. Therefore, adding semantics to Internet of Things becomes an overwhelming tendency. This paper provides a systematic review of data semantization in IoT, including its backgrounds, processing flows, prevalent techniques, applications, existing challenges and open issues. It surveys development status of adding semantics to IoT data, mainly referring to sensor data and points out current issues and challenges that are worth further study.

  18. A Survey of Data Semantization in Internet of Things.

    Science.gov (United States)

    Shi, Feifei; Li, Qingjuan; Zhu, Tao; Ning, Huansheng

    2018-01-22

    With the development of Internet of Things (IoT), more and more sensors, actuators and mobile devices have been deployed into our daily lives. The result is that tremendous data are produced and it is urgent to dig out hidden information behind these volumous data. However, IoT data generated by multi-modal sensors or devices show great differences in formats, domains and types, which poses challenges for machines to process and understand. Therefore, adding semantics to Internet of Things becomes an overwhelming tendency. This paper provides a systematic review of data semantization in IoT, including its backgrounds, processing flows, prevalent techniques, applications, existing challenges and open issues. It surveys development status of adding semantics to IoT data, mainly referring to sensor data and points out current issues and challenges that are worth further study.

  19. SEMANTIC DERIVATION OF BORROWINGS

    Directory of Open Access Journals (Sweden)

    Shigapova, F.F.

    2017-09-01

    Full Text Available The author carried out the contrastive analysis of the word спикер borrowed into Russian from English and the English word speaker. The findings of the analysis include confirm (1 different derivational abilities and functions of the borrowed word and the native word; (2 distinctive features in the definitions, i.e. semantic structures, registered in monolingual non-abridged dictionaries; (3 heterogeneous parameters of frequencies recorded in the National Corpus of the Russian language and the British National Corpus; (4 absence of bilingual equivalent collocations with words спикер and speaker. The collocations with words studied revealed new lexical and connotative senses in the meaning of the word. Relevance of the study conducted is justified by the new facts revealed about the semantic adaptation of the borrowed word in the system of the Russian language and its paradigmatic and syntagmatic connections in the system of the recipient language.

  20. Promoting positive parenting: an annotated bibliography.

    Science.gov (United States)

    Ahmann, Elizabeth

    2002-01-01

    Positive parenting is built on respect for children and helps develop self-esteem, inner discipline, self-confidence, responsibility, and resourcefulness. Positive parenting is also good for parents: parents feel good about parenting well. It builds a sense of dignity. Positive parenting can be learned. Understanding normal development is a first step, so that parents can distinguish common behaviors in a stage of development from "problems." Central to positive parenting is developing thoughtful approaches to child guidance that can be used in place of anger, manipulation, punishment, and rewards. Support for developing creative and loving approaches to meet special parenting challenges, such as temperament, disabilities, separation and loss, and adoption, is sometimes necessary as well. This annotated bibliography offers resources to professionals helping parents and to parents wishing to develop positive parenting skills.

  1. Semantically Enhanced Recommender Systems

    Science.gov (United States)

    Ruiz-Montiel, Manuela; Aldana-Montes, José F.

    Recommender Systems have become a significant area in the context of web personalization, given the large amount of available data. Ontologies can be widely taken advantage of in recommender systems, since they provide a means of classifying and discovering of new information about the items to recommend, about user profiles and even about their context. We have developed a semantically enhanced recommender system based on this kind of ontologies. In this paper we present a description of the proposed system.

  2. Dictionary-driven protein annotation.

    Science.gov (United States)

    Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

    2002-09-01

    Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were

  3. Insensitive Enough Semantics

    Directory of Open Access Journals (Sweden)

    Richard Vallée

    2006-06-01

    Full Text Available According to some philosophers, sentences like (1 “It is raining” and (2 “John is ready” are context sensitive sentences even if they do not contain indexicals or demonstratives. That view initiated a context sensitivity frenzy. Cappelen and Lepore (2005 summarize the frenzy by the slogan “Every sentence is context sensitive” (Insensitive Semantics, p. 6, note 5. They suggest a view they call Minimalism according to which the truth conditions of utterances of sentences like (1/(2 are exactly what Convention T gives you. I will distinguish different propositions, and refocus semantics on sentences. As distinct from what the protagonists in the ongoing debate think, I argue that the content or truth conditions of utterances of both context sensitive sentences and sentences like (1/(2 are not interesting from a semantic point of view, and that the problem sentences like (1/(2 raises is not about context sensitivity or context insensitivity of sentences, but relevance of the content of utterances.

  4. Knowledge Representation and Management. From Ontology to Annotation. Findings from the Yearbook 2015 Section on Knowledge Representation and Management.

    Science.gov (United States)

    Charlet, J; Darmoni, S J

    2015-08-13

    To summarize the best papers in the field of Knowledge Representation and Management (KRM). A comprehensive review of medical informatics literature was performed to select some of the most interesting papers of KRM published in 2014. Four articles were selected, two focused on annotation and information retrieval using an ontology. The two others focused mainly on ontologies, one dealing with the usage of a temporal ontology in order to analyze the content of narrative document, one describing a methodology for building multilingual ontologies. Semantic models began to show their efficiency, coupled with annotation tools.

  5. Subliminal semantic priming in speech.

    Directory of Open Access Journals (Sweden)

    Jérôme Daltrozzo

    Full Text Available Numerous studies have reported subliminal repetition and semantic priming in the visual modality. We transferred this paradigm to the auditory modality. Prime awareness was manipulated by a reduction of sound intensity level. Uncategorized prime words (according to a post-test were followed by semantically related, unrelated, or repeated target words (presented without intensity reduction and participants performed a lexical decision task (LDT. Participants with slower reaction times in the LDT showed semantic priming (faster reaction times for semantically related compared to unrelated targets and negative repetition priming (slower reaction times for repeated compared to semantically related targets. This is the first report of semantic priming in the auditory modality without conscious categorization of the prime.

  6. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.

    Science.gov (United States)

    Holt, Carson; Yandell, Mark

    2011-12-22

    Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets.

  7. Evaluating Hierarchical Structure in Music Annotations.

    Science.gov (United States)

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  8. Evaluating Hierarchical Structure in Music Annotations

    Directory of Open Access Journals (Sweden)

    Brian McFee

    2017-08-01

    Full Text Available Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR, it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  9. Survey of semantic modeling techniques

    Energy Technology Data Exchange (ETDEWEB)

    Smith, C.L.

    1975-07-01

    The analysis of the semantics of programing languages was attempted with numerous modeling techniques. By providing a brief survey of these techniques together with an analysis of their applicability for answering semantic issues, this report attempts to illuminate the state-of-the-art in this area. The intent is to be illustrative rather than thorough in the coverage of semantic models. A bibliography is included for the reader who is interested in pursuing this area of research in more detail.

  10. Semantic multimedia analysis and processing

    CERN Document Server

    Spyrou, Evaggelos; Mylonas, Phivos

    2014-01-01

    Broad in scope, Semantic Multimedia Analysis and Processing provides a complete reference of techniques, algorithms, and solutions for the design and the implementation of contemporary multimedia systems. Offering a balanced, global look at the latest advances in semantic indexing, retrieval, analysis, and processing of multimedia, the book features the contributions of renowned researchers from around the world. Its contents are based on four fundamental thematic pillars: 1) information and content retrieval, 2) semantic knowledge exploitation paradigms, 3) multimedia personalization, and 4)

  11. EST-PAC a web package for EST annotation and protein sequence prediction

    Directory of Open Access Journals (Sweden)

    Strahm Yvan

    2006-10-01

    Full Text Available Abstract With the decreasing cost of DNA sequencing technology and the vast diversity of biological resources, researchers increasingly face the basic challenge of annotating a larger number of expressed sequences tags (EST from a variety of species. This typically consists of a series of repetitive tasks, which should be automated and easy to use. The results of these annotation tasks need to be stored and organized in a consistent way. All these operations should be self-installing, platform independent, easy to customize and amenable to using distributed bioinformatics resources available on the Internet. In order to address these issues, we present EST-PAC a web oriented multi-platform software package for expressed sequences tag (EST annotation. EST-PAC provides a solution for the administration of EST and protein sequence annotations accessible through a web interface. Three aspects of EST annotation are automated: 1 searching local or remote biological databases for sequence similarities using Blast services, 2 predicting protein coding sequence from EST data and, 3 annotating predicted protein sequences with functional domain predictions. In practice, EST-PAC integrates the BLASTALL suite, EST-Scan2 and HMMER in a relational database system accessible through a simple web interface. EST-PAC also takes advantage of the relational database to allow consistent storage, powerful queries of results and, management of the annotation process. The system allows users to customize annotation strategies and provides an open-source data-management environment for research and education in bioinformatics.

  12. Semantic Representatives of the Concept

    Directory of Open Access Journals (Sweden)

    Elena N. Tsay

    2013-01-01

    Full Text Available In the article concept as one of the principle notions of cognitive linguistics is investigated. Considering concept as culture phenomenon, having language realization and ethnocultural peculiarities, the description of the concept “happiness” is presented. Lexical and semantic paradigm of the concept of happiness correlates with a great number of lexical and semantic variants. In the work semantic representatives of the concept of happiness, covering supreme spiritual values are revealed and semantic interpretation of their functioning in the Biblical discourse is given.

  13. System semantics of explanatory dictionaries

    Directory of Open Access Journals (Sweden)

    Volodymyr Shyrokov

    2015-11-01

    Full Text Available System semantics of explanatory dictionaries Some semantic properties of the language to be followed from the structure of lexicographical systems of big explanatory dictionaries are considered. The hyperchains and hypercycles are determined as the definite kind of automorphisms of the lexicographical system of explanatory dictionary. Some semantic consequencies following from the principles of lexicographic closure and lexicographic completeness are investigated using the hyperchains and hypercycles formalism. The connection between the hypercyle properties of the lexicographical system semantics and Goedel’s incompleteness theorem is discussed.

  14. Where does it break? or : Why the semantic web is not just "research as usual"

    NARCIS (Netherlands)

    Van Harmelen, Frank

    2006-01-01

    Work on the Semantic Web is all too often phrased as a technological challenge: how to improve the precision of search engines, how to personalise web-sites, how to integrate weakly-structured data-sources, etc. This suggests that we will be able to realise the Semantic Web by merely applying (and

  15. Semantic memory is impaired in patients with unilateral anterior temporal lobe resection for temporal lobe epilepsy.

    Science.gov (United States)

    Lambon Ralph, Matthew A; Ehsan, Sheeba; Baker, Gus A; Rogers, Timothy T

    2012-01-01

    Contemporary clinical and basic neuroscience studies have increasingly implicated the anterior temporal lobe regions, bilaterally, in the formation of coherent concepts. Mounting convergent evidence for the importance of the anterior temporal lobe in semantic memory is found in patients with bilateral anterior temporal lobe damage (e.g. semantic dementia), functional neuroimaging and repetitive transcranial magnetic stimulation studies. If this proposal is correct, then one might expect patients with anterior temporal lobe resection for long-standing temporal lobe epilepsy to be semantically impaired. Such patients, however, do not present clinically with striking comprehension deficits but with amnesia and variable anomia, leading some to conclude that semantic memory is intact in resection for temporal lobe epilepsy and thus casting doubt over the conclusions drawn from semantic dementia and linked basic neuroscience studies. Whilst there is a considerable neuropsychological literature on temporal lobe epilepsy, few studies have probed semantic memory directly, with mixed results, and none have undertaken the same type of systematic investigation of semantic processing that has been conducted with other patient groups. In this study, therefore, we investigated the semantic performance of 20 patients with resection for chronic temporal lobe epilepsy with a full battery of semantic assessments, including more sensitive measures of semantic processing. The results provide a bridge between the current clinical observations about resection for temporal lobe epilepsy and the expectations from semantic dementia and other neuroscience findings. Specifically, we found that on simple semantic tasks, the patients' accuracy fell in the normal range, with the exception that some patients with left resection for temporal lobe epilepsy had measurable anomia. Once the semantic assessments were made more challenging, by probing specific-level concepts, lower frequency

  16. Early experiences with crowdsourcing airway annotations in chest CT

    DEFF Research Database (Denmark)

    Cheplygina, Veronika; Perez-Rovira, Adria; Kuo, Wieying

    2016-01-01

    Measuring airways in chest computed tomography (CT) images is important for characterizing diseases such as cystic fibrosis, yet very time-consuming to perform manually. Machine learning algorithms offer an alternative, but need large sets of annotated data to perform well. We investigate whether...... a number of further research directions and provide insight into the challenges of crowdsourcing in medical images from the perspective of first-time users....

  17. Functional annotation of hierarchical modularity.

    Directory of Open Access Journals (Sweden)

    Kanchana Padmanabhan

    Full Text Available In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology and the association of individual genes or proteins with these concepts (e.g., GO terms, our method will assign a Hierarchical Modularity Score (HMS to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our

  18. Semantic orchestration of image processing services for environmental analysis

    Science.gov (United States)

    Ranisavljević, Élisabeth; Devin, Florent; Laffly, Dominique; Le Nir, Yannick

    2013-09-01

    In order to analyze environmental dynamics, a major process is the classification of the different phenomena of the site (e.g. ice and snow for a glacier). When using in situ pictures, this classification requires data pre-processing. Not all the pictures need the same sequence of processes depending on the disturbances. Until now, these sequences have been done manually, which restricts the processing of large amount of data. In this paper, we present how to realize a semantic orchestration to automate the sequencing for the analysis. It combines two advantages: solving the problem of the amount of processing, and diversifying the possibilities in the data processing. We define a BPEL description to express the sequences. This BPEL uses some web services to run the data processing. Each web service is semantically annotated using an ontology of image processing. The dynamic modification of the BPEL is done using SPARQL queries on these annotated web services. The results obtained by a prototype implementing this method validate the construction of the different workflows that can be applied to a large number of pictures.

  19. Exploring DBpedia and Wikipedia for Portuguese Semantic Relationship Extraction

    Directory of Open Access Journals (Sweden)

    David Soares Batista

    2013-07-01

    Full Text Available The identification of semantic relationships, as expressed between named entities in text, is an important step for extracting knowledge from large document collections, such as the Web. Previous works have addressed this task for the English language through supervised learning techniques for automatic classification. The current state of the art involves the use of learning methods based on string kernels. However, such approaches require manually annotated training data for each type of semantic relationship, and have scalability problems when tens or hundreds of different types of relationships have to be extracted. This article discusses an approach for distantly supervised relation extraction over texts written in the Portuguese language, which uses an efficient technique for measuring similarity between relation instances, based on minwise hashing and on locality sensitive hashing. In the proposed method, the training examples are automatically collected from Wikipedia, corresponding to sentences that express semantic relationships between pairs of entities extracted from DBPedia. These examples are represented as sets of character quadgrams and other representative elements. The sets are indexed in a data structure that implements the idea of locality-sensitive hashing. To check which semantic relationship is expressed between a given pair of entities referenced in a sentence, the most similar training examples are searched, based on an approximation to the Jaccard coefficient, obtained through min-hashing. The relation class is assigned with basis on the weighted votes of the most similar examples. Tests with a dataset from Wikipedia validate the suitability of the proposed method, showing, for instance, that the method is able to extract 10 different types of semantic relations, 8 of them corresponding to asymmetric relations, with an average score of 55.6%, measured in terms of F1.

  20. A Software Engineering Approach based on WebML and BPMN to the Mediation Scenario of the SWS Challenge

    Science.gov (United States)

    Brambilla, Marco; Ceri, Stefano; Valle, Emanuele Della; Facca, Federico M.; Tziviskou, Christina

    Although Semantic Web Services are expected to produce a revolution in the development of Web-based systems, very few enterprise-wide design experiences are available; one of the main reasons is the lack of sound Software Engineering methods and tools for the deployment of Semantic Web applications. In this chapter, we present an approach to software development for the Semantic Web based on classical Software Engineering methods (i.e., formal business process development, computer-aided and component-based software design, and automatic code generation) and on semantic methods and tools (i.e., ontology engineering, semantic service annotation and discovery).

  1. Semantic content-based recommendations using semantic graphs.

    Science.gov (United States)

    Guo, Weisen; Kraines, Steven B

    2010-01-01

    Recommender systems (RSs) can be useful for suggesting items that might be of interest to specific users. Most existing content-based recommendation (CBR) systems are designed to recommend items based on text content, and the items in these systems are usually described with keywords. However, similarity evaluations based on keywords suffer from the ambiguity of natural languages. We present a semantic CBR method that uses Semantic Web technologies to recommend items that are more similar semantically with the items that the user prefers. We use semantic graphs to represent the items and we calculate the similarity scores for each pair of semantic graphs using an inverse graph frequency algorithm. The items having higher similarity scores to the items that are known to be preferred by the user are recommended.

  2. Personal semantics: at the crossroads of semantic and episodic memory.

    Science.gov (United States)

    Renoult, Louis; Davidson, Patrick S R; Palombo, Daniela J; Moscovitch, Morris; Levine, Brian

    2012-11-01

    Declarative memory is usually described as consisting of two systems: semantic and episodic memory. Between these two poles, however, may lie a third entity: personal semantics (PS). PS concerns knowledge of one's past. Although typically assumed to be an aspect of semantic memory, it is essentially absent from existing models of knowledge. Furthermore, like episodic memory (EM), PS is idiosyncratically personal (i.e., not culturally-shared). We show that, depending on how it is operationalized, the neural correlates of PS can look more similar to semantic memory, more similar to EM, or dissimilar to both. We consider three different perspectives to better integrate PS into existing models of declarative memory and suggest experimental strategies for disentangling PS from semantic and episodic memory. Copyright © 2012 Elsevier Ltd. All rights reserved.

  3. Grammaticalization and Semantic Maps: Evidence from Artificial Language

    Directory of Open Access Journals (Sweden)

    Remi van Trijp

    2010-01-01

    Full Text Available Semantic maps have offered linguists an appealing and empirically rooted methodology for describing recurrent structural patterns in language development and the multifunctionality of grammatical categories. Although some researchers argue that semantic maps are universal and given, others provide evidence that there are no fixed or universal maps. This paper takes the position that semantic maps are a useful way to visualize the grammatical evolution of a language (particularly the evolution of semantic structuring but that this grammatical evolution is a consequence of distributed processes whereby language users shape and reshape their language. So it is a challenge to find out what these processes are and whether they indeed generate the kind of semantic maps observed for human languages. This work takes a design stance towards the question of the emergence of linguistic structure and investigates how grammar can be formed in populations of autonomous artificial ?agents? that play ?language games? with each other about situations they perceive through a sensori-motor embodiment. The experiments reported here investigate whether semantic maps for case markers could emerge through grammaticalization processes without the need for a universal conceptual space.

  4. Rewriting Logic Semantics of a Plan Execution Language

    Science.gov (United States)

    Dowek, Gilles; Munoz, Cesar A.; Rocha, Camilo

    2009-01-01

    The Plan Execution Interchange Language (PLEXIL) is a synchronous language developed by NASA to support autonomous spacecraft operations. In this paper, we propose a rewriting logic semantics of PLEXIL in Maude, a high-performance logical engine. The rewriting logic semantics is by itself a formal interpreter of the language and can be used as a semantic benchmark for the implementation of PLEXIL executives. The implementation in Maude has the additional benefit of making available to PLEXIL designers and developers all the formal analysis and verification tools provided by Maude. The formalization of the PLEXIL semantics in rewriting logic poses an interesting challenge due to the synchronous nature of the language and the prioritized rules defining its semantics. To overcome this difficulty, we propose a general procedure for simulating synchronous set relations in rewriting logic that is sound and, for deterministic relations, complete. We also report on the finding of two issues at the design level of the original PLEXIL semantics that were identified with the help of the executable specification in Maude.

  5. Ontology alignment architecture for semantic sensor Web integration.

    Science.gov (United States)

    Fernandez, Susel; Marsa-Maestre, Ivan; Velasco, Juan R; Alarcos, Bernardo

    2013-09-18

    Sensor networks are a concept that has become very popular in data acquisition and processing for multiple applications in different fields such as industrial, medicine, home automation, environmental detection, etc. Today, with the proliferation of small communication devices with sensors that collect environmental data, semantic Web technologies are becoming closely related with sensor networks. The linking of elements from Semantic Web technologies with sensor networks has been called Semantic Sensor Web and has among its main features the use of ontologies. One of the key challenges of using ontologies in sensor networks is to provide mechanisms to integrate and exchange knowledge from heterogeneous sources (that is, dealing with semantic heterogeneity). Ontology alignment is the process of bringing ontologies into mutual agreement by the automatic discovery of mappings between related concepts. This paper presents a system for ontology alignment in the Semantic Sensor Web which uses fuzzy logic techniques to combine similarity measures between entities of different ontologies. The proposed approach focuses on two key elements: the terminological similarity, which takes into account the linguistic and semantic information of the context of the entity's names, and the structural similarity, based on both the internal and relational structure of the concepts. This work has been validated using sensor network ontologies and the Ontology Alignment Evaluation Initiative (OAEI) tests. The results show that the proposed techniques outperform previous approaches in terms of precision and recall.

  6. Ontology Alignment Architecture for Semantic Sensor Web Integration

    Directory of Open Access Journals (Sweden)

    Bernardo Alarcos

    2013-09-01

    Full Text Available Sensor networks are a concept that has become very popular in data acquisition and processing for multiple applications in different fields such as industrial, medicine, home automation, environmental detection, etc. Today, with the proliferation of small communication devices with sensors that collect environmental data, semantic Web technologies are becoming closely related with sensor networks. The linking of elements from Semantic Web technologies with sensor networks has been called Semantic Sensor Web and has among its main features the use of ontologies. One of the key challenges of using ontologies in sensor networks is to provide mechanisms to integrate and exchange knowledge from heterogeneous sources (that is, dealing with semantic heterogeneity. Ontology alignment is the process of bringing ontologies into mutual agreement by the automatic discovery of mappings between related concepts. This paper presents a system for ontology alignment in the Semantic Sensor Web which uses fuzzy logic techniques to combine similarity measures between entities of different ontologies. The proposed approach focuses on two key elements: the terminological similarity, which takes into account the linguistic and semantic information of the context of the entity’s names, and the structural similarity, based on both the internal and relational structure of the concepts. This work has been validated using sensor network ontologies and the Ontology Alignment Evaluation Initiative (OAEI tests. The results show that the proposed techniques outperform previous approaches in terms of precision and recall.

  7. Connectionism and Compositional Semantics

    Science.gov (United States)

    1989-05-01

    can use their hidden layers to learn difficult discriminations. such as panty or the Penzias two clumps/three clumps problem, where the output is...sauce." For novel sentences that are similar to the training sentences (e.g., train on "the girl hit the boy," test on -the boy hit the girl "), the...overridden by semantic considerations. as in this example from Wendy Lehnert (personal communicanon): (5) John saw the girl with the telescope in a red

  8. A Semantics of Synchronization.

    Science.gov (United States)

    1980-09-01

    suggestion of having very hungry philosophers. One can easily imagine the complexity of the equivalent implementation using semaphores . Synchronization types...Edinburgh, July 1978. [STAR79] Stark, E.W., " Semaphore Primitives and Fair Mutual Exclusion," TM-158, Laboratory for Computer Science, M.I.T., Cambridge...AD-AQ91 015 MASSACHUSETTS INST OF TECH CAMBRIDGE LAB FOR COMPUTE--ETC F/S 9/2 A SEMANTICS OF SYNCHRONIZATION .(U) .C SEP 80 C A SEAQUIST N00015-75

  9. Semantics, Conceptual Role

    OpenAIRE

    Block, Ned

    1997-01-01

    According to Conceptual Role Semantics ("CRS"), the meaning of a representation is the role of that representation in the cognitive life of the agent, e.g. in perception, thought and decision-making. It is an extension of the well known "use" theory of meaning, according to which the meaning of a word is its use in communication and more generally, in social interaction. CRS supplements external use by including the role of a symbol inside a computer or a brain. The uses appealed to are not j...

  10. Essential Annotation Schema for Ecology (EASE)-A framework supporting the efficient data annotation and faceted navigation in ecology.

    Science.gov (United States)

    Pfaff, Claas-Thido; Eichenberg, David; Liebergesell, Mario; König-Ries, Birgitta; Wirth, Christian

    2017-01-01

    Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only) is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines.

  11. Essential Annotation Schema for Ecology (EASE)—A framework supporting the efficient data annotation and faceted navigation in ecology

    Science.gov (United States)

    Eichenberg, David; Liebergesell, Mario; König-Ries, Birgitta; Wirth, Christian

    2017-01-01

    Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only) is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines. PMID:29023519

  12. Essential Annotation Schema for Ecology (EASE-A framework supporting the efficient data annotation and faceted navigation in ecology.

    Directory of Open Access Journals (Sweden)

    Claas-Thido Pfaff

    Full Text Available Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines.

  13. A Generalization of Inquisitive Semantics

    Czech Academy of Sciences Publication Activity Database

    Punčochář, Vít

    2016-01-01

    Roč. 45, č. 4 (2016), s. 399-428 ISSN 0022-3611 R&D Projects: GA ČR(CZ) GA13-21076S Institutional support: RVO:67985955 Keywords : Intuitionistic logic * Superintuitionistic logics * Inquisitive logic * Topological semantics * Kripke semantics * Disjunction Subject RIV: AA - Philosophy ; Religion

  14. The Problem of Naturalizing Semantics.

    Science.gov (United States)

    Sullivan, Arthur

    2000-01-01

    Investigates conceptual barriers prevalent in the works of both proponents and opponents of semantic naturalism. Searches for a tenable definition of naturalism according to which one can be a realist, a non-reductionist, and a naturalist about semantic content. (Author/VWL)

  15. On the Semantics of Focus

    Science.gov (United States)

    Kess, Joseph F.

    1975-01-01

    This article discusses the semantics of the notion of focus, insofar as it relates to Filipino languages. The evolution of this notion is reviewed, and an alternative explanation of it is given, stressing the fact that grammar and semantics should be kept separate in a discussion of focus. (CLK)

  16. Semantic Coherence Facilitates Distributional Learning.

    Science.gov (United States)

    Ouyang, Long; Boroditsky, Lera; Frank, Michael C

    2017-04-01

    Computational models have shown that purely statistical knowledge about words' linguistic contexts is sufficient to learn many properties of words, including syntactic and semantic category. For example, models can infer that "postman" and "mailman" are semantically similar because they have quantitatively similar patterns of association with other words (e.g., they both tend to occur with words like "deliver," "truck," "package"). In contrast to these computational results, artificial language learning experiments suggest that distributional statistics alone do not facilitate learning of linguistic categories. However, experiments in this paradigm expose participants to entirely novel words, whereas real language learners encounter input that contains some known words that are semantically organized. In three experiments, we show that (a) the presence of familiar semantic reference points facilitates distributional learning and (b) this effect crucially depends both on the presence of known words and the adherence of these known words to some semantic organization. Copyright © 2016 Cognitive Science Society, Inc.

  17. Polish Semantic Parser

    Directory of Open Access Journals (Sweden)

    Agnieszka Grudzinska

    2000-01-01

    Full Text Available Amount of information transferred by computers grows very rapidly thus outgrowing the average man's capability of reception. It implies computer programs increase in the demand for which would be able to perform an introductory classitication or even selection of information directed to a particular receiver. Due to the complexity of the problem, we restricted it to understanding short newspaper notes. Among many conceptions formulated so far, the conceptual dependency worked out by Roger Schank has been chosen. It is a formal language of description of the semantics of pronouncement integrated with a text understanding algorithm. Substantial part of each text transformation system is a semantic parser of the Polish language. It is a module, which as the first and the only one has an access to the text in the Polish language. lt plays the role of an element, which finds relations between words of the Polish language and the formal registration. It translates sentences written in the language used by people into the language theory. The presented structure of knowledge units and the shape of understanding process algorithms are universal by virtue of the theory. On the other hand the defined knowledge units and the rules used in the algorithms ure only examples because they are constructed in order to understand short newspaper notes.

  18. Latent semantic analysis.

    Science.gov (United States)

    Evangelopoulos, Nicholas E

    2013-11-01

    This article reviews latent semantic analysis (LSA), a theory of meaning as well as a method for extracting that meaning from passages of text, based on statistical computations over a collection of documents. LSA as a theory of meaning defines a latent semantic space where documents and individual words are represented as vectors. LSA as a computational technique uses linear algebra to extract dimensions that represent that space. This representation enables the computation of similarity among terms and documents, categorization of terms and documents, and summarization of large collections of documents using automated procedures that mimic the way humans perform similar cognitive tasks. We present some technical details, various illustrative examples, and discuss a number of applications from linguistics, psychology, cognitive science, education, information science, and analysis of textual data in general. WIREs Cogn Sci 2013, 4:683-692. doi: 10.1002/wcs.1254 CONFLICT OF INTEREST: The author has declared no conflicts of interest for this article. For further resources related to this article, please visit the WIREs website. © 2013 John Wiley & Sons, Ltd.

  19. Semantics and pragmatics.

    Science.gov (United States)

    McNally, Louise

    2013-05-01

    The fields of semantics and pragmatics are devoted to the study of conventionalized and context- or use-dependent aspects of natural language meaning, respectively. The complexity of human language as a semiotic system has led to considerable debate about how the semantics/pragmatics distinction should be drawn, if at all. This debate largely reflects contrasting views of meaning as a property of linguistic expressions versus something that speakers do. The fact that both views of meaning are essential to a complete understanding of language has led to a variety of efforts over the last 40 years to develop better integrated and more comprehensive theories of language use and interpretation. The most important advances have included the adaptation of propositional analyses of declarative sentences to interrogative, imperative and exclamative forms; the emergence of dynamic, game theoretic, and multi-dimensional theories of meaning; and the development of various techniques for incorporating context-dependent aspects of content into representations of context-invariant content with the goal of handling phenomena such as vagueness resolution, metaphor, and metonymy. WIREs Cogn Sci 2013, 4:285-297. doi: 10.1002/wcs.1227 For further resources related to this article, please visit the WIREs website. The authors declare no conflict of interest. Copyright © 2013 John Wiley & Sons, Ltd.

  20. CI-Miner: A Semantic Methodology to Integrate Scientists, Data and Documents through the Use of Cyber-Infrastructure

    Science.gov (United States)

    Pinheiro da Silva, P.; CyberShARE Center of Excellence

    2011-12-01

    Scientists today face the challenge of rethinking the manner in which they document and make available their processes and data in an international cyber-infrastructure of shared resources. Some relevant examples of new scientific practices in the realm of computational and data extraction sciences include: large scale data discovery; data integration; data sharing across distinct scientific domains, systematic management of trust and uncertainty; and comprehensive support for explaining processes and results. This talk introduces CI-Miner - an innovative hands-on, open-source, community-driven methodology to integrate these new scientific practices. It has been developed in collaboration with scientists, with the purpose of capturing, storing and retrieving knowledge about scientific processes and their products, thereby further supporting a new generation of science techniques based on data exploration. CI-Miner uses semantic annotations in the form of W3C Ontology Web Language-based ontologies and Proof Markup Language (PML)-based provenance to represent knowledge. This methodology specializes in general-purpose ontologies, projected into workflow-driven ontologies(WDOs) and into semantic abstract workflows (SAWs). Provenance in PML is CI-Miner's integrative component, which allows scientists to retrieve and reason with the knowledge represented in these new semantic documents. It serves additionally as a platform to share such collected knowledge with the scientific community participating in the international cyber-infrastructure. The integrated semantic documents that are tailored for the use of human epistemic agents may also be utilized by machine epistemic agents, since the documents are based on W3C Resource Description Framework (RDF) notation. This talk is grounded upon interdisciplinary lessons learned through the use of CI-Miner in support of government-funded national and international cyber-infrastructure initiatives in the areas of geo

  1. Pipeline to upgrade the genome annotations

    Directory of Open Access Journals (Sweden)

    Lijin K. Gopi

    2017-12-01

    Full Text Available Current era of functional genomics is enriched with good quality draft genomes and annotations for many thousands of species and varieties with the support of the advancements in the next generation sequencing technologies (NGS. Around 25,250 genomes, of the organisms from various kingdoms, are submitted in the NCBI genome resource till date. Each of these genomes was annotated using various tools and knowledge-bases that were available during the period of the annotation. It is obvious that these annotations will be improved if the same genome is annotated using improved tools and knowledge-bases. Here we present a new genome annotation pipeline, strengthened with various tools and knowledge-bases that are capable of producing better quality annotations from the consensus of the predictions from different tools. This resource also perform various additional annotations, apart from the usual gene predictions and functional annotations, which involve SSRs, novel repeats, paralogs, proteins with transmembrane helices, signal peptides etc. This new annotation resource is trained to evaluate and integrate all the predictions together to resolve the overlaps and ambiguities of the boundaries. One of the important highlights of this resource is the capability of predicting the phylogenetic relations of the repeats using the evolutionary trace analysis and orthologous gene clusters. We also present a case study, of the pipeline, in which we upgrade the genome annotation of Nelumbo nucifera (sacred lotus. It is demonstrated that this resource is capable of producing an improved annotation for a better understanding of the biology of various organisms.

  2. X-Informatics: Practical Semantic Science

    Science.gov (United States)

    Borne, K. D.

    2009-12-01

    The discipline of data science is merging with multiple science disciplines to form new X-informatics research disciplines. They are almost too numerous to name, but they include geoinformatics, bioinformatics, cheminformatics, biodiversity informatics, ecoinformatics, materials informatics, and the emerging discipline of astroinformatics. Within any X-informatics discipline, the information granules are unique to that discipline -- e.g., gene sequences in bio, the sky object in astro, and the spatial object in geo (such as points, lines, and polygons in the vector model, and pixels in the raster model). Nevertheless the goals are similar: transparent data re-use across subdisciplines and within education settings, information and data integration and fusion, personalization of user interactions with the data collection, semantic search and retrieval, and knowledge discovery. The implementation of an X-informatics framework enables these semantic e-science research goals. We describe the concepts, challenges, and new developments associated with the new discipline of astroinformatics, and how geoinformatics provides valuable lessons learned and a model for practical semantic science within a traditional science discipline through the accretion of data science methodologies (such as formal metadata creation, data models, data mining, information retrieval, knowledge engineering, provenance, taxonomies, and ontologies). The emerging concept of data-as-a-service (DaaS) builds upon the concept of smart data (or data DNA) for intelligent data management, automated workflows, and intelligent processing. Smart data, defined through X-informatics, enables several practical semantic science use cases, including self-discovery, data intelligence, automatic recommendations, relevance analysis, dimension reduction, feature selection, constraint-based mining, interdisciplinary data re-use, knowledge-sharing, data use in education, and more. We describe these concepts within the

  3. SCALEUS: Semantic Web Services Integration for Biomedical Applications.

    Science.gov (United States)

    Sernadela, Pedro; González-Castro, Lorena; Oliveira, José Luís

    2017-04-01

    In recent years, we have witnessed an explosion of biological data resulting largely from the demands of life science research. The vast majority of these data are freely available via diverse bioinformatics platforms, including relational databases and conventional keyword search applications. This type of approach has achieved great results in the last few years, but proved to be unfeasible when information needs to be combined or shared among different and scattered sources. During recent years, many of these data distribution challenges have been solved with the adoption of semantic web. Despite the evident benefits of this technology, its adoption introduced new challenges related with the migration process, from existent systems to the semantic level. To facilitate this transition, we have developed Scaleus, a semantic web migration tool that can be deployed on top of traditional systems in order to bring knowledge, inference rules, and query federation to the existent data. Targeted at the biomedical domain, this web-based platform offers, in a single package, straightforward data integration and semantic web services that help developers and researchers in the creation process of new semantically enhanced information systems. SCALEUS is available as open source at http://bioinformatics-ua.github.io/scaleus/ .

  4. BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments.

    Science.gov (United States)

    López-Fernández, H; Reboiro-Jato, M; Glez-Peña, D; Aparicio, F; Gachet, D; Buenaga, M; Fdez-Riverola, F

    2013-07-01

    Automatic term annotation from biomedical documents and external information linking are becoming a necessary prerequisite in modern computer-aided medical learning systems. In this context, this paper presents BioAnnote, a flexible and extensible open-source platform for automatically annotating biomedical resources. Apart from other valuable features, the software platform includes (i) a rich client enabling users to annotate multiple documents in a user friendly environment, (ii) an extensible and embeddable annotation meta-server allowing for the annotation of documents with local or remote vocabularies and (iii) a simple client/server protocol which facilitates the use of our meta-server from any other third-party application. In addition, BioAnnote implements a powerful scripting engine able to perform advanced batch annotations. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  5. Correlating Information Contents of Gene Ontology Terms to Infer Semantic Similarity of Gene Products

    Directory of Open Access Journals (Sweden)

    Mingxin Gan

    2014-01-01

    Full Text Available Successful applications of the gene ontology to the inference of functional relationships between gene products in recent years have raised the need for computational methods to automatically calculate semantic similarity between gene products based on semantic similarity of gene ontology terms. Nevertheless, existing methods, though having been widely used in a variety of applications, may significantly overestimate semantic similarity between genes that are actually not functionally related, thereby yielding misleading results in applications. To overcome this limitation, we propose to represent a gene product as a vector that is composed of information contents of gene ontology terms annotated for the gene product, and we suggest calculating similarity between two gene products as the relatedness of their corresponding vectors using three measures: Pearson’s correlation coefficient, cosine similarity, and the Jaccard index. We focus on the biological process domain of the gene ontology and annotations of yeast proteins to study the effectiveness of the proposed measures. Results show that semantic similarity scores calculated using the proposed measures are more consistent with known biological knowledge than those derived using a list of existing methods, suggesting the effectiveness of our method in characterizing functional relationships between gene products.

  6. NegGOA: negative GO annotations selection using ontology structure.

    Science.gov (United States)

    Fu, Guangyuan; Wang, Jun; Yang, Bo; Yu, Guoxian

    2016-10-01

    Predicting the biological functions of proteins is one of the key challenges in the post-genomic era. Computational models have demonstrated the utility of applying machine learning methods to predict protein function. Most prediction methods explicitly require a set of negative examples-proteins that are known not carrying out a particular function. However, Gene Ontology (GO) almost always only provides the knowledge that proteins carry out a particular function, and functional annotations of proteins are incomplete. GO structurally organizes more than tens of thousands GO terms and a protein is annotated with several (or dozens) of these terms. For these reasons, the negative examples of a protein can greatly help distinguishing true positive examples of the protein from such a large candidate GO space. In this paper, we present a novel approach (called NegGOA) to select negative examples. Specifically, NegGOA takes advantage of the ontology structure, available annotations and potentiality of additional annotations of a protein to choose negative examples of the protein. We compare NegGOA with other negative examples selection algorithms and find that NegGOA produces much fewer false negatives than them. We incorporate the selected negative examples into an efficient function prediction model to predict the functions of proteins in Yeast, Human, Mouse and Fly. NegGOA also demonstrates improved accuracy than these comparing algorithms across various evaluation metrics. In addition, NegGOA is less suffered from incomplete annotations of proteins than these comparing methods. The Matlab and R codes are available at https://sites.google.com/site/guoxian85/neggoa gxyu@swu.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. Annotating temporal information in clinical narratives.

    Science.gov (United States)

    Sun, Weiyi; Rumshisky, Anna; Uzuner, Ozlem

    2013-12-01

    Temporal information in clinical narratives plays an important role in patients' diagnosis, treatment and prognosis. In order to represent narrative information accurately, medical natural language processing (MLP) systems need to correctly identify and interpret temporal information. To promote research in this area, the Informatics for Integrating Biology and the Bedside (i2b2) project developed a temporally annotated corpus of clinical narratives. This corpus contains 310 de-identified discharge summaries, with annotations of clinical events, temporal expressions and temporal relations. This paper describes the process followed for the development of this corpus and discusses annotation guideline development, annotation methodology, and corpus quality. Copyright © 2013 Elsevier Inc. All rights reserved.

  8. Estimating the annotation error rate of curated GO database sequence annotations

    Directory of Open Access Journals (Sweden)

    Brown Alfred L

    2007-05-01

    Full Text Available Abstract Background Annotations that describe the function of sequences are enormously important to researchers during laboratory investigations and when making computational inferences. However, there has been little investigation into the data quality of sequence function annotations. Here we have developed a new method of estimating the error rate of curated sequence annotations, and applied this to the Gene Ontology (GO sequence database (GOSeqLite. This method involved artificially adding errors to sequence annotations at known rates, and used regression to model the impact on the precision of annotations based on BLAST matched sequences. Results We estimated the error rate of curated GO sequence annotations in the GOSeqLite database (March 2006 at between 28% and 30%. Annotations made without use of sequence similarity based methods (non-ISS had an estimated error rate of between 13% and 18%. Annotations made with the use of sequence similarity methodology (ISS had an estimated error rate of 49%. Conclusion While the overall error rate is reasonably low, it would be prudent to treat all ISS annotations with caution. Electronic annotators that use ISS annotations as the basis of predictions are likely to have higher false prediction rates, and for this reason designers of these systems should consider avoiding ISS annotations where possible. Electronic annotators that use ISS annotations to make predictions should be viewed sceptically. We recommend that curators thoroughly review ISS annotations before accepting them as valid. Overall, users of curated sequence annotations from the GO database should feel assured that they are using a comparatively high quality source of information.

  9. E-Learning System Overview Based on Semantic Web

    Science.gov (United States)

    Alsultanny, Yas A.

    2006-01-01

    The challenge of the semantic web is the provision of distributed information with well-defined meaning, understandable for different parties. e-Learning is efficient task relevant and just-in-time learning grown from the learning requirements of the new dynamically changing, distributed business world. In this paper we design an e-Learning system…

  10. Understanding and Teaching the Semantics of Terrorism: An Alternative Perspective.

    Science.gov (United States)

    Long, Kenneth J.

    1990-01-01

    Critiques conventional definitions of terrorism. Advocates sensitizing students to the semantics of terrorism and teaching skepticism of leaders who manipulate such concepts. Recommends using historical case studies to clarify issues, inform students about state and state-sponsored terrorism, and challenge students' preconceptions. Includes a…

  11. Semantic Linking and Contextualization for Social Forensic Text Analysis

    NARCIS (Netherlands)

    Ren, Z.; van Dijk, D.; Graus, D.; van der Knaap, N.; Henseler, H.; de Rijke, M.; Brynielsson, J.; Johansson, F.

    2013-01-01

    With the development of social media, forensic text analysis is becoming more and more challenging as forensic analysts have begun to include this information source in their practice. In this paper, we report on our recent work related to semantic search in e-discovery and propose the use of entity

  12. A Calculus of Circular Proofs and its Categorical Semantics

    DEFF Research Database (Denmark)

    Santocanale, Luigi

    2002-01-01

    We present a calculus of “circular proofs”: the graph underlying a proof is not a finite tree but instead it is allowed to contain a certain amount of cycles.The main challenge in developing a theory for the calculus is to define the semantics of proofs, since the usual method by induction...

  13. A Calculus of Circular Proofs and its Categorical Semantics

    DEFF Research Database (Denmark)

    Santocanale, Luigi

    2002-01-01

    We present a calculus of "circular proofs": the graph underlying a proof is not a finite tree but instead it is allowed to contain a certain amount of cycles.The main challenge in developing a theory for the calculus is to define the semantics of proofs, since the usual method by induction...

  14. Engineering semantic-based interactive multi-device web applications

    NARCIS (Netherlands)

    Bellekens, P.A.E.; Sluijs, van der K.A.M.; Aroyo, L.M.; Houben, G.J.P.M.; Baresi, L.; Fraternali, P.; Houben, G.J.

    2007-01-01

    To build high-quality personalized Web applications developers have to deal with a number of complex problems. We look at the growing class of personalized Web Applications that share three characteristic challenges. Firstly, the semantic problem of how to enable content reuse and integration.

  15. ANNOTATION SUPPORTED OCCLUDED OBJECT TRACKING

    Directory of Open Access Journals (Sweden)

    Devinder Kumar

    2012-08-01

    Full Text Available Tracking occluded objects at different depths has become as extremely important component of study for any video sequence having wide applications in object tracking, scene recognition, coding, editing the videos and mosaicking. The paper studies the ability of annotation to track the occluded object based on pyramids with variation in depth further establishing a threshold at which the ability of the system to track the occluded object fails. Image annotation is applied on 3 similar video sequences varying in depth. In the experiment, one bike occludes the other at a depth of 60cm, 80cm and 100cm respectively. Another experiment is performed on tracking humans with similar depth to authenticate the results. The paper also computes the frame by frame error incurred by the system, supported by detailed simulations. This system can be effectively used to analyze the error in motion tracking and further correcting the error leading to flawless tracking. This can be of great interest to computer scientists while designing surveillance systems etc.

  16. Annotating and Interpreting Linear and Cyclic Peptide Tandem Mass Spectra.

    Science.gov (United States)

    Niedermeyer, Timo Horst Johannes

    2016-01-01

    Nonribosomal peptides often possess pronounced bioactivity, and thus, they are often interesting hit compounds in natural product-based drug discovery programs. Their mass spectrometric characterization is difficult due to the predominant occurrence of non-proteinogenic monomers and, especially in the case of cyclic peptides, the complex fragmentation patterns observed. This makes nonribosomal peptide tandem mass spectra annotation challenging and time-consuming. To meet this challenge, software tools for this task have been developed. In this chapter, the workflow for using the software mMass for the annotation of experimentally obtained peptide tandem mass spectra is described. mMass is freely available (http://www.mmass.org), open-source, and the most advanced and user-friendly software tool for this purpose. The software enables the analyst to concisely annotate and interpret tandem mass spectra of linear and cyclic peptides. Thus, it is highly useful for accelerating the structure confirmation and elucidation of cyclic as well as linear peptides and depsipeptides.

  17. Microsyntactic Annotation of Corpora and its Use in Computational Linguistics Tasks

    Directory of Open Access Journals (Sweden)

    Iomdin Leonid

    2017-12-01

    Full Text Available Microsyntax is a linguistic discipline dealing with idiomatic elements whose important properties are strongly related to syntax. In a way, these elements may be viewed as transitional entities between the lexicon and the grammar, which explains why they are often underrepresented in both of these resource types: the lexicographer fails to see such elements as full-fledged lexical units, while the grammarian finds them too specific to justify the creation of individual well-developed rules. As a result, such elements are poorly covered by linguistic models used in advanced modern computational linguistic tasks like high-quality machine translation or deep semantic analysis. A possible way to mend the situation and improve the coverage and adequate treatment of microsyntactic units in linguistic resources is to develop corpora with microsyntactic annotation, closely linked to specially designed lexicons. The paper shows how this task is solved in the deeply annotated corpus of Russian, SynTagRus.

  18. A Methodology for the Development of RESTful Semantic Web Services for Gene Expression Analysis.

    Science.gov (United States)

    Guardia, Gabriela D A; Pires, Luís Ferreira; Vêncio, Ricardo Z N; Malmegrim, Kelen C R; de Farias, Cléver R G

    2015-01-01

    Gene expression studies are generally performed through multi-step analysis processes, which require the integrated use of a number of analysis tools. In order to facilitate tool/data integration, an increasing number of analysis tools have been developed as or adapted to semantic web services. In recent years, some approaches have been defined for the development and semantic annotation of web services created from legacy software tools, but these approaches still present many limitations. In addition, to the best of our knowledge, no suitable approach has been defined for the functional genomics domain. Therefore, this paper aims at defining an integrated methodology for the implementation of RESTful semantic web services created from gene expression analysis tools and the semantic annotation of such services. We have applied our methodology to the development of a number of services to support the analysis of different types of gene expression data, including microarray and RNASeq. All developed services are publicly available in the Gene Expression Analysis Services (GEAS) Repository at http://dcm.ffclrp.usp.br/lssb/geas. Additionally, we have used a number of the developed services to create different integrated analysis scenarios to reproduce parts of two gene expression studies documented in the literature. The first study involves the analysis of one-color microarray data obtained from multiple sclerosis patients and healthy donors. The second study comprises the analysis of RNA-Seq data obtained from melanoma cells to investigate the role of the remodeller BRG1 in the proliferation and morphology of these cells. Our methodology provides concrete guidelines and technical details in order to facilitate the systematic development of semantic web services. Moreover, it encourages the development and reuse of these services for the creation of semantically integrated solutions for gene expression analysis.

  19. A Methodology for the Development of RESTful Semantic Web Services for Gene Expression Analysis.

    Directory of Open Access Journals (Sweden)

    Gabriela D A Guardia

    Full Text Available Gene expression studies are generally performed through multi-step analysis processes, which require the integrated use of a number of analysis tools. In order to facilitate tool/data integration, an increasing number of analysis tools have been developed as or adapted to semantic web services. In recent years, some approaches have been defined for the development and semantic annotation of web services created from legacy software tools, but these approaches still present many limitations. In addition, to the best of our knowledge, no suitable approach has been defined for the functional genomics domain. Therefore, this paper aims at defining an integrated methodology for the implementation of RESTful semantic web services created from gene expression analysis tools and the semantic annotation of such services. We have applied our methodology to the development of a number of services to support the analysis of different types of gene expression data, including microarray and RNASeq. All developed services are publicly available in the Gene Expression Analysis Services (GEAS Repository at http://dcm.ffclrp.usp.br/lssb/geas. Additionally, we have used a number of the developed services to create different integrated analysis scenarios to reproduce parts of two gene expression studies documented in the literature. The first study involves the analysis of one-color microarray data obtained from multiple sclerosis patients and healthy donors. The second study comprises the analysis of RNA-Seq data obtained from melanoma cells to investigate the role of the remodeller BRG1 in the proliferation and morphology of these cells. Our methodology provides concrete guidelines and technical details in order to facilitate the systematic development of semantic web services. Moreover, it encourages the development and reuse of these services for the creation of semantically integrated solutions for gene expression analysis.

  20. Treating metadata as annotations: separating the content markup from the content

    Directory of Open Access Journals (Sweden)

    Fredrik Paulsson

    2007-11-01

    Full Text Available The use of digital learning resources creates an increasing need for semantic metadata, describing the whole resource, as well as parts of resources. Traditionally, schemas such as Text Encoding Initiative (TEI have been used to add semantic markup for parts of resources. This is not sufficient for use in a ”metadata ecology”, where metadata is distributed, coherent to different Application Profiles, and added by different actors. A new methodology, where metadata is “pointed in” as annotations, using XPointers, and RDF is proposed. A suggestion for how such infrastructure can be implemented, using existing open standards for metadata, and for the web is presented. We argue that such methodology and infrastructure is necessary to realize the decentralized metadata infrastructure needed for a “metadata ecology".

  1. Semantic attributes based texture generation

    Science.gov (United States)

    Chi, Huifang; Gan, Yanhai; Qi, Lin; Dong, Junyu; Madessa, Amanuel Hirpa

    2018-04-01

    Semantic attributes are commonly used for texture description. They can be used to describe the information of a texture, such as patterns, textons, distributions, brightness, and so on. Generally speaking, semantic attributes are more concrete descriptors than perceptual features. Therefore, it is practical to generate texture images from semantic attributes. In this paper, we propose to generate high-quality texture images from semantic attributes. Over the last two decades, several works have been done on texture synthesis and generation. Most of them focusing on example-based texture synthesis and procedural texture generation. Semantic attributes based texture generation still deserves more devotion. Gan et al. proposed a useful joint model for perception driven texture generation. However, perceptual features are nonobjective spatial statistics used by humans to distinguish different textures in pre-attentive situations. To give more describing information about texture appearance, semantic attributes which are more in line with human description habits are desired. In this paper, we use sigmoid cross entropy loss in an auxiliary model to provide enough information for a generator. Consequently, the discriminator is released from the relatively intractable mission of figuring out the joint distribution of condition vectors and samples. To demonstrate the validity of our method, we compare our method to Gan et al.'s method on generating textures by designing experiments on PTD and DTD. All experimental results show that our model can generate textures from semantic attributes.

  2. Applied and implied semantics in crystallographic publishing

    Directory of Open Access Journals (Sweden)

    McMahon Brian

    2012-08-01

    Full Text Available Abstract Background Crystallography is a data-rich, software-intensive scientific discipline with a community that has undertaken direct responsibility for publishing its own scientific journals. That community has worked actively to develop information exchange standards allowing readers of structure reports to access directly, and interact with, the scientific content of the articles. Results Structure reports submitted to some journals of the International Union of Crystallography (IUCr can be automatically validated and published through an efficient and cost-effective workflow. Readers can view and interact with the structures in three-dimensional visualization applications, and can access the experimental data should they wish to perform their own independent structure solution and refinement. The journals also layer on top of this facility a number of automated annotations and interpretations to add further scientific value. Conclusions The benefits of semantically rich information exchange standards have revolutionised the scholarly publishing process for crystallography, and establish a model relevant to many other physical science disciplines.

  3. Ratsnake: A Versatile Image Annotation Tool with Application to Computer-Aided Diagnosis

    Directory of Open Access Journals (Sweden)

    D. K. Iakovidis

    2014-01-01

    Full Text Available Image segmentation and annotation are key components of image-based medical computer-aided diagnosis (CAD systems. In this paper we present Ratsnake, a publicly available generic image annotation tool providing annotation efficiency, semantic awareness, versatility, and extensibility, features that can be exploited to transform it into an effective CAD system. In order to demonstrate this unique capability, we present its novel application for the evaluation and quantification of salient objects and structures of interest in kidney biopsy images. Accurate annotation identifying and quantifying such structures in microscopy images can provide an estimation of pathogenesis in obstructive nephropathy, which is a rather common disease with severe implication in children and infants. However a tool for detecting and quantifying the disease is not yet available. A machine learning-based approach, which utilizes prior domain knowledge and textural image features, is considered for the generation of an image force field customizing the presented tool for automatic evaluation of kidney biopsy images. The experimental evaluation of the proposed application of Ratsnake demonstrates its efficiency and effectiveness and promises its wide applicability across a variety of medical imaging domains.

  4. Mapping the Structure of Semantic Memory

    Science.gov (United States)

    Morais, Ana Sofia; Olsson, Henrik; Schooler, Lael J.

    2013-01-01

    Aggregating snippets from the semantic memories of many individuals may not yield a good map of an individual's semantic memory. The authors analyze the structure of semantic networks that they sampled from individuals through a new snowball sampling paradigm during approximately 6 weeks of 1-hr daily sessions. The semantic networks of individuals…

  5. Adaptive semantics visualization

    CERN Document Server

    Nazemi, Kawa

    2016-01-01

    This book introduces a novel approach for intelligent visualizations that adapts the different visual variables and data processing to human’s behavior and given tasks. Thereby a number of new algorithms and methods are introduced to satisfy the human need of information and knowledge and enable a usable and attractive way of information acquisition. Each method and algorithm is illustrated in a replicable way to enable the reproduction of the entire “SemaVis” system or parts of it. The introduced evaluation is scientifically well-designed and performed with more than enough participants to validate the benefits of the methods. Beside the introduced new approaches and algorithms, readers may find a sophisticated literature review in Information Visualization and Visual Analytics, Semantics and information extraction, and intelligent and adaptive systems. This book is based on an awarded and distinguished doctoral thesis in computer science.

  6. Communication of Semantic Properties

    DEFF Research Database (Denmark)

    Lenau, Torben Anker; Boelskifte, Per

    2004-01-01

    The selection of materials and planning for production play a key role for the design of physical products. Product function, appearance and expression are influenced by the chosen materials and how they are shaped. However these properties are not carried by the material itself, but by the speci......The selection of materials and planning for production play a key role for the design of physical products. Product function, appearance and expression are influenced by the chosen materials and how they are shaped. However these properties are not carried by the material itself...... processes. This working paper argues for the need for a commonly accepted terminology used to communicate semantic product properties. Designers and others involved in design processes are dependent of a sharp and clear verbal communication. Search facilities in computer programs for product and material...

  7. Learning for Semantic Parsing with Kernels under Various Forms of Supervision

    Science.gov (United States)

    2007-08-01

    natural language sentences to their formal executable meaning representations. This is a challenging problem and is critical for developing computing...sentences are semantically tractable. This indi- cates that Geoquery is more challenging domain for semantic parsing than ATIS. In the past, there have been a...Combining parsers. In Proceedings of the Conference on Em- pirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/ VLC -99), pp. 187–194

  8. Contextually guided very-high-resolution imagery classification with semantic segments

    Science.gov (United States)

    Zhao, Wenzhi; Du, Shihong; Wang, Qiao; Emery, William J.

    2017-10-01

    Contextual information, revealing relationships and dependencies between image objects, is one of the most important information for the successful interpretation of very-high-resolution (VHR) remote sensing imagery. Over the last decade, geographic object-based image analysis (GEOBIA) technique has been widely used to first divide images into homogeneous parts, and then to assign semantic labels according to the properties of image segments. However, due to the complexity and heterogeneity of VHR images, segments without semantic labels (i.e., semantic-free segments) generated with low-level features often fail to represent geographic entities (such as building roofs usually be partitioned into chimney/antenna/shadow parts). As a result, it is hard to capture contextual information across geographic entities when using semantic-free segments. In contrast to low-level features, "deep" features can be used to build robust segments with accurate labels (i.e., semantic segments) in order to represent geographic entities at higher levels. Based on these semantic segments, semantic graphs can be constructed to capture contextual information in VHR images. In this paper, semantic segments were first explored with convolutional neural networks (CNN) and a conditional random field (CRF) model was then applied to model the contextual information between semantic segments. Experimental results on two challenging VHR datasets (i.e., the Vaihingen and Beijing scenes) indicate that the proposed method is an improvement over existing image classification techniques in classification performance (overall accuracy ranges from 82% to 96%).

  9. Workspaces in the Semantic Web

    Science.gov (United States)

    Wolfe, Shawn R.; Keller, RIchard M.

    2005-01-01

    Due to the recency and relatively limited adoption of Semantic Web technologies. practical issues related to technology scaling have received less attention than foundational issues. Nonetheless, these issues must be addressed if the Semantic Web is to realize its full potential. In particular, we concentrate on the lack of scoping methods that reduce the size of semantic information spaces so they are more efficient to work with and more relevant to an agent's needs. We provide some intuition to motivate the need for such reduced information spaces, called workspaces, give a formal definition, and suggest possible methods of deriving them.

  10. Semantic acquisition games harnessing manpower for creating semantics

    CERN Document Server

    Šimko, Jakub

    2014-01-01

    A comprehensive and extensive review of state-of-the-art in semantics acquisition game (SAG) design A set of design patterns for SAG designers A set of case studies (real SAG projects) demonstrating the use of SAG design patterns

  11. Creating Gaze Annotations in Head Mounted Displays

    DEFF Research Database (Denmark)

    Mardanbeigi, Diako; Qvarfordt, Pernilla

    2015-01-01

    To facilitate distributed communication in mobile settings, we developed GazeNote for creating and sharing gaze annotations in head mounted displays (HMDs). With gaze annotations it possible to point out objects of interest within an image and add a verbal description. To create an annota- tion...

  12. Ground Truth Annotation in T Analyst

    DEFF Research Database (Denmark)

    2015-01-01

    This video shows how to annotate the ground truth tracks in the thermal videos. The ground truth tracks are produced to be able to compare them to tracks obtained from a Computer Vision tracking approach. The program used for annotation is T-Analyst, which is developed by Aliaksei Laureshyn, Ph...

  13. Annotation of regular polysemy and underspecification

    DEFF Research Database (Denmark)

    Martínez Alonso, Héctor; Pedersen, Bolette Sandford; Bel, Núria

    2013-01-01

    We present the result of an annotation task on regular polysemy for a series of seman- tic classes or dot types in English, Dan- ish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods...

  14. Black English Annotations for Elementary Reading Programs.

    Science.gov (United States)

    Prasad, Sandre

    This report describes a program that uses annotations in the teacher's editions of existing reading programs to indicate the characteristics of black English that may interfere with the reading process of black children. The first part of the report provides a rationale for the annotation approach, explaining that the discrepancy between written…

  15. Harnessing Collaborative Annotations on Online Formative Assessments

    Science.gov (United States)

    Lin, Jian-Wei; Lai, Yuan-Cheng

    2013-01-01

    This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…

  16. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop.

    Science.gov (United States)

    Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana

    2010-10-01

    Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world's biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  17. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop

    Directory of Open Access Journals (Sweden)

    Qiandong Zeng

    2010-10-01

    Full Text Available Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world’s biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  18. Annotation an effective device for student feedback: a critical review of the literature.

    Science.gov (United States)

    Ball, Elaine C

    2010-05-01

    of annotation influences student learning and assessment or, indeed, helps tutors to employ better annotative practices [Juwah, C., Macfarlane-Dick, D., Matthew, B., Nicol, D., Ross, D., Smith, B., 2004. Enhancing student learning through effective formative feedback. The Higher Education Academy, 1-40; Jewitt, C., Kress, G., 2005. English in classrooms: only write down what you need to know: annotation for what? English in Education, 39(1), 5-18]. There is little evidence on ways to heighten students' self-awareness when their essays are returned with annotated feedback [Storch, N., Tapper, J., 1997. Student annotations: what NNS and NS university students say about their own writing. Journal of Second Language Writing, 6(3), 245-265]. The literature review clarifies forms of annotation as feedback practice and offers a summary of the challenges and usefulness of annotation. Copyright 2009. Published by Elsevier Ltd.

  19. Essential Requirements for Digital Annotation Systems

    Directory of Open Access Journals (Sweden)

    ADRIANO, C. M.

    2012-06-01

    Full Text Available Digital annotation systems are usually based on partial scenarios and arbitrary requirements. Accidental and essential characteristics are usually mixed in non explicit models. Documents and annotations are linked together accidentally according to the current technology, allowing for the development of disposable prototypes, but not to the support of non-functional requirements such as extensibility, robustness and interactivity. In this paper we perform a careful analysis on the concept of annotation, studying the scenarios supported by digital annotation tools. We also derived essential requirements based on a classification of annotation systems applied to existing tools. The analysis performed and the proposed classification can be applied and extended to other type of collaborative systems.

  20. MIPS bacterial genomes functional annotation benchmark dataset.

    Science.gov (United States)

    Tetko, Igor V; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Fobo, Gisela; Ruepp, Andreas; Antonov, Alexey V; Surmeli, Dimitrij; Mewes, Hans-Wernen

    2005-05-15

    Any development of new methods for automatic functional annotation of proteins according to their sequences requires high-quality data (as benchmark) as well as tedious preparatory work to generate sequence parameters required as input data for the machine learning methods. Different program settings and incompatible protocols make a comparison of the analyzed methods difficult. The MIPS Bacterial Functional Annotation Benchmark dataset (MIPS-BFAB) is a new, high-quality resource comprising four bacterial genomes manually annotated according to the MIPS functional catalogue (FunCat). These resources include precalculated sequence parameters, such as sequence similarity scores, InterPro domain composition and other parameters that could be used to develop and benchmark methods for functional annotation of bacterial protein sequences. These data are provided in XML format and can be used by scientists who are not necessarily experts in genome annotation. BFAB is available at http://mips.gsf.de/proj/bfab

  1. XSemantic: An Extension of LCA Based XML Semantic Search

    Science.gov (United States)

    Supasitthimethee, Umaporn; Shimizu, Toshiyuki; Yoshikawa, Masatoshi; Porkaew, Kriengkrai

    One of the most convenient ways to query XML data is a keyword search because it does not require any knowledge of XML structure or learning a new user interface. However, the keyword search is ambiguous. The users may use different terms to search for the same information. Furthermore, it is difficult for a system to decide which node is likely to be chosen as a return node and how much information should be included in the result. To address these challenges, we propose an XML semantic search based on keywords called XSemantic. On the one hand, we give three definitions to complete in terms of semantics. Firstly, the semantic term expansion, our system is robust from the ambiguous keywords by using the domain ontology. Secondly, to return semantic meaningful answers, we automatically infer the return information from the user queries and take advantage of the shortest path to return meaningful connections between keywords. Thirdly, we present the semantic ranking that reflects the degree of similarity as well as the semantic relationship so that the search results with the higher relevance are presented to the users first. On the other hand, in the LCA and the proximity search approaches, we investigated the problem of information included in the search results. Therefore, we introduce the notion of the Lowest Common Element Ancestor (LCEA) and define our simple rule without any requirement on the schema information such as the DTD or XML Schema. The first experiment indicated that XSemantic not only properly infers the return information but also generates compact meaningful results. Additionally, the benefits of our proposed semantics are demonstrated by the second experiment.

  2. Ion implantation: an annotated bibliography

    International Nuclear Information System (INIS)

    Ting, R.N.; Subramanyam, K.

    1975-10-01

    Ion implantation is a technique for introducing controlled amounts of dopants into target substrates, and has been successfully used for the manufacture of silicon semiconductor devices. Ion implantation is superior to other methods of doping such as thermal diffusion and epitaxy, in view of its advantages such as high degree of control, flexibility, and amenability to automation. This annotated bibliography of 416 references consists of journal articles, books, and conference papers in English and foreign languages published during 1973-74, on all aspects of ion implantation including range distribution and concentration profile, channeling, radiation damage and annealing, compound semiconductors, structural and electrical characterization, applications, equipment and ion sources. Earlier bibliographies on ion implantation, and national and international conferences in which papers on ion implantation were presented have also been listed separately

  3. Semantic Knowledge Representation (SKR) API

    Data.gov (United States)

    U.S. Department of Health & Human Services — The SKR Project was initiated at NLM in order to develop programs to provide usable semantic representation of biomedical free text by building on resources...

  4. Problem Solving with General Semantics.

    Science.gov (United States)

    Hewson, David

    1996-01-01

    Discusses how to use general semantics formulations to improve problem solving at home or at work--methods come from the areas of artificial intelligence/computer science, engineering, operations research, and psychology. (PA)

  5. SCC: Semantic Context Cascade for Efficient Action Detection

    KAUST Repository

    Heilbron, Fabian Caba

    2017-11-09

    Despite the recent advances in large-scale video analysis, action detection remains as one of the most challenging unsolved problems in computer vision. This snag is in part due to the large volume of data that needs to be analyzed to detect actions in videos. Existing approaches have mitigated the computational cost, but still, these methods lack rich high-level semantics that helps them to localize the actions quickly. In this paper, we introduce a Semantic Cascade Context (SCC) model that aims to detect action in long video sequences. By embracing semantic priors associated with human activities, SCC produces high-quality class-specific action proposals and prune unrelated activities in a cascade fashion. Experimental results in ActivityNet unveils that SCC achieves state-of-the-art performance for action detection while operating at real time.

  6. An Architecture for Semantically Interoperable Electronic Health Records.

    Science.gov (United States)

    Toffanello, André; Gonçalves, Ricardo; Kitajima, Adriana; Puttini, Ricardo; Aguiar, Atualpa

    2017-01-01

    Despite the increasing adhesion of electronic health records, the challenge of semantic interoperability remains unsolved. The fact that different parties can exchange messages does not mean they can understand the underlying clinical meaning, therefore, it cannot be assumed or treated as a requirement. This work introduces an architecture designed to achieve semantic interoperability, in a way which organizations that follow different policies may still share medical information through a common infrastructure comparable to an ecosystem, whose organisms are exemplified within the Brazilian scenario. Nonetheless, the proposed approach describes a service-oriented design with modules adaptable to different contexts. We also discuss the establishment of an enterprise service bus to mediate a health infrastructure defined on top of international standards, such as openEHR and IHE. Moreover, we argue that, in order to achieve truly semantic interoperability in a wide sense, a proper profile must be published and maintained.

  7. Automatic annotation of head velocity and acceleration in Anvil

    DEFF Research Database (Denmark)

    Jongejan, Bart

    2012-01-01

    We describe an automatic face tracker plugin for the ANVIL annotation tool. The face tracker produces data for velocity and for acceleration in two dimensions. We compare the annotations generated by the face tracking algorithm with independently made manual annotations for head movements....... The annotations are a useful supplement to manual annotations and may help human annotators to quickly and reliably determine onset of head movements and to suggest which kind of head movement is taking place....

  8. NASA and The Semantic Web

    Science.gov (United States)

    Ashish, Naveen

    2005-01-01

    We provide an overview of several ongoing NASA endeavors based on concepts, systems, and technology from the Semantic Web arena. Indeed NASA has been one of the early adopters of Semantic Web Technology and we describe ongoing and completed R&D efforts for several applications ranging from collaborative systems to airspace information management to enterprise search to scientific information gathering and discovery systems at NASA.

  9. Are Some Semantic Changes Predictable?

    DEFF Research Database (Denmark)

    Schousboe, Steen

    2010-01-01

      Historical linguistics is traditionally concerned with phonology and syntax. With the exception of grammaticalization - the development of auxiliary verbs, the syntactic rather than localistic use of prepositions, etc. - semantic change has usually not been described as a result of regular...... developments, but only as specific meaning changes in individual words. This paper will suggest some regularities in semantic change, regularities which, like sound laws, have predictive power and can be tested against recorded languages....

  10. Efficient computation of argumentation semantics

    CERN Document Server

    Liao, Beishui

    2013-01-01

    Efficient Computation of Argumentation Semantics addresses argumentation semantics and systems, introducing readers to cutting-edge decomposition methods that drive increasingly efficient logic computation in AI and intelligent systems. Such complex and distributed systems are increasingly used in the automation and transportation systems field, and particularly autonomous systems, as well as more generic intelligent computation research. The Series in Intelligent Systems publishes titles that cover state-of-the-art knowledge and the latest advances in research and development in intelligen

  11. Semantic representation of scientific literature: bringing claims, contributions and named entities onto the Linked Open Data cloud

    Directory of Open Access Journals (Sweden)

    Bahar Sateli

    2015-12-01

    Full Text Available Motivation. Finding relevant scientific literature is one of the essential tasks researchers are facing on a daily basis. Digital libraries and web information retrieval techniques provide rapid access to a vast amount of scientific literature. However, no further automated support is available that would enable fine-grained access to the knowledge ‘stored’ in these documents. The emerging domain of Semantic Publishing aims at making scientific knowledge accessible to both humans and machines, by adding semantic annotations to content, such as a publication’s contributions, methods, or application domains. However, despite the promises of better knowledge access, the manual annotation of existing research literature is prohibitively expensive for wide-spread adoption. We argue that a novel combination of three distinct methods can significantly advance this vision in a fully-automated way: (i Natural Language Processing (NLP for Rhetorical Entity (RE detection; (ii Named Entity (NE recognition based on the Linked Open Data (LOD cloud; and (iii automatic knowledge base construction for both NEs and REs using semantic web ontologies that interconnect entities in documents with the machine-readable LOD cloud.Results. We present a complete workflow to transform scientific literature into a semantic knowledge base, based on the W3C standards RDF and RDFS. A text mining pipeline, implemented based on the GATE framework, automatically extracts rhetorical entities of type Claims and Contributions from full-text scientific literature. These REs are further enriched with named entities, represented as URIs to the linked open data cloud, by integrating the DBpedia Spotlight tool into our workflow. Text mining results are stored in a knowledge base through a flexible export process that provides for a dynamic mapping of semantic annotations to LOD vocabularies through rules stored in the knowledge base. We created a gold standard corpus from computer

  12. Graph-based Operational Semantics of a Lazy Functional Languages

    DEFF Research Database (Denmark)

    Rose, Kristoffer Høgsbro

    1992-01-01

    Presents Graph Operational Semantics (GOS): a semantic specification formalism based on structural operational semantics and term graph rewriting. Demonstrates the method by specifying the dynamic ...

  13. COTARD SYNDROME IN SEMANTIC DEMENTIA

    Science.gov (United States)

    Mendez, Mario F.; Ramírez-Bermúdez, Jesús

    2011-01-01

    Background Semantic dementia is a neurodegenerative disorder characterized by the loss of meaning of words or concepts. semantic dementia can offer potential insights into the mechanisms of content-specific delusions. Objective The authors present a rare case of semantic dementia with Cotard syndrome, a delusion characterized by nihilism or self-negation. Method The semantic deficits and other features of semantic dementia were evaluated in relation to the patient's Cotard syndrome. Results Mrs. A developed the delusional belief that she was wasting and dying. This occurred after she lost knowledge for her somatic discomforts and sensations and for the organs that were the source of these sensations. Her nihilistic beliefs appeared to emerge from her misunderstanding of her somatic sensations. Conclusion This unique patient suggests that a mechanism for Cotard syndrome is difficulty interpreting the nature and source of internal pains and sensations. We propose that loss of semantic knowledge about one's own body may lead to the delusion of nihilism or death. PMID:22054629

  14. Making web annotations persistent over time

    Energy Technology Data Exchange (ETDEWEB)

    Sanderson, Robert [Los Alamos National Laboratory; Van De Sompel, Herbert [Los Alamos National Laboratory

    2010-01-01

    As Digital Libraries (DL) become more aligned with the web architecture, their functional components need to be fundamentally rethought in terms of URIs and HTTP. Annotation, a core scholarly activity enabled by many DL solutions, exhibits a clearly unacceptable characteristic when existing models are applied to the web: due to the representations of web resources changing over time, an annotation made about a web resource today may no longer be relevant to the representation that is served from that same resource tomorrow. We assume the existence of archived versions of resources, and combine the temporal features of the emerging Open Annotation data model with the capability offered by the Memento framework that allows seamless navigation from the URI of a resource to archived versions of that resource, and arrive at a solution that provides guarantees regarding the persistence of web annotations over time. More specifically, we provide theoretical solutions and proof-of-concept experimental evaluations for two problems: reconstructing an existing annotation so that the correct archived version is displayed for all resources involved in the annotation, and retrieving all annotations that involve a given archived version of a web resource.

  15. Assessment of disease named entity recognition on a corpus of annotated sentences.

    Science.gov (United States)

    Jimeno, Antonio; Jimenez-Ruiz, Ernesto; Lee, Vivian; Gaudan, Sylvain; Berlanga, Rafael; Rebholz-Schuhmann, Dietrich

    2008-04-11

    In recent years, the recognition of semantic types from the biomedical scientific literature has been focused on named entities like protein and gene names (PGNs) and gene ontology terms (GO terms). Other semantic types like diseases have not received the same level of attention. Different solutions have been proposed to identify disease named entities in the scientific literature. While matching the terminology with language patterns suffers from low recall (e.g., Whatizit) other solutions make use of morpho-syntactic features to better cover the full scope of terminological variability (e.g., MetaMap). Currently, MetaMap that is provided from the National Library of Medicine (NLM) is the state of the art solution for the annotation of concepts from UMLS (Unified Medical Language System) in the literature. Nonetheless, its performance has not yet been assessed on an annotated corpus. In addition, little effort has been invested so far to generate an annotated dataset that links disease entities in text to disease entries in a database, thesaurus or ontology and that could serve as a gold standard to benchmark text mining solutions. As part of our research work, we have taken a corpus that has been delivered in the past for the identification of associations of genes to diseases based on the UMLS Metathesaurus and we have reprocessed and re-annotated the corpus. We have gathered annotations for disease entities from two curators, analyzed their disagreement (0.51 in the kappa-statistic) and composed a single annotated corpus for public use. Thereafter, three solutions for disease named entity recognition including MetaMap have been applied to the corpus to automatically annotate it with UMLS Metathesaurus concepts. The resulting annotations have been benchmarked to compare their performance. The annotated corpus is publicly available at ftp://ftp.ebi.ac.uk/pub/software/textmining/corpora/diseases and can serve as a benchmark to other systems. In addition, we found

  16. COGNATE: comparative gene annotation characterizer.

    Science.gov (United States)

    Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver

    2017-07-17

    The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https

  17. Crowdsourcing and annotating NER for Twitter #drift

    DEFF Research Database (Denmark)

    Fromreide, Hege; Hovy, Dirk; Søgaard, Anders

    2014-01-01

    We present two new NER datasets for Twitter; a manually annotated set of 1,467 tweets (kappa=0.942) and a set of 2,975 expert-corrected, crowdsourced NER annotated tweets from the dataset described in Finin et al. (2010). In our experiments with these datasets, we observe two important points: (a......) language drift on Twitter is significant, and while off-the-shelf systems have been reported to perform well on in-sample data, they often perform poorly on new samples of tweets, (b) state-of-the-art performance across various datasets can beobtained from crowdsourced annotations, making it more feasible...

  18. AIGO: Towards a unified framework for the Analysis and the Inter-comparison of GO functional annotations

    Directory of Open Access Journals (Sweden)

    Defoin-Platel Michael

    2011-11-01

    Full Text Available Abstract Background In response to the rapid growth of available genome sequences, efforts have been made to develop automatic inference methods to functionally characterize them. Pipelines that infer functional annotation are now routinely used to produce new annotations at a genome scale and for a broad variety of species. These pipelines differ widely in their inference algorithms, confidence thresholds and data sources for reasoning. This heterogeneity makes a comparison of the relative merits of each approach extremely complex. The evaluation of the quality of the resultant annotations is also challenging given there is often no existing gold-standard against which to evaluate precision and recall. Results In this paper, we present a pragmatic approach to the study of functional annotations. An ensemble of 12 metrics, describing various aspects of functional annotations, is defined and implemented in a unified framework, which facilitates their systematic analysis and inter-comparison. The use of this framework is demonstrated on three illustrative examples: analysing the outputs of state-of-the-art inference pipelines, comparing electronic versus manual annotation methods, and monitoring the evolution of publicly available functional annotations. The framework is part of the AIGO library (http://code.google.com/p/aigo for the Analysis and the Inter-comparison of the products of Gene Ontology (GO annotation pipelines. The AIGO library also provides functionalities to easily load, analyse, manipulate and compare functional annotations and also to plot and export the results of the analysis in various formats. Conclusions This work is a step toward developing a unified framework for the systematic study of GO functional annotations. This framework has been designed so that new metrics on GO functional annotations can be added in a very straightforward way.

  19. Process-oriented semantic web search

    CERN Document Server

    Tran, DT

    2011-01-01

    The book is composed of two main parts. The first part is a general study of Semantic Web Search. The second part specifically focuses on the use of semantics throughout the search process, compiling a big picture of Process-oriented Semantic Web Search from different pieces of work that target specific aspects of the process.In particular, this book provides a rigorous account of the concepts and technologies proposed for searching resources and semantic data on the Semantic Web. To collate the various approaches and to better understand what the notion of Semantic Web Search entails, this bo

  20. Bridging the Gap: Enriching YouTube Videos with Jazz Music Annotations

    Directory of Open Access Journals (Sweden)

    Stefan Balke

    2018-02-01

    Full Text Available Web services allow permanent access to music from all over the world. Especially in the case of web services with user-supplied content, e.g., YouTube™, the available metadata is often incomplete or erroneous. On the other hand, a vast amount of high-quality and musically relevant metadata has been annotated in research areas such as Music Information Retrieval (MIR. Although they have great potential, these musical annotations are often inaccessible to users outside the academic world. With our contribution, we want to bridge this gap by enriching publicly available multimedia content with musical annotations available in research corpora, while maintaining easy access to the underlying data. Our web-based tools offer researchers and music lovers novel possibilities to interact with and navigate through the content. In this paper, we consider a research corpus called the Weimar Jazz Database (WJD as an illustrating example scenario. The WJD contains various annotations related to famous jazz solos. First, we establish a link between the WJD annotations and corresponding YouTube videos employing existing retrieval techniques. With these techniques, we were able to identify 988 corresponding YouTube videos for 329 solos out of 456 solos contained in the WJD. We then embed the retrieved videos in a recently developed web-based platform and enrich the videos with solo transcriptions that are part of the WJD. Furthermore, we integrate publicly available data resources from the Semantic Web in order to extend the presented information, for example, with a detailed discography or artists-related information. Our contribution illustrates the potential of modern web-based technologies for the digital humanities, and novel ways for improving access and interaction with digitized multimedia content.

  1. Algal Functional Annotation Tool: a web-based analysis suite to functionally interpret large gene lists using integrated annotation and expression data

    Directory of Open Access Journals (Sweden)

    Merchant Sabeeha S

    2011-07-01

    Full Text Available Abstract Background Progress in genome sequencing is proceeding at an exponential pace, and several new algal genomes are becoming available every year. One of the challenges facing the community is the association of protein sequences encoded in the genomes with biological function. While most genome assembly projects generate annotations for predicted protein sequences, they are usually limited and integrate functional terms from a limited number of databases. Another challenge is the use of annotations to interpret large lists of 'interesting' genes generated by genome-scale datasets. Previously, these gene lists had to be analyzed across several independent biological databases, often on a gene-by-gene basis. In contrast, several annotation databases, such as DAVID, integrate data from multiple functional databases and reveal underlying biological themes of large gene lists. While several such databases have been constructed for animals, none is currently available for the study of algae. Due to renewed interest in algae as potential sources of biofuels and the emergence of multiple algal genome sequences, a significant need has arisen for such a database to process the growing compendiums of algal genomic data. Description The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of

  2. The duplicated genes database: identification and functional annotation of co-localised duplicated genes across genomes.

    Directory of Open Access Journals (Sweden)

    Marion Ouedraogo

    Full Text Available BACKGROUND: There has been a surge in studies linking genome structure and gene expression, with special focus on duplicated genes. Although initially duplicated from the same sequence, duplicated genes can diverge strongly over evolution and take on different functions or regulated expression. However, information on the function and expression of duplicated genes remains sparse. Identifying groups of duplicated genes in different genomes and characterizing their expression and function would therefore be of great interest to the research community. The 'Duplicated Genes Database' (DGD was developed for this purpose. METHODOLOGY: Nine species were included in the DGD. For each species, BLAST analyses were conducted on peptide sequences corresponding to the genes mapped on a same chromosome. Groups of duplicated genes were defined based on these pairwise BLAST comparisons and the genomic location of the genes. For each group, Pearson correlations between gene expression data and semantic similarities between functional GO annotations were also computed when the relevant information was available. CONCLUSIONS: The Duplicated Gene Database provides a list of co-localised and duplicated genes for several species with the available gene co-expression level and semantic similarity value of functional annotation. Adding these data to the groups of duplicated genes provides biological information that can prove useful to gene expression analyses. The Duplicated Gene Database can be freely accessed through the DGD website at http://dgd.genouest.org.

  3. Big Data Semantics

    NARCIS (Netherlands)

    Ceravolo, Paolo; Azzini, Antonia; Angelini, Marco; Catarci, Tiziana; Cudré-Mauroux, Philippe; Damiani, Ernesto; Mazak, Alexandra; van Keulen, Maurice; Jarrar, Mustafa; Santucci, Giuseppe; Sattler, Kai-Uwe; Scannapieco, Monica; Wimmer, Manuel; Wrembel, Robert; Zaraket, Fadi

    2018-01-01

    Big Data technology has discarded traditional data modeling approaches as no longer applicable to distributed data processing. It is, however, largely recognized that Big Data impose novel challenges in data and infrastructure management. Indeed, multiple components and procedures must be

  4. Annotations to quantum statistical mechanics

    CERN Document Server

    Kim, In-Gee

    2018-01-01

    This book is a rewritten and annotated version of Leo P. Kadanoff and Gordon Baym’s lectures that were presented in the book Quantum Statistical Mechanics: Green’s Function Methods in Equilibrium and Nonequilibrium Problems. The lectures were devoted to a discussion on the use of thermodynamic Green’s functions in describing the properties of many-particle systems. The functions provided a method for discussing finite-temperature problems with no more conceptual difficulty than ground-state problems, and the method was equally applicable to boson and fermion systems and equilibrium and nonequilibrium problems. The lectures also explained nonequilibrium statistical physics in a systematic way and contained essential concepts on statistical physics in terms of Green’s functions with sufficient and rigorous details. In-Gee Kim thoroughly studied the lectures during one of his research projects but found that the unspecialized method used to present them in the form of a book reduced their readability. He st...

  5. Meteor showers an annotated catalog

    CERN Document Server

    Kronk, Gary W

    2014-01-01

    Meteor showers are among the most spectacular celestial events that may be observed by the naked eye, and have been the object of fascination throughout human history. In “Meteor Showers: An Annotated Catalog,” the interested observer can access detailed research on over 100 annual and periodic meteor streams in order to capitalize on these majestic spectacles. Each meteor shower entry includes details of their discovery, important observations and orbits, and gives a full picture of duration, location in the sky, and expected hourly rates. Armed with a fuller understanding, the amateur observer can better view and appreciate the shower of their choice. The original book, published in 1988, has been updated with over 25 years of research in this new and improved edition. Almost every meteor shower study is expanded, with some original minor showers being dropped while new ones are added. The book also includes breakthroughs in the study of meteor showers, such as accurate predictions of outbursts as well ...

  6. Only time will tell - why temporal information is essential for our neuroscientific understanding of semantics.

    Science.gov (United States)

    Hauk, Olaf

    2016-08-01

    Theoretical developments about the nature of semantic representations and processes should be accompanied by a discussion of how these theories can be validated on the basis of empirical data. Here, I elaborate on the link between theory and empirical research, highlighting the need for temporal information in order to distinguish fundamental aspects of semantics. The generic point that fast cognitive processes demand fast measurement techniques has been made many times before, although arguably more often in the psychophysiological community than in the metabolic neuroimaging community. Many reviews on the neuroscience of semantics mostly or even exclusively focus on metabolic neuroimaging data. Following an analysis of semantics in terms of the representations and processes involved, I argue that fundamental theoretical debates about the neuroscience of semantics can only be concluded on the basis of data with sufficient temporal resolution. Any "semantic effect" may result from a conflation of long-term memory representations, retrieval and working memory processes, mental imagery, and episodic memory. This poses challenges for all neuroimaging modalities, but especially for those with low temporal resolution. It also throws doubt on the usefulness of contrasts between meaningful and meaningless stimuli, which may differ on a number of semantic and non-semantic dimensions. I will discuss the consequences of this analysis for research on the role of convergence zones or hubs and distributed modal brain networks, top-down modulation of task and context as well as interactivity between levels of the processing hierarchy, for example in the framework of predictive coding.

  7. Psychologizing the Semantics of Fiction

    Directory of Open Access Journals (Sweden)

    John Woods

    2010-04-01

    Full Text Available Psychologiser la sémantique de la fictionLes théoriciens sémantistes de la fiction cherchent typiquement à expliquer nos relations sémantiques au fictionnel dans le contexte plus général des théories de la référence, privilégiant une explication de la sémantique sur le psychologique. Dans cet article, nous défendons une dépendance inverse. Par l’éclaircissement de nos relations psychologiques au fictionnel, nous trouverons un guide pour savoir comment développer une sémantique de la fiction. S’ensuivra une esquisse de la sémantique.Semantic theorists of fiction typically look for an account of our semantic relations to the fictional within general-purpose theories of reference, privileging an explanation of the semantic over the psychological. In this paper, we counsel a reverse dependency. In sorting out our psychological relations to the fictional, there is useful guidance about how to proceed with the semantics of fiction. A sketch of the semantics follows.

  8. Semantic graphs and associative memories

    Science.gov (United States)

    Pomi, Andrés; Mizraji, Eduardo

    2004-12-01

    Graphs have been increasingly utilized in the characterization of complex networks from diverse origins, including different kinds of semantic networks. Human memories are associative and are known to support complex semantic nets; these nets are represented by graphs. However, it is not known how the brain can sustain these semantic graphs. The vision of cognitive brain activities, shown by modern functional imaging techniques, assigns renewed value to classical distributed associative memory models. Here we show that these neural network models, also known as correlation matrix memories, naturally support a graph representation of the stored semantic structure. We demonstrate that the adjacency matrix of this graph of associations is just the memory coded with the standard basis of the concept vector space, and that the spectrum of the graph is a code invariant of the memory. As long as the assumptions of the model remain valid this result provides a practical method to predict and modify the evolution of the cognitive dynamics. Also, it could provide us with a way to comprehend how individual brains that map the external reality, almost surely with different particular vector representations, are nevertheless able to communicate and share a common knowledge of the world. We finish presenting adaptive association graphs, an extension of the model that makes use of the tensor product, which provides a solution to the known problem of branching in semantic nets.

  9. ERPs, semantic processing and age.

    Science.gov (United States)

    Miyamoto, T; Katayama, J; Koyama, T

    1998-06-01

    ERPs (N400, LPC and CNV) were elicited in two sets of subjects grouped according to age (young vs. elderly) using a word-pair category matching paradigm. Each prime consisted of a Japanese noun (constructed from two to four characters of the Hiragana) followed by one Chinese character (Kanji) as the target, this latter representing one of five semantic categories. There were two equally probable target conditions: match or mismatch. Each target was preceded by a prime, either belonging to, or not belonging to, the same semantic category. The subjects were required to respond with a specified button press to the given target according to the condition. We found RTs to be longer in the elderly subjects and under the mismatch condition. N400 amplitude was reduced in the elderly subjects under the mismatch condition and there was no difference between match and mismatch response, which were similar in amplitude to that under match condition for the young subjects. In addition, the CNV amplitudes were larger in the elderly subjects. These results suggested that functional changes in semantic processing through aging (larger semantic networks and diffuse semantic activation) were the cause of this N400 reduction, attributing a subsidiary role to attentional disturbance. We also discuss the importance of taking age-related changes into consideration in clinical studies.

  10. The influence of annotation in graphical organizers

    NARCIS (Netherlands)

    Bezdan, Eniko; Kester, Liesbeth; Kirschner, Paul A.

    2013-01-01

    Bezdan, E., Kester, L., & Kirschner, P. A. (2012, 29-31 August). The influence of annotation in graphical organizers. Poster presented at the biannual meeting of the EARLI Special Interest Group Comprehension of Text and Graphics, Grenoble, France.

  11. An Informally Annotated Bibliography of Sociolinguistics.

    Science.gov (United States)

    Tannen, Deborah

    This annotated bibliography of sociolinguistics is divided into the following sections: speech events, ethnography of speaking and anthropological approaches to analysis of conversation; discourse analysis (including analysis of conversation and narrative), ethnomethodology and nonverbal communication; sociolinguistics; pragmatics (including…

  12. The Community Junior College: An Annotated Bibliography.

    Science.gov (United States)

    Rarig, Emory W., Jr., Ed.

    This annotated bibliography on the junior college is arranged by topic: research tools, history, functions and purposes, organization and administration, students, programs, personnel, facilities, and research. It covers publications through the fall of 1965 and has an author index. (HH)

  13. WormBase: Annotating many nematode genomes.

    Science.gov (United States)

    Howe, Kevin; Davis, Paul; Paulini, Michael; Tuli, Mary Ann; Williams, Gary; Yook, Karen; Durbin, Richard; Kersey, Paul; Sternberg, Paul W

    2012-01-01

    WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.

  14. Annotated Tsunami bibliography: 1962-1976

    International Nuclear Information System (INIS)

    Pararas-Carayannis, G.; Dong, B.; Farmer, R.

    1982-08-01

    This compilation contains annotated citations to nearly 3000 tsunami-related publications from 1962 to 1976 in English and several other languages. The foreign-language citations have English titles and abstracts

  15. GRADUATE AND PROFESSIONAL EDUCATION, AN ANNOTATED BIBLIOGRAPHY.

    Science.gov (United States)

    HEISS, ANN M.; AND OTHERS

    THIS ANNOTATED BIBLIOGRAPHY CONTAINS REFERENCES TO GENERAL GRADUATE EDUCATION AND TO EDUCATION FOR THE FOLLOWING PROFESSIONAL FIELDS--ARCHITECTURE, BUSINESS, CLINICAL PSYCHOLOGY, DENTISTRY, ENGINEERING, LAW, LIBRARY SCIENCE, MEDICINE, NURSING, SOCIAL WORK, TEACHING, AND THEOLOGY. (HW)

  16. Microtask crowdsourcing for disease mention annotation in PubMed abstracts.

    Science.gov (United States)

    Good, Benjamin M; Nanis, Max; Wu, Chunlei; Su, Andrew I

    2015-01-01

    Identifying concepts and relationships in biomedical text enables knowledge to be applied in computational analyses. Many biological natural language processing (BioNLP) projects attempt to address this challenge, but the state of the art still leaves much room for improvement. Progress in BioNLP research depends on large, annotated corpora for evaluating information extraction systems and training machine learning models. Traditionally, such corpora are created by small numbers of expert annotators often working over extended periods of time. Recent studies have shown that workers on microtask crowdsourcing platforms such as Amazon's Mechanical Turk (AMT) can, in aggregate, generate high-quality annotations of biomedical text. Here, we investigated the use of the AMT in capturing disease mentions in PubMed abstracts. We used the NCBI Disease corpus as a gold standard for refining and benchmarking our crowdsourcing protocol. After several iterations, we arrived at a protocol that reproduced the annotations of the 593 documents in the 'training set' of this gold standard with an overall F measure of 0.872 (precision 0.862, recall 0.883). The output can also be tuned to optimize for precision (max = 0.984 when recall = 0.269) or recall (max = 0.980 when precision = 0.436). Each document was completed by 15 workers, and their annotations were merged based on a simple voting method. In total 145 workers combined to complete all 593 documents in the span of 9 days at a cost of $.066 per abstract per worker. The quality of the annotations, as judged with the F measure, increases with the number of workers assigned to each task; however minimal performance gains were observed beyond 8 workers per task. These results add further evidence that microtask crowdsourcing can be a valuable tool for generating well-annotated corpora in BioNLP. Data produced for this analysis are available at http://figshare.com/articles/Disease_Mention_Annotation_with_Mechanical_Turk/1126402.

  17. BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data

    Science.gov (United States)

    Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Pareja, Eduardo; Tobes, Raquel

    2012-01-01

    BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version – which is developed in Java, takes advantage of Amazon Web Services (AWS) cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future. PMID:23185310

  18. BG7: a new approach for bacterial genome annotation designed for next generation sequencing data.

    Directory of Open Access Journals (Sweden)

    Pablo Pareja-Tobes

    Full Text Available BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version - which is developed in Java, takes advantage of Amazon Web Services (AWS cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future.

  19. Contributions to In Silico Genome Annotation

    KAUST Repository

    Kalkatawi, Manal M.

    2017-11-30

    Genome annotation is an important topic since it provides information for the foundation of downstream genomic and biological research. It is considered as a way of summarizing part of existing knowledge about the genomic characteristics of an organism. Annotating different regions of a genome sequence is known as structural annotation, while identifying functions of these regions is considered as a functional annotation. In silico approaches can facilitate both tasks that otherwise would be difficult and timeconsuming. This study contributes to genome annotation by introducing several novel bioinformatics methods, some based on machine learning (ML) approaches. First, we present Dragon PolyA Spotter (DPS), a method for accurate identification of the polyadenylation signals (PAS) within human genomic DNA sequences. For this, we derived a novel feature-set able to characterize properties of the genomic region surrounding the PAS, enabling development of high accuracy optimized ML predictive models. DPS considerably outperformed the state-of-the-art results. The second contribution concerns developing generic models for structural annotation, i.e., the recognition of different genomic signals and regions (GSR) within eukaryotic DNA. We developed DeepGSR, a systematic framework that facilitates generating ML models to predict GSR with high accuracy. To the best of our knowledge, no available generic and automated method exists for such task that could facilitate the studies of newly sequenced organisms. The prediction module of DeepGSR uses deep learning algorithms to derive highly abstract features that depend mainly on proper data representation and hyperparameters calibration. DeepGSR, which was evaluated on recognition of PAS and translation initiation sites (TIS) in different organisms, yields a simpler and more precise representation of the problem under study, compared to some other hand-tailored models, while producing high accuracy prediction results. Finally

  20. Semantic-JSON: a lightweight web service interface for Semantic Web contents integrating multiple life science databases.

    Science.gov (United States)

    Kobayashi, Norio; Ishii, Manabu; Takahashi, Satoshi; Mochizuki, Yoshiki; Matsushima, Akihiro; Toyoda, Tetsuro

    2011-07-01

    Global cloud frameworks for bioinformatics research databases become huge and heterogeneous; solutions face various diametric challenges comprising cross-integration, retrieval, security and openness. To address this, as of March 2011 organizations including RIKEN published 192 mammalian, plant and protein life sciences databases having 8.2 million data records, integrated as Linked Open or Private Data (LOD/LPD) using SciNetS.org, the Scientists' Networking System. The huge quantity of linked data this database integration framework covers is based on the Semantic Web, where researchers collaborate by managing metadata across public and private databases in a secured data space. This outstripped the data query capacity of existing interface tools like SPARQL. Actual research also requires specialized tools for data analysis using raw original data. To solve these challenges, in December 2009 we developed the lightweight Semantic-JSON interface to access each fragment of linked and raw life sciences data securely under the control of programming languages popularly used by bioinformaticians such as Perl and Ruby. Researchers successfully used the interface across 28 million semantic relationships for biological applications including genome design, sequence processing, inference over phenotype databases, full-text search indexing and human-readable contents like ontology and LOD tree viewers. Semantic-JSON services of SciNetS.org are provided at http://semanticjson.org.

  1. Semantics-Based Intelligent Indexing and Retrieval of Digital Images - A Case Study

    Science.gov (United States)

    Osman, Taha; Thakker, Dhavalkumar; Schaefer, Gerald

    The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they typically rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this chapter we present a semantically enabled image annotation and retrieval engine that is designed to satisfy the requirements of commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as presenting our initial thoughts on exploiting lexical databases for explicit semantic-based query expansion.

  2. Computational methods for corpus annotation and analysis

    CERN Document Server

    Lu, Xiaofei

    2014-01-01

    This book reviews computational tools for lexical, syntactic, semantic, pragmatic and discourse analysis, with instructions on how to obtain, install and use each tool. Covers studies using Natural Language Processing, and offers ideas for better integration.

  3. Fluid Annotations in a Open World

    DEFF Research Database (Denmark)

    Zellweger, Polle Trescott; Bouvin, Niels Olof; Jehøj, Henning

    2001-01-01

    Fluid Documents use animated typographical changes to provide a novel and appealing user experience for hypertext browsing and for viewing document annotations in context. This paper describes an effort to broaden the utility of Fluid Documents by using the open hypermedia Arakne Environment to l...... to layer fluid annotations and links on top of abitrary HTML pages on the World Wide Web. Changes to both Fluid Documents and Arakne are required....

  4. Toward an E-Government Semantic Platform

    Science.gov (United States)

    Sbodio, Marco Luca; Moulin, Claude; Benamou, Norbert; Barthès, Jean-Paul

    This chapter describes the major aspects of an e-government platform in which semantics underpins more traditional technologies in order to enable new capabilities and to overcome technical and cultural challenges. The design and development of such an e-government Semantic Platform has been conducted with the financial support of the European Commission through the Terregov research project: "Impact of e-government on Territorial Government Services" (Terregov 2008). The goal of this platform is to let local government and government agencies offer online access to their services in an interoperable way, and to allow them to participate in orchestrated processes involving services provided by multiple agencies. Implementing a business process through an electronic procedure is indeed a core goal in any networked organization. However, the field of e-government brings specific constraints to the operations allowed in procedures, especially concerning the flow of private citizens' data: because of legal reasons in most countries, such data are allowed to circulate only from agency to agency directly. In order to promote transparency and responsibility in e-government while respecting the specific constraints on data flows, Terregov supports the creation of centrally controlled orchestrated processes; while the cross agencies data flows are centrally managed, data flow directly across agencies.

  5. JGI Plant Genomics Gene Annotation Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Shu, Shengqiang; Rokhsar, Dan; Goodstein, David; Hayes, David; Mitros, Therese

    2014-07-14

    Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward this aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.

  6. Annotating the human genome with Disease Ontology

    Science.gov (United States)

    Osborne, John D; Flatow, Jared; Holko, Michelle; Lin, Simon M; Kibbe, Warren A; Zhu, Lihua (Julie); Danila, Maria I; Feng, Gang; Chisholm, Rex L

    2009-01-01

    Background The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases. Results We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to discover gene-disease relationships from the GeneRIF database. We utilized a comprehensive subset of UMLS, which is disease-focused and structured as a directed acyclic graph (the Disease Ontology), to filter and interpret results from MMTx. The results were validated against the Homayouni gene collection using recall and precision measurements. We compared our results with the widely used Online Mendelian Inheritance in Man (OMIM) annotations. Conclusion The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% recall and 98% precision using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching. The much higher recall rate of our approach demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome. PMID:19594883

  7. Compiling Dictionaries Using Semantic Domains*

    Directory of Open Access Journals (Sweden)

    Ronald Moe

    2011-10-01

    Full Text Available

    Abstract: The task of providing dictionaries for all the world's languages is prodigious, re-quiring efficient techniques. The text corpus method cannot be used for minority languages lacking texts. To meet the need, the author has constructed a list of 1 600 semantic domains, which he has successfully used to collect words. In a workshop setting, a group of speakers can collect as many as 17 000 words in ten days. This method results in a classified word list that can be efficiently expanded into a full dictionary. The method works because the mental lexicon is a giant web or-ganized around key concepts. A semantic domain can be defined as an important concept together with the words directly related to it by lexical relations. A person can utilize the mental web to quickly jump from word to word within a domain. The author is developing a template for each domain to aid in collecting words and in de-scribing their semantics. Investigating semantics within the context of a domain yields many in-sights. The method permits the production of both alphabetically and semantically organized dic-tionaries. The list of domains is intended to be universal in scope and applicability. Perhaps due to universals of human experience and universals of linguistic competence, there are striking simi-larities in various lists of semantic domains developed for languages around the world. Using a standardized list of domains to classify multiple dictionaries opens up possibilities for cross-lin-guistic research into semantic and lexical universals.

    Keywords: SEMANTIC DOMAINS, SEMANTIC FIELDS, SEMANTIC CATEGORIES, LEX-ICAL RELATIONS, SEMANTIC PRIMITIVES, DOMAIN TEMPLATES, MENTAL LEXICON, SEMANTIC UNIVERSALS, MINORITY LANGUAGES, LEXICOGRAPHY

    Opsomming: Samestelling van woordeboeke deur gebruikmaking van se-mantiese domeine. Die taak van die voorsiening van woordeboeke aan al die tale van die wêreld is geweldig en vereis doeltreffende tegnieke. Die

  8. Ontology Matching with Semantic Verification.

    Science.gov (United States)

    Jean-Mary, Yves R; Shironoshita, E Patrick; Kabuka, Mansur R

    2009-09-01

    ASMOV (Automated Semantic Matching of Ontologies with Verification) is a novel algorithm that uses lexical and structural characteristics of two ontologies to iteratively calculate a similarity measure between them, derives an alignment, and then verifies it to ensure that it does not contain semantic inconsistencies. In this paper, we describe the ASMOV algorithm, and then present experimental results that measure its accuracy using the OAEI 2008 tests, and that evaluate its use with two different thesauri: WordNet, and the Unified Medical Language System (UMLS). These results show the increased accuracy obtained by combining lexical, structural and extensional matchers with semantic verification, and demonstrate the advantage of using a domain-specific thesaurus for the alignment of specialized ontologies.

  9. Behavior Modification Through Covert Semantic Desensitization

    Science.gov (United States)

    Hekmat, Hamid; Vanian, Daniel

    1971-01-01

    Results support the hypothesized relationship between meaning and phobia. Semantic desensitization techniques based on counter conditioning of meaning were significantly effective in altering the semantic value of the word from unpleasantness to neutrality. (Author)

  10. Relaxation as a Factor in Semantic Desensitization

    Science.gov (United States)

    Bechtel, James E.; McNamara, J. Regis

    1975-01-01

    Relaxation and semantic desensitization were used to alleviate the fear of phobic females. Results showed that semantic desensitization, alone or in combination with relaxation, failed to modify the evaluative meanings evoked by the feared object. (SE)

  11. Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries.

    Science.gov (United States)

    Lin, Ching-Heng; Wu, Nai-Yuan; Lai, Wei-Shao; Liou, Der-Ming

    2015-01-01

    Electronic medical records with encoded entries should enhance the semantic interoperability of document exchange. However, it remains a challenge to encode the narrative concept and to transform the coded concepts into a standard entry-level document. This study aimed to use a novel approach for the generation of entry-level interoperable clinical documents. Using HL7 clinical document architecture (CDA) as the example, we developed three pipelines to generate entry-level CDA documents. The first approach was a semi-automatic annotation pipeline (SAAP), the second was a natural language processing (NLP) pipeline, and the third merged the above two pipelines. We randomly selected 50 test documents from the i2b2 corpora to evaluate the performance of the three pipelines. The 50 randomly selected test documents contained 9365 words, including 588 Observation terms and 123 Procedure terms. For the Observation terms, the merged pipeline had a significantly higher F-measure than the NLP pipeline (0.89 vs 0.80, pgenerating entry-level interoperable clinical documents. © The Author 2014. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.comFor numbered affiliation see end of article.

  12. Biology Question Generation from a Semantic Network

    Science.gov (United States)

    Zhang, Lishan

    Science instructors need questions for use in exams, homework assignments, class discussions, reviews, and other instructional activities. Textbooks never have enough questions, so instructors must find them from other sources or generate their own questions. In order to supply instructors with biology questions, a semantic network approach was developed for generating open response biology questions. The generated questions were compared to professional authorized questions. To boost students' learning experience, adaptive selection was built on the generated questions. Bayesian Knowledge Tracing was used as embedded assessment of the student's current competence so that a suitable question could be selected based on the student's previous performance. A between-subjects experiment with 42 participants was performed, where half of the participants studied with adaptive selected questions and the rest studied with mal-adaptive order of questions. Both groups significantly improved their test scores, and the participants in adaptive group registered larger learning gains than participants in the control group. To explore the possibility of generating rich instructional feedback for machine-generated questions, a question-paragraph mapping task was identified. Given a set of questions and a list of paragraphs for a textbook, the goal of the task was to map the related paragraphs to each question. An algorithm was developed whose performance was comparable to human annotators. A multiple-choice question with high quality distractors (incorrect answers) can be pedagogically valuable as well as being much easier to grade than open-response questions. Thus, an algorithm was developed to generate good distractors for multiple-choice questions. The machine-generated multiple-choice questions were compared to human-generated questions in terms of three measures: question difficulty, question discrimination and distractor usefulness. By recruiting 200 participants from

  13. Tile-Level Annotation of Satellite Images Using Multi-Level Max-Margin Discriminative Random Field

    Directory of Open Access Journals (Sweden)

    Hong Sun

    2013-05-01

    Full Text Available This paper proposes a multi-level max-margin discriminative analysis (M3DA framework, which takes both coarse and fine semantics into consideration, for the annotation of high-resolution satellite images. In order to generate more discriminative topic-level features, the M3DA uses the maximum entropy discrimination latent Dirichlet Allocation (MedLDA model. Moreover, for improving the spatial coherence of visual words neglected by M3DA, conditional random field (CRF is employed to optimize the soft label field composed of multiple label posteriors. The framework of M3DA enables one to combine word-level features (generated by support vector machines and topic-level features (generated by MedLDA via the bag-of-words representation. The experimental results on high-resolution satellite images have demonstrated that, using the proposed method can not only obtain suitable semantic interpretation, but also improve the annotation performance by taking into account the multi-level semantics and the contextual information.

  14. A generalized notion of semantic independence

    DEFF Research Database (Denmark)

    Fränzle, Martin; Stengel, Bernhard von; Wittmüss, Arne

    1995-01-01

    For programs represented semantically as relations, a concept of semantic independence is defined that is more general than previously stated notions. It allows for shared input variables and irrelevant interference due to nondeterminism.......For programs represented semantically as relations, a concept of semantic independence is defined that is more general than previously stated notions. It allows for shared input variables and irrelevant interference due to nondeterminism....

  15. The semantic structure of gratitude

    Directory of Open Access Journals (Sweden)

    Smirnov, Alexander V.

    2016-06-01

    Full Text Available In the modern social and economic environment of Russia, gratitude might be considered an ambiguous phenomenon. It can have different meaning for a person in different contexts and can manifest itself differently as well (that is, as an expression of sincere feelings or as an element of corruption. In this respect it is topical to investigate the system of meanings and relationships that define the semantic space of gratitude. The goal of the study was the investigation and description of the content and structure of the semantic space of the gratitude phenomenon as well as the determination of male, female, age, and ethnic peculiarities of the expression of gratitude. The objective was achieved by using the semantic differential designed by the authors to investigate attitudes toward gratitude. This investigation was carried out with the participation of 184 respondents (Russians, Tatars, Ukrainians, Jews living in the Russian Federation, Belarus, Kazakhstan, Tajikistan, Israel, Australia, Canada, and the United Kingdom and identifying themselves as representatives of one of these nationalities. The structural components of gratitude were singled out by means of exploratory factor analysis of the empirical data from the designed semantic differential. Gender, age, and ethnic differences were differentiated by means of Student’s t-test. Gratitude can be represented by material and nonmaterial forms as well as by actions in response to help given. The empirical data allowed us to design the ethnically nonspecified semantic structure of gratitude. During the elaboration of the differential, semantic universals of gratitude, which constitute its psychosemantic content, were distinguished. Peculiarities of attitudes toward gratitude by those in different age and gender groups were revealed. Differences in the degree of manifestation of components of the psychosemantic structure of gratitude related to ethnic characteristics were not discovered

  16. Learning from Weak and Noisy Labels for Semantic Segmentation

    KAUST Repository

    Lu, Zhiwu

    2016-04-08

    A weakly supervised semantic segmentation (WSSS) method aims to learn a segmentation model from weak (image-level) as opposed to strong (pixel-level) labels. By avoiding the tedious pixel-level annotation process, it can exploit the unlimited supply of user-tagged images from media-sharing sites such as Flickr for large scale applications. However, these ‘free’ tags/labels are often noisy and few existing works address the problem of learning with both weak and noisy labels. In this work, we cast the WSSS problem into a label noise reduction problem. Specifically, after segmenting each image into a set of superpixels, the weak and potentially noisy image-level labels are propagated to the superpixel level resulting in highly noisy labels; the key to semantic segmentation is thus to identify and correct the superpixel noisy labels. To this end, a novel L1-optimisation based sparse learning model is formulated to directly and explicitly detect noisy labels. To solve the L1-optimisation problem, we further develop an efficient learning algorithm by introducing an intermediate labelling variable. Extensive experiments on three benchmark datasets show that our method yields state-of-the-art results given noise-free labels, whilst significantly outperforming the existing methods when the weak labels are also noisy.

  17. Learning from Weak and Noisy Labels for Semantic Segmentation

    KAUST Repository

    Lu, Zhiwu; Fu, Zhenyong; Xiang, Tao; Han, Peng; Wang, Liwei; Gao, Xin

    2016-01-01

    A weakly supervised semantic segmentation (WSSS) method aims to learn a segmentation model from weak (image-level) as opposed to strong (pixel-level) labels. By avoiding the tedious pixel-level annotation process, it can exploit the unlimited supply of user-tagged images from media-sharing sites such as Flickr for large scale applications. However, these ‘free’ tags/labels are often noisy and few existing works address the problem of learning with both weak and noisy labels. In this work, we cast the WSSS problem into a label noise reduction problem. Specifically, after segmenting each image into a set of superpixels, the weak and potentially noisy image-level labels are propagated to the superpixel level resulting in highly noisy labels; the key to semantic segmentation is thus to identify and correct the superpixel noisy labels. To this end, a novel L1-optimisation based sparse learning model is formulated to directly and explicitly detect noisy labels. To solve the L1-optimisation problem, we further develop an efficient learning algorithm by introducing an intermediate labelling variable. Extensive experiments on three benchmark datasets show that our method yields state-of-the-art results given noise-free labels, whilst significantly outperforming the existing methods when the weak labels are also noisy.

  18. Semantic Query Processing : Estimating Relational Purity

    NARCIS (Netherlands)

    Kalo, Jan-Christoph; Lofi, C.; Maseli, René Pascal; Balke, Wolf-Tilo; Leyer, M.

    2017-01-01

    The use of semantic information found in structured knowledge bases has become an integral part of the processing pipeline of modern intelligent in-
    formation systems. However, such semantic information is frequently insuffi-cient to capture the rich semantics demanded by the applications, and

  19. A Model for Semantic IS Standards

    NARCIS (Netherlands)

    Folmer, Erwin Johan Albert; Oude Luttighuis, Paul; van Hillegersberg, Jos

    2011-01-01

    We argue that, in order to suggest improvements of any kind to semantic information system (IS) standards, better understanding of the conceptual structure of semantic IS standard is required. This study develops a model for semantic IS standard, based on literature and expert knowledge. The model

  20. Measurement of Learning Process by Semantic Annotation Technique on Bloom's Taxonomy Vocabulary

    Science.gov (United States)

    Yanchinda, Jirawit; Yodmongkol, Pitipong; Chakpitak, Nopasit

    2016-01-01

    A lack of science and technology knowledge understanding of most rural people who had the highest education at elementary education level more than others level is unsuccessfully transferred appropriate technology knowledge for rural sustainable development. This study provides the measurement of the learning process by on Bloom's Taxonomy…

  1. Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description.

    Science.gov (United States)

    Sahoo, Satya S; Valdez, Joshua; Rueschman, Michael

    2016-01-01

    Scientific reproducibility is key to scientific progress as it allows the research community to build on validated results, protect patients from potentially harmful trial drugs derived from incorrect results, and reduce wastage of valuable resources. The National Institutes of Health (NIH) recently published a systematic guideline titled "Rigor and Reproducibility " for supporting reproducible research studies, which has also been accepted by several scientific journals. These journals will require published articles to conform to these new guidelines. Provenance metadata describes the history or origin of data and it has been long used in computer science to capture metadata information for ensuring data quality and supporting scientific reproducibility. In this paper, we describe the development of Provenance for Clinical and healthcare Research (ProvCaRe) framework together with a provenance ontology to support scientific reproducibility by formally modeling a core set of data elements representing details of research study. We extend the PROV Ontology (PROV-O), which has been recommended as the provenance representation model by World Wide Web Consortium (W3C), to represent both: (a) data provenance, and (b) process provenance. We use 124 study variables from 6 clinical research studies from the National Sleep Research Resource (NSRR) to evaluate the coverage of the provenance ontology. NSRR is the largest repository of NIH-funded sleep datasets with 50,000 studies from 36,000 participants. The provenance ontology reuses ontology concepts from existing biomedical ontologies, for example the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), to model the provenance information of research studies. The ProvCaRe framework is being developed as part of the Big Data to Knowledge (BD2K) data provenance project.

  2. Semantic heterogeneity: comparing new semantic web approaches with those of digital libraries

    OpenAIRE

    Krause, Jürgen

    2008-01-01

    To demonstrate that newer developments in the semantic web community, particularly those based on ontologies (simple knowledge organization system and others) mitigate common arguments from the digital library (DL) community against participation in the Semantic web. The approach is a semantic web discussion focusing on the weak structure of the Web and the lack of consideration given to the semantic content during indexing. The points criticised by the semantic web and ontology approaches ar...

  3. Annotated chemical patent corpus: a gold standard for text mining.

    Directory of Open Access Journals (Sweden)

    Saber A Akhondi

    Full Text Available Exploring the chemical and biological space covered by patent applications is crucial in early-stage medicinal chemistry activities. Patent analysis can provide understanding of compound prior art, novelty checking, validation of biological assays, and identification of new starting points for chemical exploration. Extracting chemical and biological entities from patents through manual extraction by expert curators can take substantial amount of time and resources. Text mining methods can help to ease this process. To validate the performance of such methods, a manually annotated patent corpus is essential. In this study we have produced a large gold standard chemical patent corpus. We developed annotation guidelines and selected 200 full patents from the World Intellectual Property Organization, United States Patent and Trademark Office, and European Patent Office. The patents were pre-annotated automatically and made available to four independent annotator groups each consisting of two to ten annotators. The annotators marked chemicals in different subclasses, diseases, targets, and modes of action. Spelling mistakes and spurious line break due to optical character recognition errors were also annotated. A subset of 47 patents was annotated by at least three annotator groups, from which harmonized annotations and inter-annotator agreement scores were derived. One group annotated the full set. The patent corpus includes 400,125 annotations for the full set and 36,537 annotations for the harmonized set. All patents and annotated entities are publicly available at www.biosemantics.org.

  4. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    Directory of Open Access Journals (Sweden)

    Shu-Chuan Chen

    Full Text Available The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.

  5. IBRI-CASONTO: Ontology-based semantic search engine

    OpenAIRE

    Awny Sayed; Amal Al Muqrishi

    2017-01-01

    The vast availability of information, that added in a very fast pace, in the data repositories creates a challenge in extracting correct and accurate information. Which has increased the competition among developers in order to gain access to technology that seeks to understand the intent researcher and contextual meaning of terms. While the competition for developing an Arabic Semantic Search systems are still in their infancy, and the reason could be traced back to the complexity of Arabic ...

  6. Using Semantic Web Services for Context-Aware Mobile Applications

    OpenAIRE

    Sheshagiri , Mithun; Sadeh , Norman; Gandon , Fabien

    2004-01-01

    International audience; One way of overcoming the challenges associated with mobile and pervasive computing environments involves providing users with higher levels of automation. This in turn requires capturing the context within which the user operates. In this paper, we describe ongoing research aimed leveraging Semantic Web Services in support of context awareness. This includes modeling sources of contextual information as web services that can be automatically discovered and accessed by...

  7. Towards semantic software engineering enviroments

    NARCIS (Netherlands)

    Falbo, R.A.; Guizzardi, G.; Natali, A.; Bertollo, G.; Ruy, F.; Mian, P.; Tortora, G.; Chang, S.K.

    2002-01-01

    Software tools processing partially common set of data should share an understanding of what these data mean. Since ontologies have been used to express formally a shared understanding of information, we argue that they are a way towards Semantic SEEs. In this paper we discuss an ontology-based

  8. The Semantic Web in Education

    Science.gov (United States)

    Ohler, Jason

    2008-01-01

    The semantic web or Web 3.0 makes information more meaningful to people by making it more understandable to machines. In this article, the author examines the implications of Web 3.0 for education. The author considers three areas of impact: knowledge construction, personal learning network maintenance, and personal educational administration.…

  9. Quest for a Computerised Semantics.

    Science.gov (United States)

    Leslie, Adrian R.

    The objective of this thesis was to colligate the various strands of research in the literature of computational linguistics that have to do with the computational treatment of semantic content so as to encode it into a computerized dictionary. In chapter 1 the course of mechanical translation (1947-1960) and quantitative linguistics is traced to…

  10. Russian nominal semantics and morphology

    DEFF Research Database (Denmark)

    Nørgård-Sørensen, Jens

    The principal idea behind this book is that lexis and grammar make up a single coherent structure. It is shown that the grammatical patterns of the different classes of Russian nominals are closely interconnected. They can be described as reflecting a limited set of semantic distinctions which ar...... or weaker, of Russian. Students will see a pattern in what is traditionally described as disparate subsystems, and linguists may be inspired to consider the theoretical points concerning language as a coherent system, determining usage.......The principal idea behind this book is that lexis and grammar make up a single coherent structure. It is shown that the grammatical patterns of the different classes of Russian nominals are closely interconnected. They can be described as reflecting a limited set of semantic distinctions which...... are also rooted in the lexical-semantic classification of Russian nouns. The presentation focuses on semantics, both lexical and grammatical, and not least the connection between these two levels of content. The principal theoretical impact is the insight that grammar and lexis should not be seen...

  11. Semantic Reasoning for Scene Interpretation

    DEFF Research Database (Denmark)

    Jensen, Lars Baunegaard With; Baseski, Emre; Pugeault, Nicolas

    2008-01-01

    In this paper, we propose a hierarchical architecture for representing scenes, covering 2D and 3D aspects of visual scenes as well as the semantic relations between the different aspects. We argue that labeled graphs are a suitable representational framework for this representation and demonstrat...

  12. Semantic Preview Benefit during Reading

    Science.gov (United States)

    Hohenstein, Sven; Kliegl, Reinhold

    2014-01-01

    Word features in parafoveal vision influence eye movements during reading. The question of whether readers extract semantic information from parafoveal words was studied in 3 experiments by using a gaze-contingent display change technique. Subjects read German sentences containing 1 of several preview words that were replaced by a target word…

  13. Semantic search during divergent thinking.

    Science.gov (United States)

    Hass, Richard W

    2017-09-01

    Divergent thinking, as a method of examining creative cognition, has not been adequately analyzed in the context of modern cognitive theories. This article casts divergent thinking responding in the context of theories of memory search. First, it was argued that divergent thinking tasks are similar to semantic fluency tasks, but are more constrained, and less well structured. Next, response time distributions from 54 participants were analyzed for temporal and semantic clustering. Participants responded to two prompts from the alternative uses test: uses for a brick and uses for a bottle, for two minutes each. Participants' cumulative response curves were negatively accelerating, in line with theories of search of associative memory. However, results of analyses of semantic and temporal clustering suggested that clustering is less evident in alternative uses responding compared to semantic fluency tasks. This suggests either that divergent thinking responding does not involve an exhaustive search through a clustered memory trace, but rather that the process is more exploratory, yielding fewer overall responses that tend to drift away from close associates of the divergent thinking prompt. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. SEMSIN SEMANTIC AND SYNTACTIC PARSER

    Directory of Open Access Journals (Sweden)

    K. K. Boyarsky

    2015-09-01

    Full Text Available The paper deals with the principle of operation for SemSin semantic and syntactic parser creating a dependency tree for the Russian language sentences. The parser consists of 4 blocks: a dictionary, morphological analyzer, production rules and lexical analyzer. An important logical part of the parser is pre-syntactical module, which harmonizes and complements morphological analysis results, separates the text paragraphs into individual sentences, and also carries out predisambiguation. Characteristic feature of the presented parser is an open type of control – it is done by means of a set of production rules. A varied set of commands provides the ability to both morphological and semantic-syntactic analysis of the sentence. The paper presents the sequence of rules usage and examples of their work. Specific feature of the rules is the decision making on establishment of syntactic links with simultaneous removal of the morphological and semantic ambiguity. The lexical analyzer provides the execution of commands and rules, and manages the parser in manual or automatic modes of the text analysis. In the first case, the analysis is performed interactively with the possibility of step-by-step execution of the rules and scanning the resulting parse tree. In the second case, analysis results are filed in an xml-file. Active usage of syntactic and semantic dictionary information gives the possibility to reduce significantly the ambiguity of parsing. In addition to marking the text, the parser is also usable as a tool for information extraction from natural language texts.

  15. Bare coordination: the semantic shift

    NARCIS (Netherlands)

    de Swart, Henriette; Le Bruyn, Bert

    2014-01-01

    This paper develops an analysis of the syntax-semantics interface of two types of split coordination structures. In the first type, two bare singular count nouns appear as arguments in a coordinated structure, as in bride and groom were happy. We call this the N&N construction. In the second type,

  16. A development framework for semantically interoperable health information systems.

    Science.gov (United States)

    Lopez, Diego M; Blobel, Bernd G M E

    2009-02-01

    Semantic interoperability is a basic challenge to be met for new generations of distributed, communicating and co-operating health information systems (HIS) enabling shared care and e-Health. Analysis, design, implementation and maintenance of such systems and intrinsic architectures have to follow a unified development methodology. The Generic Component Model (GCM) is used as a framework for modeling any system to evaluate and harmonize state of the art architecture development approaches and standards for health information systems as well as to derive a coherent architecture development framework for sustainable, semantically interoperable HIS and their components. The proposed methodology is based on the Rational Unified Process (RUP), taking advantage of its flexibility to be configured for integrating other architectural approaches such as Service-Oriented Architecture (SOA), Model-Driven Architecture (MDA), ISO 10746, and HL7 Development Framework (HDF). Existing architectural approaches have been analyzed, compared and finally harmonized towards an architecture development framework for advanced health information systems. Starting with the requirements for semantic interoperability derived from paradigm changes for health information systems, and supported in formal software process engineering methods, an appropriate development framework for semantically interoperable HIS has been provided. The usability of the framework has been exemplified in a public health scenario.

  17. Semantic Segmentation of Aerial Images with AN Ensemble of Cnns

    Science.gov (United States)

    Marmanis, D.; Wegner, J. D.; Galliani, S.; Schindler, K.; Datcu, M.; Stilla, U.

    2016-06-01

    This paper describes a deep learning approach to semantic segmentation of very high resolution (aerial) images. Deep neural architectures hold the promise of end-to-end learning from raw images, making heuristic feature design obsolete. Over the last decade this idea has seen a revival, and in recent years deep convolutional neural networks (CNNs) have emerged as the method of choice for a range of image interpretation tasks like visual recognition and object detection. Still, standard CNNs do not lend themselves to per-pixel semantic segmentation, mainly because one of their fundamental principles is to gradually aggregate information over larger and larger image regions, making it hard to disentangle contributions from different pixels. Very recently two extensions of the CNN framework have made it possible to trace the semantic information back to a precise pixel position: deconvolutional network layers undo the spatial downsampling, and Fully Convolution Networks (FCNs) modify the fully connected classification layers of the network in such a way that the location of individual activations remains explicit. We design a FCN which takes as input intensity and range data and, with the help of aggressive deconvolution and recycling of early network layers, converts them into a pixelwise classification at full resolution. We discuss design choices and intricacies of such a network, and demonstrate that an ensemble of several networks achieves excellent results on challenging data such as the ISPRS semantic labeling benchmark, using only the raw data as input.

  18. Vital analysis: annotating sensed physiological signals with the stress levels of first responders in action.

    Science.gov (United States)

    Gomes, P; Kaiseler, M; Queirós, C; Oliveira, M; Lopes, B; Coimbra, M

    2012-01-01

    First responders such as firefighters are exposed to extreme stress and fatigue situations during their work routines. It is thus desirable to monitor their health using wearable sensing but this is a complex and still unsolved research challenge that requires large amounts of properly annotated physiological signals data. In this paper we show that the information gathered by our Vital Analysis Framework can support the annotation of these vital signals with the stress levels perceived by the target user, confirmed by the analysis of more than 4600 hours of data collected from real firefighters in action, including 717 answers to event questionnaires from a total of 454 different events.

  19. MPEG-7 based video annotation and browsing

    Science.gov (United States)

    Hoeynck, Michael; Auweiler, Thorsten; Wellhausen, Jens

    2003-11-01

    The huge amount of multimedia data produced worldwide requires annotation in order to enable universal content access and to provide content-based search-and-retrieval functionalities. Since manual video annotation can be time consuming, automatic annotation systems are required. We review recent approaches to content-based indexing and annotation of videos for different kind of sports and describe our approach to automatic annotation of equestrian sports videos. We especially concentrate on MPEG-7 based feature extraction and content description, where we apply different visual descriptors for cut detection. Further, we extract the temporal positions of single obstacles on the course by analyzing MPEG-7 edge information. Having determined single shot positions as well as the visual highlights, the information is jointly stored with meta-textual information in an MPEG-7 description scheme. Based on this information, we generate content summaries which can be utilized in a user-interface in order to provide content-based access to the video stream, but further for media browsing on a streaming server.

  20. ACID: annotation of cassette and integron data

    Directory of Open Access Journals (Sweden)

    Stokes Harold W

    2009-04-01

    Full Text Available Abstract Background Although integrons and their associated gene cassettes are present in ~10% of bacteria and can represent up to 3% of the genome in which they are found, very few have been properly identified and annotated in public databases. These genetic elements have been overlooked in comparison to other vectors that facilitate lateral gene transfer between microorganisms. Description By automating the identification of integron integrase genes and of the non-coding cassette-associated attC recombination sites, we were able to assemble a database containing all publicly available sequence information regarding these genetic elements. Specialists manually curated the database and this information was used to improve the automated detection and annotation of integrons and their encoded gene cassettes. ACID (annotation of cassette and integron data can be searched using a range of queries and the data can be downloaded in a number of formats. Users can readily annotate their own data and integrate it into ACID using the tools provided. Conclusion ACID is a community resource providing easy access to annotations of integrons and making tools available to detect them in novel sequence data. ACID also hosts a forum to prompt integron-related discussion, which can hopefully lead to a more universal definition of this genetic element.

  1. Phonological learning in semantic dementia.

    Science.gov (United States)

    Jefferies, Elizabeth; Bott, Samantha; Ehsan, Sheeba; Lambon Ralph, Matthew A

    2011-04-01

    Patients with semantic dementia (SD) have anterior temporal lobe (ATL) atrophy that gives rise to a highly selective deterioration of semantic knowledge. Despite pronounced anomia and poor comprehension of words and pictures, SD patients have well-formed, fluent speech and normal digit span. Given the intimate connection between phonological STM and word learning revealed by both neuropsychological and developmental studies, SD patients might be expected to show good acquisition of new phonological forms, even though their ability to map these onto meanings is impaired. In contradiction of these predictions, a limited amount of previous research has found poor learning of new phonological forms in SD. In a series of experiments, we examined whether SD patient, GE, could learn novel phonological sequences and, if so, under which circumstances. GE showed normal benefits of phonological knowledge in STM (i.e., normal phonotactic frequency and phonological similarity effects) but reduced support from semantic memory (i.e., poor immediate serial recall for semantically degraded words, characterised by frequent item errors). Next, we demonstrated normal learning of serial order information for repeated lists of single-digit number words using the Hebb paradigm: these items were well-understood allowing them to be repeated without frequent item errors. In contrast, patient GE showed little learning of nonsense syllable sequences using the same Hebb paradigm. Detailed analysis revealed that both GE and the controls showed a tendency to learn their own errors as opposed to the target items. Finally, we showed normal learning of phonological sequences for GE when he was prevented from repeating his errors. These findings confirm that the ATL atrophy in SD disrupts phonological processing for semantically degraded words but leaves the phonological architecture intact. Consequently, when item errors are minimised, phonological STM can support the acquisition of new phoneme

  2. Concept indexing and expansion for social multimedia websites based on semantic processing and graph analysis

    Science.gov (United States)

    Lin, Po-Chuan; Chen, Bo-Wei; Chang, Hangbae

    2016-07-01

    This study presents a human-centric technique for social video expansion based on semantic processing and graph analysis. The objective is to increase metadata of an online video and to explore related information, thereby facilitating user browsing activities. To analyze the semantic meaning of a video, shots and scenes are firstly extracted from the video on the server side. Subsequently, this study uses annotations along with ConceptNet to establish the underlying framework. Detailed metadata, including visual objects and audio events among the predefined categories, are indexed by using the proposed method. Furthermore, relevant online media associated with each category are also analyzed to enrich the existing content. With the above-mentioned information, users can easily browse and search the content according to the link analysis and its complementary knowledge. Experiments on a video dataset are conducted for evaluation. The results show that our system can achieve satisfactory performance, thereby demonstrating the feasibility of the proposed idea.

  3. SEMANTIC INDEXING OF TERRASAR-X AND IN SITU DATA FOR URBAN ANALYTICS

    Directory of Open Access Journals (Sweden)

    D. Espinoza Molina

    2015-12-01

    Full Text Available This paper presents the semantic indexing of TerraSAR-X images and in situ data. Image processing together with machine learning methods, relevance feedback techniques, and human expertise are used to annotate the image content into a land use land cover catalogue. All the generated information is stored into a geo-database supporting the link between different types of information and the computation of queries and analytics. We used 11 TerraSAR-X scenes over Germany and LUCAS as in situ data. The semantic index is composed of about 73 land use land cover categories found in TerraSAR-X test dataset and 84 categories found in LUCAS dataset.

  4. A Denotational Semantics for Communicating Unstructured Code

    Directory of Open Access Journals (Sweden)

    Nils Jähnig

    2015-03-01

    Full Text Available An important property of programming language semantics is that they should be compositional. However, unstructured low-level code contains goto-like commands making it hard to define a semantics that is compositional. In this paper, we follow the ideas of Saabas and Uustalu to structure low-level code. This gives us the possibility to define a compositional denotational semantics based on least fixed points to allow for the use of inductive verification methods. We capture the semantics of communication using finite traces similar to the denotations of CSP. In addition, we examine properties of this semantics and give an example that demonstrates reasoning about communication and jumps. With this semantics, we lay the foundations for a proof calculus that captures both, the semantics of unstructured low-level code and communication.

  5. A Semantics for Distributed Execution of Statemate

    DEFF Research Database (Denmark)

    Fränzle, Martin; Niehaus, Jürgen; Metzner, Alexander

    2003-01-01

    We present a semantics for the statechart variant implemented in the Statemate product of i-Logix. Our semantics enables distributed code generation for Statemate models in the context of rapid prototyping for embedded control applications. We argue that it seems impossible to efficiently generate......, the changes made regarding the interaction of distributed model parts are similar to the interaction between the model and its environment in the original semantics, thus giving designers a familiar execution model. The semantics has been implemented in Grace, a framework for rapid prototyping code generation...... distributed code using the original Statemate semantics. The new, distributed semantics has the advantages that, first, it enables the generation of efficient distributed code, second, it preserves many aspects of the original semantics for those parts of a model that are not distributed, and third...

  6. Alignment of the UMLS semantic network with BioTop: methodology and assessment.

    Science.gov (United States)

    Schulz, Stefan; Beisswanger, Elena; van den Hoek, László; Bodenreider, Olivier; van Mulligen, Erik M

    2009-06-15

    For many years, the Unified Medical Language System (UMLS) semantic network (SN) has been used as an upper-level semantic framework for the categorization of terms from terminological resources in biomedicine. BioTop has recently been developed as an upper-level ontology for the biomedical domain. In contrast to the SN, it is founded upon strict ontological principles, using OWL DL as a formal representation language, which has become standard in the semantic Web. In order to make logic-based reasoning available for the resources annotated or categorized with the SN, a mapping ontology was developed aligning the SN with BioTop. The theoretical foundations and the practical realization of the alignment are being described, with a focus on the design decisions taken, the problems encountered and the adaptations of BioTop that became necessary. For evaluation purposes, UMLS concept pairs obtained from MEDLINE abstracts by a named entity recognition system were tested for possible semantic relationships. Furthermore, all semantic-type combinations that occur in the UMLS Metathesaurus were checked for satisfiability. The effort-intensive alignment process required major design changes and enhancements of BioTop and brought up several design errors that could be fixed. A comparison between a human curator and the ontology yielded only a low agreement. Ontology reasoning was also used to successfully identify 133 inconsistent semantic-type combinations. BioTop, the OWL DL representation of the UMLS SN, and the mapping ontology are available at http://www.purl.org/biotop/.

  7. SEMANTIC INDEXING OF TERRASAR-X AND IN SITU DATA FOR URBAN ANALYTICS

    OpenAIRE

    Espinoza Molina, D.; Alonso, K.; Datcu, M.

    2015-01-01

    This paper presents the semantic indexing of TerraSAR-X images and in situ data. Image processing together with machine learning methods, relevance feedback techniques, and human expertise are used to annotate the image content into a land use land cover catalogue. All the generated information is stored into a geo-database supporting the link between different types of information and the computation of queries and analytics. We used 11 TerraSAR-X scenes over Germany and LUCAS as in situ dat...

  8. A shortest-path graph kernel for estimating gene product semantic similarity

    Directory of Open Access Journals (Sweden)

    Alvarez Marco A

    2011-07-01

    Full Text Available Abstract Background Existing methods for calculating semantic similarity between gene products using the Gene Ontology (GO often rely on external resources, which are not part of the ontology. Consequently, changes in these external resources like biased term distribution caused by shifting of hot research topics, will affect the calculation of semantic similarity. One way to avoid this problem is to use semantic methods that are "intrinsic" to the ontology, i.e. independent of external knowledge. Results We present a shortest-path graph kernel (spgk method that relies exclusively on the GO and its structure. In spgk, a gene product is represented by an induced subgraph of the GO, which consists of all the GO terms annotating it. Then a shortest-path graph kernel is used to compute the similarity between two graphs. In a comprehensive evaluation using a benchmark dataset, spgk compares favorably with other methods that depend on external resources. Compared with simUI, a method that is also intrinsic to GO, spgk achieves slightly better results on the benchmark dataset. Statistical tests show that the improvement is significant when the resolution and EC similarity correlation coefficient are used to measure the performance, but is insignificant when the Pfam similarity correlation coefficient is used. Conclusions Spgk uses a graph kernel method in polynomial time to exploit the structure of the GO to calculate semantic similarity between gene products. It provides an alternative to both methods that use external resources and "intrinsic" methods with comparable performance.

  9. A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.

    Science.gov (United States)

    Mazandu, Gaston K; Chimusa, Emile R; Mbiyavanga, Mamana; Mulder, Nicola J

    2016-02-01

    Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is a portable software package integrating all known GO information content-based semantic similarity measures and relevant biological applications associated with these measures. A-DaGO-Fun has the advantage not only of handling datasets from the current high-throughput genome-wide applications, but also allowing users to choose the most relevant semantic similarity approach for their biological applications and to adapt a given module to their needs. A-DaGO-Fun is freely available to the research community at http://web.cbio.uct.ac.za/ITGOM/adagofun. It is implemented in Linux using Python under free software (GNU General Public Licence). gmazandu@cbio.uct.ac.za or Nicola.Mulder@uct.ac.za Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  10. nGASP - the nematode genome annotation assessment project

    Energy Technology Data Exchange (ETDEWEB)

    Coghlan, A; Fiedler, T J; McKay, S J; Flicek, P; Harris, T W; Blasiar, D; Allen, J; Stein, L D

    2008-12-19

    While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets for 10 Mb of the C. elegans genome. Predictions were compared to reference gene sets consisting of confirmed or manually curated gene models from WormBase. The most accurate gene-finders were 'combiner' algorithms, which made use of transcript- and protein-alignments and multi-genome alignments, as well as gene predictions from other gene-finders. Gene-finders that used alignments of ESTs, mRNAs and proteins came in second place. There was a tie for third place between gene-finders that used multi-genome alignments and ab initio gene-finders. The median gene level sensitivity of combiners was 78% and their specificity was 42%, which is nearly the same accuracy as reported for combiners in the human genome. C. elegans genes with exons of unusual hexamer content, as well as those with many exons, short exons, long introns, a weak translation start signal, weak splice sites, or poorly conserved orthologs were the most challenging for gene-finders. While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets for 10 Mb of the C

  11. GFFview: A Web Server for Parsing and Visualizing Annotation Information of Eukaryotic Genome.

    Science.gov (United States)

    Deng, Feilong; Chen, Shi-Yi; Wu, Zhou-Lin; Hu, Yongsong; Jia, Xianbo; Lai, Song-Jia

    2017-10-01

    Owing to wide application of RNA sequencing (RNA-seq) technology, more and more eukaryotic genomes have been extensively annotated, such as the gene structure, alternative splicing, and noncoding loci. Annotation information of genome is prevalently stored as plain text in General Feature Format (GFF), which could be hundreds or thousands Mb in size. Therefore, it is a challenge for manipulating GFF file for biologists who have no bioinformatic skill. In this study, we provide a web server (GFFview) for parsing the annotation information of eukaryotic genome and then generating statistical description of six indices for visualization. GFFview is very useful for investigating quality and difference of the de novo assembled transcriptome in RNA-seq studies.

  12. Motion lecture annotation system to learn Naginata performances

    Science.gov (United States)

    Kobayashi, Daisuke; Sakamoto, Ryota; Nomura, Yoshihiko

    2013-12-01

    This paper describes a learning assistant system using motion capture data and annotation to teach "Naginata-jutsu" (a skill to practice Japanese halberd) performance. There are some video annotation tools such as YouTube. However these video based tools have only single angle of view. Our approach that uses motion-captured data allows us to view any angle. A lecturer can write annotations related to parts of body. We have made a comparison of effectiveness between the annotation tool of YouTube and the proposed system. The experimental result showed that our system triggered more annotations than the annotation tool of YouTube.

  13. An Annotated Dataset of 14 Meat Images

    DEFF Research Database (Denmark)

    Stegmann, Mikkel Bille

    2002-01-01

    This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given.......This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given....

  14. Software for computing and annotating genomic ranges.

    Directory of Open Access Journals (Sweden)

    Michael Lawrence

    Full Text Available We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  15. Software for computing and annotating genomic ranges.

    Science.gov (United States)

    Lawrence, Michael; Huber, Wolfgang; Pagès, Hervé; Aboyoun, Patrick; Carlson, Marc; Gentleman, Robert; Morgan, Martin T; Carey, Vincent J

    2013-01-01

    We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  16. A Flexible Object-of-Interest Annotation Framework for Online Video Portals

    Directory of Open Access Journals (Sweden)

    Robert Sorschag

    2012-02-01

    Full Text Available In this work, we address the use of object recognition techniques to annotate what is shown where in online video collections. These annotations are suitable to retrieve specific video scenes for object related text queries which is not possible with the manually generated metadata that is used by current portals. We are not the first to present object annotations that are generated with content-based analysis methods. However, the proposed framework possesses some outstanding features that offer good prospects for its application in real video portals. Firstly, it can be easily used as background module in any video environment. Secondly, it is not based on a fixed analysis chain but on an extensive recognition infrastructure that can be used with all kinds of visual features, matching and machine learning techniques. New recognition approaches can be integrated into this infrastructure with low development costs and a configuration of the used recognition approaches can be performed even on a running system. Thus, this framework might also benefit from future advances in computer vision. Thirdly, we present an automatic selection approach to support the use of different recognition strategies for different objects. Last but not least, visual analysis can be performed efficiently on distributed, multi-processor environments and a database schema is presented to store the resulting video annotations as well as the off-line generated low-level features in a compact form. We achieve promising results in an annotation case study and the instance search task of the TRECVID 2011 challenge.

  17. MiMiR: a comprehensive solution for storage, annotation and exchange of microarray data

    Directory of Open Access Journals (Sweden)

    Rahman Fatimah

    2005-11-01

    Full Text Available Abstract Background The generation of large amounts of microarray data presents challenges for data collection, annotation, exchange and analysis. Although there are now widely accepted formats, minimum standards for data content and ontologies for microarray data, only a few groups are using them together to build and populate large-scale databases. Structured environments for data management are crucial for making full use of these data. Description The MiMiR database provides a comprehensive infrastructure for microarray data annotation, storage and exchange and is based on the MAGE format. MiMiR is MIAME-supportive, customised for use with data generated on the Affymetrix platform and includes a tool for data annotation using ontologies. Detailed information on the experiment, methods, reagents and signal intensity data can be captured in a systematic format. Reports screens permit the user to query the database, to view annotation on individual experiments and provide summary statistics. MiMiR has tools for automatic upload of the data from the microarray scanner and export to databases using MAGE-ML. Conclusion MiMiR facilitates microarray data management, annotation and exchange, in line with international guidelines. The database is valuable for underpinning research activities and promotes a systematic approach to data handling. Copies of MiMiR are freely available to academic groups under licence.

  18. Solar Tutorial and Annotation Resource (STAR)

    Science.gov (United States)

    Showalter, C.; Rex, R.; Hurlburt, N. E.; Zita, E. J.

    2009-12-01

    We have written a software suite designed to facilitate solar data analysis by scientists, students, and the public, anticipating enormous datasets from future instruments. Our “STAR" suite includes an interactive learning section explaining 15 classes of solar events. Users learn software tools that exploit humans’ superior ability (over computers) to identify many events. Annotation tools include time slice generation to quantify loop oscillations, the interpolation of event shapes using natural cubic splines (for loops, sigmoids, and filaments) and closed cubic splines (for coronal holes). Learning these tools in an environment where examples are provided prepares new users to comfortably utilize annotation software with new data. Upon completion of our tutorial, users are presented with media of various solar events and asked to identify and annotate the images, to test their mastery of the system. Goals of the project include public input into the data analysis of very large datasets from future solar satellites, and increased public interest and knowledge about the Sun. In 2010, the Solar Dynamics Observatory (SDO) will be launched into orbit. SDO’s advancements in solar telescope technology will generate a terabyte per day of high-quality data, requiring innovation in data management. While major projects develop automated feature recognition software, so that computers can complete much of the initial event tagging and analysis, still, that software cannot annotate features such as sigmoids, coronal magnetic loops, coronal dimming, etc., due to large amounts of data concentrated in relatively small areas. Previously, solar physicists manually annotated these features, but with the imminent influx of data it is unrealistic to expect specialized researchers to examine every image that computers cannot fully process. A new approach is needed to efficiently process these data. Providing analysis tools and data access to students and the public have proven

  19. Homophonic and semantic priming of Japanese Kanji words: a time course study.

    Science.gov (United States)

    Chen, Hsi-Chin; Yamauchi, Takashi; Tamaoka, Katsuo; Vaid, Jyotsna

    2007-02-01

    In an examination of the time course of activation of phonological and semantic information in processing kanji script, two lexical decision experiments were conducted with native readers of Japanese. Kanji targets were preceded at short (85-msec) and long (150-msec) intervals by homophonic, semantically related, or unrelated primes presented in kanji (Experiment 1) or by hiragana transcriptions of the kanji primes (Experiment 2). When primes were in kanji, semantic relatedness facilitated kanji target recognition at both intervals but homophonic relatedness did not. When primes were in hiragana, kanji target recognition was facilitated by homophonic relatedness at both intervals and by semantic relatedness only at the longer interval. The absence of homophonic priming of kanji targets by kanji primes challenges the universal phonology principle's claim that phonology is central to accessing meaning from print. The stimuli used in the present study may be downloaded from www.psychonomic.org/archive.

  20. A Postcolonial Semantics of Personhood

    DEFF Research Database (Denmark)

    Levisen, Carsten

    that provide an answer to the question: “what makes up a person?” The paper aims toarticulate semantic explications and cultural scripts for personhood constructs in Bislama, beingmindful of the anglicizations, contradictions, and reinventions that are characteristic of postcolonialdiscourse......, 2013-2015 (Levisen 2016a, 2016b). I willfocus on the keyword tingting ‘mind, heart’ (from English ‘think-think’), and the related concepts speret(from English ‘spirit’), devil (from English ‘devil’), and pija (from English ‘picture’), as well as morerecent imports from English: maen (mind), sol (soul...... levels. Traditionalterms like devil and pija are being problematized by urban speakers, and are both in decline. Sol,maen, and had have become more common, and speret/spirit has undergone a semanticanglicization. Tingting remains the key construct, around which Bislama personhood semantics isorganized...